Remote Homelab Admin with Tailscale

remote-homelab-admin-with-tailscale

For a video version of me walking through this is you can watch here

Network Overview

Tailscale is a super cool networking tool, and Iximiuz Labs Playgrounds are a great place to run a Remote Homelab, so I thought I’d smash em together and make a super cool tech sandwich!

Author’s note: This particular playground will only run on the Iximiuz Labs premium tier, but you can sign up for a tailscale account and play with your own VMs for free.

The setup

To make things at least a little authentic, I replicated a very basic homelab running two main services:

  • home assistant
  • jellyfin

and two monitoring components:

  • grafana
  • prometheus

along with tailscale on each of the nodes to allow me to connect securely via my phone (which I do via termux)

all inside of a multi-node k3s cluster. The graphic below shows these components and how they’re deployed across the various 4 VMs

Homelab architecture diagram

The cool networking part

The TL;DR of tailscale is that once you connect running nodes to your tailnet, they can find each other by exchanging public keys with the centralized tailscale key servers, and using UDP holepunching to find out their public IP addresses.

The traffic heading out of a node first goes to the virtual network interface, where the wireguard process is running and signs the messages with the public keys of the other nodes they send traffic to. Then it gets forwarded onto the physical network interface with the IP address of the destination node (found via UDP holepunching earlier).

It’s all very slick and I’ve found it to be a great UX. If you want to read more in depth about how it all works, this is a good article.

And so if your phone has also joined the tailnet, it can communicate directly with the VMs inside the homelab (or in this case, remotelab) with end-to-end encrypted tunnels straight through to the private subnets behind NAT!

Your servers are never exposed to the public internet.

The scenario

I wanted to demo an authentic scenario to show off the capabilities, and so I did a bunch of work up front so I could crash one of the services (jellyfin) and then bring it back to life, all connecting via my phone over the secure tailnet.

…okay, so I cheat a little bit and crash the server from the Iximiuz Labs web interface, but you probably wouldn’t run that command from your phone anyway, so I’m gonna claim it’s still totally authentic ™️.

First, we set up our slideshow of http status cats via jellyfin…

Jellyfin slideshow with http status cats

Then we start filling up the server with fake logs

Filling up the node with fake logs

…and then all of a sudden, the slideshow turns to sadness!

Jellyfin slideshow broken

OH NO! The cats have crashed and we’re far away from home…time to debug remotely!

Luckily, we have a (very basic) monitoring stack set up with Prometheus and Grafana, so we can see if that tells us anything interesting…

Grafana dashboard showing node-02 down

Ah ha! It looks like one of the nodes is down, and we can also see that the file usage spiked to 100%…that’s probably what crashed the server :lightbulb:

Let’s dive into the node and see if we can work out where all the logs are filling up the server (we’ve already stopped the log generation script via the Iximiuz Labs web UI, so we don’t need to worry about that anymore).

Lets first connect to the node with termux. Grab either the magic DNS or just the private IP of the node and use that when you run

ssh phoneuser@IP_ADDRESS_FROM_TAILSCALE_UI

Note, this requires having a valid authorized_keys file in your ssh config for the phoneuser user on the node, containing the public key from a keypair generated on your phone.

Then we can run this command to see where all the disk bloat is

sudo du -h --max-depth=1 /

Looks like it’s the /var directory. Looking further down the directory tree, we can see the culprit is in /var/log/fake_logs.

We found all the (fake) logs…let’s clean them up. Now run sudo rm -rf /var/log/fake_logs/*.

Now that there’s free space on the server, let’s restart the k3s-agent (the thing that crashed when we ran out of disk space).

sudo systemctl restart k3s-agent

and we probably also want to clean up all the left over crud in the pod namespace…

k delete po -l app=jellyfin
k delete po -l app=node-exporter

since these are stateless services that mount in their config, they’re completely fine to be deleted and recreated

And now we can see that the node-exporter is back up and the pod has returned to normal functioning. Our jellyfin server is back up, and we can continue admiring how cats’ facial expressions never change, and that’s why they’re so funny.

Jellyfin cats slideshow is back

Wrapping up

Both Tailscale and Iximiuz Labs are very fun and approachable tools to help us learn about different homelab components, and they let us do it (basically) for free! What an internet!

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post
build-an-ai-chatbot-in-reverb-livewire-with-laravel-12-step-by-step

Build an AI Chatbot in Reverb Livewire with Laravel 12 Step-by-Step

Next Post
iso-10012-revision-update

ISO 10012 Revision Update

Related Posts