Intro
I've recently completed a major refactoring of my home server setup.
As originally configured, my home server acted as:
- A WireGuard entry point for my road-warrior-style home VPN.
- A makeshift virtualisation environment to host a small number of virtual machines, including a media server.
- A BorgBackup server.
Everything was running fine™.
In particular,
- I could connect to the media server from my LAN as well as from the VPN. For instance, I could use my phone to listen to my music and audiobooks while on the go.
- My computer would periodically send (encrypted, deduplicated) backups to the server, either directly when at home or via the VPN when outside.
It all sounds pretty good. Then why a major refactoring?
Three main reasons: simplicity, robustness, and ease of maintainance. My server's configuration had grown to a level of complexity that made me uncomfortable. Because of the many moving parts, the system had become a bit temperamental and issues were typically hard to diagnose and fix.
Let's see what the major sources of complexity were. And what tradeoffs I eventually accepted to reestablish simplicity.
Some more context
All my machines (server, laptop, VMs) run Guix, the functional Guile Scheme-based operating system. My local network is dual-stack IPv4 and IPv6.
I like the idea of virtualisation for the level of isolation that it provides. Isolation is good both for running applications that you don't trust and for those applications that you do trust but are internet-facing and therefore could be exploited by a remote attacker.
For convenience, usually my VMs are also Guix-based, since that's the operating system I'm most familiar with. It's great to be able to programmatically create an image from a Guix system definition and then run it with QEMU.
Sources of complexity
Where was the complexity then and what made things made the system unstable and difficult to maintain?
WireGuard road-warrior setup
For it to act as a WireGuard entry point in what it's called road-warrior or point-to-site setup, the server's firewall needs to forward and masquerade packets from and to the LAN.
flush ruleset define public_interface = <network-public-interface> define wireguard_interface = <wireguard-interface> define wireguard_port = <wireguard-port> define lan_v4 = {<network-lan-v4>} define lan_v6 = {<network-lan-v6>} table inet filter { chain inbound { type filter hook input priority 0; policy drop; iif lo accept meta l4proto {icmp, ipv6-icmp} accept ct state vmap {established: accept, related: accept, invalid: drop} ct state new limit rate over 1/second burst 10 packets drop # Accept WireGuard and SSH. iifname $public_interface udp dport $wireguard_port accept iifname $public_interface tcp dport ssh ip saddr $lan_v4 accept iifname $public_interface tcp dport ssh ip6 saddr $lan_v6 accept iifname $wireguard_interface tcp dport ssh accept } chain forward { # Forward packets from WireGuard to LAN. type filter hook forward priority 0; policy drop; ct state vmap {established: accept, related: accept, invalid: drop} iifname $wireguard_interface oifname $public_interface accept } } table inet nat { chain postrouting { # Masquerade packets from WireGuard to LAN and back. type nat hook postrouting priority 100; policy accept; iifname $wireguard_interface oifname $public_interface masquerade } }
For the above to work, forwarding must be enabled via the kernel parameters
net.ipv4.ip_forward
and net.ipv6.conf.all.forwarding
.
Once forwarding is enabled, however, the kernel no longer accepts IPv6 Router
Advertisement packets and the machine is no longer able to self-assign an IPv6
address via SLAAC, stateless address autoconfiguration. This is documented by
various blog posts such as this and this, as I found out after some
debugging. This can be fixed via this other kernel parameter,
net.ipv6.conf.all.accept_ra
.
This can be all sorted with the following systcl
configuration.
(define server-operating-system (operating-system ... (services (cons* ... (modify-services %base-services (sysctl-service-type config => (sysctl-configuration (settings (cons* '("net.ipv4.ip_forward" . "1") '("net.ipv6.conf.all.forwarding" . "1") '("net.ipv6.conf.all.accept_ra" . "2") %default-sysctl-settings)))) ...))))
Internet-reachable VMs
In most cases, I wanted the VMs to host a service and therefore to be reachable from the internet, as if they were ordinary physical machines in my local network. This requires a bridge network interface, which can be configured on the server as detailed below. The service takes inspiration from Sergey's setup, see this mailing list thread. 🙏
(define bridge-interface-service-type (shepherd-service-type 'bridge-interface (lambda (config) (let ((bridge-name (car config)) (eth-name (cdr config))) (shepherd-service (documentation "Define a bridge interface and use it as master.") (provision (list 'networking-bridge)) (requirement '(udev)) (modules '((ip link))) (start (with-extensions (list guile-netlink) #~(lambda _ (wait-for-link #$eth-name #:blocking? #f) (link-add #$bridge-name "bridge") ;; Set the MAC address to get a fixed IPv6 address. (link-set #$bridge-name #:address "<network-bridge-mac-address>") (link-set #$bridge-name #:up #t) (link-set #$eth-name #:up #t #:master #$bridge-name)))) (stop (with-extensions (list guile-netlink) #~(lambda _ (let ((ip (string-append #$iproute "/sbin/ip"))) (system* ip "link" "set" #$eth-name "nomaster") (link-set #$eth-name #:down #t) (link-del #$bridge-name)))))))) (description "Define a bridge interface and use it as master."))) (define server-operating-system (operating-system ... (services (cons* ... (service bridge-interface-service-type (cons "<network-public-interface>" "<network-physical-interface>")) (service dhcp-client-service-type (dhcp-client-configuration (shepherd-requirement '(networking-bridge)) (interfaces '("<network-public-interface>")))) ...))))
For QEMU to work with the network bridge one has to use a special QEMU configuration, as explained in the this section of the Guix Cookbook.
(define server-operating-system (operating-system ... (setuid-programs (cons* (setuid-program (program (file-append qemu "/libexec/qemu-bridge-helper"))) %setuid-programs)) (services (cons* (extra-special-file "/etc/qemu/bridge.conf" (plain-file "bridge.conf" "allow <network-public-interface>")) ...))))
With this in place, and with the following command line parameters, QEMU will take advantage of the new bridge interface.
screen qemu-system-x86_64 \ ... -device virtio-net-pci,netdev=net0,mac=<media-server-mac-address> \ -netdev \ bridge,id=net0,br=br0,helper="/run/setuid-programs/qemu-bridge-helper"
See below for the full QEMU command.
As a final note, this assumes the server is connected to the network via Ethernet. While this wasn't a problem in my case, you'll need to do things differently if you use WiFi, see this page on the Guix Cookbook or search for "QEMU bridge with wifi" on the internet.
Shared storage and the quest for stateless virtualisation
Ideally, I wanted to treat the VMs as disposable systems, their state (or the relevant parts thereof) persisted in storage shared with the host. This is convenient and elegant. To update a system, for example, you just kill it, rebuild the image, and launch it again - a very much "functional" approach.
However, sharing storage across a host and its QEMU-run guests is not exactly trivial. Depending on the QEMU parameters used, one can experience file permission issues when accessing files from the guests. On the other hand, a carefree approach at setting these QEMU parameters, just to get things sorted with guest access, may weaken the host-guest isolation.
This is the full QEMU command that one can use to launch a VM on the host, set
up the network, and give read-only access to the host's /media/vm
folder from
the VM.
screen qemu-system-x86_64 \ -enable-kvm \ -m <media-server-machine-ram> \ -smp <media-server-machine-cpus> \ -device virtio-net-pci,netdev=i,mac=<media-server-mac-address> \ -netdev bridge,id=i,br=b,helper="/run/setuid-programs/qemu-bridge-helper" \ -drive if=virtio,file="<media-server-image-path>" \ -virtfs local,path="/media/vm",security_model=mapped,mount_tag="vm" \ -nographic
The folder can then be mounted by the guest with:
mount -t 9p -o trans=virtio vm /media -oversion=9p2000.L
This is not as good as read and write access and it doesn't accomplish my objective of purely disposable VMs. I wasn't able to find a solution to this, not before I decided to refactor things differently anyway.
My new setup
I eventually decided to heavily refactor my setup. I wanted things to be simpler without compromising on security. The solution was to reduce the requirements - featureful, secure, simple, pick any two. I went for secure and simple.
To start with, no more virtualisation. Instead, I'll be running a small number of trusted applications directly on the server and only access them from within the VPN. I prefer renouncing to untrustworthy applications and publicly exposed services for now, in exchange for a leaner setup.
An alternative would be using Proxmox or a similar virtualisation environment that does all the heavy-lifting for me, but ultimately reducing functionality is ok.
No virtualisation means no more need for:
- the forwarding and masquerading firewall rules,
- the forwarding and router advertisement kernel parameters,
- the network bridge,
- QEMU, the bridge helper, and scripts for storage sharing between host and guests.
Let's take the media server as an example. I used to run Jellyfin within a virtual machine. Of course Jellyfin has many nice features, such as a good user interface, a mobile client, a Roku client, it automatically fetches film and music metadata, including images. But ultimately, I'll be perfectly fine with a more minimalist app such as ReadyMedia (formerly known as MiniDLNA). I'm more comfortable running ReadyMedia natively (without virtualisation) as long as I keep it within the VPN.
I'll be sharing my final setup in my next post. Stay tuned if you want to know how my laptop and home server are configured!