Guix for machine provisioning and management

Fabio Natali, 17 January 2024

Intro

This post is about using Guix for the provisioning and management of cloud machines and services. Beyond the official documentation, there are various great tutorials around this topic already, like this one, this one, or this other one. I'm writing this up primarily as a note-to-self, and in case my specific approach can be of interest to anyone else.

In summary, we'll be looking at how to build a basic Guix system image and how to use it to provision a cloud machine. We'll see how the machine can be updated and reconfigured via guix deploy. We'll start with a simple operating system definition and then iterate on it to install extra packages and services, like an Nginx webserver.

If this sounds interesting, let's crack on then!

A base system definition

I love the idea of being able to quickly spin up a new cloud server when needed. There are countless cloud providers and povisioning tools that make this possible programmatically and with very little effort. Sweet!

However, I also want my machines to be based on Guix, and that's where things become less straightforward. At the time of this writing, Guix images are only offered by a limited number of providers. In general, one has to build their own image and then upload it to the provider's image repository.

Let's see how to build a system image which can then be uploaded to the cloud provider. What follows is a fairly minimal and generic system definition for a cloud machine.

We want the machine to be accessible via SSH, with the key indicated in %ssh-public-key. We also want to be able to operate on the machine via guix deploy, with the key indicated in %guix-key. SSH key pairs can be generated with ssh-keygen and Guix Archive public keys with guix archive --generate-key.

WARNING: These keys will be granted blanket access to any server based on this system system definition. Make sure these keys are the correct ones and that their private counterparts are handled securely.

Save this as ./build/operating-system-base.scm.

(define-module (operating-system-base)
  #:use-module (gnu)
  #:use-module (gnu packages ssh)
  #:use-module (gnu services admin)
  #:use-module (gnu services networking)
  #:use-module (gnu services ssh)
  #:export (operating-system-base))

(define %ssh-public-key
  (plain-file
   "ssh-key.pub"
   "..."))

(define %guix-key
  (plain-file
   "guix-key.pub"
   "(public-key (ecc (curve Ed25519) (q ...)))"))

(define operating-system-base
  (operating-system
   (host-name "host")
   (timezone "Europe/London")
   (locale "en_US.UTF-8")
   (bootloader (bootloader-configuration
                (bootloader grub-bootloader)
                (targets '("/dev/vda"))))
   (file-systems (cons
                  (file-system
                   (device "/dev/vda2")
                   (mount-point "/")
                   (type "ext4"))
                  %base-file-systems))
   (users (cons
           (user-account
            (name "user")
            (group "users")
            (supplementary-groups '("wheel"))
            (home-directory "/home/user"))
           %base-user-accounts))
   (sudoers-file
    (plain-file
     "sudoers"
     (string-append (plain-file-content %sudoers-specification)
                    "%wheel ALL = NOPASSWD: ALL")))
   (services (cons*
              (service unattended-upgrade-service-type)
              (service dhcp-client-service-type)
              (service
               openssh-service-type
               (openssh-configuration
                (openssh openssh-sans-x)
                (authorized-keys `(("user" ,%ssh-public-key)))
                (password-authentication? #f)))
              (modify-services
               %base-services
               (guix-service-type config =>
                                  (guix-configuration
                                   (authorized-keys
                                    (cons %guix-key
                                          %default-authorized-guix-keys)))))))

operating-system-base

Build

This is how to build the system image on a Guix system:

cp `guix system image \
    --image-size=20GB \--image-type=qcow2 \
    --save-provenance \
    operating-system-base.scm` image.qcow2
chmod u+w image.qcow2

Upload

The image can now be uploaded to a cloud provider of your choice (if they accept custom images) and a cloud machine can be booted based on it.

I usually make the image available at a URL (e.g. at some S3-like storage service) and upload it as shown below. The following examples are based on DigitalOcean and its command-line tool doctl, but any provider that accepts custom images will do.

WARNING: Your cloud provider might charge you for some of these operations. Make sure you're familiar with your provider's pricing plan. Remember to turn resources off when not needed.

export url='https://example.com/image.qcow2'
export region=lon1
doctl compute image create "guix" --image-url "${url}" --region "${region}"

doctl compute image list will give you a list of available images. During the upload an image is reported as NEW, once uploaded the image will be marked as available.

Provisioning

A new DigitalOcean machine can be provisioned as follows. Be careful though, as cloud providers are likely to charge something for this!

export image_id=...
export region=lon1
export size=s-2vcpu-4gb
export ssh_key_id=...
doctl compute droplet create \
    --image "${image_id}" \
    --region "${region}" \
    --size "${size}" \
    --ssh-keys "${ssh_key_id}" \
    --wait \
    test

The DigitalOcean API seems to require a SSH public key, even if this is not needed by the provided image. Provide one just to make the API happy. The list of the SSH keys available can be retrieved with doctl compute ssh-key list.

Check whether the machine has been provisioned with doctl compute droplet list.

First connection

If everything has been configured correctly, it should be possible to SSH into the machine, using the SSH key authorised in the original system definition. It should also be possible to update or reconfigure the machine via guix deploy, using the Guix Archive key authorised in the original system definition.

WARNING: When connecting to the remote maching via SSH for the first time we are asked to verify the authenticity of the server's SSH fingerprint. The current setup doesn't offer a way to verify the fingerprint, which must be then accepted on a trust-on-first-use (TOFU) basis. This is not good and must be fixed. See this thread.

A machine definition

We've looked at how to build a basic Guix system image and use it to provision a cloud machine. Let's now see how the machine can be updated and reconfigured via guix deploy. We'll be running guix deploy from the authorised coordinator, i.e. from the machine whose Guix Archive public key was included in the original system definition.

First, guix deploy will need a machine configuration. Assuming that our system definition was saved as ./build/operating-system-base.scm, you can save this as ./build/machine-configuration.scm.

(define-module (machine-configuration)
  #:use-module (gnu)
  #:use-module (gnu machine)
  #:use-module (gnu machine ssh)
  #:export (machine-configuration))

(define machine-configuration
  (machine-ssh-configuration
   (host-name "<server-address>")
   (host-key "<ssh-key>")
   (system "x86_64-linux")
   (user "user")
   (identity "/home/user/.ssh/id_rsa")))

Let's now combine the system and the machine definition together into a single file, say ./build/deploy-base.scm, that can be fed to guix deploy.

(use-modules (gnu))
(use-modules (operating-system-base))
(use-modules (machine-configuration))

(list
 (machine
  (operating-system operating-system-base)
  (environment managed-host-environment-type)
  (configuration machine-configuration)))

It is now possible to run command on the remote machine from the coordinator, for instance:

guix deploy build/deploy-base.scm --load-path=build --execute -- guix describe

Or:

guix deploy build/deploy-base.scm --load-path=build --execute -- uname --all

The remote machine can be updated as follows:

guix pull && guix deploy build/deploy-base.scm --load-path=build

The hello world package

The remote machine can also be reconfigured via guix deploy. For instance, the following definition can be used to install the hello package on top of the base system. Save this as ./build/deploy-hello.scm.

(use-modules (gnu))
(use-modules (gnu packages base))
(use-modules (operating-system-base))
(use-modules (machine-configuration))

(define operating-system-hello
  (operating-system
   (inherit operating-system-base)
   (packages (cons* hello
                    (operating-system-packages operating-system-base)))))

(list
 (machine
  (operating-system operating-system-hello)
  (environment managed-host-environment-type)
  (configuration machine-configuration)))

One can now launch the hello command from remote, as follows.

guix deploy build/deploy-hello.scm --load-path=build
guix deploy build/deploy-hello.scm --load-path=build --execute -- hello

Nginx

Let's expand on the previous operating system definition, let's now install an Nginx webserver. Save this as ./build/deploy-nginx.scm.

(use-modules (gnu))
(use-modules (gnu services ssh))
(use-modules (gnu services web))
(use-modules (operating-system-base))
(use-modules (machine-configuration))

(define operating-system-nginx
  (operating-system
   (inherit operating-system-base)
   (services (cons*
              (service nginx-service-type
                       (nginx-configuration
                        (server-blocks
                         (list (nginx-server-configuration
                                (listen '("80"))
                                (root "/var/www"))))))
              (operating-system-user-services operating-system-base)))))

(list
 (machine
  (operating-system operating-system-nginx)
  (environment managed-host-environment-type)
  (configuration machine-configuration)))

Reconfigure the server:

guix deploy build/deploy-nginx.scm --load-path=build
ssh user@<server-address> "echo hello world | sudo tee /var/www/index.html"

Test that everything has worked correctly with curl <server-address>. You should get a hello world back. Exciting!

Nginx + TLS

Let's now add TLS support with the help of Certbot, an ACME client, and Let's Encrypt. We will need a domain name pointing to our server instance. Update the relevant parts in the following system definitions, e.g. domain name and ACME email. Save this as ./build/deploy-nginx-tls-init.scm.

(use-modules (gnu))
(use-modules (gnu services certbot))
(use-modules (gnu services ssh))
(use-modules (gnu services web))
(use-modules (operating-system-base))
(use-modules (machine-configuration))

(define %certbot-deploy-hook
  (program-file
   "certbot-deploy-hook.scm"
   (with-imported-modules
    '((gnu services herd))
    #~(begin
        (use-modules (gnu services herd))
        (with-shepherd-action 'nginx ('reload) result result)))))

;; You may want to use a staging ACME server when testing.
(define %certbot-server
  "https://acme-staging-v02.api.letsencrypt.org/directory")

(define (cert-path host file)
  (format #f "/etc/letsencrypt/live/~a/~a.pem" host (symbol->string file)))

(define operating-system-nginx-tls
  (operating-system
   (inherit operating-system-base)
   (services (cons*
              (service certbot-service-type
                       (certbot-configuration
                        (email "user@example.com")
                        (server %certbot-server)
                        (certificates
                         (list
                          (certificate-configuration
                           (domains '("example.com"))
                           (deploy-hook %certbot-deploy-hook))))))
              (service nginx-service-type
                       (nginx-configuration
                        (server-blocks
                         (list
                          (nginx-server-configuration
                           (listen '("80"))
                           (root "/var/www")
                           (server-name '("example.com")))))))
              (operating-system-user-services operating-system-base)))))

(list
 (machine
  (operating-system operating-system-nginx-tls)
  (environment managed-host-environment-type)
  (configuration machine-configuration)))

The system can be deployed and TLS can be initialised as follows.

guix deploy build/deploy-nginx-tls-init.scm --load-path=build
guix deploy build/deploy-nginx-tls-init.scm --load-path=build --execute -- \
    /var/lib/certbot/renew-certificates

Did you get a "Successfully received certificate" message? Sweet. We are good to go and deploy the final system that reconfigures the web server to use the TLS certificates generated above. The only difference between the two definitions is the two nginx-server-configuration blocks. Save this as ./build/deploy-nginx-tls.scm.

(use-modules (gnu))
(use-modules (gnu services certbot))
(use-modules (gnu services ssh))
(use-modules (gnu services web))
(use-modules (operating-system-base))
(use-modules (machine-configuration))

(define %certbot-deploy-hook
  (program-file
   "certbot-deploy-hook.scm"
   (with-imported-modules
    '((gnu services herd))
    #~(begin
        (use-modules (gnu services herd))
        (with-shepherd-action 'nginx ('reload) result result)))))

;; You may want to use a staging ACME server when testing.
(define %certbot-server
  "https://acme-staging-v02.api.letsencrypt.org/directory")

(define (cert-path host file)
  (format #f "/etc/letsencrypt/live/~a/~a.pem" host (symbol->string file)))

(define operating-system-nginx-tls
  (operating-system
   (inherit operating-system-base)
   (packages (cons*
              openssl
              (operating-system-packages operating-system-base)))
   (services (cons*
              (service certbot-service-type
                       (certbot-configuration
                        (email "user@example.com")
                        (server %certbot-server)
                        (certificates
                         (list
                          (certificate-configuration
                           (domains '("example.com"))
                           (deploy-hook %certbot-deploy-hook))))))
              (service nginx-service-type
                       (nginx-configuration
                        (server-blocks
                         (list
                          (nginx-server-configuration
                           (listen '("443 ssl" "[::]:443 ssl"))
                           (root "/var/www")
                           (server-name '("example.com"))
                           (ssl-certificate
                            (cert-path "example.com" 'fullchain))
                           (ssl-certificate-key
                            (cert-path "example.com" 'privkey)))))))
              (operating-system-user-services operating-system-base)))))

(list
 (machine
  (operating-system operating-system-nginx-tls)
  (environment managed-host-environment-type)
  (configuration machine-configuration)))

Redeploy and restart Nginx.

guix deploy build/deploy-nginx-tls.scm --load-path=build
guix deploy build/deploy-nginx-tls.scm --load-path=build --execute \
    -- herd restart nginx

The web server will now be reachable via TLS, i.e. via its https://... form. Yay! Depending on whether you've used a staging ACME server instead of a production one (see the variable %certbot-server), you may have to force your browser to accept the certificate.

Summary

In this post we've looked at how to build a basic Guix system image and use it to provision a cloud machine. We've seen how the machine can be updated and reconfigured via guix deploy. We've seen how to extend the initial system definition and install extra packages and services.

I hope the post does a decent job at exemplifing the power of our favourite operating system, also when it comes to the provisioning and management of cloud machines. If you have any question or comment, please get in touch via email or on the Fediverse. Until next time!

Revision 3a220d5.