../assets/homelab/homelabFront.jpeg

Starting a home lab: docker management containers

Preface

This article is going to discuss the containers I use to manage my homelab as well as the installation process for getting everything setup. It’s best to think of this set of containers as only really useful to me as the administrator of the system, and will likely not be looked at by anybody else. This will build on the previous articles in my homelab series (such as container networking), so if something isn’t explained, it might be a good idea to look at one of the previous articles.

This set of containers can essentially be split into 2 parts, the first set allow for the management and viewing of information related to my containers, the second set is for the configuration management of other servers and machines within my homelab (i.e.: not part of my proxmox setup). There’s some overlap between the two, but I think that it’s still useful to think of the containers like this.

Considerations

As with all the other containers I’m choosing to use in my homelab, I’m following the same rules I’d laid out earlier, but in addition to those I’m looking for containers which will provide the following:

  • Access to logs
    • using docker logs <container> is fine, but it means I need to be logged onto the server to see them and it can be a little clunky with the number of services I’m running
  • Remote desktop
    • I occasionally use machines that don’t have an SSH client installed, so having a web service that is ready to go can speed me up.
  • Container management
    • Really just a place to start/stop containers without having to log into the box and use the docker restart <container> command and get an overall view of my docker installation
  • Automatic container updates
    • I run a lot of services, having something that can update containers without manual intervention means I need to spend less time maintaining the environment, even when including the effort to fix breaking changes
  • PXE network boot support
    • This allows for a machine to install an operating system over the network and is traditionally called a PXE boot server
  • Machine management
    • I want to use a tool of some description that can help me when setting up new machines, so that I can automate the provisioning process.

Choices

There’s a lot of options within this space, but I’ve ultimately decided to implement the following containers:

  • Docker Socket Proxy
    • This limits access to the docker socket to only the surfaces that I want to give access to for services that I’m using over the internet. As these containers require a lot of access to the underlying machine, I’ve opted to provide access to functions required by the docker socket through a docker socket proxy.
  • Dozzle
    • This provides access to logs and it’s deep focus on only logging gives an excellent experience for quickly finding what the issue is along with some great features like built-in filtering using regular expressions and essentially instantly pulling in new log files when I start a new container. Over the past 6 months I’ve found Dozzle so useful for troubleshooting that I automatically open it whenever I start messing around in my homelab.
  • Portainer
    • I’m definitely not making use of the full abilities of Portainer around deploying containers, but as I mentioned it’s useful for viewing what’s happening, how my containers are setup (i.e.: labels attached and environment variables etc) and restarting containers. It does have the ability to view logs, but I’ve found Dozzle more user-friendly for this function. However, I’ve personally found the most useful function is the ability to connect to the console of a container as this can be a little time consuming to find the right shell.
  • Apache Guacamole
    • This is a remote desktop client that can be used through a web browser. While I’m currently only using SSH, it does have support for RDP, VNC, Telnet and Kubernetes. This is a nice to have, but I do tend to log onto the machines directly as I do often have an SSH client on the machine I’m using to connect. However, it is useful for having access to all my machines in one place and it does offer copy/paste, which is better than the default Proxmox client.
  • Watchtower
    • This automatically updates my containers to the latest version of the tagged container. As mentioned, this can be dangerous, so I do use specific version tags for some of my containers. Watchtower is an entirely command line based tool and doesn’t have a web UI, but I’ve found it’s very stable once running correctly
  • Netboot XYZ
    • This container is a really simple PXE server that contains a lot of default tools and operating systems with a handy UI for selecting the version I want. It’s really made setting up new machines a breeze.
  • Semaphore
    • This is essentially a frontend to Ansible, which unlike Ansible tower, doesn’t require Kubernetes to run. Instead it works by using git repositories as playbooks and I’ve found this really useful to make sure I keep everything backed up correctly.
    • In the past I have setup Ansible AWX which is a first-party solution, but that requires Kubernetes to run, and that’s overkill in this setup and as of writing, development has been paused for 10 months for a major refactor.

Setup

File structure

The file structure is similar to what I’ve previously discussed and requires the following folders and files to be created:

text
data/
┗ management/
┣ guacamole/
┣ netboot/
┣ portainer/
┗ semaphore/
management/
┣ .env
┗ docker-compose.yaml

Common compose

This is a set of common compose declarations. As mentioned in a previous article, I group my services by type as it currently feels more manageable to do this, but is different from the recommendation of 1 Compose file per service. as such, here’s a set of resources that will be used across services:

environment variables:

shell
CADDY_HOST=<homelab domain> # setting the domain to make changing it easier
TIMEZONE=Europe/London # Used for some contaienrs to set a local timezone - for me, this is London

Setting a time zone isn’t really necessary as it should be picked up normally. However, having it set like this doesn’t hurt and can occasionally catch weird errors.

YAML:

yaml
# name of the stack
name: management
# where the services I'm running are defined
services:
networks:
# using the caddy network created in my networking tutorial using `docker network create caddy`
caddy:
external: true
# folders that persist between container restarts
volumes:

In future Compose declarations, I’ll mention the top level statement that it slots into. For example, when creating a service I’ll show it like this:

yaml
services:
foo:
networks:
bar_network:

Docker proxy socket setup

This is a secondary container that’s only used by other containers within this YAML file, as such the only ways to see it works correctly is to use a health check and try it with another service.

yaml
services:
docker_socket_proxy:
# making sure the container has a nice name
container_name: docker_socket_proxy
# I'm using the linux server version of this container, as I lie the standard environment variables
image: lscr.io/linuxserver/socket-proxy:latest
# As discussed in the docker defaults article
restart: always
# healthcheck that calls the version endpoint of the docker socket
healthcheck:
test: wget --spider http://localhost:2375/version || exit 1
interval: "30s"
timeout: "5s"
retries: 3
start_period: "30s"
# allowing the caddy container to access the docker host
volumes:
- /var/run/docker.sock:/var/run/docker.sock
environment:
# allow access to POST requests - this is not a good idea in "live" environments
- POST=1
# required for portainer
- ALLOW_START=1
# required for portainer
- ALLOW_STOP=1
# required for portainer
- ALLOW_RESTARTS=1
# required for portainer and dozzle
- CONTAINERS=1
# required for portainer
- IMAGES=1
# required for portainer
- INFO=1
# required for portainer
- NETWORKS=1
# required for portainer
- SERVICES=1
# required for portainer
- TASKS=1
# required for portainer
- VOLUMES=1
# access to a network that can be used by other containers
networks:
- management
labels:
# various autokuma configuration variables discussed in the future
kuma.management.group.name: "management"
kuma.docker_socket_proxy.docker.parent_name: "management"
kuma.docker_socket_proxy.docker.name: "docker_socket_proxy"
kuma.docker_socket_proxy.docker.docker_container: "docker_socket_proxy"
kuma.docker_socket_proxy.docker.docker_host: 1
networks:
management:
# give the management network a specific subnet
# avoids issues with the Docker networking tool creating networks that conflict with my addresses on 192.168.0.0/16
ipam:
driver: default
config:
- subnet: 172.0.13.0/24

After this just run sudo docker compose up -d and the container should start.

I can then see that this container is running and healthy by calling sudo docker ps -f name=docker_socket_proxy and I should get the following data:

text
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
68b0fa72925b lscr.io/linuxserver/socket-proxy:latest "/docker-entrypoint.…" 18 minutes ago Up 18 minutes (healthy) 2375/tcp docker_socket_proxy

Dozzle setup

Dozzle would be really straightforward to setup as it doesn’t require any other services, but I’ve complicated the setup slightly by using the docker socket proxy to replace direct access to the docker socket.

NOTE: this container will require the networks and environment variables discussed in the Common compose sections, as well as the docker_socket_proxy container

yaml
services:
dozzle:
# making sure the container has a nice name
container_name: dozzle
image: amir20/dozzle:latest
# As discussed in the docker defaults article
restart: always
# caling the dozzle healthcheck command
healthcheck:
test: ["CMD", "/dozzle", "healthcheck"]
interval: 30s
timeout: 5s
retries: 3
start_period: 30s
# only start after the docker_socket_proxy
depends_on:
# don't start until after the docker socket proxy has started
- docker_socket_proxy
# there's a web app for this, so we need the caddy network as well as management
networks:
# access to the external caddy network for SSL
- caddy
# access to the docker proxy
- management
environment:
# use the london timezone set from above
- TZ=${TIMEZONE}
# the DOCKER_HOST environment variable is how the docker proxy socket is accessed
- DOCKER_HOST=tcp://docker_socket_proxy:2375
labels:
# various caddy config labels - see the networking article for more details
caddy: dozzle.${CADDY_HOST}
caddy.reverse_proxy: "{{upstreams 8080}}"
caddy.tls.dns: cloudflare {env.CLOUDFLARE_API_TOKEN}
# the dozzle container doesn't have any authentication by default, so I'm locking it behind the proxy auth discussed in the authentication article
# details here: https://blog.jacklewis.dev/977-docker-auth/#forward-auth-setup
caddy.import: authentik_server
# various autokuma configuration variables discussed in the future
kuma.management.group.name: "management"
kuma.dozzle.docker.parent_name: "management"
kuma.dozzle.docker.name: "dozzle"
kuma.dozzle.docker.docker_container: "dozzle"
kuma.dozzle.docker.docker_host: 1

Once again, I can start the container by running docker compose up -d and then view the running container details with sudo docker ps -f name=dozzle:

text
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
08c9b9577b20 amir20/dozzle:latest "/dozzle" 10 minutes ago Up 10 minutes (healthy) 8080/tcp dozzle

I can now navigate to https://dozzle.<homelab domain> in my browser and view the front page of Dozzle:

Dozzle

As can be seen, there’s some basic information on the state of the containers in my homelab, but this application is designed around viewing logs easily. There’s a lot of features around how to interact with Dozzle, but my favourite is the text/regex search function that can be accessed by opening a log and pressing ctrl + f to quickly search through my logfiles:

Dozzle

Dozzle closing thoughts

I now have the ability to quickly access logs over the network that vastly improves my workflow for checking for errors in my homelab.

Portainer setup

Portainer uses a single volume to store configuration values between restarts and has been slightly complicated due to the use of the docker socket proxy.

Portainer environment

bash
PORTAINER_DATA=<homelab folder>/data/management/portainer # Used for a single volume

Portainer docker

NOTE: This docker file depends on the common configuration above for setting the caddy URL and accessing the caddy network. In addition, this container takes advantage of the docker socket proxy created above.

yaml
services:
portainer:
container_name: portainer
# alpine is needed for healthcheck as the default tag doesn't have a way to do it
image: portainer/portainer-ce:alpine
volumes:
- portainer:/data
restart: always
healthcheck:
# calls the healthcheck endpoint using wget
test: wget --spider http://localhost:9000/api/system/status || exit 1
start_period: 30s
interval: 30s
timeout: 5s
retries: 3
# connection to caddy and the managment container for access to the docker socket proxy
networks:
# access to the external caddy network for SSL
- caddy
# access to the docker proxy
- management
depends_on:
# don't start until after the docker socket proxy has started
- docker_socket_proxy
labels:
# various caddy config labels - see the networking article for more details
caddy: portainer.${CADDY_HOST}
caddy.reverse_proxy: "{{upstreams 9000}}"
caddy.tls.dns: cloudflare {env.CLOUDFLARE_API_TOKEN}
# autokuma configuration labels
kuma.management.group.name: "management"
kuma.portainer.docker.parent_name: "management"
kuma.portainer.docker.name: "portainer"
kuma.portainer.docker.docker_container: "portainer"
kuma.portainer.docker.docker_host: 1
volumes:
portainer:
driver: local
driver_opts:
type: none
o: bind
device: ${PORTAINER_DATA}

After this is in the Docker compose, I can start the container by running docker compose up -d and then view the running container details with sudo docker ps -f name=portainer:

text
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0eea610e384c portainer/portainer-ce:alpine "/portainer -H tcp:/…" Up 25 minutes ago Up 25 minutes (healthy) 8000/tcp, 9000/tcp, 9443/tcp portainer

Portainer configuration

Portainer needs some configuration to be done after start-up. I can do this by navigating to https://portainer.pve.hostedlab.co.uk in my browser, and am greeted with an admin login screen:

Portainer

As I’m not restoring from backup, I just fill out the details with a username and admin and create a password for portainer. Once this is done I click Create user and I can then connect to my current environment by clicking the Add Environments button on the next screen:

Portainer

I’m using a Docker Standalone installation, so I then select that:

Portainer

I can finally add the environment by going to API, calling this environment something (I chose local) and entering the Docker API URL of docker_socket_proxy:2375. I’m also not using TLS as this is running locally, so I make sure to leave that unchecked:

Portainer

Selecting Connect causes a toast to appear in the top right with my new environment:

Portainer

Navigating back to the front page now shows that I’m connected to the local environment:

Portainer

As can be seen, there’s a fair amount of information here, but importantly I can see that there’s a stopped container, which is a hello world container I’d initially run to check that Docker was running. In order to make sure that everything is fitting together, I’m going to remove it by going to Containers, highlighting the stopped container and selecting Remove in the top right:

Portainer

After confirming I want to delete the container, it disappears from Containers list and I get a toast notification in the top right:

Portainer

Portainer OAuth setup

Portainer is set up with it’s on authentication by default, but it’s easy to add Authentik as an OAuth provider. This can be configured by going to Administration -> Settings -> Authentication and selecting OAuth:

Portainer

I also leave the Use SSO slide toggle checked as this allows Portainer to log in without prompting. Additionally, I turn on the `Automatic user provisioning slide toggle as it means I don’t need to manually create a user in Portainer to map to my Authentik users.

Additionally, Automatic user provisioning is left enabled so that users are added to Portainer by logging in with an Authentik account. This can be disabled later on if required.

Meanwhile, open up the OAuth provider setup in the Authentik admin UI so that the configuration can be seen:

Authentik

NOTE: the client id and secret can be found by pressing the Edit button.

Then I can fill out the details with the following mappings:

Setting value Portainer setting
Authentik Client Id Client Id
Authentik Client Secret Client Secret
Authentik Authorize URL Authorization Url
Authentik Token URL Access token URL
Authentik Userinfo URL Resource URL
https://portainer.<homelab domain>/ Redirect URL
https://portainer.<homelab domain>/ Logout URL
preferred_username User identifier
email openid profile Scopes
Auto Detect Auth Style

NOTE: the spaces in Scopes are important, other punctuation (like a comma) won’t work.

Portainer

Once I click save, I get a toast popup in the top right and I can test this works by just logging out, and I can see that the login page has changed to include an option for OAuth:

Portainer

Provided the authentication setup is correct, I can login straight away using credentials from Authentik and can see the following screen:

Portainer

As can be seen though, this new user doesn’t have any permissions to view details in Portainer. The next section of this guide will show how to do this.

Portainer user management

Unfortunately, a lot of the user management features are locked behind the business edition of Portainer and I pretty much only have an option to choose if a user is an administrator or not. While not ideal, it is workable and this is the process for enabling an Authentik user to view details in Portainer.

As Automatic user provisioning has been left enabled, by logging in with an Authentik user, they will automatically be provisioned onto Portainer. So from the login screen I press Login with OAuth and then log into the service. After this I am greeted with the following screen:

Portainer

However, as akadmin can be seen, there is no Environments available to view and this is because this user currently has no permissions, I need to enable them from the current admin user in portainer, by re-logging as the this previous admin user. Under Administration -> User-related -> Users, the “new” akadmin account can be seen:

Portainer

By opening up the akadmin user, there’s a toggle to enable Administrator:

Portainer

Pressing Save causes a toast confirming the user is saved, and then I can log in as akadmin again and see I now have access to the local environment:

Portainer

Apache Guacamole setup

Guacamole is initially pretty simple to setup, but does require a small amount of additional configuration based on using an extension that provides SSO support.

Apache Guacamole environment variables

In addition to the general environment variables, Guacamole uses a few environment variables:

bash
GUACAMOLE_DATA=<homelab folder>/data/management/guacamole # Used for a single volume
# all of these are required to get oauth working
OPENID_AUTHORIZATION_ENDPOINT=<homelab url>/application/o/authorize/ # Authorize URL in Authentik
OPENID_CLIENT_ID=<client id> # The client ID from Authentik
OPENID_ISSUER=<homelab url>/application/o/oauth/ # OpenID Configuration Issuer in Authentik - note the /oauth/ section might be different
OPENID_JWKS_ENDPOINT=<homelab url>/application/o/oauth/jwks/ # JWKS URL in Authentik - note the /oauth/ section might be different
EXTENSIONS=auth-sso-openid # these are comma separated, but I only need the SSO plugin
EXTENSION_PRIORITY=*, openid # allows you to login via the default Guacamole login, as well as oauth
OPENID_USERNAME_CLAIM_TYPE=preferred_username # show the username, instead of the default email for showing who's logged in

Apache Guacamole docker

Finding a container to run Guacamole was actually pretty hard. I ended up using flcontainers as running Apache Guacamole requires 3 separate applications: Guacamole itself, a database and a webserver. the Docker image from flcontainers packages Guacamole, Postgres and Tomcat into a single container which massively simplifies deployment. Additionally, this container includes a set of extensions that allow single sign on to work.

yaml
services:
guacamole:
# as discussed, using an image from flcontainers
image: flcontainers/guacamole
container_name: guacamole
volumes:
# requires a config to be set to volume for persistence
- guacamole:/config
environment:
- TZ=${TIMEZONE}
# A set of open id environment variables, set from the environment file
- OPENID_AUTHORIZATION_ENDPOINT=${OPENID_AUTHORIZATION_ENDPOINT}
- OPENID_CLIENT_ID=${OPENID_CLIENT_ID}
- OPENID_ISSUER=${OPENID_ISSUER}
- OPENID_JWKS_ENDPOINT=${OPENID_JWKS_ENDPOINT}
- OPENID_REDIRECT_URI=https://guacamole.${CADDY_HOST}/
- OPENID_USERNAME_CLAIM_TYPE=${OPENID_USERNAME_CLAIM_TYPE}
# discussed above, but allows for sorting how extensions will be loaded by Guacamole
- EXTENSION_PRIORITY=${EXTENSION_PRIORITY}
# a comma separated list of extensions to add, which are set in environment variables
- EXTENSIONS=${EXTENSIONS}
restart: always
healthcheck:
# this healthcheck isn't perfect, but it's one of a very small number of endpoints that I found that don't require authentication
test: wget --spider http://localhost:8080/api/languages || exit 1
start_period: 30s
interval: 30s
timeout: 5s
retries: 3
networks:
- caddy
labels:
# various caddy config labels - see the networking article for more details
caddy: guacamole.${CADDY_HOST}
caddy.reverse_proxy: "{{upstreams 8080}}"
caddy.tls.dns: cloudflare {env.CLOUDFLARE_API_TOKEN}
# autokuma configuration labels
kuma.management.group.name: "management"
kuma.guacamole.docker.parent_name: "management"
kuma.guacamole.docker.name: "guacamole"
kuma.guacamole.docker.docker_container: "guacamole"
kuma.guacamole.docker.docker_host: 1
volumes:
guacamole:
driver: local
driver_opts:
type: none
o: bind
device: ${GUACAMOLE_DATA}

As with the previous containers, I can start this by running docker compose up -d and then view the running container details with sudo docker ps -f name=guacamole:

text
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
28d2831197b4 flcontainers/guacamole "/startup.sh" 9 minutes ago Up 9 minutes (healthy) 8080/tcp guacamole

Initial boot of this container takes a while, fortunately Dozzle provides quick feedback for when this container is ready

Once this container is running, I can then check this is working by going to https://guacamole.<homelab domain>/. However, there should be an error shown:

Guacamole

This is because in the default install, environment variables aren’t enabled. So I need to do this via a config file that has been generated into the volume I’m using. Specifically, the guacamole.properties file. in the flcontainers image, this lives in the guacamole of the volume found at <homelab folder>/data/management/guacamole/guacamole/guacamole.properties. This file will have some initial config in it around the Postgres database, but needs to have enable-environment-properties: true added, so that the full file will look something like this:

yaml
enable-clipboard-integration: true
postgresql-hostname: localhost
postgresql-port: 5432
postgresql-database: guacamole_db
postgresql-username: guacamole
postgresql-password: <generated password>
# the added variable
enable-environment-properties: true

After doing this, use the save as root extension in Visual Studio Code to save the file and restart the Apache Guacamole container with the command sudo docker restart guacamole.

Once the container has restarted, the login screen will show up correctly:

Guacamole

I can test that the Authentik integration is working by pressing the Sign in with: OpenID link in the bottom left:

Guacamole

However, by default this user is not an admin, and so can’t be used to configure clients. Instead I need to login with the default admin user, which is guacadmin. The default password for this user is also guacadmin. Once logged in, I go to settings in the dropdown, select Users and then press New User:

Guacamole

In this screen, I set the username to the name of my authentik user (akadmin in my case), and then select all the permissions I want to give this new admin user, followed by pressing Save:

Guacamole

This is all I need to do with this user, so I can then logout and log back in with the Authentik user. I can then verify that this user is working as an admin by going to the settings page and I can now see the option to view and create users:

Guacamole

NOTE: at this point, it’s a good idea to either delete or change the password for the guacadmin user as this is no longer required.

Creating an SSH connection

Apache Guacamole has a large amount of supported protocols, but the one I’m most interested in is SSH as it’s the one I need to access my various virtual machines remotely. In order to demonstrate this, I’m going to add a connection to the Proxmox instance that my Docker VM is running on. To do this I need to go to Settings -> Connections -> New Connection:

Guacamole

There’s a lot of settings here, but what I’ve set is as followed:

  • named the connection Proxmox
  • changed the Protocol to SSH
  • set the hostname to be https://proxmox.<homelab domain>
  • set the port to 22

Guacamole

I can also set a private key or username and password for authentication, but I’m not going to do that as I don’t want anybody with access to the console to be logged in straight to the console of my virtual machine.

After I’ve saved the new connection, I can now view and login to the connection from the guacamole homepage:

Guacamole

Clicking on the link allows me to access an SSH console to login to my Proxmox instance.

Guacamole

NOTE: I can exit the full screen SSH session by entering the command logout.

Apache Guacamole losing thoughts

This is a really powerful tool that I’m not currently making full use of. For instance, I currently have sessions created for Proxmox, the Fileserver, the Docker VM and my OPNsense router but I could extend this out to my Windows desktop and various other computers in my house. However, I really like the ability to log into a browser window from essentially anywhere in the world (using Tailscale) and be able to quickly launch an SSH session to do what I want.

Watchtower setup

Watchtower is entirely driven by the Docker compose file and has no GUI. As such it’s both simple to setup and has a lot of options that need to be searched for to access. However, as of writing this article, Watchtower hasn’t actually been updated in a while, but it’s pretty feature complete and as such I don’t have any issues running this in my environment. However, I might look for something a little newer in the future.

NOTE: Watchtower will only work with containers being pulled from a Docker repository, containers that are built using a Dockerfile (like Caddy) will still need to be updated manually

Watchtower environment variables

Watchtower uses a single environment variable that requires some explanation. This is a Discord Webhook and Watchtower uses Shoutrrr for it’s notification system. In order to get this URL, within Discord I go to Channel Settings -> Integrations -> Webhooks and create a new Webhook called watchtower:

Watchtower

From here I can grab the Webhook URL using Copy Webhook URL, which contains the 2 variables needed by Watchtower to connect to Discord properly and is something like this - https://discord.com/api/webhooks/<webhook id>/<token>, which can then be converted into the environment variable below:

bash
# Watchtower
DISCORD_WEBHOOK=discord://<token>@<webhook id>

Watchtower Docker

As mentioned earlier, this is an older container, but it does work perfectly well at what it does.

yaml
services:
watchtower:
image: containrrr/watchtower:latest
container_name: watchtower
restart: always
# command to call the health check in Watchtower
healthcheck:
test: ["CMD", "/watchtower", "--health-check"]
interval: 30s
timeout: 10s
retries: 3
start_period: 5s
# requires access to the docker socket proxy to be able to pull down containers
networks:
- management
depends_on:
- docker_socket_proxy
environment:
- TZ=${TIMEZONE}
# enabling this envionment variable will mean watchtower won't update containers, but will output what would be updated
# - WATCHTOWER_MONITOR_ONLY=true
# CRON schedule that will run the update every Saturday at 2AM
- WATCHTOWER_SCHEDULE=0 0 2 * * 6
# allows old container images to be cleaned up - keeps diskspace down
- WATCHTOWER_CLEANUP=true
# debug output for logging
- WATCHTOWER_DEBUG=true
# sets the Discord URL mentioned above
- WATCHTOWER_NOTIFICATION_URL=${DISCORD_WEBHOOK}
# tells watchtower where the docker socket proxy lives
- DOCKER_HOST=tcp://docker_socket_proxy:2375
labels:
# various autokuma configuration labels for future monitoring
kuma.management.group.name: "management"
kuma.watchtower.docker.parent_name: "management"
kuma.watchtower.docker.name: "watchtower"
kuma.watchtower.docker.docker_container: "watchtower"
kuma.watchtower.docker.docker_host: 1

After running the container using docker compose up -d I can view view the running container details with sudo docker ps -f name=watchtower, but I can now use Portainer or Dozzle to view that the container is running and healthy:

Watchtower

Additionally, I can see a notification in Discord to show that Watchtower has started listening for container updates:

Watchtower

Disabling watchtower

While I don’t personally have any containers I’ve disabled watchtower on, it’s possible to do this by adding to the labels block of the Docker compose in the container that we want to disable Watchtower, for example:

yaml
services:
whoami:
image: "traefik/whoami"
container_name: "whoami"
labels:
com.centurylinklabs.watchtower.enable: "false"

Watchtower closing thoughts

I like how simple it is to get Watchtower running and that it’s silent in the background. However, as mentioned it’s nearly unmaintained now so I will probably look for an alternative at some point.

Netboot XYZ setup

Netboot just requires a single container to setup, but it does use an interesting way of pulling images from the networked storage created in a previous article, as well as setting OPNsense to use netboot as the network boot server.

However, Netboot does use storage pulled from the NAS discussed earlier in this series. In order to allow Docker to work with the NAS, I’m using CIFS as it’s cross-platform and works out of the box with Docker. It is an older format now, but it does still work.

Fileserver setup

Really all the fileserver needs is a folder created to hold the images and other assets that could be useful for Netboot. This structure looks like this:

text
webshare/
┗ docker/
┗ data/
┗ assets/

I should note, I don’t actual use networked assets, but it’s useful to have in case I want to start working with it later.

Netboot environment

bash
# The local location of files
NETBOOT_DATA=<homelab folder>/data/management/netboot
# The below 2 settings are ONLY required if using network storage, as described by the NAS documentation
# CIFS connection string, if the guide is followed, the username would be nas, while the UID and GID are 1000
NAS_CIFS="username=<fileserver username>,password=<fileserver password>,uid=<user UID>,gid=<user GID>"
# location of the network share to use - this structure should match the folder structure set above
NAS_DATA_SHARE=//<fileserver ip address>/webshare/docker/data

Netboot docker

yaml
services:
netboot:
image: ghcr.io/netbootxyz/netbootxyz
container_name: netboot
# simple healthcheck, using the internal webserver host
healthcheck:
test: wget --spider http://localhost:3000 || exit 1
interval: "30s"
timeout: "5s"
retries: 3
start_period: "20s"
environment:
# explicitly set the webapp and Nginx port
- NGINX_PORT=80
- WEB_APP_PORT=3000
volumes:
- netboot_data:/config # optional
- netboot_assets:/assets # optional
ports:
# this is required to allow netboot to operate as a PXE server and is the default port for a TFTP server
- 69:69/udp
restart: always
networks:
- caddy
labels:
# various caddy config labels - see the networking article for more details
caddy: netboot.${CADDY_HOST}
caddy.reverse_proxy: "{{upstreams 3000}}"
caddy.tls.dns: cloudflare {env.CLOUDFLARE_API_TOKEN}
caddy.tls.resolvers: 1.1.1.1
# the netboot container doesn't have any authentication by default, so I'm locking it behind the proxy auth discussed in the authentication article
# details here: https://blog.jacklewis.dev/977-docker-auth/#forward-auth-setup
caddy.import: authentik_server
# autokuma configuration labels
kuma.management.group.name: "management"
kuma.netboot.docker.parent_name: "management"
kuma.netboot.docker.name: "netboot"
kuma.netboot.docker.docker_container: "netboot"
kuma.netboot.docker.docker_host: 1
volumes:
# connects to the local filesystem
netboot_data:
driver: local
driver_opts:
type: none
o: bind
device: ${NETBOOT_DATA}
# CIFS connection to the fileserver - this should be formatted the same as netboot_data to use a local drive
netboot_assets:
driver: local
driver_opts:
type: cifs
o: ${NAS_CIFS}
device: ${NAS_DATA_SHARE}/assets

Getting the container to start can once again be done with sudo docker compose up -d and waiting for the container to start up. Once this is done, I can either use sudo docker ps -f name=netboot, or go to Portainer to view the container is working. However, it’s easy to view this service through the web portal, by going to https://netboot.pve.<homelab URL> and viewing the frontpage of the admin console:

Netboot

Using netboot as a PXE server

A PXE server requires some configuration on the router that is used to connect to the internet and each make/model of router will have a slightly different format for doing this. In my case, I’m using OPNsense and so this documentation will focus on how to setup this functionality.

Before moving onto configuration, it needs to be understood that there are 2 main forms of firmware that are used to run the software on a computer. BIOS, which is older, but still common in the enterprise space, and UEFI, which is newer and used in the consumer space. Additionally, there are differences between ARM and x86/x64 architectures. I want to be able to launch all of these from a single configuration. In order to do that, I need to do the below.

Within OPNsense head to Services -> ISC DHCP V4 -> the VLAN/network that needs to be configured (VL50_IOT in my case) -> expand out the Network booting option and then fill the following settings out:

Setting name Value Meaning
Set next-server IP 0.0.0.0 The IP address that points to the Docker server hosting Netboot
Set default bios filename netboot.xyz.kpxe The name of the configuration file for a BIOS machine
Set x86 UEFI (32-bit) filename netboot.xyz.efi The name of the configuration file for an x86 UEFI machine
Set x64 UEFI/EBC (64-bit) filename netboot.xyz.efi The name of the configuration file for an x64 UEFI machine
Set ARM UEFI (32-bit) filename Netboot does not support Arm32
Set ARM UEFI (64-bit) filename The name of the configuration file for an Arm64 UEFI machine
Set iPXE boot filename Setting this will override all the other configurations with the value here
Set root-path string Not required by Netboot

Then tick Enable network booting and then save the config.

This will look something like this:

Netboot

Testing the setup

This is all the configuration required to get the basic PXE server up and running, sop I just need to test this using virtual machine and Proxmox. In this case, I’m going to show how to create a UEFI machine, as it requires a couple of changes not required by a BIOS in the System setup, though I’ll note the differences.

To start with, as in the Docker virtual machine setup guide, select the Create a New VM button at the top right:

Netboot

Then set a name for the virtual machine, in this case I chose netbootUEFI and select Next. As this is a test machine, I’m not setting this to start at boot, but in normal setup I would enable this option:

Netboot

In order to use Netboot, I need to select Do not use any media and select Next. This option allows the VM to start without OS in the drive, meaning it will use the PXE server instead:

Netboot

Configuring a UEFI virtual machine involves switching the default BIOS setting to OVMF (UEFI) and importantly, deselecting the Pre-enroll keys option. Pre-enrolled keys are a component of Microsoft’s Secure Boot architecture and are essential for running their operating systems. However, these keys are not necessary for Linux virtual machines. In fact, leaving them enabled will interfere with the boot process, specifically preventing the VM from booting to the PXE server. Lastly, for the EFI Storage location, VM-Drives has been selected due to its larger capacity.

Netboot

NOTE: for BIOS, leave this as the default SeaBIOS

At this point, I leave everything as there default settings, but the settings for disks, CPU and memory should be modified to fit the needs of the machine.

For reference, here is the final summary of the machine, with Start after created enabled to automatically start the VM.

Netboot

As the machine starts up, a clear sign the PXE server is working is that the start-up process contains details of the PXE server:

Netboot

Shortly after this, the start page should appear showing (among other things) Distrbutions for Linux installs, and utilities for various helpers. This screen is navigable with a keyboard, using the Esc key to go back:

Netboot

NOTE: these options can differ between BIOS and UEFI, for example the BIOS version of Netboot contains a copy of Memtest86 by default, which would have been very helpful in diagnosing the issues with RAM I came across during first install.

At this point, Netboot is fully installed and configured. There are more things that can be done, such as creating ipxe files and local assets, but these are out of scope of this guide.

Semaphore setup

I did have some issues getting Semaphore working due requiring some setup in a config.json file that’s generated during creation for setting the OAuth credentials properly. Additionally, the “default” setup of Semaphore uses an in-built bolt database, but given I have more experience with Postgres, I’ve decided to use that instead despite the added complexity.

Semaphore folder setup

Semaphore requires some folders to be setup in data, meaning that the full set of folders looks like this:

text
data/
┗ management/
┗ semaphore/
┣ base/
┗ database/
management/
┣ .env
┗ docker-compose.yaml

Semaphore environment

Semaphore has quite a few environment variables in my setup:

bash
SEMAPHORE_BASE_DATA=<homelab folder>/data/management/semaphore/base # used to store data from the semaphore container iteself
SEMAPHORE_DATABASE_DATA=<homelab folder>/data/management/semaphore/database # where files required by the database will go
SEMAPHORE_ADMIN_PASSWORD=<randomly generated password> # used as the default password to log into semaphore
# make sure all of the SEMAPHORE_ADMIN environment variables are different from the OIDC users
SEMAPHORE_ADMIN_NAME=<admin username> # used as the name for initially logging into semaphore
SEMAPHORE_ADMIN_EMAIL=<admin email>
SEMAPHORE_ADMIN=admin # used as the username for initially logging into semaphore
# NOTE: this is not working with the latest version - using a notification bridge might help here
SEMAPHORE_SLACK_URL=<discord webhook URL>/slack # This is a slack URL, so Discord must been run as a slack notification
SEMAPHORE_PG_PASS=<randomly generated db password> # echo "SEMAPHORE_PG_PASS=$(openssl rand -base64 36 | tr -d '\n')" >> .env - from the management folder
SEMAPHORE_PG_DB=semaphore # name of the semaphore database
SEMAPHORE_PG_DB_USER=semaphore # semaphore db username
SEMAPHORE_PG_DB_PORT=5432 # default postgres port

NOTE: Details of how to get the Discord webhook URL can be found in the Authentication documentation, or in the Watchtower documentation earlier on in this guide.

NOTE: Discord using slack requires that the webhook URL be appended with /slack

Semaphore docker

yaml
services:
semaphore_postgres:
# postgres 14 is reccomended by semaphore at the time of writing this article
image: postgres:14
container_name: semaphore_postgres
# standard postgres healthcheck
healthcheck:
test: ["CMD-SHELL", "pg_isready -d $${POSTGRES_DB} -U $${POSTGRES_USER}"]
start_period: 20s
interval: 30s
retries: 5
timeout: 5s
environment:
- POSTGRES_USER=${SEMAPHORE_PG_DB_USER} # shared with semaphore
- POSTGRES_PASSWORD=${SEMAPHORE_PG_PASS} # shared with semaphore
- POSTGRES_DB=${SEMAPHORE_PG_DB} # shared with semaphore
volumes:
# perist the postgres data folder
- semaphore_database:/var/lib/postgresql/data
restart: always
networks:
# allows semaphore and the database to talk to each other
- semaphore
labels:
# autokuma configuration labels
kuma.management.group.name: "management"
kuma.semaphore.docker.parent_name: "management"
kuma.semaphore.docker.name: "semaphore_postgres"
kuma.semaphore.docker.docker_container: "semaphore_postgres"
kuma.semaphore.docker.docker_host: 1
semaphore:
image: semaphoreui/semaphore:latest
container_name: semaphore
# healthcheck using the ping endpoint
healthcheck:
test: wget --spider http://localhost:3000/api/ping || exit 1
interval: "30s"
timeout: "5s"
retries: 3
start_period: "20s"
environment:
- SEMAPHORE_DB_USER=${SEMAPHORE_PG_DB_USER} # shared with the database
- SEMAPHORE_DB_PASS=${SEMAPHORE_PG_PASS} # shared with the database
- SEMAPHORE_DB_HOST=semaphore_postgres # name
- SEMAPHORE_DB_PORT=${SEMAPHORE_PG_DB_PORT} # shared with the database
- SEMAPHORE_DB_DIALECT=postgres # use postgres, instead of bolt or mysql
- SEMAPHORE_DB=${SEMAPHORE_PG_DB} # shared with the database
- SEMAPHORE_DB_OPTIONS={"sslmode":"disable"} # SSL is handled by caddy
- SEMAPHORE_ADMIN_PASSWORD=${SEMAPHORE_ADMIN_PASSWORD} # initial admin login password
- SEMAPHORE_ADMIN_NAME=${SEMAPHORE_ADMIN_NAME} # initial admin login name
- SEMAPHORE_ADMIN_EMAIL=${SEMAPHORE_ADMIN_EMAIL} # initial admin login email;
- SEMAPHORE_ADMIN=${SEMAPHORE_ADMIN} # initial admin login username
- TZ=${TIMEZONE}
- SEMAPHORE_SLACK_ALERT=True # enable slack alerts
- SEMAPHORE_SLACK_URL=${SEMAPHORE_SLACK_URL} # set the slack URL
- SEMAPHORE_WEB_ROOT=/ # https://github.com/semaphoreui/semaphore/issues/2681
volumes:
- semaphore:/etc/semaphore
restart: always
networks:
# access to the external caddy network for SSL
- caddy
# allows semaphore and the database to talk to each other
- semaphore
depends_on:
# don't start until after the database has started
- semaphore_postgres
labels:
# various caddy config labels - see the networking article for more details
caddy: semaphore.${CADDY_HOST}
caddy.reverse_proxy: "{{upstreams 3000}}"
caddy.tls.dns: cloudflare {env.CLOUDFLARE_API_TOKEN}
# autokuma configuration labels
kuma.management.group.name: "management"
kuma.semaphore.docker.parent_name: "management"
kuma.semaphore.docker.name: "semaphore"
kuma.semaphore.docker.docker_container: "semaphore"
kuma.semaphore.docker.docker_host: 1
networks:
# create a semaphore network with a specified subnet to avoid issues with conflicting IP addresses
semaphore:
ipam:
driver: default
config:
- subnet: 172.0.14.0/24
volumes:
# database volume setup
semaphore_database:
driver: local
driver_opts:
type: none
o: bind
device: ${SEMAPHORE_DATABASE_DATA}
# semaphore volume
semaphore:
driver: local
driver_opts:
type: none
o: bind
device: ${SEMAPHORE_BASE_DATA}

Now that the Compose is set, running sudo docker compose up -d will pull and start the containers. After waiting a few minutes for Caddy to pull in the correct certificates, the web app should be available at semaphore.pve.<host>:

Semaphore

Initial login can be completed using the SEMAPHORE_ADMIN environment variable for username and SEMAPHORE_ADMIN_PASSWORD for the password, which should cause the following screen to display:

Semaphore

Enabling SSO in Semaphore

Semaphore has the ability to use SSO, but it’s not enabled by default. Instead it needs to be a enabled from a config.json file generated in the <homelab folder>\data\semaphore\base folder that should have been created in the install process. This file then needs the following config block added to enable SSO:

json
"oidc_providers": {
"authentik": {
"color": "orange",
"display_name": "Sign in with authentik",
"client_id": "<Authentik client id>",
"redirect_url": "<Semaphore base address>/api/auth/oidc/authentik/redirect",
"client_secret": "<Authentik client secret>",
"scopes": ["email", "openid", "profile"],
"username_claim": "preferred_username",
"name_claim": "preferred_username",
"provider_url": "<Authentik base address>/application/o/<Slug>/"
}

Note: Slug in the provider_url is oauth if following the documentation in the Authentik guide

Once this is added and saved, the restart the container with sudo docker restart semaphore, and there should now be an option to login to the web UI with Authentik:

Semaphore

At this point it’s possible to login with the Authentik user, but given this is the main user, it needs to be converted into an admin account using the initial admin user.

Semaphore

To do this, login as the admin user, click on the username in the bottom left and select Users:

Semaphore

Within this screen, click edit button, then modify the newly created user by enabling Admin user and Send alerts and then clicking SAVE:

Semaphore

Semaphore simple test project

At this point, Semaphore is now installed and running, but testing the setup is still needed. This is pretty straightforward in Semaphore as the install has an option to create a demo project. To do this, just select the CREATE DEMO PROJECT option when creating the first project:

Semaphore

What this does is create a simple project, with a connection to the Semaphore demo Git repository that contains a set of task templates that can be used to show that Semaphore can provide information. In order to test this, just run the Print system info task in Task Templates by clicking the play button:

Semaphore

This then will open a window showing progress and eventually system info of the underlying hardware in a window with something like the following oputput:

text
--------------------------------------
8:48:03 PM
CPU cores: 4
8:48:03 PM
8:48:03 PM Memory Information (Linux):
8:48:03 PM total used free shared buff/cache available
8:48:03 PM Mem: 4.4G 3.1G 194.6M 33.2M 1.2G 1.0G
8:48:03 PM Swap: 977.0M 630.9M 346.1M
8:48:03 PM
8:48:03 PM Disk Usage (root):
8:48:03 PM Filesystem Size Used Available Use% Mounted on
8:48:03 PM overlay 91.7G 13.9G 73.7G 16% /

Once this is done, the run will show up in the Dashboard under history, as well as the Task Templates screen showing successfully run and who ran the task last:

Semaphore

Setting up an actual test environment

While a full explanation of how Ansible works is outside the scope of this guide, I think it’s useful to show exactly how to get Semaphore working with a “real” setup.

Prerequisites

  • Ubuntu server (or similar) with OpenSSH (and the Proxmox QEMU agent if using Proxmox) installed
    • Netboot can be used to set this up quickly
  • private Git repository
    • Holds Ansible playlists
    • I use Github

Additionally, a new project should be created in Semaphore:

Semaphore

Ansible git repository

Repository Contents

Semaphore uses a git repository to hold Ansible playbooks, so to demonstrate the end to end, I created a simple playbook designed to ping the hosts with a ping command. This command basically connects to a machine and then runs a python script that echo’s PONG back to the console.

This repository has a single file at the root called ping.yml with the below contents:

yaml
- name: test
# hosts are controlled by the inventory, so all is fine
hosts: all
tasks:
- name: Ping
ansible.builtin.ping:
Github access to SSH

As this is a private repository, credentials are required to access Github, and the easiest type to use is SSH. There are more detailed instructions on how to do this here, but the steps are as follows:

From a shell, run the following command to generate a certificate:

shell
ssh-keygen -t ed25519 -C <Github email address>

Then save the file without entering a password and taking a note of the saved file location (by default it should be C:\Users\<username>/.ssh/id_ed25519 in Windows). This should provide a key (which is the private key) and key.pub (which is the public key).

Then in Github, click on your picture in the top right, then go to Settings -> SSH and GPG keys -> New SSH key and paste the public key into the Key box and then press Add SSH Key:

Github

At this point, the Github account should be ready to accept an SSH connection.

Creating an inventory

An inventory is where the IP addresses or hostnames of machines that will be deployed to using Ansible. In order to dop this, go to Inventory -> NEW INVENTORY -> Ansible Inventory:

Semaphore

The following values are set:

  • Name can be anything, as long as it’s remembered
  • User Credentials of None
  • Type of Static as the most simple inventory to setup

NOTE: the credentials of None should be the default credentials created when the new project is. These currently won’t work as they aren’t real credentials, but can be used to help show that the connection to Github is working. If this isn’t there, they can be created in the Key Store with a Type of None.

Accessing Github in Semaphore

Within Semaphore, the SSH key needs to be added first, and this can be done by going to Key Store -> NEW KEY, setting a memorable name, selecting the Type to be SSH Key and then pasting in the generated private key:

Semaphore

Once this is done, grab the SSH Clone URL from Github, then in Semaphore go to Repositories -> NEW REPOSITORY and set it like the below:

Semaphore

NOTE: the Branch name of main is the name of the default branch created by Github.

Running the Ping playbook

Now that the inventory and Github connections are completed, a Task Template for the Ping playbook can be created to verify that the connection to Github is working. To do this can be done from Task Templates -> NEW TEMPLATE -> Ansible Playbook:

Semaphore

The values set are as follows

  • Name set to something memorable
  • PAth to playbook file is the location of ping.yml in the Git repository
  • Inventory is the inventory created above
  • Repository is the repository created above
  • Variable Group is the default empty variable group as this simple Playbook doesn’t use any variables
  • CLI args has -vvv added, as this increases the logging to Info level as the default logging can be a bit difficult to see what’s going on.

Once CREATE has been clicked, the Playlist can be run to verify that the Task Template is working and able to download the Playbook from Github:

text
8:52:01 PM
Run TaskRunner with template: Ping
8:52:01 PM
Preparing: 3
8:52:01 PM Cloning Repository git@github.com:jlewis92/ansible.git
8:52:01 PM Cloning into 'repository_3_template_8'...
8:52:02 PM Warning: Permanently added 'github.com' (ED25519) to the list of known hosts.
8:52:03 PM Get current commit hash
8:52:03 PM Get current commit message
8:52:03 PM installing static inventory

However, the Playbook should still fail due to the password being sent to the machine being None, which is incorrect. The next step will show how to fix this.

Configuring those inventory connection

While it’s absolutely possible to use Login with password from the Key Store, it’s a better idea to use SSH where possible, so this guide will show the process for setting up the machine to connect via an SSH key instead of password authentication.

To start with log onto the machine and move to either the .ssh or an empty directory and run sudo ssh-keygen -t ed25519 -f <filename>, this should generate a public and private key named after the filename, for example using -f test, would produce the following:

text
total 8
-rw------- 1 root root 399 May 14 19:52 test
-rw-r--r-- 1 root root 91 May 14 19:52 test.pub

with test being the private key and test.pub being the public key.

In order to allow access via SSH, the server needs to have the key added to the authorized_keys file in the .ssh directory, but there’s fortunately a shortcut to add this using the command sudo ssh-copy-id add <public key location> <username>@<ip address>, which would produce the following results in ~/.ssh/authorized_keys, when using jack@192.168.50.145 as the username and IP address:

text
ssh-ed25519 <public key> jack@192.168.50.145

The public key is now setup to be used by the machine, now Semaphore needs to be set to use this public key to authenticate. To do this, go to key Store -> NEW KEY in Semaphore and set the following values:

Key value Explanation
Key Name inventory Something memorable
Type SSH Key This would be Login with password, if just using a login directly
Username whatever is set This is whatever was entered when performing ssh-copy-id, which was jack in my case
Passphrase whatever is set This is whatever was entered when performing ssh-keygen
Private Key <private key> Copied from the test file in the example given above

which looks like this:

Semaphore

Then go to Inventory -> <inventory name> -> Edit and set the User Credentials to be the newly created credential in the Key Store:

Semaphore

Once saved, go to Task Templates and run the Ping again and the inventory should now be working:

Semaphore

As can be seen above, I get a ping: pong back, showing the Python script executed and at this point, everything is setup and Ansible is now ready to be run.

Closing thoughts

This article was a lot longer than I thought it would be as most of these applications were streamlined in setup. However, actually writing down how I did everything made me realize how much work there is in actually setting these services up! However, now that they are, I’ve found they don’t change very often.

Over the past year or so of use, I’ve found Dozzle to be the most helpful to me personally as I’ve constantly checked this service whenever there’s an issue. However, I do find myself only reaching for Portainer when I can’t be bothered to SSH into my server to restart/stop something. Related to this, I’ve also very rarely found myself in a situation where I didn’t have an SSH client available and a list of server addresses, so just using a terminal has been quicker than using Apache Guacamole. In terms of Netboot and Semaphore, I’ve not had to create another machine recently so these haven’t had much use (outside of writing this guide), but I’m still happy I have these as they definitely address some pain points I’ve historically had when I get the urge to build a new machine. Finally, Watchtower has been brilliant, I’ve not touched it at all and it’s just ran quietly in the background. However, I do feel the need to point out that some of the containers that Watchtower has pulled have been “broken” due to changes in configuration.

In the future, I think the only thing I might change is to look at Komodo or Dockge as replacement to Portainer as these services feel like they would fit my usage a bit better.

Giscus other comment systems