Ansible Control Node
My laptop running Ubuntu desktop will be used as Ansible Control Node.
As an alternative, a VirtualBox VM running on a Windows PC can be used as the Ansible Control Node, pimaster, for automating the provisioning of the Raspberry Pi cluster.
As OS for pimaster, Ubuntu 22.04 LTS server can be used.
Ansible Project structure and configuration
Ansible source code is structured following typical directory layout:
π ansible
βββ π host_vars
βββ π group_vars
βββ π vars
βββ π tasks
βββ π templates
βββ π roles
βββ ansible.cfg
βββ inventory.yml
βββ requirements.yml
βββ playbook1.yml
βββ playbook2.yml
βββ pyproject.toml
βββ uv.lock
βββ python-version
Where:
host_varsandgroup_varscontain Ansible variables belonging to hosts and groupsvarscontains Ansibleβs variables files used by playbookstaskscontains Ansibleβs tasks files used by playbookstemplatescontains Jinja2βs templates used by playbooksrolescontains Ansibleβs rolesinventory.ymldefines hosts, groups, and group relationships used as playbook targetsrequirements.ymldefines Galaxy roles and collections dependenciespyproject.toml,uv.lock, andpython-versiondefine the local Ansible execution environment dependencies and configuration to build a Python virtual environment with UV package manager to be used when running Ansible commands in the local environment.
Ansible Local Running Environment
Create a Python virtual environment managed by UV, Pythonβs universal virtual environment manager, to run Ansible commands locally. This environment ensures that all dependencies are isolated and consistent.
Python dependencies are managed with UV virtual environment through the following files:
pyproject.toml(dependency source of truth)uv.lock(pinned dependency graph).python-version(Python3.12)
Step-by-step local environment setup:
-
Install UV (if it is not already installed):
curl -LsSf https://astral.sh/uv/install.sh | sh -
Create initial UV project files within
ansibleproject directory (first-time setup only):cd ansible [ -f pyproject.toml ] || uv init --bare [ -f uv.lock ] || uv lockThis creates
pyproject.tomlanduv.lockwhen starting from an empty local environment. -
Add Ansible and related dependencies to
pyproject.toml: Add the required Ansible runtime and tooling dependencies to the project usinguv add:At least ansible-core and ansible-lint are required, but additional Python dependencies can be added as needed for specific playbook requirements (e.g.,
kubernetes,hvac,certbot, etc.).uv add ansible-core uv add yamllint uv add ansible-lintSpecific versions of dependencies can be added with
uv add <package>@<version>. -
Sync the local Ansible virtual environment dependencies:
cd ansible uv sync --frozen -
Verify local mode command execution:
cd ansible uv run ansible-playbook --version -
Run local lint and syntax checks:
cd ansible uv run yamllint . uv run ansible-playbook --syntax-check external_services.yml
UV project dependency update flow
When Ansible or other Python dependencies need to be updated, follow this flow:
uv add <package>
uv lock
uv sync --frozen
Ansible Galaxy dependencies
Ansible playbooks use Ansible Galaxy roles and collections, which are defined as dependencies in requirements.yml file.
Sample requirements.yml file:
---
roles:
- name: ricsanfre.minio
version: v1.1.15
- name: ricsanfre.backup
version: v1.1.3
- name: ricsanfre.vault
version: v1.0.5
collections:
- name: community.general
version: 12.3.0
- name: kubernetes.core
version: 6.3.0
- name: community.hashi_vault
version: 7.1.0
- name: ansible.posix
version: 2.1.0
- name: community.crypto
version: 3.1.1
- name: prometheus.prometheus
version: 0.27.6
The following Ansible community collections are added to requirements.yml:
community.general: broad set of general-purpose Ansible modules/plugins.kubernetes.core: Kubernetes modules for managing cluster resources.community.hashi_vault: modules/lookups for Vault integration.ansible.posix: POSIX/Linux modules (system, users, mounts, ACLs, etc.).community.crypto: crypto and certificate/key management modules.prometheus.prometheus: modules/roles for Prometheus ecosystem automation.
Ansible Galaxy roles and collections will be installed in a specfied location (e.g., ~/.ansible/roles and ~/.ansible/collections) and configured in ansible.cfg to be available for playbook execution.
π $HOME
βββ π .ansible
βββ π roles
βββ π collections
mkdir -p ~/.ansible/roles ~/.ansible/collections
Install Galaxy dependencies from ansible/requirements.yml:
uv run ansible-galaxy role install -r requirements.yml --roles-path ~/.ansible/roles
uv run ansible-galaxy collection install -r requirements.yml --collections-path ~/.ansible/collections
Ansible configuration
Configuration file
Ansible configuration is in ansible.cfg file containing paths to roles, collections and inventory file:
ansible.cfg
[defaults]
# Inventory file location
inventory = ./inventory.yml
# Ansible execution threads
forks = 10
# Paths to search for roles in, colon separated
roles_path = ~/.ansible/roles:./roles
# Path for collections
collections_path = ~/.ansible/collections:./collections
# Disable SSH key host checking
host_key_checking = false
Inventory file
Ansible inventory file, inventory.yml, defines hosts and groups of hosts to target with playbooks. It also defines group relationships (e.g., parent-child) and variables specific to hosts or groups.
inventory.yml
all:
children:
picluster:
hosts:
node1:
ansible_host: 10.0.0.11
node2:
ansible_host: 10.0.0.12
vaul:
hosts:
node1:
ansible_host variable is used to specify the IP address or hostname to connect to for each host in the inventory, which can be different from the inventory hostname (e.g., node1, node2, etc.) used as an identifier in playbooks.
SSH keys for Ansible connection
Authentication using SSH keys should be the only mechanism available to login to any server in the Pi Cluster.
In order to improve security, default UNIX user, ubuntu, created by cloud images will be disabled. A new unix user, ricsanfre, will be created in all servers with root privileges (sudo permissions). This user will be used to connect to the servers from my home laptop and to automate configuration activities using Ansible (used as ansible_remote_user variable when connecting).
Public ssh keys can be added to the UNIX user created in all servers as ssh-authorized-keys to enable passwordless SSH connection.
Default user in cluster nodes and its authorized SSH public keys will be added to cloud-init configuration when installing Ubuntu OS.
SSH keys generation
For generating ansible SSH keys in Linux server execute command:
ssh-keygen
In directory $HOME/.ssh/ public and private key files can be found for the user
id_rsa contains the private key and id_rsa.pub contains the public key.
Content of the id_rsa.pub file has to be used as ssh_authorized_keys of UNIX user created in cloud-init user-data
cat id_rsa.pub
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDsVSvxBitgaOiqeX4foCfhIe4yZj+OOaWP+wFuoUOBCZMWQ3cW188nSyXhXKfwYK50oo44O6UVEb2GZiU9bLOoy1fjfiGMOnmp3AUVG+e6Vh5aXOeLCEKKxV3I8LjMXr4ack6vtOqOVFBGFSN0ThaRTZwKpoxQ+pEzh+Q4cMJTXBHXYH0eP7WEuQlPIM/hmhGa4kIw/A92Rm0ZlF2H6L2QzxdLV/2LmnLAkt9C+6tH62hepcMCIQFPvHVUqj93hpmNm9MQI4hM7uK5qyH8wGi3nmPuX311km3hkd5O6XT5KNZq9Nk1HTC2GHqYzwha/cAka5pRUfZmWkJrEuV3sNAl ansible@pimaster
Configuring Ansible variables to use SSH keys
Ansible variables for SSH connection can be defined in ansible.cfg file or in playbooks as ansible_ssh_private_key_file variable. Also ansible_user variable can be defined to specify the remote user to connect to servers in the cluster (e.g., ricsanfre).
ansible\vars\all.yml file containing Ansible variables for all hosts can be created with the following content:
ansible_user: ricsanfre
ansible_ssh_private_key_file: /home/ricsanfre/.ssh/id_rsa
So the same SSH key can be used for all servers in the cluster and the Ansible variable file can be used in all playbooks to avoid repeating the same variables in each playbook.
Additional tools
Other tools used within Ansible playbooks may need to be installed in the local environment.
Example of such tools used in Pi Cluster automation are:
| Tool | Purpose |
|---|---|
| OpenTofu | Automate Terraform workflows in Ansible playbooks. |
| kubectl | Automate Kubernetes CLI operations in Ansible playbooks. |
| Helm | Automate Helm chart operations in Ansible playbooks. |
| Helmfile | Automate Helmfile release workflows in Ansible playbooks. |
Ansible Dockerized Running Environment
Runs Ansible commands inside the ansible-runner container avoiding the need to install Ansible and dependencies in the local environment or any of the additional tools used in the automation workflows.
A single Docker image contains a complete Ansible execution environment with all dependencies and tools pre-installed and configured, providing a consistent runtime for Ansible commands across different host environments.
Installing Docker
Follow official installation guide.
-
Step 1. Uninstall old versions of docker
for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do sudo apt-get remove $pkg; done -
Step 2. Install packages to allow apt to use a repository over HTTPS
sudo apt-get update sudo apt-get install \ apt-transport-https \ ca-certificates \ curl \ gnupg \ lsb-release -
Step 3. Add Dockerβs official GPG key
sudo install -m 0755 -d /etc/apt/keyrings curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.asc sudo chmod a+r /etc/apt/keyrings/docker.asc -
Step 4: Add x86_64 repository
echo \ "deb [arch=amd64 signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \ $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null -
Step 5: Install Docker Engine
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin -
Step 6: Enable Docker management with a non-privileged user
-
Create docker group
sudo groupadd docker -
Add user to docker group
sudo usermod -aG docker $USER
-
-
Step 7: Configure Docker to start on boot
sudo systemctl enable docker.service sudo systemctl enable containerd.service -
Step 8: Configure docker daemon.
-
Edit file
/etc/docker/daemon.jsonSet storage driver to overlay2 and to use systemd for the management of the containerβs cgroups. Optionally default directory for storing images/containers can be changed to a different disk partition (example /data). Documentation about the possible options can be found here
{ "exec-opts": ["native.cgroupdriver=systemd"], "log-driver": "json-file", "log-opts": { "max-size": "100m" }, "storage-driver": "overlay2", "data-root": "/data/docker" } -
Restart docker
sudo systemctl restart docker
-
Creating ansible-runner docker environment
The following directory/files structure is needed for the ansible runtime environment using docker.
π pi-cluster
βββ π ansible-runner
β βββ docker-compose.yaml
β βββ Dockerfile
βββ π ansible
βββ ansible.cfg
βββ inventory.yml
βββ requirements.yml
βββ pyproject.toml
βββ uv.lock
βββ π roles
Where:
ansible-runnercontains image build fileDockerfileand docker compose filedocker-compose.yamlfor lifecycle management of the runner container.ansiblecontains the Ansible project source and UV-based local execution environment definition files (pyproject.toml, uv.lock, etc.).
Docker Image build details
The runner image is defined in ansible-runner/Dockerfile and uses:
python:3.12-slimas base image- UV (
ghcr.io/astral-sh/uv) for Python environment and dependency management - Same Ansible and Galaxy dependencies as the local environment defined in
ansible/pyproject.toml,ansible/uv.lock, andansible/requirements.ymlansible/pyproject.toml+ansible/uv.lockfor reproducible Python dependenciesansible/requirements.ymlfor Galaxy roles and collections
- Multi-stage binary sources for infrastructure tooling
ghcr.io/opentofu/opentofu:1.11.5fortofu(copied through an Alpine intermediary stage)alpine/kubectl:1.35.1forkubectlalpine/helm:3.18.5forhelmghcr.io/helmfile/helmfile:v1.2.3forhelmfile
Build prerequisites checklist:
- Docker Engine and Docker Compose plugin are installed and working.
- Build context is the repository root (
pi-cluster/) so all required files are available. - Required files exist:
ansible-runner/Dockerfileansible/pyproject.tomlansible/uv.lockansible/requirements.yml
Graphical build flow:
flowchart TD
A["1) Prepare build sources
base image + helper stages"]
B["2) Install OS dependencies
apt packages + cleanup"]
C["3) Copy build inputs
Docker and Ansible files"]
D["4) Install infrastructure tooling
OpenTofu + kubectl + helm + helmfile"]
E["5) Prepare runner user environment
user, PATH, directories"]
F["6) Install Python + Galaxy dependencies
uv sync + ansible-galaxy"]
G["7) Finalize runtime tooling
helm plugins"]
H["8) Set runtime defaults
WORKDIR /ansible"]
A --> B --> C --> D --> E --> F --> G --> H
Steps/tasks performed when building the Docker image:
- Prepare build sources:
- Pull helper binaries from multi-stage images (
helmfile,uv,kubectl,helm). - Start from
python:3.12-slimas the runtime base.
- Pull helper binaries from multi-stage images (
- Install OS-level runtime dependencies:
- Install system packages required by Ansible workflows (
sudo,git,curl,gnupg, etc.). - Clean apt caches to keep the image smaller.
- Install system packages required by Ansible workflows (
- Copy build inputs into the image:
- Copy Python dependency files (
pyproject.toml,uv.lock) and Galaxy dependency file (requirements.yml).
- Copy Python dependency files (
- Install infrastructure tooling:
- Copy
tofufromghcr.io/opentofu/opentofu:1.11.5via an Alpine intermediary stage. - Copy
kubectlandhelmbinaries from dedicated Alpine images. - Copy
helmfilebinary from the build stage.
- Copy
- Create and prepare the
runneruser environment:- Create non-root
runneruser and home directory. - Configure PATH to include the UV virtual environment.
- Create required directories for Ansible, cache/config, and runtime mounts.
- Create non-root
- Install Python and Galaxy dependencies:
- Run
uv sync --frozen --no-devto create the pinned Python environment. - Install Galaxy roles and collections from
requirements.yml(with retries).
- Run
- Finalize runtime tooling:
- Install Helm plugins (
helm-git,helm-diff).
- Install Helm plugins (
- Set runtime defaults:
- Set working directory to
/ansibleso commands run in the Ansible project context.
- Set working directory to
Current Dockerfile reference:
FROM ghcr.io/helmfile/helmfile:v1.2.3 AS helmfile
FROM alpine/kubectl:1.35.1 AS kubectl
FROM alpine/helm:3.18.5 AS helm
FROM alpine:3.21 AS tofu
COPY --from=ghcr.io/opentofu/opentofu:1.11.5 /usr/local/bin/tofu /usr/local/bin/tofu
FROM ghcr.io/astral-sh/uv:0.10.4 AS uv
FROM python:3.12-slim
ARG ANSIBLE_GALAXY_CLI_COLLECTION_OPTS=--ignore-certs
ARG ANSIBLE_GALAXY_CLI_ROLE_OPTS=--ignore-certs
ENV UV_LINK_MODE=copy
ENV UV_PROJECT_ENVIRONMENT=/home/runner/.venv
RUN apt-get update -qq && \
apt-get install sudo git apt-utils pwgen gnupg curl -y && \
apt-get clean && \
rm -rf /usr/share/doc/* /usr/share/man/* /var/lib/apt/lists/* /tmp/* /var/tmp/*
COPY --from=uv /uv /uvx /usr/local/bin/
WORKDIR /build
COPY ansible/pyproject.toml /build/pyproject.toml
COPY ansible/uv.lock /build/uv.lock
COPY ansible/requirements.yml /build/requirements.yml
# Install OpenTofu from dedicated image
COPY --from=tofu /usr/local/bin/tofu /usr/local/bin/tofu
# Install kubectl and Helm from dedicated images
COPY --from=kubectl /usr/local/bin/kubectl /usr/local/bin/kubectl
COPY --from=helm /usr/bin/helm /usr/local/bin/helm
# Copy helmfile
COPY --from=helmfile /usr/local/bin/helmfile /usr/local/bin/helmfile
ENV USER=runner
ENV FOLDER=/home/runner
RUN /usr/sbin/groupadd $USER && \
/usr/sbin/useradd $USER -m -d $FOLDER -g $USER -s /bin/bash && \
echo $USER 'ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers && \
echo 'case ":$PATH:" in *:/home/runner/.venv/bin:*) ;; *) export PATH="/home/runner/.venv/bin:$PATH" ;; esac' >> /home/runner/.bashrc && \
echo 'case ":$PATH:" in *:/home/runner/.venv/bin:*) ;; *) export PATH="/home/runner/.venv/bin:$PATH" ;; esac' >> /home/runner/.profile && \
mkdir -p /home/runner/.ansible/roles /home/runner/.ansible/collections /home/runner/.local /home/runner/.cache /home/runner/.config /home/runner/.ssh && \
chmod 700 /home/runner/.ssh && \
chown -R $USER:$USER /home/runner
RUN for dir in \
/runner \
/ansible \
/var/lib/letsencrypt \
/etc/letsencrypt \
/var/log/letsencrypt ; \
do mkdir -p $dir ; chown $USER:$USER $dir; chmod 775 $dir ; done
USER $USER
# Install python dependencies
RUN uv sync --frozen --no-dev
# Install ansible roles/collections dependencies
RUN set -eux; \
for attempt in 1 2 3; do \
uv run ansible-galaxy role install $ANSIBLE_GALAXY_CLI_ROLE_OPTS -r requirements.yml --roles-path "/home/runner/.ansible/roles" --timeout 600 && break; \
if [ "$attempt" -eq 3 ]; then exit 1; fi; \
sleep 10; \
done
RUN set -eux; \
for attempt in 1 2 3; do \
uv run ansible-galaxy collection install $ANSIBLE_GALAXY_CLI_COLLECTION_OPTS -r requirements.yml --collections-path "/home/runner/.ansible/collections" && break; \
if [ "$attempt" -eq 3 ]; then exit 1; fi; \
sleep 10; \
done
# Install helmfile required plugins
RUN helm plugin install https://github.com/aslafy-z/helm-git
RUN helm plugin install https://github.com/databus23/helm-diff
WORKDIR /ansible
Docker Compose details
The ansible-runner service is defined in ansible-runner/docker-compose.yaml.
services:
ansible-runner:
image: ansible-runner
build:
context: ..
dockerfile: ansible-runner/Dockerfile
command: tail -f /dev/null
container_name: ansible-runner
restart: unless-stopped
volumes:
- ./../ansible:/ansible
- ./../kubernetes:/kubernetes
- ./../terraform:/terraform
- ./../metal/x86/pxe-files:/metal/x86/pxe-files
- ${HOME}/.secrets:/home/runner/.secrets
- ${HOME}/.ssh/id_rsa:/home/runner/.ssh/id_rsa:ro
- ${HOME}/.ssh/id_rsa.pub:/home/runner/.ssh/id_rsa.pub:ro
- ${HOME}/.kube:/home/runner/.kube
- ${HOME}/.certbot/log:/home/runner/.certbot/log
- ${HOME}/.certbot/config:/home/runner/.certbot/config
- ${HOME}/.certbot/work:/home/runner/.certbot/work
Mounted volumes (localhost β container):
Volume map quick reference:
| Host (localhost) | Container (ansible-runner) |
Mode | Content |
|---|---|---|---|
./ansible |
/ansible |
rw |
Ansible project source (playbooks, roles, vars, inventory). |
./kubernetes |
/kubernetes |
rw |
Kubernetes manifests and platform/app cluster configuration. |
./terraform |
/terraform |
rw |
Terraform/OpenTofu code for Vault and MinIO automation. |
./metal/x86/pxe-files |
/metal/x86/pxe-files |
rw |
PXE boot files and related bare-metal assets. |
${HOME}/.secrets |
/home/runner/.secrets |
rw |
Local secret/env files used by automation workflows. |
${HOME}/.ssh/id_rsa |
/home/runner/.ssh/id_rsa |
ro |
Private SSH key for remote host authentication. |
${HOME}/.ssh/id_rsa.pub |
/home/runner/.ssh/id_rsa.pub |
ro |
Public SSH key paired with the private key. |
${HOME}/.kube |
/home/runner/.kube |
rw |
Kubeconfig and Kubernetes client context data. |
${HOME}/.certbot/log |
/home/runner/.certbot/log |
rw |
Certbot logs and execution history. |
${HOME}/.certbot/config |
/home/runner/.certbot/config |
rw |
Certbot account data and issued certificate material. |
${HOME}/.certbot/work |
/home/runner/.certbot/work |
rw |
Certbot temporary working files/state. |
This docker-compose file builds and starts the ansible-runner Docker container and mounts several host directories, including the Ansible project structure. The container is always running (command is tail -f /dev/null), so commands can be executed with docker exec, and there is no need to recreate a new container (docker run) every time a command needs to be executed.
Docker compose execution details
After the image is built and the compose service is defined, runner lifecycle can be managed manually as:
flowchart TD A["Repository root"] --> B["cd ansible-runner"] B --> C["docker compose build"] C --> D["docker compose up -d"] D --> E["ansible-runner container running"] E --> F["docker exec -it ansible-runner <command>"] E --> G["docker exec -it ansible-runner /bin/bash"] E --> H["(optional) docker compose down"] H --> I["container stopped and removed"]
cd ansible-runner
docker compose build
docker compose up -d
Any command can be executed in the running container:
docker exec -it ansible-runner <command>
Interactive shell:
docker exec -it ansible-runner /bin/bash
Command Wrappers Script
After building the local UV-based environment and the Dockerized ansible-runner environment, a wrapper script can be used to run commands with a consistent interface using local environment or Dockerized environment based on an environment variable (ANSIBLE_RUNNER_MODE).
Create following wrapper script in the repository to run Ansible commands in both environments with the same interface:
ansible-runner/ansible-runner.sh
#!/bin/bash
if [ "$ANSIBLE_RUNNER_MODE" = "local" ]; then
uv run "$@"
else
docker exec -it ansible-runner "$@"
fi
- Local mode wrapper:
ANSIBLE_RUNNER_MODE=local ./ansible-runner/ansible-runner.sh <command> - Docker mode wrapper:
./ansible-runner/ansible-runner.sh <command>
Examples:
# Local mode
ANSIBLE_RUNNER_MODE=local ./ansible-runner/ansible-runner.sh ansible-playbook --version
# Docker mode
./ansible-runner/ansible-runner.sh ansible-playbook --version
Makefile
Wrapper script can be used in a Makefile to create convenient shortcuts for common Ansible commands:
.PHONY: lint syntax check ansible-playbook
lint:
./ansible-runner/ansible-runner.sh yamllint .
syntax:
./ansible-runner/ansible-runner.sh ansible-playbook --syntax-check
check: lint syntax
ansible-playbook:
./ansible-runner/ansible-runner.sh ansible-playbook $(playbook) -i inventory.yml
This Makefile defines shortcuts for linting and syntax checking Ansible playbooks, as well as a generic target for running any playbook with the playbook variable. The same Makefile can be used in both local and Dockerized environments without modification, providing a consistent interface for Ansible command execution.