Quick Start Instructions

This are the instructions to quickly deploy Kuberentes Pi-cluster using the following tools:

  • cloud-init: to automate initial OS installation/configuration on each node of the cluster
  • Ansible: to automatically configure cluster nodes, install and configure external services (DNS, DHCP, Firewall, S3 Storage server, Hashicorp Vautl) install K3S, and bootstraping cluster through installation and configuration of FluxCD
  • Flux CD: to automatically deploy Applications to Kuberenetes cluster from manifest files in Git repository.

Ansible control node setup

  • Use your own Linux-based PC or set-up a Ubuntu Server VM to become ansible control node (pimaster)

    In case of building a VM check out tip for automating its creation in “Ansible Control Node”.

  • Clone Pi-Cluster Git repo or download using the ‘Download ZIP’ link on GitHub.

    git clone https://github.com/ricsanfre/pi-cluster.git
    
  • Install docker and docker-compose

    Follow instructions in “Ansible Control Node: Installing Ansible Runtime environment”.

  • Create and configure Ansible execution environment (ansible-runner):

    make ansible-runner-setup
    

    This will automatically build and start ansible-runner docker container (including all packages and its dependencies), generate GPG key for encrypting with ansible-vault and create SSH key for remote connections.

Ansible configuration

Ansible configuration (variables and inventory files) might need to be adapted to your particular environment

Inventory file

Adjust ansible/inventory.yml inventory file to meet your cluster configuration: IPs, hostnames, number of nodes, etc.

Add Raspberry PI nodes to the rpi group and x86 nodes to x86 nodes.

Se the node, i.e node1, which is going to be used to install non-kubernetes services (dns server, pxe server, load balancer (ha-proxy), vault server). It has to be added to groups dns, pxe, vault and haproxy

Configuring ansible remote access

The UNIX user to be used in remote connections (i.e.: ricsanfre) and its SSH key file location need to be specified.

Modify ansible/group_vars/all.yml to set the UNIX user to be used by Ansible in the remote connection, ansible_user (default value ansible) and its SSH private key, ansible_ssh_private_key_file

  # Remote user name
  ansible_user: ricsanfre

  # Ansible ssh private key
  ansible_ssh_private_key_file: ~/.ssh/id_rsa

By default it uses the ssh key automatically created when initializing ansible-runner (make ansible-runner-setup) located at ansible-runner/runner/.ssh directory.

Modify Ansible Playbook variables

Adjust ansible playbooks/roles variables defined within group_vars, host_vars and vars directories to meet your specific configuration.

The following table shows the variable files defined at ansible’s group and host levels

Group/Host Variable file Nodes affected
ansible/group_vars/all.yml all nodes of cluster + gateway node + pimaster
ansible/group_vars/control.yml control group: gateway node + pimaster
ansible/group_vars/k3s_cluster.yml all nodes of the k3s cluster
ansible/group_vars/k3s_master.yml K3s master nodes
ansible/host_vars/gateway.yml gateway node specific variables
ansible/host_vars/node1.yml external services node specific variables

The following table shows the variable files used for configuring the storage, backup server and K3S cluster and services.

Specific Variable File Configuration
ansible/vars/picluster.yml K3S cluster and external services configuration variables
ansible/vars/centralized_san/centralized_san_target.yml Configuration iSCSI target local storage and LUNs: Centralized SAN setup
ansible/vars/centralized_san/centralized_san_initiator.yml Configuration iSCSI Initiator: Centralized SAN setup

Vault credentials generation

Generate ansible vault variable file (var/vault.yml) containing all credentials/passwords. Random generated passwords will be generated for all cluster services.

Execute the following command:

make ansible-credentials

Credentials for external cloud services (IONOS DNS API credentials) or Github PAT are asked during the execution of the playbook.

Prepare PXE server

Get PXE boot files and ISO image to automate x86 nodes installation.

cd metal/x86
make get-kernel-files
make get-uefi-files

Cluster nodes setup

Update Raspberry Pi firmware

Update firmware in all Raspberry-PIs following the procedure described in “Raspberry PI firmware update”

Install gateway node

Install OpenWRT operating system on Raspberry PI or GL-Inet router, gateway node The installation and configuration process is described in “Cluster Gateway (OpenWRT)”

Option 2: Ubuntu OS

gateway router/firewall can be implemented deploying Linux services on Ubuntu 22.04 OS The installation and configuration process is described in “Cluster Gateway (Ubuntu)”

Install external services node

Once gateway node is up and running. External services node, node1 can be configured.

node1 is used to install common services: DNS server, PXE server, Vault, etc.

In crentralized SAN architecture node1 can be configured as SAN server.

Install Ubuntu Operating System on node1 (Rapberry PI-4B 4GB).

The installation procedure followed is the described in “Ubuntu OS Installation” using cloud-init configuration files (user-data and network-config) for node1.

user-data depends on the storage architectural option selected:

Dedicated Disks Centralized SAN
user-data user-data

network-config is the same in both architectures:

Network configuration
network-config

Configure external-services node

For automatically execute basic OS setup tasks and configuration of node1’s services (DNS, PXE Server, etc.), execute the command:

make external-setup

Install cluster nodes.

Once node1 is up and running the rest of the nodes can be installed and connected to the LAN switch.

Install Raspberry PI nodes

Install Operating System on Raspberry Pi nodes node2-6

Follow the installation procedure indicated in “Ubuntu OS Installation” using the corresponding cloud-init configuration files (user-data and network-config) depending on the storage setup selected. Since DHCP is used there is no need to change default /boot/network-config file located in the ubuntu image.

Dedicated Disks Centralized SAN
user-data user-data

In above user-data files, hostname field need to be changed for each node (node1-node6).

Install x86 nodes

Install Operating System on x86 nodes (node-hp-x).

Follow the installation procedure indicated in “OS Installation - X86 (PXE)” and adapt the cloud-init files to your environment.

To automate deployment of PXE server execute the following command

make pxe-setup

PXE server is automatically deployed in node1 node (host belonging to pxe hosts group in Ansible’s inventory file). cloud-init files are automatically created from autoinstall jinja template. for every single host belonging to x86 hosts group

This file and the corresponding host-variables files containing storage configuration, can be tweak to be adpated to your needs.

autoinstall storage config node-hp-1

If the template or the storage config files are changed, in order to deploy the changes in the PXE server, make pxe-setup need to be re-executed.

Configure cluster nodes

For automatically execute basic OS setup tasks (DNS, DHCP, NTP, etc.), execute the command:

make nodes-setup

Configure External Services

DNS server

Install and configure DNS authoritative server, Bind9, for homelab subdomain in node1 (node belonging to Ansible’s host group dns).

Homelab subdomain is specified in variable dns_domain configured in ansible/group_vars/all.yml and DNS server configuration in ansible/host_vars/node1.yml. Update both files to meet your cluster requirements.

make dns-setup

Minio and Hashicorp Vault

Install and configure S3 Storage server (Minio), and Secret Manager (Hashicorp Vault) running the command:

make external-services

Ansible Playbook assumes S3 server is installed in a external node s3 and Hashicorp Vault in node1 (node belonging to Ansible’s host group vault).

Configuring OS level backup (restic)

Automate backup tasks at OS level with restic in all nodes (node1-node6 and gateway) running the command:

make configure-os-backup

Minio S3 server VM, s3, hosted in Public Cloud (Oracle Cloud Infrastructure), will be used as backup backend.

Kubernetes Applications (GitOps)

FluxCD is used to deploy automatically packaged applications contained in the repository. These applications are located in /kubernetes directory.

  • Modify cluster configuration to point to your own repository

    Edit kubernetes/clusters/prod/config/cluster.yaml.

    In GitRepository resource definition, set spec.url to the URL of your repository

  • In case of using a private Git repository

    Add following configuration to GitRepository resource

    spec:
      secretRef:
        name: flux-system
    
  • Modify cluster global variables

    Edit kubernetes/clusters/prod/config/cluster-settings.yaml to use your own configuration. Own DNS domain, External Services DNS names, etc.

  • Tune parameters of the different packaged Applications to meet your specific configuration

    Edit helm chart values.yaml file and other kubernetes manifest files file of the different applications located in /kubernetes directory.

K3S

K3S Installation

To install K3S cluster, execute the command:

make k3s-install

K3S Bootstrap

To bootstrap the cluster, run the command:

make k3s-bootstrap

Flux CD will be installed and it will automatically deploy all cluster applications automatically from git repo.

K3s Cluster reset

If you mess anything up in your Kubernetes cluster, and want to start fresh, the K3s Ansible playbook includes a reset playbook, that you can use to remove the installation of K3S:

make k3s-reset

Shutting down the Raspberry Pi Cluster

To automatically shut down the Raspberry PI cluster, Ansible can be used.

Kubernetes graceful node shutdown feature is enabled in the culster. This feature is documented here. and it ensures that pods follow the normal pod termination process during the node shutdown.

For doing a controlled shutdown of the cluster execute the following commands

  • Step 1: Shutdown K3S workers nodes:

    make shutdown-k3s-worker
    
  • Step 2: Shutdown K3S master nodes:

    make shutdown-k3s-master
    
  • Step 3: Shutdown gateway node:

    make shutdown-gateway
    

shutdown commands connects to each Raspberry PI in the cluster and execute the command sudo shutdown -h 1m, commanding the raspberry-pi to shutdown in 1 minute.

After a few minutes, all raspberry pi will be shutdown. You can notice that when the Switch ethernet ports LEDs are off. Then it is safe to unplug the Raspberry PIs.

Updating Ubuntu packages

To automatically update Ubuntu OS packages, run the following command:

make os-upgrade

This playbook automatically updates OS packages to the latest stable version and it performs a system reboot if needed.


Last Update: Oct 06, 2024

Comments: