What is this project about?

Scope

The scope of this project is to create a kubernetes cluster at home using Raspberry Pis and to automate its deployment and configuration applying IaC (infrastructure as a code) and GitOps methodologies with tools like Ansible, cloud-init and Argo CD.

As part of the project, the goal is to use a lightweight Kubernetes flavor based on K3S and deploy cluster basic services such as: 1) distributed block storage for POD’s persistent volumes, LongHorn, 2) backup/restore solution for the cluster, Velero and Restic, 3) service mesh architecture, Linkerd, and 4) observability platform based on metrics monitoring solution, Prometheus, logging and analytics solution, EFḰ+LG stack (Elasticsearch-Fluentd/Fluentbit-Kibana + Loki-Grafana), and distributed tracing solution, Tempo.

Design Principles

  • Use ARM 64 bits operating system enabling the possibility of using Raspberry PI B nodes with 8GB RAM. Currently only Ubuntu supports 64 bits ARM distribution for Raspberry Pi.
  • Use ligthweigh Kubernetes distribution (K3S). Kuberentes distribution with a smaller memory footprint which is ideal for running on Raspberry PIs
  • Use of distributed storage block technology, instead of centralized NFS system, for pod persistent storage. Kubernetes block distributed storage solutions, like Rook/Ceph or Longhorn, in their latest versions have included ARM 64 bits support.
  • Use of opensource projects under the CNCF: Cloud Native Computing Foundation umbrella
  • Use latest versions of each opensource project to be able to test the latest Kubernetes capabilities.
  • Use of cloud-init to automate the initial OS installation.
  • Use of Ansible for automating the configuration of the cluster nodes, installation of kubernetes and external services, and triggering cluster bootstrap (ArgoCD bootstrap).
  • Use of Argo CD to automatically provision Kubernetes applications from git repository.

Technology Stack

The following picture shows the set of opensource solutions used for building this cluster:

Cluster-Icons

Name Description
Ansible Automate OS configuration, external services installation and k3s installation and bootstrapping
ArgoCD GitOps tool for deploying applications to Kubernetes
Cloud-init Automate OS initial installation
Ubuntu Cluster nodes OS
K3S Lightweight distribution of Kubernetes
containerd Container runtime integrated with K3S
Flannel Kubernetes Networking (CNI) integrated with K3S
CoreDNS Kubernetes DNS
Metal LB Load-balancer implementation for bare metal Kubernetes clusters
Traefik Kubernetes Ingress Controller
Linkerd Kubernetes Service Mesh
Longhorn Kubernetes distributed block storage
Minio S3 Object Storage solution
Cert-manager TLS Certificates management
Hashicorp Vault Secrets Management solution
External Secrets Operator Sync Kubernetes Secrets from Hashicorp Vault
Velero Kubernetes Backup and Restore solution
Restic OS Backup and Restore solution
Prometheus Metrics monitoring and alerting
Fluentd Logs forwarding and distribution
Fluentbit Logs collection
Loki Logs aggregation
Elasticsearch Logs analytics
Kibana Logs analytics Dashboards
Tempo Distributed tracing monitoring
Grafana Monitoring Dashboards

External Resources and Services

Even whe the premise is to deploy all services in the kubernetes cluster, there is still a need for a few external services/resources. Below is a list of external resources/services and why we need them.

Cloud external services

  Provider Resource Purpose
Letsencrypt TLS CA Authority Signed valid TLS certificates
IONOS DNS DNS and DNS-01 challenge for certificates

Alternatives:

  1. Use a private PKI (custom CA to sign certificates).

    Currently supported. Only minor changes are required. See details in Doc: Quick Start instructions.

  2. Use other DNS provider.

    Cert-manager / Certbot used to automatically obtain certificates from Let’s Encrypt can be used with other DNS providers. This will need further modifications in the way cert-manager application is deployed (new providers and/or webhooks/plugins might be required).

    Currently only acme issuer (letsencytp) using IONOS as dns-01 challenge provider is configured. Check list of supported dns01 providers.

Self-hosted external services

There is another list of services that I have decided to run outside the kuberentes cluster selfhosting them.

  External Service Resource Purpose
Minio S3 Object Store Cluster Backup
Hashicorp Vault Secrets Management Cluster secrets management

Minio backup servive is hosted in a VM running in Public Cloud, using Oracle Cloud Infrastructure (OCI) free tier.

Vault service is running in gateway node, since Vault kubernetes authentication method need access to Kuberentes API, I won’t host Vault service in Public Cloud.

What I have built so far

From hardware perspective I built two different versions of the cluster

  • Release 1.0: Basic version using dedicated USB flash drive for each node and centrazalized SAN as additional storage

Cluster-1.0

  • Release 2.0: Adding dedicated SSD disk to each node of the cluster and improving a lot the overall cluster performance

!Cluster-2.0

What I have developed so far

From software perspective, I have developed the following:

  1. Cloud-init template files for initial OS installation in Raspberry PI nodes

    Source code can be found in Pi Cluster Git repository under metal/rpi/cloud-init directory.

  2. Ansible playbook and roles for configuring cluster nodes and installating and bootstraping K3S cluster

    Source code can be found in Pi Cluster Git repository under /ansible directory.

    Aditionally several ansible roles have been developed to automate different configuration tasks on Ubuntu-based servers that can be reused in other projects. These roles are used by Pi-Cluster Ansible Playbooks

    Each ansible role source code can be found in its dedicated Github repository and is published in Ansible-Galaxy to facilitate its installation with ansible-galaxy command.

    Ansible role Description Github
    ricsanfre.security Automate SSH hardening configuration tasks
    ricsanfre.ntp Chrony NTP service configuration
    ricsanfre.firewall NFtables firewall configuration
    ricsanfre.dnsmasq Dnsmasq configuration
    ricsanfre.storage Configure LVM
    ricsanfre.iscsi_target Configure iSCSI Target
    ricsanfre.iscsi_initiator Configure iSCSI Initiator
    ricsanfre.k8s_cli Install kubectl and Helm utilities
    ricsanfre.fluentbit Configure fluentbit
    ricsanfre.minio Configure Minio S3 server
    ricsanfre.backup Configure Restic
    ricsanfre.vault Configure Hashicorp Vault
  3. Packaged Kuberentes applications (Helm, Kustomize, manifest files) to be deployed using ArgoCD

    Source code can be found in Pi Cluster Git repository under /argocd directory.

  4. This documentation website picluster.ricsanfre.com, hosted in Github pages.

    Static website generated with Jekyll.

    Source code can be found in the Pi-cluster repository under /docs directory.

Software used and latest version tested

The software used and the latest version tested of each component

Type Software Latest Version tested Notes
OS Ubuntu 22.04.2  
Control Ansible 2.14.5  
Control cloud-init 23.1.2 version pre-integrated into Ubuntu 22.04.2
Kubernetes K3S v1.27.1 K3S version
Kubernetes Helm v3.12  
Metrics Kubernetes Metrics Server v0.6.2 version pre-integrated into K3S
Computing containerd v1.6.19-k3s1 version pre-integrated into K3S
Networking Flannel v0.21.4 version pre-integrated into K3S
Networking CoreDNS v1.10.1 version pre-integrated into K3S
Networking Metal LB v0.13.9 Helm chart version: 0.13.9
Service Mesh Linkerd v2.13.3 Helm chart version: linkerd-control-plane-1.12.3
Service Proxy Traefik v2.10.1 Helm chart version: 23.0.1
Storage Longhorn v1.4.2 Helm chart version: 1.4.2
Storage Minio RELEASE.2023-04-28T18-11-17Z Helm chart version: 5.0.9
TLS Certificates Certmanager v1.12.0 Helm chart version: v1.12.0
Logging ECK Operator 2.7.0 Helm chart version: 2.7.0
Logging Elastic Search 8.6.0 Deployed with ECK Operator
Logging Kibana 8.6.0 Deployed with ECK Operator
Logging Fluentbit 2.1.3 Helm chart version: 0.29.0
Logging Fluentd 1.15.2 Helm chart version: 0.3.9 Custom docker image from official v1.15.2
Logging Loki 2.8.2 Helm chart grafana/loki version: 5.5.1
Monitoring Kube Prometheus Stack 0.65.1 Helm chart version: 45.29.0
Monitoring Prometheus Operator 0.65.1 Installed by Kube Prometheus Stack. Helm chart version: 45.29.0
Monitoring Prometheus 2.42.0 Installed by Kube Prometheus Stack. Helm chart version: 45.29.0
Monitoring AlertManager 0.25.0 Installed by Kube Prometheus Stack. Helm chart version: 45.29.0
Monitoring Grafana 9.5.2 Helm chart version grafana-6.56.5. Installed as dependency of Kube Prometheus Stack chart. Helm chart version: 45.29.0
Monitoring Prometheus Node Exporter 1.5.0 Helm chart version: prometheus-node-exporter-4.16.0 Installed as dependency of Kube Prometheus Stack chart. Helm chart version: 43.3.1
Monitoring Prometheus Elasticsearch Exporter 1.5.0 Helm chart version: prometheus-elasticsearch-exporter-4.15.1
Tracing Grafana Tempo 2.1.1 Helm chart: tempo-distributed (1.4.0)
Backup Minio External (self-hosted) RELEASE.2023-05-04T18-10-16Z  
Backup Restic 0.13.1  
Backup Velero 1.9.3 Helm chart version: 2.32.1
Secrets Hashicorp Vault 1.12.2  
Secrets External Secret Operator 0.8.1 Helm chart version: 0.8.1
GitOps Argo CD v2.7.2 Helm chart version: 5.33.2

Last Update: May 20, 2023

Comments: