What is this project about?
Scope
The scope of this project is to create a kubernetes cluster at home using Raspberry Pis and to automate its deployment and configuration applying IaC (infrastructure as a code) and GitOps methodologies with tools like Ansible, cloud-init and Argo CD.
As part of the project, the goal is to use a lightweight Kubernetes flavor based on K3S and deploy cluster basic services such as: 1) distributed block storage for POD’s persistent volumes, LongHorn, 2) backup/restore solution for the cluster, Velero and Restic, 3) service mesh architecture, Linkerd, and 4) observability platform based on metrics monitoring solution, Prometheus, logging and analytics solution, EFḰ+LG stack (Elasticsearch-Fluentd/Fluentbit-Kibana + Loki-Grafana), and distributed tracing solution, Tempo.
Design Principles
- Use ARM 64 bits operating system enabling the possibility of using Raspberry PI B nodes with 8GB RAM. Currently only Ubuntu supports 64 bits ARM distribution for Raspberry Pi.
- Use ligthweigh Kubernetes distribution (K3S). Kuberentes distribution with a smaller memory footprint which is ideal for running on Raspberry PIs
- Use of distributed storage block technology, instead of centralized NFS system, for pod persistent storage. Kubernetes block distributed storage solutions, like Rook/Ceph or Longhorn, in their latest versions have included ARM 64 bits support.
- Use of opensource projects under the CNCF: Cloud Native Computing Foundation umbrella
- Use latest versions of each opensource project to be able to test the latest Kubernetes capabilities.
- Use of cloud-init to automate the initial OS installation.
- Use of Ansible for automating the configuration of the cluster nodes, installation of kubernetes and external services, and triggering cluster bootstrap (ArgoCD bootstrap).
- Use of Argo CD to automatically provision Kubernetes applications from git repository.
Technology Stack
The following picture shows the set of opensource solutions used for building this cluster:
Name | Description | |
---|---|---|
Ansible | Automate OS configuration, external services installation and k3s installation and bootstrapping | |
ArgoCD | GitOps tool for deploying applications to Kubernetes | |
Cloud-init | Automate OS initial installation | |
Ubuntu | Cluster nodes OS | |
K3S | Lightweight distribution of Kubernetes | |
containerd | Container runtime integrated with K3S | |
Flannel | Kubernetes Networking (CNI) integrated with K3S | |
CoreDNS | Kubernetes DNS | |
![]() |
Metal LB | Load-balancer implementation for bare metal Kubernetes clusters |
Traefik | Kubernetes Ingress Controller | |
Linkerd | Kubernetes Service Mesh | |
Longhorn | Kubernetes distributed block storage | |
Minio | S3 Object Storage solution | |
Cert-manager | TLS Certificates management | |
Hashicorp Vault | Secrets Management solution | |
External Secrets Operator | Sync Kubernetes Secrets from Hashicorp Vault | |
Velero | Kubernetes Backup and Restore solution | |
![]() |
Restic | OS Backup and Restore solution |
![]() |
Prometheus | Metrics monitoring and alerting |
![]() |
Fluentd | Logs forwarding and distribution |
Fluentbit | Logs collection | |
![]() |
Loki | Logs aggregation |
Elasticsearch | Logs analytics | |
Kibana | Logs analytics Dashboards | |
Tempo | Distributed tracing monitoring | |
Grafana | Monitoring Dashboards |
External Resources and Services
Even whe the premise is to deploy all services in the kubernetes cluster, there is still a need for a few external services/resources. Below is a list of external resources/services and why we need them.
Cloud external services
Note: These resources are optional, the homelab still works without them but it won’t have trusted certificates.
Provider | Resource | Purpose | |
---|---|---|---|
Letsencrypt | TLS CA Authority | Signed valid TLS certificates | |
![]() |
IONOS | DNS | DNS and DNS-01 challenge for certificates |
Alternatives:
-
Use a private PKI (custom CA to sign certificates).
Currently supported. Only minor changes are required. See details in Doc: Quick Start instructions.
-
Use other DNS provider.
Cert-manager / Certbot used to automatically obtain certificates from Let’s Encrypt can be used with other DNS providers. This will need further modifications in the way cert-manager application is deployed (new providers and/or webhooks/plugins might be required).
Currently only acme issuer (letsencytp) using IONOS as dns-01 challenge provider is configured. Check list of supported dns01 providers.
Self-hosted external services
There is another list of services that I have decided to run outside the kuberentes cluster selfhosting them.
External Service | Resource | Purpose | |
---|---|---|---|
Minio | S3 Object Store | Cluster Backup | |
Hashicorp Vault | Secrets Management | Cluster secrets management |
Minio backup servive is hosted in a VM running in Public Cloud, using Oracle Cloud Infrastructure (OCI) free tier.
Vault service is running in gateway
node, since Vault kubernetes authentication method need access to Kuberentes API, I won’t host Vault service in Public Cloud.
What I have built so far
From hardware perspective I built two different versions of the cluster
- Release 1.0: Basic version using dedicated USB flash drive for each node and centrazalized SAN as additional storage
- Release 2.0: Adding dedicated SSD disk to each node of the cluster and improving a lot the overall cluster performance
What I have developed so far
From software perspective, I have developed the following:
-
Cloud-init template files for initial OS installation in Raspberry PI nodes
Source code can be found in Pi Cluster Git repository under
metal/rpi/cloud-init
directory. -
Ansible playbook and roles for configuring cluster nodes and installating and bootstraping K3S cluster
Source code can be found in Pi Cluster Git repository under
/ansible
directory.Aditionally several ansible roles have been developed to automate different configuration tasks on Ubuntu-based servers that can be reused in other projects. These roles are used by Pi-Cluster Ansible Playbooks
Each ansible role source code can be found in its dedicated Github repository and is published in Ansible-Galaxy to facilitate its installation with
ansible-galaxy
command.Ansible role Description Github ricsanfre.security Automate SSH hardening configuration tasks ricsanfre.ntp Chrony NTP service configuration ricsanfre.firewall NFtables firewall configuration ricsanfre.dnsmasq Dnsmasq configuration ricsanfre.storage Configure LVM ricsanfre.iscsi_target Configure iSCSI Target ricsanfre.iscsi_initiator Configure iSCSI Initiator ricsanfre.k8s_cli Install kubectl and Helm utilities ricsanfre.fluentbit Configure fluentbit ricsanfre.minio Configure Minio S3 server ricsanfre.backup Configure Restic ricsanfre.vault Configure Hashicorp Vault -
Packaged Kuberentes applications (Helm, Kustomize, manifest files) to be deployed using ArgoCD
Source code can be found in Pi Cluster Git repository under
/argocd
directory. -
This documentation website picluster.ricsanfre.com, hosted in Github pages.
Static website generated with Jekyll.
Source code can be found in the Pi-cluster repository under
/docs
directory.
Software used and latest version tested
The software used and the latest version tested of each component
Type | Software | Latest Version tested | Notes |
---|---|---|---|
OS | Ubuntu | 22.04.2 | |
Control | Ansible | 2.14.5 | |
Control | cloud-init | 23.1.2 | version pre-integrated into Ubuntu 22.04.2 |
Kubernetes | K3S | v1.27.1 | K3S version |
Kubernetes | Helm | v3.12 | |
Metrics | Kubernetes Metrics Server | v0.6.2 | version pre-integrated into K3S |
Computing | containerd | v1.6.19-k3s1 | version pre-integrated into K3S |
Networking | Flannel | v0.21.4 | version pre-integrated into K3S |
Networking | CoreDNS | v1.10.1 | version pre-integrated into K3S |
Networking | Metal LB | v0.13.9 | Helm chart version: 0.13.9 |
Service Mesh | Linkerd | v2.13.3 | Helm chart version: linkerd-control-plane-1.12.3 |
Service Proxy | Traefik | v2.10.1 | Helm chart version: 23.0.1 |
Storage | Longhorn | v1.4.2 | Helm chart version: 1.4.2 |
Storage | Minio | RELEASE.2023-04-28T18-11-17Z | Helm chart version: 5.0.9 |
TLS Certificates | Certmanager | v1.12.0 | Helm chart version: v1.12.0 |
Logging | ECK Operator | 2.7.0 | Helm chart version: 2.7.0 |
Logging | Elastic Search | 8.6.0 | Deployed with ECK Operator |
Logging | Kibana | 8.6.0 | Deployed with ECK Operator |
Logging | Fluentbit | 2.1.3 | Helm chart version: 0.29.0 |
Logging | Fluentd | 1.15.2 | Helm chart version: 0.3.9 Custom docker image from official v1.15.2 |
Logging | Loki | 2.8.2 | Helm chart grafana/loki version: 5.5.1 |
Monitoring | Kube Prometheus Stack | 0.65.1 | Helm chart version: 45.29.0 |
Monitoring | Prometheus Operator | 0.65.1 | Installed by Kube Prometheus Stack. Helm chart version: 45.29.0 |
Monitoring | Prometheus | 2.42.0 | Installed by Kube Prometheus Stack. Helm chart version: 45.29.0 |
Monitoring | AlertManager | 0.25.0 | Installed by Kube Prometheus Stack. Helm chart version: 45.29.0 |
Monitoring | Grafana | 9.5.2 | Helm chart version grafana-6.56.5. Installed as dependency of Kube Prometheus Stack chart. Helm chart version: 45.29.0 |
Monitoring | Prometheus Node Exporter | 1.5.0 | Helm chart version: prometheus-node-exporter-4.16.0 Installed as dependency of Kube Prometheus Stack chart. Helm chart version: 43.3.1 |
Monitoring | Prometheus Elasticsearch Exporter | 1.5.0 | Helm chart version: prometheus-elasticsearch-exporter-4.15.1 |
Tracing | Grafana Tempo | 2.1.1 | Helm chart: tempo-distributed (1.4.0) |
Backup | Minio External (self-hosted) | RELEASE.2023-05-04T18-10-16Z | |
Backup | Restic | 0.13.1 | |
Backup | Velero | 1.9.3 | Helm chart version: 2.32.1 |
Secrets | Hashicorp Vault | 1.12.2 | |
Secrets | External Secret Operator | 0.8.1 | Helm chart version: 0.8.1 |
GitOps | Argo CD | v2.7.2 | Helm chart version: 5.33.2 |
Comments:
- Previous
- Next