Cilium (Kubernetes CNI)

Cilium is an open source, cloud native solution for providing, securing, and observing network connectivity between workloads, powered by eBPF Kernel technology.

In a Kubernetes cluster, Cilium can be used as,

  • High performance CNI

    See details in Cilium Use-case: Layer 4 Load Balancer

  • Kube-proxy replacement

    Kube-proxy is a component running in the nodes of the cluster which provides load-balancing traffic targeted to kubernetes services (via Cluster IPs and Node Ports), routing the traffic to the proper backend pods.

    Cilium can be used to replace kube-proxy component, replacing kube-proxy’s iptables based routing by eBFP technology.

    See details in Cilium Use-case: Kube-proxy Replacement

  • Layer 4 Load Balancer

    Software based load-balancer for the kubernetes cluster which is able to announce the routes to kubernetes services using BGP or L2 protocols

    Cilium’s LB IPAM is a feature that allows Cilium to assign IP addresses to Kubernetes Services of type LoadBalancer.

    Once IP address is asigned, Cilium can advertise those assigned IPs, through BGP or L2 announcements, so traffic can be routed to cluster services from the exterior (Nort-bound traffic: External to Pod)

    See details in Cilium Use-case: Layer 4 Load Balancer

In the Pi Cluster, Cilium can be used as a replacement for the following networking components of in the cluster

  • Flannel CNI, installed by default by K3S, which uses an VXLAN overlay as networking protocol. Cilium CNI networking using eBPF technology.

  • Kube-proxy, so eBPF based can be used to increase performance.

  • Metal-LB, load balancer. MetalLB was used for LoadBalancer IP Address Management (LB-IPAM) and L2 announcements for Address Resolution Protocol (ARP) requests over the local network. Cilium 1.13 introduced LB-IPAM support and 1.14 added L2 announcement capabilities, making possible to replace MetalLB in my homelab. My homelab does not have a BGP router and so new L2 aware functionality can be used.

K3S installation

By default K3s install and configure basic Kubernetes networking packages:

  • Flannel as Networking plugin, CNI (Container Networking Interface), for enabling pod communications
  • CoreDNS providing cluster dns services
  • Traefik as ingress controller
  • Klipper Load Balancer as embedded Service Load Balancer

K3S master nodes need to be installed with the following additional options:

  • --flannel-backend=none: to disable Fannel instalation
  • --disable-network-policy: Most CNI plugins come with their own network policy engine, so it is recommended to set –disable-network-policy as well to avoid conflicts.
  • --disable-kube-proxy: to disable kube-proxy installation
  • --disable servicelb to disable default service load balancer installed by K3S (Klipper Load Balancer). Cilium will be used instead.

See complete intallation procedure and other configuration settings in “K3S Installation”

Cilium Installation

Installation using Helm (Release 3):

  • Step 1: Add Cilium Helm repository:

      helm repo add cilium https://helm.cilium.io/
    
  • Step 2: Fetch the latest charts from the repository:

      helm repo update
    
  • Step 3: Create helm values file cilium-values.yml

    # Cilium operator config
    operator:
      # replicas: 1  # Uncomment this if you only have one node
      # Roll out cilium-operator pods automatically when configmap is updated.
      rollOutPods: true
    
      # Install operator on master node
      nodeSelector:
        node-role.kubernetes.io/master: "true"
    
    # Roll out cilium agent pods automatically when ConfigMap is updated.
    rollOutCiliumPods: true
    
    # K8s API service
    k8sServiceHost: 127.0.0.1
    k8sServicePort: 6444
    
    # Replace Kube-proxy
    kubeProxyReplacement: true
    kubeProxyReplacementHealthzBindAddr: 0.0.0.0:10256
    
    # -- Configure IP Address Management mode.
    # ref: https://docs.cilium.io/en/stable/network/concepts/ipam/
    ipam:
      operator:
        clusterPoolIPv4PodCIDRList: "10.42.0.0/16"
    
    # Configure L2 announcements (LB-IPAM configuration)
    l2announcements:
      enabled: true
    externalIPs:
      enabled: true
    
    # Increase the k8s api client rate limit to avoid being limited due to increased API usage 
    k8sClientRateLimit:
      qps: 50
      burst: 200
    
  • Step 4: Install Cilium in kube-system namespace

      helm install cilium cilium/cilium --namespace kube-system -f cilium-values.yaml
    
  • Step 5: Confirm that the deployment succeeded, run:

      kubectl -n kube-system get pod
    

Helm chart configuration details

  • Configure Cilium Operator, to run on master nodes and roll out automatically when configuration is updated

    operator:
      # Roll out cilium-operator pods automatically when configmap is updated.
      rollOutPods: true
    
      # Install operator on master node
      nodeSelector:
        node-role.kubernetes.io/master: "true"
    
  • Configure Cilium agents to roll out automatically when configuration is updated

    # Roll out cilium agent pods automatically when ConfigMap is updated.
    rollOutCiliumPods: true
    
  • Configure Cilium to replace kube-proxy

    # Replace Kube-proxy
    kubeProxyReplacement: true
    kubeProxyReplacementHealthzBindAddr: 0.0.0.0:10256
    

    Further details about kube-proxy replacement mode in Cilium doc

  • Cilium LB-IPAM configuration

    l2announcements:
      enabled: true
    externalIPs:
      enabled: true
    

    Cilium to perform L2 announcements and reply to ARP requests (l2announcements.enabled). This configuration requires Cilium to run in kube-proxy replacement mode.

    Also announce external IPs assigned to LoadBalancer type Services need to be enabled (externalIPs.enabled, i.e the IPs assigned by LB-IPAM.

  • Configure access to Kubernetes API

    k8sServiceHost: 127.0.0.1
    k8sServicePort: 6444
    

    This variables should point to Kuberentes API listening in a Virtual IP configured in haproxy (10.0.0.11) and port 6443. K3s has an API server proxy listening in 127.0.0.1:6444 on all nodes in the cluster, so it is not needed to point to external virtual IP address.

  • Increase the k8s api client rate limit to avoid being limited due to increased API usage

    k8sClientRateLimit:
      qps: 50
      burst: 200
    

    The leader election process, used by L2 annoucements, continually generates API traffic, so API request. The default client rate limit is 5 QPS (query per second) with allowed bursts up to 10 QPS. this default limit is quickly reached when utilizing L2 announcements and thus users should size the client rate limit accordingly.

    See details in Cilium L2 announcements documentation

Configure Cilium monitoring and Hubble UI

  • Configure Prometheus Monitoring metrics and Grafana dashboards

    Add following configuration to helm chart values.yaml:

    operator:
      # Enable prometheus integration for cilium-operator
      prometheus:
        enabled: true
        serviceMonitor:
          enabled: true
      # Enable Grafana dashboards for cilium-operator
      dashboards:
        enabled: true
        annotations:
          grafana_folder: Cilium
    
    # Enable Prometheus integration for cilium-agent
    prometheus:
      enabled: true
      serviceMonitor:
        enabled: true
        # scrape interval
        interval: "10s"
        # -- Relabeling configs for the ServiceMonitor hubble
        relabelings:
          - action: replace
            sourceLabels:
              - __meta_kubernetes_pod_node_name
            targetLabel: node
            replacement: ${1}
        trustCRDsExist: true
    # Enable Grafana dashboards for cilium-agent
    # grafana can import dashboards based on the label and value
    # ref: https://github.com/grafana/helm-charts/tree/main/charts/grafana#sidecar-for-dashboards
    dashboards:
      enabled: true
      annotations:
        grafana_folder: Cilium
    
    
  • Enable Huble UI

    Add following configuration to helm chart values.yaml:

    # Enable Hubble
    hubble:
      enabled: true
      # Enable Monitoring
      metrics:
        enabled:
          - dns:query
          - drop
          - tcp
          - flow
          - port-distribution
          - icmp
          - http
        serviceMonitor:
          enabled: true
          # scrape interval
          interval: "10s"
          # -- Relabeling configs for the ServiceMonitor hubble
          relabelings:
            - action: replace
              sourceLabels:
                - __meta_kubernetes_pod_node_name
              targetLabel: node
              replacement: ${1}
        # Grafana Dashboards
        dashboards:
          enabled: true
          annotations:
            grafana_folder: Cilium
      relay:
        enabled: true
        rollOutPods: true
        # Enable Prometheus for hubble-relay
        prometheus:
          enabled: true
          serviceMonitor:
            enabled: true
      ui:
        enabled: true
        rollOutPods: true
        # Enable Ingress
        ingress:
          enabled: true
          annotations:
            # Enable external authentication using Oauth2-proxy
            nginx.ingress.kubernetes.io/auth-signin: https://oauth2-proxy.picluster.ricsanfre.com/oauth2/start?rd=https://$host$request_uri
            nginx.ingress.kubernetes.io/auth-url: http://oauth2-proxy.oauth2-proxy.svc.cluster.local/oauth2/auth
            nginx.ingress.kubernetes.io/proxy-buffer-size: "16k"
            nginx.ingress.kubernetes.io/auth-response-headers: Authorization
    
            # Enable cert-manager to create automatically the SSL certificate and store in Secret
            # Possible Cluster-Issuer values:
            #   * 'letsencrypt-issuer' (valid TLS certificate using IONOS API)
            #   * 'ca-issuer' (CA-signed certificate, not valid)
            cert-manager.io/cluster-issuer: letsencrypt-issuer
            cert-manager.io/common-name: hubble.picluster.ricsanfre.com
          className: nginx
          hosts: ["hubble.picluster.ricsanfre.com"]
          tls:
            - hosts:
              - hubble.picluster.ricsanfre.com
              secretName: hubble-tls
    

See further details in Cilium Monitoring and Metrics and Cilium Hubble UI and Cilium Hubble Configuration.

Configure LB-IPAM

  • Step 1: Configure IP addess pool and the announcement method (L2 configuration)

    Create the following manifest file: cilium-config.yaml

      ---
      apiVersion: "cilium.io/v2alpha1"
      kind: CiliumLoadBalancerIPPool
      metadata:
        name: "first-pool"
        namespace: kube-system
      spec:
        blocks:
          - start: "10.0.0.100"
            stop: "10.0.0.200"
    
      ---
      apiVersion: cilium.io/v2alpha1
      kind: CiliumL2AnnouncementPolicy
      metadata:
        name: default-l2-announcement-policy
        namespace: kube-system
      spec:
        externalIPs: true
        loadBalancerIPs: true
    
    

    Apply the manifest file

     kubectl apply -f cilium-config.yaml
    

Configuring LoadBalancer Services

Services can request specific IPs. The legacy way of doing so is via .spec.loadBalancerIP which takes a single IP address. This method has been deprecated in k8s v1.24.

With Cilium LB IPAM, the way of requesting specific IPs is to use annotation, io.cilium/lb-ipam-ips. This annotation takes a comma-separated list of IP addresses, allowing for multiple IPs to be requested at once.

apiVersion: v1
kind: Service
metadata:
  name: service-blue
  namespace: example
  labels:
    color: blue
  annotations:
    io.cilium/lb-ipam-ips: "20.0.10.100,20.0.10.200"
spec:
  type: LoadBalancer
  ports:
  - port: 1234

See further details in Cilium LB IPAM documentation.

Cilium and ArgoCD

ArgoCD automatic synchornization and pruning of resources might might delete some of the Resources automatically created by Cilium. This is a well-known behaviour of ArgoCD. Check “Argocd installation” document.

ArgoCD need to be configured to exclude the synchronization of CiliumIdentity resources:

resource.exclusions: |
 - apiGroups:
     - cilium.io
   kinds:
     - CiliumIdentity
   clusters:
     - "*"

Also other issues related to Hubble’s rotation of certificates need to be considere and add the following configuration to Cilium’s ArgoCd Application resource

ignoreDifferences:
  - group: ""
    kind: ConfigMap
    name: hubble-ca-cert
    jsonPointers:
    - /data/ca.crt
  - group: ""
    kind: Secret
    name: hubble-relay-client-certs
    jsonPointers:
    - /data/ca.crt
    - /data/tls.crt
    - /data/tls.key
  - group: ""
    kind: Secret
    name: hubble-server-certs
    jsonPointers:
    - /data/ca.crt
    - /data/tls.crt
    - /data/tls.key

See further details in Cilium documentation: Troubleshooting Cilium deployed with Argo CD.

K3S Uninstallation

If custom CNI, like Cilium, is used, K3s scripts to clean up an existing installation (k3s-uninstall.sh or k3s-killall.sh) need to be used carefully.

Those scripts does not clean Cilium networking configuration, and execute them might cause to lose network connectivity to the host when K3s is stopped.

Before running k3s-killall.sh or k3s-uninstall.sh on any node, cilium interfaces must be removed (cilium_host, cilium_net and cilium_vxlan):

ip link delete cilium_host
ip link delete cilium_net
ip link delete cilium_vxlan

Additionally, iptables rules for cilium should be removed:

iptables-save | grep -iv cilium | iptables-restore
ip6tables-save | grep -iv cilium | ip6tables-restore

Also CNI config directory need to be removed

rm /etc/cni/net.d

References


Last Update: Sep 09, 2024

Comments: