A working Kubernetes installation in a single lunch break!


This is a write-up of my project to get Kubernetes working as a single-node, general purpose install on a baremetal server provided by OVH.

It took me a long time to find the right information on how to do this, as many of the components are new with little documentation provided.


DISCLAIMER
A single-node instance like the one I'm about to describe is incredibly useful for a personal system or learning tool, but should not be confused with something that's acceptable for production workloads.

Background

Why are you doing this?

I've had a few low-end virtual servers for a while in Sydney and Warsaw that I've been using for various tasks, but I'd been straddling the limit of their (tiny) resources for quite some time.

To compound the frustration, I've been using OVH's OpenStack object storage containers as large virtual disks (thank you S3QL). Imagine using a server based in Sydney while its main data disk is over http on the other side of the world.

Regardless, it was time for a change, and the recent introduction of some temptingly cheap local baremetal servers has swayed me.

My Needs

I had to find a solution that would tick some boxes:

  • It should have at least 200GiB of space
  • It should be easily backed up (especially my Nextcloud data)
  • It should be easy to partition different workloads and control their resource usage
  • It should be secure
  • It should be easy to manage and automate (I'm sick of the traditional special-snowflake server method)
  • It should allow multiple domains and TCP services on the same static IP

Kubernetes in an Hour

Software I Used

  • RancherOS: a ridiculously small distro where all the things are Dockerised
  • RKE: the Rancher Kubernetes Engine
  • Rancher: lubricant for Kubernetes
  • MetalLB: Google's solution to software load-balancing
  • Helm and Tiller: a sort-of package manager for Kubernetes

Part 1: Preparation

I decided to use RancherOS for the server, as everything running on the system is dockerised and it's designed for Kubernetes. Unfortunatelyt this is not one of the standard OS offerings by OVH.

OS Installation

Luckily, OVH do give you a Java Web Start KVM for your (Supermicro) server with the ability to connect an ISO to the virtual disc drive. The installation ISO for RancherOS is only about 100MiB, so this is absolutely perfect.

RancherOS uses the standard cloud-config YAML referencing to ingest its initial settings during installation. All you need is something similar to this:

#cloud-config
hostname: your.public.dns.address

rancher:
  network:
    interfaces:
      eth0:
        dhcp: true
    dns:
     nameservers:
     # Using Google's DNS
     - 8.8.8.8
     - 8.8.4.4

ssh_authorized_keys:
  - ssh-ed25519 AAAAblahblahblahtypicalkey email@add.ress

Once RancherOS is up and running, you can use some of the tricks in this helpful Gist to get a software RAID running in RancherOS:

# Create empty partition tables on each disk with G
sudo fdisk /dev/sda
sudo fdisk /dev/sdb

# Install once on each disk
sudo ros install -i rancher/os:v1.5.0 -t gptsyslinux -c cloud-config.yml -a "rancher.state.mdadm_scan" -d /dev/sda --no-reboot
sudo ros install -i rancher/os:v1.5.0 -t gptsyslinux -c cloud-config.yml -a "rancher.state.mdadm_scan" -d /dev/sdb --no-reboot

# Configure RAID
sudo mdadm --create /dev/md0 --level=1 -- metadata=1.0 --raid-devices=2 /dev/sda1 /dev/sdb1

# Final sanity checks
sudo fsck /dev/md0
sudo resize2fs /dev/md0
sudo fsck /dev/md0

sudo reboot

Kubernetes Initialisation

The Rancher Kubernetes Engine is painless to set up and get going. I'll step through performing a single-node install using Let's Encrypt as the certificate manager, but veering off the path a little bit to make things work seamlessly on the node.

First off you need to tell RKE to install everything it needs, using some YAML spec. There are a number of things you can tag and configure, though for my deployment I left everything at its default. Create your rke.yaml file for deployment:

ignore_docker_version: true
ssh_agent_auth: true
nodes:
  - address: public.dns.add.ress
    hostname_override: public.dns.add.ress
    user: rancher
    role: [controlplane,worker,etcd]

network:
    plugin: canal
    options:
        canal_iface: eth0

services:
  etcd:
    snapshot: true
    creation: 6h
    retention: 24h

As long as you're using ssh-agent for your SSH key, you don't have to specify the actual key you're using.

Next up, install Kubes and do some basic configuration:

rke up --config rke.yaml
export KUBECONFIG=$(pwd)/kube_config_rke.yaml
kubectl -n kube-system create serviceaccount tiller
kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
helm init --service-account tiller
helm repo add rancher-stable https://releases.rancher.com/server-charts/stable
helm install stable/cert-manager --name cert-manager --namespace kube-system
helm install rancher-stable/rancher --name rancher --namespace cattle-system --set hostname=your.public.dns.name

You might need to space the above commands out just to ensure that everything is deploying properly, or use kubectl to check the rollout status of each.

You're done, and should have a ready-to-go basic Kubernetes node. To make it truly useful, you'll need to get down and dirty with some additional software.

Part 2: MetalLB

As this installation is only single-node, and I wanted to make sure that I could run all the services I needed to, I had to ensure that I had a way to share ports on the public IP address without resorting to hand-crafted iptables rules.

Why Use a Load Balancer?

It might seem counter-intuitive to run a load balancer service on a single node (and single IP) installation, but you need to know about how Kubernetes networking works for non-HTTP services. For the sake of a single node, you essentially have three options:

  • ClusterIP as part of a service for internal-only access (just like linking Docker containers)
  • NodePort to directly expose the service on your Kubernetes node, on a random port from 30000 to 32767
  • As part of a DaemonSet... which I will not go into

As a DaemonSet is not a recommended way to go about simple deployments, none of the options are especially useful for exposing a typical port (say, IMAP).

Enter MetalLB, Google's software implementation of load-balancing for Kubernetes. Exposing a service to the world is as simple as:

  1. Installing MetalLB
  2. Telling MetalLB which IPs it can use
  3. Telling a service to expose a port on a particular IP

Luckily, the load balancer implementation supports sharing multiple services with unique ports on a single IP address. This is especially useful when you only have a single IP.

Setting it Up

Create a YAML spec containing the IP address of your service to instantiate the default pool:

apiVersion: v1
kind: ConfigMap
metadata:
  namespace: metallb-system
  name: config
data:  
  config: |
    address-pools:
    - name: default
      protocol: layer2
      addresses:
      - your.public.ip.addr

Then install and apply:

kubectl apply -f https://raw.githubusercontent.com/google/metallb/v0.7.3/manifests/metallb.yaml
kubectl apply -n metallb-system lb-pool.yaml

Exposing Workloads

Following the official documentation, getting it to work is as easy as assigning your service to the specific IP and putting in an IP sharing annotation.

Take a look at these two sample services: a typical bastion host and a generic game server with a voice port. MetalLB will expose multiple ports on the same IP address when you set the services up:

apiVersion: v1
kind: Service
metadata:
  annotations:
    metallb.universe.tf/allow-shared-ip: ip1
  name: bastion-exposer
  namespace: bastion
spec:
  loadBalancerIP: your.public.ip.addr
  ports:
  - name: ssh
    port: 1234
    protocol: TCP
    targetPort: 22
  selector:
    service: disposable
  type: LoadBalancer
---
apiVersion: v1
kind: Service
metadata:
  annotations:
    metallb.universe.tf/allow-shared-ip: ip1
  labels:
    game.service: game1-exposer
  name: game1-service
  namespace: gamehost
spec:
  ports:
  - name: game1
    port: 5678
    targetPort: 5678
  - name: voice
    port: 2468
    targetPort: 2468
  selector:
    game.service: game1
  type: LoadBalancer
  loadBalancerIP: your.public.ip.addr

The important part of the service spec is the annotation:

metadata:
  annotations:
    metallb.universe.tf/allow-shared-ip: ip1

As long as the metadata.annotations.allow-shared-ip key is matched for everything using the same spec.loadBalancerIP, every port will happily be opened up on that IP. It really is that easy.

Part 3: Automatic Certificates

This last part is a rough guide on how to use cert-manager to generate Let's Encrypt SSL HTTPS certificates on-demand for any web services you're using.

I'm doing this using the default ingress-nginx that ships with the Rancher Kubernetes Engine. The one prerequisite is that you have working DNS to point to your web services' addresses, since we'll be using the normal HTTP challenge with Let's Encrypt's ACME service.

Creating a ClusterIssuer

Cert-manager lets you configure two kinds of issuers: the standard Issuer resource that's bound to a single namespace, or the ClusterIssuer resource that's available across all namespaces. As part of a normal cluster, you'd want to use namespaced issuers as part of good security, but for a single node where you're the only tenant, it's much easier to go with the latter type.

Cert-manager comes with a sample cluster issuer, but it's better to create your own so you know what you're doing. Simply throw together a YAML file to deploy one, following the official doco:

apiVersion: v1
items:
- apiVersion: certmanager.k8s.io/v1alpha1
  kind: ClusterIssuer
  metadata:
    name: letsencrypt
  spec:
    acme:
      email: your@email.address
      http01: {}
      privateKeySecretRef:
        key: ""
        name: letsencrypt-cluster
      server: https://acme-v02.api.letsencrypt.org/directory

When it's done and working, you can use kubectl get clusterissuer -o yaml to make sure it's worked. It should output something resembling the following:

status:
acme:
    uri: https://acme-v02.api.letsencrypt.org/acme/acct/47017662
conditions:
- lastTransitionTime: "2018-12-04T05:31:27Z"
    message: The ACME account was registered with the ACME server
    reason: ACMEAccountRegistered
    status: "True"
    type: Ready

Creating Virtual SSL Servers

This part took me a while to figure out—every time I deployed an ingress for a service, it never generated a certificate. The official doco for this seems to be missing (at least for me) one little piece of YAML that made it all work.

For example, getting this Ghost instance exposed externally required the following service and ingress spec:

---
apiVersion: v1
kind: Service
metadata:
  labels:
    ghost.service: ghost-web-service
  name: ghost-web-service
  namespace: ghost
spec:
  ports:
  - name: http
    port: 2368
    targetPort: 2368
  selector:
    ghost.service: ghost-web
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: ghost-ingress
  annotations:
    kubernetes.io/ingress.class: nginx
    certmanager.k8s.io/cluster-issuer: letsencrypt
    certmanager.k8s.io/acme-challenge-type: http01
spec:
  tls:
  - hosts:
    - kelsey.id.au
    secretName: ghost-cert
  rules:
  - host: kelsey.id.au
    http:
      paths:
      - path: /
        backend:
          serviceName: ghost-web-service
          servicePort: 2368
---
apiVersion: certmanager.k8s.io/v1alpha1
kind: Certificate
metadata:
  name: ghost-cert
spec:
  secretName: ghost-cert
  dnsNames:
  - kelsey.id.au
  acme:
    config:
    - http01:
        ingressClass: nginx
      domains:
      - kelsey.id.au
  issuerRef:
    name: letsencrypt
    kind: ClusterIssuer

By creating an empty certificate object that references the domain and ClusterIssuer, and putting the ClusterIssuer and ACME challenge type in the Ingress, the issuer puts it all together and generates a certificate for you.

Closing Thoughts

While this guide gives you a pretty neat Kubernetes instance, the resulting 'cluster' has its shortcomings, some of which I'm still trying to find a good solution to. You need to ask questions like:

How do I provision persistent storage for my deployments?

Take a look at the built-in storage classes for Kubernetes. There's not a lot on offer if you want to have flexible storage using the host's baremetal disks.

How do I back-up and restore my data?

If you lose the node, you lose everything, but in this case it's not as easy to set up a simple cron job to back up your data, let alone for things like etcd. This is a topic that I'll cover in a future article.


Otherwise, I hope this was useful for you. Feel free to reach out if there's a topic you'd like me to cover as I continue learning.