<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[From Stone Age to DevOps]]></title><description><![CDATA[Trial by fire: my notes on DevOps from scratch.]]></description><link>https://kelsey.id.au/</link><image><url>https://kelsey.id.au/favicon.png</url><title>From Stone Age to DevOps</title><link>https://kelsey.id.au/</link></image><generator>Ghost 3.3</generator><lastBuildDate>Sat, 08 Feb 2020 13:31:33 GMT</lastBuildDate><atom:link href="https://kelsey.id.au/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[Automating RancherOS in vCloud Director]]></title><description><![CDATA[Automating deployment of cloud-init compatible network settings within VMware's vCloud Director using a Python script.]]></description><link>https://kelsey.id.au/automating-rancheros-in-vcenter/</link><guid isPermaLink="false">5d0b61b03f1a790001219826</guid><category><![CDATA[vCloud]]></category><category><![CDATA[VMware]]></category><category><![CDATA[RancherOS]]></category><category><![CDATA[Python]]></category><dc:creator><![CDATA[Angus Kelsey]]></dc:creator><pubDate>Sat, 22 Jun 2019 01:26:23 GMT</pubDate><content:encoded><![CDATA[<!--kg-card-begin: markdown--><p>One of the things I've been doing a lot of recently is deploying <a href="https://rancher.com/docs/os/v1.x/en/">RancherOS</a> nodes into VMware's <a href="https://www.vmware.com/au/products/vcloud-director.html">vCloud Director</a>.</p>
<p>While automating the process entirely has provided its own set of challenges, the one that has been the most difficult to overcome is that of asking the <code>cloud-init</code> network interface settings to set themselves correctly.</p>
<h2 id="theproblem">The Problem</h2>
<p>In most cloud providers, container-optimised operating systems have no real issue in determining their network settings given to them (including from vSphere/ESXi hypervisors). Unfortuantely, vCloud is one of the providers that seems to make things difficult.</p>
<p>RancherOS is able to detect the hypervisor that it's running in, and when <a href="https://rancher.com/docs/os/v1.2/en/running-rancheros/cloud/vmware-esxi/">running in ESXi</a> installs <code>open-vm-tools</code> to detect the virtual machine's <code>guestinfo</code> keys exposed to it: including the network interface settings. As encountered in <a href="https://github.com/rancher/os/issues/2151">this Git issue</a>, vCloud uses special keys in its OVF environment strings which are not compatible.</p>
<p>If you run <code>vmtoolsd --cmd &quot;info-get guestinfo.ovfenv&quot;</code> on a node, you'll get something like this:</p>
<pre><code class="language-xml">&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
   &lt;Environment
      xmlns=&quot;http://schemas.dmtf.org/ovf/environment/1&quot;
      xmlns:xsi=&quot;http://www.w3.org/2001/XMLSchema-instance&quot;
      xmlns:oe=&quot;http://schemas.dmtf.org/ovf/environment/1&quot;
      xmlns:ve=&quot;http://www.vmware.com/schema/ovfenv&quot;
      oe:id=&quot;&quot;
      ve:vCenterId=&quot;vm-123456&quot;&gt;
      &lt;PlatformSection&gt;
         &lt;Kind&gt;VMware ESXi&lt;/Kind&gt;
         &lt;Version&gt;6.7.0&lt;/Version&gt;
         &lt;Vendor&gt;VMware, Inc.&lt;/Vendor&gt;
         &lt;Locale&gt;en&lt;/Locale&gt;
      &lt;/PlatformSection&gt;
      &lt;PropertySection&gt;
         &lt;Property oe:key=&quot;vCloud_UseSysPrep&quot; oe:value=&quot;None&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_bitMask&quot; oe:value=&quot;11&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_bootproto_0&quot; oe:value=&quot;static&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_bootproto_1&quot; oe:value=&quot;static&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_computerName&quot; oe:value=&quot;rancheros-node&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_dns1_0&quot; oe:value=&quot;&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_dns1_1&quot; oe:value=&quot;&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_dns2_0&quot; oe:value=&quot;&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_dns2_1&quot; oe:value=&quot;&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_gateway_0&quot; oe:value=&quot;192.168.0.1&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_gateway_1&quot; oe:value=&quot;192.168.1.1&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_ip_0&quot; oe:value=&quot;192.168.1.100&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_ip_1&quot; oe:value=&quot;192.168.2.50&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_ip_2&quot; oe:value=&quot;10.0.0.6&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_macaddr_0&quot; oe:value=&quot;00:50:56:01:01:01&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_macaddr_1&quot; oe:value=&quot;00:50:56:02:02:02&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_markerid&quot; oe:value=&quot;uuid-uuid-uuid-uuid&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_netmask_0&quot; oe:value=&quot;255.255.255.0&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_netmask_1&quot; oe:value=&quot;255.255.255.0&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_netmask_2&quot; oe:value=&quot;255.255.255.0&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_numnics&quot; oe:value=&quot;2&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_primaryNic&quot; oe:value=&quot;0&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_reconfigToken&quot; oe:value=&quot;1234567&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_resetPassword&quot; oe:value=&quot;0&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_suffix_0&quot; oe:value=&quot;&quot;/&gt;
         &lt;Property oe:key=&quot;vCloud_suffix_1&quot; oe:value=&quot;&quot;/&gt;
      &lt;/PropertySection&gt;
      &lt;ve:EthernetAdapterSection&gt;
         &lt;ve:Adapter ve:mac=&quot;00:50:56:01:02:03&quot; ve:network=&quot;rancher_network&quot; ve:unitNumber=&quot;7&quot;/&gt;
      &lt;/ve:EthernetAdapterSection&gt;
   &lt;/Environment&gt;
</code></pre>
<p>As you can see, vCloud uses its own suffix for every item, and on top of that does not provide the correct information in some cases:</p>
<ul>
<li>The number of NICs is less than actually presented to the system</li>
<li>vCloud incorrectly presents gateways for non-routed networks</li>
<li>DNS settings do not always come through</li>
</ul>
<p>To add another layer of complexity, when vCloud creates the virtual machines, it is not consistent with the ordering of NICs. For instance, on some nodes I have found that the virtual machine sees its first NIC as <code>eth1</code> and its second NIC as <code>eth0</code>, which means you cannot assume that the string <code>vCloud_ip_0</code> matches <code>eth0</code> within the host.</p>
<h2 id="thesolution">The Solution</h2>
<p>As the name of the game here is automation, I needed something that fit in with the deployment process:</p>
<ol>
<li>Build template with Packer and upload to vCloud</li>
<li>Deploy clusters with Terraform, including specific network settings</li>
<li>Deploy Kubernetes with RKE</li>
</ol>
<p>This means that I want to use a generic <code>cloud-config</code> <em>once</em> while building the machine and not have to manually set the network on each deployed node post-deployment.</p>
<p>I ended up going with a slapped-together Python script that does the following:</p>
<ol>
<li>Obtain the current <code>cloud-config</code> data reported by <code>ros config export</code></li>
<li>Obtain the <code>guestinfo</code> keys exposed by VMware</li>
<li>Parse the VMware data into a dictionary</li>
<li>Compare with what's configured on the node and see what needs changing</li>
<li>Verify the new configuration, write it and reboot</li>
</ol>
<p>This gets written to the host on boot using the initial <code>cloud-config.yaml</code>  file, and run after the Ubuntu console ensures Python and the pip dependencies are installed.</p>
<p>The main script will be kept updated <a href="https://github.com/nlseven/misc-scripts/blob/master/net-set.py">here</a>, but to give you an idea:</p>
<pre><code class="language-python">#!/usr/bin/env python3
__author__ = &quot;Angus Kelsey &lt;nlseven@users.noreply.github.com&gt;&quot;
__copyright__ = &quot;Copyright 2019&quot;
__license__ = &quot;MIT&quot;
from xml.dom.minidom import parseString
import warnings
import sys
import subprocess
from yaml import load, dump
from yaml import CLoader as Loader, CDumper as Dumper

def live_data():
   # Export the current list of network interfaces from VMware,
   # as well as the current cloud-config from RancherOS
   grab_vcloud = &quot;/usr/bin/vmtoolsd --cmd 'info-get guestinfo.ovfEnv'&quot;
   grab_ros = &quot;/usr/bin/sudo ros config export&quot;
   dasXML = subprocess.Popen(grab_vcloud,shell=True,stdout=subprocess.PIPE).stdout.read()
   dasYAML = subprocess.Popen(grab_ros,shell=True,stdout=subprocess.PIPE).stdout.read()
   return dasXML, dasYAML

def create_dict(rawData):
   # Convert the OVFEnv XML into a dictionary
   ovfData = parseString(rawData)
   ovfKeys = {}
   # Only get keys in the &lt;Property&gt; lines
   for ovfkey in ovfData.getElementsByTagName('Property'):
      key, value = [ ovfkey.attributes['oe:key'].value,
                     ovfkey.attributes['oe:value'].value ]
      ovfKeys[key] = value
   return ovfKeys

def builder(ovfKeys):
   # Build up a dictionary of network interfaces that we can return, for up to three NICs
   # Note: this could be done a whole lot better

   # Check for a primary NIC, and die if it doesn't exist
   # as sometimes vCloud doesn't set the guestinfo
   if not 'vCloud_ip_0' in ovfKeys:
      warnings.warn(&quot;Can't find a primary NIC! Machine may need to restart first.&quot;,Warning)
      sys.exit(1)
   ovfKeys[&quot;interfaces&quot;] = {}
   # In these loops: check for either one, two or three interfaces.
   # Set the MAC addresses as the interface names because vCloud/Terraform likes to screw up the order
   if all (key in ovfKeys for key in (&quot;vCloud_ip_0&quot;, &quot;vCloud_ip_1&quot;, &quot;vCloud_ip_2&quot;)):
      print(&quot;Found three NICs!&quot;)
      ovfKeys[&quot;num&quot;] = 3
      # Convert netmask binary bits into CIDR slash notation
      eth0cidr = sum(bin(int(digit)).count('1') for digit in ovfKeys[&quot;vCloud_netmask_0&quot;].split('.'))
      eth1cidr = sum(bin(int(digit)).count('1') for digit in ovfKeys[&quot;vCloud_netmask_1&quot;].split('.'))
      eth2cidr = sum(bin(int(digit)).count('1') for digit in ovfKeys[&quot;vCloud_netmask_2&quot;].split('.'))
      # Update the IPs to use as proper CIDRs
      ovfKeys[&quot;vCloud_ip_0&quot;] += &quot;/&quot; + str(eth0cidr)
      ovfKeys[&quot;vCloud_ip_1&quot;] += &quot;/&quot; + str(eth1cidr)
      ovfKeys[&quot;vCloud_ip_2&quot;] += &quot;/&quot; + str(eth2cidr)
      # Use Rancher's format: https://rancher.com/docs/os/v1.2/en/networking/interfaces/
      key1 = 'mac=' + ovfKeys[&quot;vCloud_macaddr_0&quot;]
      key2 = 'mac=' + ovfKeys[&quot;vCloud_macaddr_1&quot;]
      key3 = 'mac=' + ovfKeys[&quot;vCloud_macaddr_2&quot;]
      # Build an interfaces dictionary in case we need to save the YAML data
      # Use the MAC address as the name in case the interfaces aren't in right order
      ovfKeys[&quot;interfaces&quot;] = {
         key1: {'address': ovfKeys[&quot;vCloud_ip_0&quot;],'gateway': ovfKeys[&quot;vCloud_gateway_0&quot;], 'dhcp': False},
         key2: {'address': ovfKeys[&quot;vCloud_ip_1&quot;], 'dhcp': False},
         key3: {'address': ovfKeys[&quot;vCloud_ip_2&quot;], 'dhcp': False}
         }

   # Repeat for 2 NICs... then 1...
   elif all (key in ovfKeys for key in (&quot;vCloud_ip_0&quot;, &quot;vCloud_ip_1&quot;)):
      print(&quot;Found two NICs!&quot;)
      ovfKeys[&quot;num&quot;] = 2
      eth0cidr = sum(bin(int(digit)).count('1') for digit in ovfKeys[&quot;vCloud_netmask_0&quot;].split('.'))
      eth1cidr = sum(bin(int(digit)).count('1') for digit in ovfKeys[&quot;vCloud_netmask_1&quot;].split('.'))
      ovfKeys[&quot;vCloud_ip_0&quot;] += &quot;/&quot; + str(eth0cidr)
      ovfKeys[&quot;vCloud_ip_1&quot;] += &quot;/&quot; + str(eth1cidr)
      # Use Rancher's format: https://rancher.com/docs/os/v1.2/en/networking/interfaces/
      key1 = 'mac=' + ovfKeys[&quot;vCloud_macaddr_0&quot;]
      key2 = 'mac=' + ovfKeys[&quot;vCloud_macaddr_1&quot;]
      ovfKeys[&quot;interfaces&quot;] = {
         key1: {'address': ovfKeys[&quot;vCloud_ip_0&quot;],'gateway': ovfKeys[&quot;vCloud_gateway_0&quot;], 'dhcp': False},
         key2: {'address': ovfKeys[&quot;vCloud_ip_1&quot;], 'dhcp': False}
         }
   else:
      print (&quot;Single NIC found!&quot;)
      ovfKeys[&quot;num&quot;] = 1
      eth0cidr = sum(bin(int(digit)).count('1') for digit in ovfKeys[&quot;vCloud_netmask_0&quot;].split('.'))
      ovfKeys[&quot;vCloud_ip_0&quot;] += &quot;/&quot; + str(eth0cidr)
      # Use Rancher's format: https://rancher.com/docs/os/v1.2/en/networking/interfaces/
      key1 = 'mac=' + ovfKeys[&quot;vCloud_macaddr_0&quot;]
      ovfKeys[&quot;interfaces&quot;] = {
         key1: {'address': ovfKeys[&quot;vCloud_ip_0&quot;],'gateway': ovfKeys[&quot;vCloud_gateway_0&quot;], 'dhcp': False}}
   return ovfKeys

def exit_lane(ovfKeys,ros_yaml):
   # Check if configuration needs updating. If not, exit this script!
   ovfKeys[&quot;update&quot;] = False
   # cconfig == cloud-config, which we grabbed from ros export - load it in as a dictionary
   cconfig = load(ros_yaml,Loader=Loader)
   # Check if we have a proper network interfaces section at all
   if &quot;interfaces&quot; not in cconfig[&quot;rancher&quot;][&quot;network&quot;]:
       print(&quot;No interfaces section exists! Need to create.&quot;)
       ovfKeys[&quot;update&quot;] = True
       return
   # Iterate through X interfaces reported by vCloud, and check their config
   for eth in range (0,ovfKeys[&quot;num&quot;]):
      currInterface = &quot;mac=&quot; + ovfKeys[&quot;vCloud_macaddr_&quot;+str(eth)]
      # Is there a config section for this particular MAC address/interface?
      # If not, flag for update and stop checking this NIC
      if currInterface not in cconfig[&quot;rancher&quot;][&quot;network&quot;][&quot;interfaces&quot;]:
         print(&quot;Configuration does not contain &quot; + currInterface + &quot;. Need to add.&quot;)
         ovfKeys[&quot;update&quot;] = True
         continue
      # If so, is the config correct?
      if cconfig[&quot;rancher&quot;][&quot;network&quot;][&quot;interfaces&quot;][currInterface][&quot;address&quot;] == ovfKeys[&quot;vCloud_ip_&quot;+str(i)]:
         print (&quot;Interface &quot; + currInterface + &quot; is up to date.&quot;)
      else:
         print (&quot;Interface &quot; + str(eth) + &quot; needs updating.&quot;)
         ovfKeys[&quot;update&quot;] = True
   # All interfaces checked. Exist the script if no updates are needed.
   if not ovfKeys[&quot;update&quot;]:
      print(&quot;Everything is up to date. Bye!&quot;)
      sys.exit(0)

def ros_yaml(ovfKeys,ros_yaml):
   # Update required, this function will render the YAML, merge config and reboot
   #
   # Create a skeleton dictionary
   newSpec = {'rancher': {'network': {}}}
   # Copy what we built back in builder() into the skeleton
   newSpec[&quot;rancher&quot;][&quot;network&quot;][&quot;interfaces&quot;] = ovfKeys[&quot;interfaces&quot;].copy()
   # Let the user see some pretty YAML
   print(dump(newSpec,default_flow_style=False))
   # Save it to /tmp
   with open('/tmp/ros_yaml.yaml','w') as cloud_config:
      cloud_config.write(dump(newSpec,default_flow_style=False))
   # Ensure it doesn't trip up RancherOS - wait for verify to complete
   ros_verify = &quot;/usr/bin/sudo ros config validate -i /tmp/ros_yaml.yaml&quot;
   validator = subprocess.Popen(ros_verify,shell=True,stdout=subprocess.PIPE)
   validator.communicate()[0]
   is_valid = validator.returncode
   # Confusing variable name - an exit code of 0 means it IS valid
   if is_valid == 0:
      print(&quot;Rancher has verified the change. Merging and rebooting ...&quot;)
      ros_merge = &quot;/usr/bin/sudo ros config merge -i /tmp/ros_yaml.yaml&quot;
      subprocess.Popen(ros_merge,shell=True,stdout=subprocess.PIPE)
      subprocess.run([&quot;sudo&quot;,&quot;reboot&quot;])
   else:
      # I don't know what else to check for if it doesn't validate
      print(&quot;Something went wrong. ROS returned &quot; + str(is_valid) + &quot;,\ntried running &quot;+ ros_verify)
      sys.exit(1)  

# Now do it in the correct order
data, ros_yaml = live_data()
parsed = create_dict(data)
parsed = builder(parsed)
exit_lane(parsed,ros_yaml)
ros_yaml(parsed,ros_yaml)
</code></pre>
<p>To make sure it runs properly on the nodes, you need to include some extras in your initial configuration:</p>
<pre><code class="language-yaml">#cloud-config.yaml
runcmd:
- apt update
- sudo apt install -y open-vm-tools python3 python3-pip cython3 libyaml-dev
- pip3 install pyyaml -q
- python3 /tmp/netset.py
rancher:
  console: ubuntu
write_files:
- content: |+
    # python script
    # goes here
  owner: rancher
  path: /tmp/netset.py
  permissions: &quot;0770&quot;
</code></pre>
<p>The script has performed exactly as I need it to and now all of the network settings are set correctly after the machine has been deployed.</p>
<h2 id="diggingfurther">Digging Further</h2>
<p>If you read through the script, you'll see that I'm using the MAC addresses as the keys for the network settings to avoid using interface names, in order to workaround VMware's quirky behaviour during deployment.</p>
<p>This is documented on the <a href="https://rancher.com/docs/os/v1.2/en/networking/interfaces/">official RancherOS pages</a>, and essentially means you can use <code>mac=00:01:02:03:04:05:</code> instead of <code>eth0:</code>.</p>
<p>The script also takes care of converting the netmask of the interfaces into the slash notation used by Rancher, thanks to <a href="https://stackoverflow.com/questions/38085571/how-use-netaddr-to-convert-subnet-mask-to-cidr-in-python">this</a> Stack Overflow post.</p>
<h2 id="drawbacksandpossibleimprovements">Drawbacks and Possible Improvements</h2>
<p>Unfortunately, this is still only a hack-job at best. There are some situations where this will not work, and some things that could be done a lot better.</p>
<h3 id="mainloop">Main Loop</h3>
<p>The script only supports up to three NICs, and only because it does so in a loop over how many it finds. In the future I'd like to make this a lot more elegant, but for my environment this isn't so much of a problem.</p>
<p>In addition, whether or not to update settings is done solely on the comparison of the address of each interface in the <code>cloud-config</code> settings, and does not take into account changing gateways, suffixes, etc.</p>
<h3 id="multiplegateways">Multiple Gateways</h3>
<p>A few times, VMware has decided to assign a gateway to a network interface attached to a private, non-routable network. Terrible design aside, this becomes a problem when the interface order is wrong: the script assumes that <code>vCloud_gateway_0</code> is the correct gateway, which can cause headaches.</p>
<p>An updated version of the script that will soon be rolled into Git has an additional loop which assigns as many gateways that are reported, and this <em>does</em> work without any obvious problems, but it's nasty. I do not want private traffic being routed when it doesn't have to be, and ultimately you will need to tailor the script to fit your particular environment.</p>
<h3 id="manipulatingtheovfdirectly">Manipulating the OVF Directly</h3>
<p>If I were to fix this the <em>right</em> way during deployment for next time, I would instead look at writing a Terraform provider to set the correct <code>cloud-config</code> strings directly into the host using the <a href="https://pubs.vmware.com/vcd-56/index.jsp?topic=%2Fcom.vmware.vcloud.api.doc_56%2FGUID-E13A5613-8A41-46E3-889B-8E1EAF10ABBE.html">vCloud ProductSection API</a>.</p>
<p>The advantage of this is that once Terraform has applied a cluster deployment, it can go through and configure the ProductSection of the VMs to match the network settings supplied and remove the need entirely for the Python script; as RancherOS (and many other operating systems using cloud init) will accept the correct keys out-of-the-box.</p>
<p>The tricky bit to making this work is that it's a PUT operation, meaning you'll first have to pull the current ProductSection, update what you need to and push back the <em>entire</em> document. There's no way to patch just the sections you need to.</p>
<h2 id="furtherreading">Further Reading</h2>
<p>It looks like users of CoreOS has stumbled upon a similar issue, and someone has made an elegant program in Go to handle vCloud's weird networking. You can see it in <a href="https://github.com/xcompass/vmware-ovfenv">this Git repo</a>, but looking through it also relies on the gateway being set on the correct interface.</p>
<!--kg-card-end: markdown--><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Single-Node Kunernetes on Baremetal: Part Two]]></title><description><![CDATA[Exploring storage and backup options for your single-node Rancher Kubernetes installation.]]></description><link>https://kelsey.id.au/k8s-baremetal-2/</link><guid isPermaLink="false">5c931fb922aad90001c3589c</guid><category><![CDATA[Kubernetes]]></category><category><![CDATA[restic]]></category><dc:creator><![CDATA[Angus Kelsey]]></dc:creator><pubDate>Tue, 02 Apr 2019 23:08:27 GMT</pubDate><media:content url="https://kelsey.id.au/content/images/2019/04/vskube.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://kelsey.id.au/content/images/2019/04/vskube.jpg" alt="Single-Node Kunernetes on Baremetal: Part Two"><p>In this edition: longer lasting data!</p>
<hr>
<p>If you're coming here from <a href="https://kelsey.id.au/kubes-on-baremetal">part one</a>, you should have a single-node Rancher Kubernetes node just waiting to be explored. This part will go into more detail about some important components and how to use them to your advantage.</p>
<h1 id="storageinanutshell">Storage in a Nutshell</h1>
<p>Kubernetes, just like Docker, is stateless. As soon as you stop a pod, the data it contained is gone forever. This is only really a bad thing if you're running something that needs to have persistent storage, like a database or wiki.</p>
<p>Keep in mind that most micro-services are designed to be configured through config-maps, secrets and environment variables without needing to store data, but there are easy ways to provide more permanent storage.</p>
<h1 id="persistentvolumes">Persistent Volumes</h1>
<p>Storage in Kubernetes is centered around the idea of Persistent Volumes being able to fulfil claims for storage. There are many types of built-in PVs that can be useful depending on the architecture and cloud environment of your system, such as NFS or vSphere. Take a look at the <a href="https://kubernetes.io/docs/concepts/storage/persistent-volumes/#types-of-persistent-volumes">official docs</a> to see a list of all supported PV types.</p>
<p>As this is a standalone server that's not a part of the cluster (and more on this in another part), you probably won't have access to many of these offerings. Luckily, you can specify that a PV should simply use the local server's storage for the files. For instance:</p>
<pre><code class="language-yaml">kind: PersistentVolume
apiVersion: v1
metadata:
  name: testing-pv
  labels:
    type: local
spec:
  storageClassName: manual
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: &quot;/mnt/testing-pv&quot;
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: testing-pvc
spec:
  storageClassName: manual
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
</code></pre>
<p>This will create a 5GB volume under <code>/mnt</code> of the server, and a PVC that will be bound to that. You can then use <code>testing-pvc</code> as a PVC for a container.</p>
<h1 id="storageprovisioners">Storage Provisioners</h1>
<p>The above example isn't ideal in the real world. Every time you need to deploy a workload, you'll have to manually configure additional storage for each PVC that needs to be fulfilled.</p>
<p>Kubernetes has the concept of <a href="https://kubernetes.io/docs/concepts/storage/storage-classes/">storage classes</a>, which are a way to provide different kinds of storage to your cluster.</p>
<p>When you have a class set up, you can use <a href="https://kubernetes.io/docs/concepts/storage/dynamic-provisioning/">dynamic volume provisioning</a> to automatically provision <em>just</em> the right amount of storage that any PVCs require. By setting the <code>reclaimPolicy</code> in the storage class, you can also ensure that PVCs that are destroyed will clean up after themselves by removing the underlying PV and storage on-disk.</p>
<h2 id="examplenfsprovisioner">Example: NFS Provisioner</h2>
<p>As this describes a single-node cluster, it makes sense to use the local disk as a storage source for containers running on it. There is a dead-simple <a href="https://github.com/kubernetes-incubator/external-storage/tree/master/nfs">NFS provisioner</a> available that does the job well.</p>
<h3 id="installingviarancher">Installing via Rancher</h3>
<p>As the previous section dealt with getting Rancher up and running, you should be able to follow along with deploying the server via a Helm chart in the GUI to make the process extremely easy:</p>
<ol>
<li>Log into the Rancher UI and ensure you're at the Global scope:</li>
</ol>
<p><img src="https://kelsey.id.au/content/images/2019/04/rke-01.png" alt="Single-Node Kunernetes on Baremetal: Part Two"></p>
<ol start="2">
<li>Click on Catalogs, and make sure <em>Helm Stable</em> is set to <em>Enabled</em></li>
<li>Drop into the <em>Cluster: local</em> scope and click on <em>Storage ▶️ Persistent Volumes</em></li>
<li>Add a new volume</li>
<li>Choose <em>Local Node Path</em> and a sane path (e.g. <code>/mnt/ganesha</code>), ensuring the directory is created</li>
<li>Drop into the <em>System</em> scope and click <em>Catalog Apps</em> at the top</li>
<li>Launch, and select <code>nfs-provisioner</code> from Library</li>
<li>Enable persistent volume, and ensure the volume size matches the above PV</li>
</ol>
<p>Watch the deployment and ensure everything sets itself up properly (it should!)</p>
<h3 id="dynamicprovisioninginaction">Dynamic Provisioning in Action</h3>
<p>Now that a dynamic provisioner and default storage class exist, the above example can be simplified to this:</p>
<pre><code class="language-yaml">kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: testing-pvc
spec:
  storageClassName: nfs-provisioner
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
</code></pre>
<p>Since the storage class is specified (or even if it isn't when it's set as the default), Kubernetes will automatically create a PV on the NFS &quot;server&quot; to match the PVC created. Pretty slick!</p>
<h1 id="backups">Backups</h1>
<p>So, you have containers being able to tap into the local storage on your node. Great! But what happens if that node, or its disk, fails?</p>
<h2 id="stashandrestic">Stash and restic</h2>
<p>Luckily, AppsCode have open-sourced a product called <a href="https://github.com/appscode/stash">Stash</a>: a backup system for Kubernetes that's built on top of the excellent <a href="https://restic.net/">restic</a> binary (which you should check out for your other projects).</p>
<p>In a nutshell, Restic is a system that provides a very, <em>very</em> simple way to back up to a wide range of destinations; from simple <code>rsync</code> locations to S3 or OpenStack buckets. Stash builds on top of this by creating sidecars in your deployments that access the persistent storage of your containers while they're running, which get snapshotted, deduplicated, encrypted and sent offsite.</p>
<h2 id="installingstash">Installing Stash</h2>
<p>Just like the above <code>nfs-provisioner</code>, Stash provides a Helm chart to make it easy to install in Kubernetes. In this example I'll step through doing it via the command line instead.</p>
<p>There's not much to the process: these steps are largely taken from the <a href="https://appscode.com/products/stash/0.8.3/setup/install/">official guide</a>:</p>
<pre><code class="language-bash">helm repo add appscode https://charts.appscode.com/stable/
helm repo update
helm install appscode/stash --name stash-operator --version 0.8.3 --namespace kube-system
</code></pre>
<p>It's very important that this is installed into the <code>kube-system</code> namespace. If you've followed the last part, you will most certainly have Role-based Access Control in effect, so Stash needs to be able to create its own role bindings to operate.</p>
<h2 id="configuringbackups">Configuring Backups</h2>
<h3 id="prerequisites">Prerequisites</h3>
<p>First and foremost, you are going to need a location to back up to. The <a href="https://restic.readthedocs.io/en/stable/030_preparing_a_new_repo.html">restic docs</a> contain a list of all supported back-ends.</p>
<p>Secondly, you'll need your API credentials (e.g. <code>AWS_SECRET_ACCESS_KEY</code>) for your chosen service. I'm using BackBlaze/B2 in this example, as I've configured it that way for my services.</p>
<h3 id="choosingatarget">Choosing a Target</h3>
<p>Let's take a look at an example workload to find out what's needed to get Stash running. You can pull out the deployment's YAML with <code>kubectl get -n your-namespace deployment foobar -o yaml</code> if you don't have it handy.</p>
<p>For instance, my Gitea deployment looks something like this:</p>
<pre><code class="language-yaml">apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    git.service: gitea-web
  name: gitea-web
  namespace: git
spec:
  replicas: 1
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        git.service: gitea-web
    spec:
      containers:
      - image: gitea/gitea:1.6
        name: gitea-web
        ports:
        - containerPort: 3000
        - containerPort: 22
        resources: {}
        volumeMounts:
        - mountPath: /data
          name: gitea-web-claim0
      restartPolicy: Always
      volumes:
      - name: gitea-web-claim0
        persistentVolumeClaim:
          claimName: gitea-web-claim0
</code></pre>
<p>As shown in the YAML, the deployment is mounting <code>gitea-web-claim0</code> to <code>/data</code>, and is running in the <code>git</code> namespace. The workload can be found using the label <code>git.service: gitea-web</code>.</p>
<h2 id="tellingstashwhattodo">Telling Stash What to Do</h2>
<p>Next, you should make a secret containing your API keys and a password used to encrypt your repository:</p>
<pre><code class="language-bash">pwgen -B 16 1 | tr -d '\n' &gt; RESTIC_PASSWORD
echo -n 'your_b2_account_id' &gt; B2_ACCOUNT_ID
echo -n 'your_b2_account_key' &gt; B2_ACCOUNT_KEY
kubectl create secret generic -n git gitea-restic-secret \
    --from-file=./RESTIC_PASSWORD \
    --from-file=./B2_ACCOUNT_ID \
    --from-file=./B2_ACCOUNT_KEY
</code></pre>
<p>It's very important that all trailing whitespace is removed from these files.</p>
<p>You should then create the <a href="https://appscode.com/products/stash/0.8.3/guides/backup/">spec for the restic</a> that Stash will look after:</p>
<pre><code class="language-yaml">apiVersion: stash.appscode.com/v1alpha1
kind: Restic
metadata:
  name: gitea-backups
  namespace: git
spec:
  selector:
    matchLabels:
      git.service: gitea-web
  fileGroups:
  - path: /data
    retentionPolicyName: 'keep-last-14'
  backend:
    b2:
      bucket: git-data
    storageSecretName: gitea-restic-secret
  schedule: '30 1 * * *'
  volumeMounts:
  - mountPath: /data
    name: gitea-web-claim0
  retentionPolicies:
  - name: 'keep-last-14'
    keepLast: 14
    prune: true
</code></pre>
<p>It's in the restic specification that you can dictate the schedule and number of backups to keep. You can be creative here, and have multiple schedules running multiple retention policies. This is also where you specify that the destination is B2, and that the bucket name is <code>git-data</code>.</p>
<blockquote>
<p><strong>Be warned:</strong> applying the Stash spec will cause the running container to restart as the sidecar is created!</p>
</blockquote>
<p>To finish, apply your restic spec with <code>kubectl apply -n git -f git-restic.yaml</code>.</p>
<h2 id="confirmingyourbackups">Confirming Your Backups</h2>
<p>Simply run the following:</p>
<pre><code class="language-bash">kubectl get -n git repo          
NAME                   BACKUP-COUNT   LAST-SUCCESSFUL-BACKUP   AGE
deployment.gitea-db    1349           51m                      56d
deployment.gitea-web   1342           21m                      56d
</code></pre>
<h1 id="conclusion">Conclusion</h1>
<p>By adding persistent storage into your Kubernetes node, and ensuring that the data is properly backed up, you can get a lot more mileage out of just a single node. While I would certainly recommend against running any kind of production workloads on just one server, for personal services this is a great compromise and ensures your data is at least kept safe.</p>
<p>In future posts I'll be exploring more ways to add security and redundancy to Kubernetes, but for now I hope you find this information helpful.</p>
<!--kg-card-end: markdown--><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Single-Node Kunernetes on Baremetal: Part One]]></title><description><![CDATA[<!--kg-card-begin: markdown--><p>A working Kubernetes installation in a single lunch break!</p>
<hr>
<p>This is a write-up of my project to get Kubernetes working as a single-node, general purpose install on a baremetal server provided by OVH.</p>
<p>It took me a long time to find the <em>right</em> information on how to do this, as</p>]]></description><link>https://kelsey.id.au/kubes-on-baremetal/</link><guid isPermaLink="false">5c4aa7a3bd18d600018beb26</guid><category><![CDATA[Kubernetes]]></category><dc:creator><![CDATA[Angus Kelsey]]></dc:creator><pubDate>Tue, 05 Feb 2019 00:00:00 GMT</pubDate><media:content url="https://kelsey.id.au/content/images/2019/01/rancher.jpg" medium="image"/><content:encoded><![CDATA[<!--kg-card-begin: markdown--><img src="https://kelsey.id.au/content/images/2019/01/rancher.jpg" alt="Single-Node Kunernetes on Baremetal: Part One"><p>A working Kubernetes installation in a single lunch break!</p>
<hr>
<p>This is a write-up of my project to get Kubernetes working as a single-node, general purpose install on a baremetal server provided by OVH.</p>
<p>It took me a long time to find the <em>right</em> information on how to do this, as many of the components are new with little documentation provided.</p>
<hr>
<p><strong>DISCLAIMER</strong><br>A single-node instance like the one I'm about to describe is incredibly useful for a personal system or learning tool, but should not be confused with something that's acceptable for production workloads.</p>
<h1 id="background">Background</h1>
<blockquote>
<p>Why are you doing this?</p>
</blockquote>
<p>I've had a few low-end virtual servers for a while in Sydney and Warsaw that I've been using for various tasks, but I'd been straddling the limit of their (tiny) resources for quite some time.</p>
<p>To compound the frustration, I've been using OVH's OpenStack object storage containers as large virtual disks (thank you <a href="http://www.rath.org/s3ql-docs/about.html">S3QL</a>). Imagine using a server based in Sydney while its main data disk is over http on the other side of the world.</p>
<p>Regardless, it was time for a change, and the recent introduction of some temptingly cheap local baremetal servers has swayed me.</p>
<h1 id="myneeds">My Needs</h1>
<p>I had to find a solution that would tick some boxes:</p>
<ul>
<li>It should have at least 200GiB of space</li>
<li>It should be easily backed up (especially my Nextcloud data)</li>
<li>It should be easy to partition different workloads and control their resource usage</li>
<li>It should be secure</li>
<li>It should be easy to manage and automate (I'm sick of the traditional special-snowflake server method)</li>
<li>It should allow multiple domains and TCP services on the same static IP</li>
</ul>
<h1 id="kubernetesinanhour">Kubernetes in an Hour</h1>
<h2 id="softwareiused">Software I Used</h2>
<ul>
<li><a href="https://github.com/rancher/os/">RancherOS</a>: a ridiculously small distro where <em>all the things</em> are Dockerised</li>
<li><a href="https://github.com/rancher/rke/">RKE</a>: the Rancher Kubernetes Engine</li>
<li><a href="https://github.com/rancher/rancher/">Rancher</a>: lubricant for Kubernetes</li>
<li><a href="https://metallb.universe.tf/">MetalLB</a>: Google's solution to software load-balancing</li>
<li><a href="https://helm.sh/">Helm and Tiller</a>: a sort-of package manager for Kubernetes</li>
</ul>
<h2 id="part1preparation">Part 1: Preparation</h2>
<p>I decided to use RancherOS for the server, as everything running on the system is dockerised and it's designed for Kubernetes. Unfortunatelyt this is not one of the standard OS offerings by OVH.</p>
<h3 id="osinstallation">OS Installation</h3>
<p>Luckily, OVH do give you a Java Web Start KVM for your <em>(Supermicro)</em> server with the ability to connect an ISO to the virtual disc drive. The installation ISO for RancherOS is only about 100MiB, so this is absolutely perfect.</p>
<p>RancherOS uses the <a href="https://rancher.com/docs/os/v1.2/en/configuration/">standard cloud-config</a> YAML referencing to ingest its initial settings during installation. All you need is something similar to this:</p>
<pre><code class="language-yaml">#cloud-config
hostname: your.public.dns.address

rancher:
  network:
    interfaces:
      eth0:
        dhcp: true
    dns:
     nameservers:
     # Using Google's DNS
     - 8.8.8.8
     - 8.8.4.4

ssh_authorized_keys:
  - ssh-ed25519 AAAAblahblahblahtypicalkey email@add.ress
</code></pre>
<p>Once RancherOS is up and running, you can use some of the tricks in <a href="https://gist.github.com/krisnod/56ff894f400cce7c742fb11fb2fde9cf">this helpful Gist</a> to get a software RAID running in RancherOS:</p>
<pre><code class="language-bash"># Create empty partition tables on each disk with G
sudo fdisk /dev/sda
sudo fdisk /dev/sdb

# Install once on each disk
sudo ros install -i rancher/os:v1.5.0 -t gptsyslinux -c cloud-config.yml -a &quot;rancher.state.mdadm_scan&quot; -d /dev/sda --no-reboot
sudo ros install -i rancher/os:v1.5.0 -t gptsyslinux -c cloud-config.yml -a &quot;rancher.state.mdadm_scan&quot; -d /dev/sdb --no-reboot

# Configure RAID
sudo mdadm --create /dev/md0 --level=1 -- metadata=1.0 --raid-devices=2 /dev/sda1 /dev/sdb1

# Final sanity checks
sudo fsck /dev/md0
sudo resize2fs /dev/md0
sudo fsck /dev/md0

sudo reboot
</code></pre>
<h3 id="kubernetesinitialisation">Kubernetes Initialisation</h3>
<p>The Rancher Kubernetes Engine is painless to set up and get going. I'll step through performing a <a href="https://rancher.com/docs/rancher/v2.x/en/installation/single-node/">single-node install</a> using Let's Encrypt as the certificate manager, but veering off the path a little bit to make things work seamlessly on the node.</p>
<p>First off you need to tell RKE to install everything it needs, using some YAML spec. There are <a href="https://rancher.com/docs/rke/v0.1.x/en/config-options/">a number of things</a> you can tag and configure, though for my deployment I left everything at its default. Create your <code>rke.yaml</code> file for deployment:</p>
<pre><code class="language-yaml">ignore_docker_version: true
ssh_agent_auth: true
nodes:
  - address: public.dns.add.ress
    hostname_override: public.dns.add.ress
    user: rancher
    role: [controlplane,worker,etcd]

network:
    plugin: canal
    options:
        canal_iface: eth0

services:
  etcd:
    snapshot: true
    creation: 6h
    retention: 24h
</code></pre>
<p>As long as you're using <code>ssh-agent</code> for your SSH key, you don't have to specify the actual key you're using.</p>
<p>Next up, install Kubes and do some basic configuration:</p>
<pre><code class="language-bash">rke up --config rke.yaml
export KUBECONFIG=$(pwd)/kube_config_rke.yaml
kubectl -n kube-system create serviceaccount tiller
kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
helm init --service-account tiller
helm repo add rancher-stable https://releases.rancher.com/server-charts/stable
helm install stable/cert-manager --name cert-manager --namespace kube-system
helm install rancher-stable/rancher --name rancher --namespace cattle-system --set hostname=your.public.dns.name
</code></pre>
<p>You might need to space the above commands out just to ensure that everything is deploying properly, or use <code>kubectl</code> to check the rollout status of each.</p>
<p>You're done, and should have a ready-to-go basic Kubernetes node. To make it truly useful, you'll need to get down and dirty with some additional software.</p>
<h2 id="part2metallb">Part 2: MetalLB</h2>
<p>As this installation is only single-node, and I wanted to make sure that I could run all the services I needed to, I had to ensure that I had a way to share ports on the public IP address <em>without</em> resorting to hand-crafted iptables rules.</p>
<h3 id="whyusealoadbalancer">Why Use a Load Balancer?</h3>
<p>It might seem counter-intuitive to run a load balancer service on a single node <em>(and single IP)</em> installation, but you need to know about how Kubernetes networking works for non-HTTP services. For the sake of a single node, you <a href="https://kubernetes.io/docs/concepts/services-networking/service/">essentially have three</a> options:</p>
<ul>
<li><code>ClusterIP</code> as part of a service for internal-only access (just like linking Docker containers)</li>
<li><code>NodePort</code> to directly expose the service on your Kubernetes node, on a random port from 30000 to 32767</li>
<li>As part of a <code>DaemonSet</code>... which I will not go into</li>
</ul>
<p>As a <code>DaemonSet</code> is not a recommended way to go about simple deployments, none of the options are especially useful for exposing a typical port (say, IMAP).</p>
<p>Enter MetalLB, Google's software implementation of load-balancing for Kubernetes. Exposing a service to the world is as simple as:</p>
<ol>
<li>Installing MetalLB</li>
<li>Telling MetalLB which IPs it can use</li>
<li>Telling a service to expose a port on a particular IP</li>
</ol>
<p>Luckily, the load balancer implementation supports sharing multiple services with unique ports on a single IP address. This is especially useful when you only <em>have</em> a single IP.</p>
<h3 id="settingitup">Setting it Up</h3>
<p>Create a YAML spec containing the IP address of your service to instantiate the default pool:</p>
<pre><code class="language-yaml">apiVersion: v1
kind: ConfigMap
metadata:
  namespace: metallb-system
  name: config
data:  
  config: |
    address-pools:
    - name: default
      protocol: layer2
      addresses:
      - your.public.ip.addr
</code></pre>
<p>Then install and apply:</p>
<pre><code>kubectl apply -f https://raw.githubusercontent.com/google/metallb/v0.7.3/manifests/metallb.yaml
kubectl apply -n metallb-system lb-pool.yaml
</code></pre>
<h3 id="exposingworkloads">Exposing Workloads</h3>
<p>Following the <a href="https://metallb.universe.tf/usage/">official documentation</a>, getting it to work is as easy as assigning your service to the specific IP and putting in an IP sharing annotation.</p>
<p>Take a look at these two sample services: a typical bastion host and a generic game server with a voice port. MetalLB will expose multiple ports on the same IP address when you set the services up:</p>
<pre><code class="language-yaml">apiVersion: v1
kind: Service
metadata:
  annotations:
    metallb.universe.tf/allow-shared-ip: ip1
  name: bastion-exposer
  namespace: bastion
spec:
  loadBalancerIP: your.public.ip.addr
  ports:
  - name: ssh
    port: 1234
    protocol: TCP
    targetPort: 22
  selector:
    service: disposable
  type: LoadBalancer
---
apiVersion: v1
kind: Service
metadata:
  annotations:
    metallb.universe.tf/allow-shared-ip: ip1
  labels:
    game.service: game1-exposer
  name: game1-service
  namespace: gamehost
spec:
  ports:
  - name: game1
    port: 5678
    targetPort: 5678
  - name: voice
    port: 2468
    targetPort: 2468
  selector:
    game.service: game1
  type: LoadBalancer
  loadBalancerIP: your.public.ip.addr
</code></pre>
<p>The important part of the service spec is the annotation:</p>
<pre><code class="language-yaml">metadata:
  annotations:
    metallb.universe.tf/allow-shared-ip: ip1
</code></pre>
<p>As long as the <code>metadata.annotations.allow-shared-ip</code> key is matched for everything using the same <code>spec.loadBalancerIP</code>, every port will happily be opened up on that IP. It really is that easy.</p>
<h2 id="part3automaticcertificates">Part 3: Automatic Certificates</h2>
<p>This last part is a rough guide on how to use <a href="https://github.com/jetstack/cert-manager/">cert-manager</a> to generate Let's Encrypt SSL HTTPS certificates on-demand for any web services you're using.</p>
<p>I'm doing this using the default <code>ingress-nginx</code> that ships with the Rancher Kubernetes Engine. The one prerequisite is that you have working DNS to point to your web services' addresses, since we'll be using the normal HTTP challenge with Let's Encrypt's ACME service.</p>
<h3 id="creatingaclusterissuer">Creating a ClusterIssuer</h3>
<p>Cert-manager lets you configure two kinds of issuers: the standard <code>Issuer</code> resource that's bound to a single namespace, or the <code>ClusterIssuer</code> resource that's available across <strong>all</strong> namespaces. As part of a normal cluster, you'd want to use namespaced issuers as part of good security, but for a single node where you're the only tenant, it's much easier to go with the latter type.</p>
<p>Cert-manager comes with a sample cluster issuer, but it's better to create your own so you know what you're doing. Simply throw together a YAML file to deploy one, following the <a href="https://cert-manager.readthedocs.io/en/latest/reference/clusterissuers.html">official doco</a>:</p>
<pre><code class="language-yaml">apiVersion: v1
items:
- apiVersion: certmanager.k8s.io/v1alpha1
  kind: ClusterIssuer
  metadata:
    name: letsencrypt
  spec:
    acme:
      email: your@email.address
      http01: {}
      privateKeySecretRef:
        key: &quot;&quot;
        name: letsencrypt-cluster
      server: https://acme-v02.api.letsencrypt.org/directory
</code></pre>
<p>When it's done and working, you can use <code>kubectl get clusterissuer -o yaml</code> to make sure it's worked. It should output something resembling the following:</p>
<pre><code class="language-yaml">status:
acme:
    uri: https://acme-v02.api.letsencrypt.org/acme/acct/47017662
conditions:
- lastTransitionTime: &quot;2018-12-04T05:31:27Z&quot;
    message: The ACME account was registered with the ACME server
    reason: ACMEAccountRegistered
    status: &quot;True&quot;
    type: Ready
</code></pre>
<h3 id="creatingvirtualsslservers">Creating Virtual SSL Servers</h3>
<p>This part took me a while to figure out—every time I deployed an ingress for a service, it never generated a certificate. The <a href="https://cert-manager.readthedocs.io/en/latest/reference/ingress-shim.html">official doco</a> for this seems to be missing (at least for me) one little piece of YAML that made it all work.</p>
<p>For example, getting this Ghost instance exposed externally required the following service and ingress spec:</p>
<pre><code class="language-yaml">---
apiVersion: v1
kind: Service
metadata:
  labels:
    ghost.service: ghost-web-service
  name: ghost-web-service
  namespace: ghost
spec:
  ports:
  - name: http
    port: 2368
    targetPort: 2368
  selector:
    ghost.service: ghost-web
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: ghost-ingress
  annotations:
    kubernetes.io/ingress.class: nginx
    certmanager.k8s.io/cluster-issuer: letsencrypt
    certmanager.k8s.io/acme-challenge-type: http01
spec:
  tls:
  - hosts:
    - kelsey.id.au
    secretName: ghost-cert
  rules:
  - host: kelsey.id.au
    http:
      paths:
      - path: /
        backend:
          serviceName: ghost-web-service
          servicePort: 2368
---
apiVersion: certmanager.k8s.io/v1alpha1
kind: Certificate
metadata:
  name: ghost-cert
spec:
  secretName: ghost-cert
  dnsNames:
  - kelsey.id.au
  acme:
    config:
    - http01:
        ingressClass: nginx
      domains:
      - kelsey.id.au
  issuerRef:
    name: letsencrypt
    kind: ClusterIssuer
</code></pre>
<p>By creating an empty <code>certificate</code> object that references the domain and <code>ClusterIssuer</code>, <strong>and</strong> putting the <code>ClusterIssuer</code> and ACME challenge type in the <code>Ingress</code>, the issuer puts it all together and generates a certificate for you.</p>
<h1 id="closingthoughts">Closing Thoughts</h1>
<p>While this guide gives you a pretty neat Kubernetes instance, the resulting 'cluster' has its shortcomings, some of which I'm still trying to find a good solution to. You need to ask questions like:</p>
<blockquote>
<p>How do I provision persistent storage for my deployments?</p>
</blockquote>
<p>Take a look at the built-in <a href="https://kubernetes.io/docs/concepts/storage/storage-classes/">storage classes</a> for Kubernetes. There's not a lot on offer if you want to have flexible storage using the host's baremetal disks.</p>
<blockquote>
<p>How do I back-up and restore my data?</p>
</blockquote>
<p>If you lose the node, you lose everything, but in this case it's not as easy to set up a simple cron job to back up your data, let alone for things like <code>etcd</code>. This is a topic that I'll cover in a future article.</p>
<hr>
<p>Otherwise, I hope this was useful for you. Feel free to reach out if there's a topic you'd like me to cover as I continue learning.</p>
<!--kg-card-end: markdown-->]]></content:encoded></item></channel></rss>