One of the things I’ve been doing a lot of recently is deploying RancherOS nodes into VMware’s vCloud Director.

While automating the process entirely has provided its own set of challenges, the one that has been the most difficult to overcome is that of asking the cloud-init network interface settings to set themselves correctly.

The Problem

In most cloud providers, container-optimised operating systems have no real issue in determining their network settings given to them (including from vSphere/ESXi hypervisors). Unfortuantely, vCloud is one of the providers that seems to make things difficult.

RancherOS is able to detect the hypervisor that it’s running in, and when running in ESXi installs open-vm-tools to detect the virtual machine’s guestinfo keys exposed to it: including the network interface settings. As encountered in this Git issue, vCloud uses special keys in its OVF environment strings which are not compatible.

If you run vmtoolsd --cmd "info-get guestinfo.ovfenv" on a node, you’ll get something like this:

<?xml version="1.0" encoding="UTF-8"?>
   <Environment
      xmlns="http://schemas.dmtf.org/ovf/environment/1"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xmlns:oe="http://schemas.dmtf.org/ovf/environment/1"
      xmlns:ve="http://www.vmware.com/schema/ovfenv"
      oe:id=""
      ve:vCenterId="vm-123456">
      <PlatformSection>
         <Kind>VMware ESXi</Kind>
         <Version>6.7.0</Version>
         <Vendor>VMware, Inc.</Vendor>
         <Locale>en</Locale>
      </PlatformSection>
      <PropertySection>
         <Property oe:key="vCloud_UseSysPrep" oe:value="None"/>
         <Property oe:key="vCloud_bitMask" oe:value="11"/>
         <Property oe:key="vCloud_bootproto_0" oe:value="static"/>
         <Property oe:key="vCloud_bootproto_1" oe:value="static"/>
         <Property oe:key="vCloud_computerName" oe:value="rancheros-node"/>
         <Property oe:key="vCloud_dns1_0" oe:value=""/>
         <Property oe:key="vCloud_dns1_1" oe:value=""/>
         <Property oe:key="vCloud_dns2_0" oe:value=""/>
         <Property oe:key="vCloud_dns2_1" oe:value=""/>
         <Property oe:key="vCloud_gateway_0" oe:value="192.168.0.1"/>
         <Property oe:key="vCloud_gateway_1" oe:value="192.168.1.1"/>
         <Property oe:key="vCloud_ip_0" oe:value="192.168.1.100"/>
         <Property oe:key="vCloud_ip_1" oe:value="192.168.2.50"/>
         <Property oe:key="vCloud_ip_2" oe:value="10.0.0.6"/>
         <Property oe:key="vCloud_macaddr_0" oe:value="00:50:56:01:01:01"/>
         <Property oe:key="vCloud_macaddr_1" oe:value="00:50:56:02:02:02"/>
         <Property oe:key="vCloud_markerid" oe:value="uuid-uuid-uuid-uuid"/>
         <Property oe:key="vCloud_netmask_0" oe:value="255.255.255.0"/>
         <Property oe:key="vCloud_netmask_1" oe:value="255.255.255.0"/>
         <Property oe:key="vCloud_netmask_2" oe:value="255.255.255.0"/>
         <Property oe:key="vCloud_numnics" oe:value="2"/>
         <Property oe:key="vCloud_primaryNic" oe:value="0"/>
         <Property oe:key="vCloud_reconfigToken" oe:value="1234567"/>
         <Property oe:key="vCloud_resetPassword" oe:value="0"/>
         <Property oe:key="vCloud_suffix_0" oe:value=""/>
         <Property oe:key="vCloud_suffix_1" oe:value=""/>
      </PropertySection>
      <ve:EthernetAdapterSection>
         <ve:Adapter ve:mac="00:50:56:01:02:03" ve:network="rancher_network" ve:unitNumber="7"/>
      </ve:EthernetAdapterSection>
   </Environment>

As you can see, vCloud uses its own suffix for every item, and on top of that does not provide the correct information in some cases:

  • The number of NICs is less than actually presented to the system
  • vCloud incorrectly presents gateways for non-routed networks
  • DNS settings do not always come through

To add another layer of complexity, when vCloud creates the virtual machines, it is not consistent with the ordering of NICs. For instance, on some nodes I have found that the virtual machine sees its first NIC as eth1 and its second NIC as eth0, which means you cannot assume that the string vCloud_ip_0 matches eth0 within the host.

The Solution

As the name of the game here is automation, I needed something that fit in with the deployment process:

  1. Build template with Packer and upload to vCloud
  2. Deploy clusters with Terraform, including specific network settings
  3. Deploy Kubernetes with RKE

This means that I want to use a generic cloud-config once while building the machine and not have to manually set the network on each deployed node post-deployment.

I ended up going with a slapped-together Python script that does the following:

  1. Obtain the current cloud-config data reported by ros config export
  2. Obtain the guestinfo keys exposed by VMware
  3. Parse the VMware data into a dictionary
  4. Compare with what’s configured on the node and see what needs changing
  5. Verify the new configuration, write it and reboot

This gets written to the host on boot using the initial cloud-config.yaml file, and run after the Ubuntu console ensures Python and the pip dependencies are installed.

The main script will be kept updated here, but to give you an idea:

#!/usr/bin/env python3
__author__ = "Angus Kelsey <nlseven@users.noreply.github.com>"
__copyright__ = "Copyright 2019"
__license__ = "MIT"
from xml.dom.minidom import parseString
import warnings
import sys
import subprocess
from yaml import load, dump
from yaml import CLoader as Loader, CDumper as Dumper

def live_data():
   # Export the current list of network interfaces from VMware,
   # as well as the current cloud-config from RancherOS
   grab_vcloud = "/usr/bin/vmtoolsd --cmd 'info-get guestinfo.ovfEnv'"
   grab_ros = "/usr/bin/sudo ros config export"
   dasXML = subprocess.Popen(grab_vcloud,shell=True,stdout=subprocess.PIPE).stdout.read()
   dasYAML = subprocess.Popen(grab_ros,shell=True,stdout=subprocess.PIPE).stdout.read()
   return dasXML, dasYAML

def create_dict(rawData):
   # Convert the OVFEnv XML into a dictionary
   ovfData = parseString(rawData)
   ovfKeys = {}
   # Only get keys in the <Property> lines
   for ovfkey in ovfData.getElementsByTagName('Property'):
      key, value = [ ovfkey.attributes['oe:key'].value,
                     ovfkey.attributes['oe:value'].value ]
      ovfKeys[key] = value
   return ovfKeys

def builder(ovfKeys):
   # Build up a dictionary of network interfaces that we can return, for up to three NICs
   # Note: this could be done a whole lot better

   # Check for a primary NIC, and die if it doesn't exist
   # as sometimes vCloud doesn't set the guestinfo
   if not 'vCloud_ip_0' in ovfKeys:
      warnings.warn("Can't find a primary NIC! Machine may need to restart first.",Warning)
      sys.exit(1)
   ovfKeys["interfaces"] = {}
   # In these loops: check for either one, two or three interfaces.
   # Set the MAC addresses as the interface names because vCloud/Terraform likes to screw up the order
   if all (key in ovfKeys for key in ("vCloud_ip_0", "vCloud_ip_1", "vCloud_ip_2")):
      print("Found three NICs!")
      ovfKeys["num"] = 3
      # Convert netmask binary bits into CIDR slash notation
      eth0cidr = sum(bin(int(digit)).count('1') for digit in ovfKeys["vCloud_netmask_0"].split('.'))
      eth1cidr = sum(bin(int(digit)).count('1') for digit in ovfKeys["vCloud_netmask_1"].split('.'))
      eth2cidr = sum(bin(int(digit)).count('1') for digit in ovfKeys["vCloud_netmask_2"].split('.'))
      # Update the IPs to use as proper CIDRs
      ovfKeys["vCloud_ip_0"] += "/" + str(eth0cidr)
      ovfKeys["vCloud_ip_1"] += "/" + str(eth1cidr)
      ovfKeys["vCloud_ip_2"] += "/" + str(eth2cidr)
      # Use Rancher's format: https://rancher.com/docs/os/v1.2/en/networking/interfaces/
      key1 = 'mac=' + ovfKeys["vCloud_macaddr_0"]
      key2 = 'mac=' + ovfKeys["vCloud_macaddr_1"]
      key3 = 'mac=' + ovfKeys["vCloud_macaddr_2"]
      # Build an interfaces dictionary in case we need to save the YAML data
      # Use the MAC address as the name in case the interfaces aren't in right order
      ovfKeys["interfaces"] = {
         key1: {'address': ovfKeys["vCloud_ip_0"],'gateway': ovfKeys["vCloud_gateway_0"], 'dhcp': False},
         key2: {'address': ovfKeys["vCloud_ip_1"], 'dhcp': False},
         key3: {'address': ovfKeys["vCloud_ip_2"], 'dhcp': False}
         }

   # Repeat for 2 NICs... then 1...
   elif all (key in ovfKeys for key in ("vCloud_ip_0", "vCloud_ip_1")):
      print("Found two NICs!")
      ovfKeys["num"] = 2
      eth0cidr = sum(bin(int(digit)).count('1') for digit in ovfKeys["vCloud_netmask_0"].split('.'))
      eth1cidr = sum(bin(int(digit)).count('1') for digit in ovfKeys["vCloud_netmask_1"].split('.'))
      ovfKeys["vCloud_ip_0"] += "/" + str(eth0cidr)
      ovfKeys["vCloud_ip_1"] += "/" + str(eth1cidr)
      # Use Rancher's format: https://rancher.com/docs/os/v1.2/en/networking/interfaces/
      key1 = 'mac=' + ovfKeys["vCloud_macaddr_0"]
      key2 = 'mac=' + ovfKeys["vCloud_macaddr_1"]
      ovfKeys["interfaces"] = {
         key1: {'address': ovfKeys["vCloud_ip_0"],'gateway': ovfKeys["vCloud_gateway_0"], 'dhcp': False},
         key2: {'address': ovfKeys["vCloud_ip_1"], 'dhcp': False}
         }
   else:
      print ("Single NIC found!")
      ovfKeys["num"] = 1
      eth0cidr = sum(bin(int(digit)).count('1') for digit in ovfKeys["vCloud_netmask_0"].split('.'))
      ovfKeys["vCloud_ip_0"] += "/" + str(eth0cidr)
      # Use Rancher's format: https://rancher.com/docs/os/v1.2/en/networking/interfaces/
      key1 = 'mac=' + ovfKeys["vCloud_macaddr_0"]
      ovfKeys["interfaces"] = {
         key1: {'address': ovfKeys["vCloud_ip_0"],'gateway': ovfKeys["vCloud_gateway_0"], 'dhcp': False}}
   return ovfKeys

def exit_lane(ovfKeys,ros_yaml):
   # Check if configuration needs updating. If not, exit this script!
   ovfKeys["update"] = False
   # cconfig == cloud-config, which we grabbed from ros export - load it in as a dictionary
   cconfig = load(ros_yaml,Loader=Loader)
   # Check if we have a proper network interfaces section at all
   if "interfaces" not in cconfig["rancher"]["network"]:
       print("No interfaces section exists! Need to create.")
       ovfKeys["update"] = True
       return
   # Iterate through X interfaces reported by vCloud, and check their config
   for eth in range (0,ovfKeys["num"]):
      currInterface = "mac=" + ovfKeys["vCloud_macaddr_"+str(eth)]
      # Is there a config section for this particular MAC address/interface?
      # If not, flag for update and stop checking this NIC
      if currInterface not in cconfig["rancher"]["network"]["interfaces"]:
         print("Configuration does not contain " + currInterface + ". Need to add.")
         ovfKeys["update"] = True
         continue
      # If so, is the config correct?
      if cconfig["rancher"]["network"]["interfaces"][currInterface]["address"] == ovfKeys["vCloud_ip_"+str(i)]:
         print ("Interface " + currInterface + " is up to date.")
      else:
         print ("Interface " + str(eth) + " needs updating.")
         ovfKeys["update"] = True
   # All interfaces checked. Exist the script if no updates are needed.
   if not ovfKeys["update"]:
      print("Everything is up to date. Bye!")
      sys.exit(0)

def ros_yaml(ovfKeys,ros_yaml):
   # Update required, this function will render the YAML, merge config and reboot
   #
   # Create a skeleton dictionary
   newSpec = {'rancher': {'network': {}}}
   # Copy what we built back in builder() into the skeleton
   newSpec["rancher"]["network"]["interfaces"] = ovfKeys["interfaces"].copy()
   # Let the user see some pretty YAML
   print(dump(newSpec,default_flow_style=False))
   # Save it to /tmp
   with open('/tmp/ros_yaml.yaml','w') as cloud_config:
      cloud_config.write(dump(newSpec,default_flow_style=False))
   # Ensure it doesn't trip up RancherOS - wait for verify to complete
   ros_verify = "/usr/bin/sudo ros config validate -i /tmp/ros_yaml.yaml"
   validator = subprocess.Popen(ros_verify,shell=True,stdout=subprocess.PIPE)
   validator.communicate()[0]
   is_valid = validator.returncode
   # Confusing variable name - an exit code of 0 means it IS valid
   if is_valid == 0:
      print("Rancher has verified the change. Merging and rebooting ...")
      ros_merge = "/usr/bin/sudo ros config merge -i /tmp/ros_yaml.yaml"
      subprocess.Popen(ros_merge,shell=True,stdout=subprocess.PIPE)
      subprocess.run(["sudo","reboot"])
   else:
      # I don't know what else to check for if it doesn't validate
      print("Something went wrong. ROS returned " + str(is_valid) + ",\ntried running "+ ros_verify)
      sys.exit(1)  

# Now do it in the correct order
data, ros_yaml = live_data()
parsed = create_dict(data)
parsed = builder(parsed)
exit_lane(parsed,ros_yaml)
ros_yaml(parsed,ros_yaml)

To make sure it runs properly on the nodes, you need to include some extras in your initial configuration:

#cloud-config.yaml
runcmd:
- apt update
- sudo apt install -y open-vm-tools python3 python3-pip cython3 libyaml-dev
- pip3 install pyyaml -q
- python3 /tmp/netset.py
rancher:
  console: ubuntu
write_files:
- content: |+
    # python script
    # goes here
  owner: rancher
  path: /tmp/netset.py
  permissions: "0770"

The script has performed exactly as I need it to and now all of the network settings are set correctly after the machine has been deployed.

Digging Further

If you read through the script, you’ll see that I’m using the MAC addresses as the keys for the network settings to avoid using interface names, in order to workaround VMware’s quirky behaviour during deployment.

This is documented on the official RancherOS pages, and essentially means you can use mac=00:01:02:03:04:05: instead of eth0:.

The script also takes care of converting the netmask of the interfaces into the slash notation used by Rancher, thanks to this Stack Overflow post.

Drawbacks and Possible Improvements

Unfortunately, this is still only a hack-job at best. There are some situations where this will not work, and some things that could be done a lot better.

Main Loop

The script only supports up to three NICs, and only because it does so in a loop over how many it finds. In the future I’d like to make this a lot more elegant, but for my environment this isn’t so much of a problem.

In addition, whether or not to update settings is done solely on the comparison of the address of each interface in the cloud-config settings, and does not take into account changing gateways, suffixes, etc.

Multiple Gateways

A few times, VMware has decided to assign a gateway to a network interface attached to a private, non-routable network. Terrible design aside, this becomes a problem when the interface order is wrong: the script assumes that vCloud_gateway_0 is the correct gateway, which can cause headaches.

An updated version of the script that will soon be rolled into Git has an additional loop which assigns as many gateways that are reported, and this does work without any obvious problems, but it’s nasty. I do not want private traffic being routed when it doesn’t have to be, and ultimately you will need to tailor the script to fit your particular environment.

Manipulating the OVF Directly

If I were to fix this the right way during deployment for next time, I would instead look at writing a Terraform provider to set the correct cloud-config strings directly into the host using the vCloud ProductSection API.

The advantage of this is that once Terraform has applied a cluster deployment, it can go through and configure the ProductSection of the VMs to match the network settings supplied and remove the need entirely for the Python script; as RancherOS (and many other operating systems using cloud init) will accept the correct keys out-of-the-box.

The tricky bit to making this work is that it’s a PUT operation, meaning you’ll first have to pull the current ProductSection, update what you need to and push back the entire document. There’s no way to patch just the sections you need to.

Further Reading

It looks like users of CoreOS has stumbled upon a similar issue, and someone has made an elegant program in Go to handle vCloud’s weird networking. You can see it in this Git repo, but looking through it also relies on the gateway being set on the correct interface.