Table of Contents
Updated by Agile Stacks
KubeFlex has been designed to be a highly automated solution to difficult on-prem problems. An experience defined by minimal effort out of the box is what we strive to deliver. Here you'll find key information about physical infrastructure KubeFlex expects to work with and how you can get the most from it.
Read on for more detail about all requirements, but the gist of what you need is:
- An Agile Stacks account
- Some number of hosts (virtual or physical) that can PXE boot
- A host for our "Boot Proxy" container
- Internet access
Agile Stacks Control Plane
The central feature of KubeFlex is extending the power of your Agile Stacks stacks to physical environments without sacrificing functionality. This means KubeFlex is deeply integrated with the Agile Stacks Control Plane and therefore an Agile Stacks account is required to operate KubeFlex.
Control Plane allows to manage multiple instances of KubeFlex clusters, and allows fully automated Kubernetes deployments on both bare metal and virtualized infrastructure:
One philosophy driving KubeFlex design is that Agile Stacks users will prefer to treat clusters as "cattle" (ie: not "pets") and will have entire clusters for environment separations (ie: dev, test, prod) or multi-tenancy. With it's default settings, KubeFlex is best at handling many clusters of a smaller size (ie: dozens of nodes) but is able to handle clusters of up to hundreds of nodes.
Due to its high adaptability, KubeFlex is well suited to brown-field deployments, where there may be existing infrastructure or other clusters to contend with. There are a variety of ways to extend, adapt, and otherwise modify the clusters that KubeFlex creates so that it will work in your specific situation. Furthermore, Agile Stacks is able to work with you to determine the best course of action for the most challenging situations so you are not on your own.
KubeFlex assumes that there are logically separate WANs and LANs.
Good internet access is critical to the operation of a KubeFlex cluster.
WANs must allow outbound internet communication to the Agile Stacks Control Plane which happens over standard HTTP/S protocols. Internet access via NAT on the WAN is perfectly fine. DNS resolution of public names must work correctly.
Furthermore, there are a number of software components that will, by default, expect to access the internet for one reason or another. For example, KubeFlex nodes will want to update their OS packages, Kubernetes will want to update its components, and many containers will need to access internet APIs. Restricting internet access for KubeFlex clusters will lead to unpredictable results ranging from using outdated packages to entirely broken services. If some level of control is desired over this situation, Agile Stacks can help you tailor a solution that works.
Lastly, when a KubeFlex cluster is deployed in "hybrid" mode, KubeFlex will also create a private WAN with the Agile Stacks Control Plane. This allows you to conveniently access services deployed to your on-prem clusters over the internet without needing to share the same local network. Alternatively, when that is not needed or wanted, KubeFlex supports deploying clusters in a "standalone" mode where you are responsible for managing your own access to services.
LANs must allow ProxyDHCP propagation from the Agile Stacks Boot Proxy, either by residing in the same layer-2 domain or using DHCP helpers if needed. DHCP Snooping, if in use, should trust the Boot Proxy traffic. A single Boot Proxy may service many domains, clusters, and nodes. There may also be many Boot Proxies servicing the same physical infrastructure but the operator must take care to ensure that they do not respond to the same nodes.
KubeFlex currently assumes that all node-to-node communication will be in a LAN and that all nodes in the same cluster will be in the same layer-2 domain. Multiple clusters may occupy the same layer-2 domain when MAC filtering is applied. Other clusters in other layer-2 domains are supported as long as a Boot Proxy can service the domain.
Advanced networking options to support additional security layers, multi-link interfaces, or dedicated high-performance networks are also possible and Agile Stacks can help you get started.
Outbound Network Access
Even for isolated clusters, the kubernetes cluster, and any stacks deploy on top of it must reach out to the internet to fetch docker containers. In addition, the OS installation uses cloud-init scripts which leverage Ubuntu package registries, as well as Python PIP repositories.
In addition, the cluster must be controlled by our Automation Hub, which directs the construction of the K8s cluster, and the installation of stacks of applications.
For these reasons, outbound network access is required on ports
80, 443, and 8888 (HTTP)
These requirements can be softened, altered or removed with some customization.
A "machine" is a computer that KubeFlex will transform into a Kubernetes node. Machines may be large data center servers or small single-board systems as long as they meet some basic requirements:
- 2+ CPU cores, x86-64 architecture
- 3GB+ of RAM
- 10GB+ of storage on at least 1 device
- Able to PXE boot from a suitable network interface
Network booting via PXE is the only mode of booting that KubeFlex currently supports because each boot contains unique code for the machine and the current state of your clusters, ensuring that your machines are running only what they need to.
A storage device and a little extra RAM are needed because we run your OS in RAM and bulk data gets offloaded to the storage device. This means that every time a machine is rebooted you can be sure that it is running a fresh OS while at the same time ensuring bulk data does not have to be recreated.
In order to have the best experience, here are some tips for configuring your systems for KubeFlex:
- If using LACP, configure a PXE-compatible fallback mode in your network
- EFI boot mode is preferred (though legacy will work too)
- Set BIOS to boot from the desired NIC first to optimize boot times
- Disable booting from other devices to optimize boot times and simplify retries
- Toggle PCI Option ROMs as needed (enabled for devices needed when booting; disabled otherwise)
- Check for firmware updates of BIOS and/or NIC if encountering trouble (especially if it doesn't make sense)
- Reach out to Agile Stacks, we can help!
Because of Kubernetes' dynamic nature, dynamically updatable DNS is a critical piece of infrastructure that is necessary to support the publishing of web services either internally or externally.
To serve Internal customers:
We can ship a DNS server that will publish the exposed ingresses of your Kubernetes cluster, and you can simply delegate the subdomain to the server that we supply. Or, if your DNS server can support RFC2136 dynamic DNS updates, we can send updates to it directly.
To serve external customers on the internet:
Our DNS management system can integrate with dozens of internet based DNS providers, include Route53, Dyn, One.com, and many others.
Certificate Management System
In order to serve web services or other assets over TLS using a Kubernetes Cluster, TLS Certificates are required.
If you don't have a CMS or a Root Certificate already, we strongly recommend using a ICANN Top Level Domain for your certificate. This eliminates a common problem of having to distribute your root CA certificate to every container that needs to access your cluster. We can manage the Root CA and deploy a certificate management system, but you will need to supply the Root CA Certificate and Key.
If you already run your own internal CA, we can integrate with that as well.
The BootProxy piggybacks on top of existing DHCP systems. It is not meant to replace them, but compliment them. For this reason, KubeFlex must be deployed in an L2 network with a working DHCP server. It also greatly simplifies things to be able to configure this server to either use our Kubeflex provided DNS server, or to delegate specific zones to it.
It is also a requirement that hostnames within the system be unique and somewhat meaningful. This is a requirement of Kubernetes, as it uses the hostname for the naming of nodes, and for finding nodes via DNS. The DHCP system should assign hostnames and update the DNS server with its entries.
In order to achieve the goal of truly "Location Independent" containers in Kubernetes, they need to be able to start on any node in the cluster.
On-Prem deployments with traditional storage face a serious bottleneck, which is that their storage is fixed to their local machine. If you have a stateful application running on a host that crashes, that stateful application must either start up on a separate host without its state, or wait until the node returns. Neither are good scenarios. For this reason, we recommend a storage solution such as Ceph, PortWorx, or LinStore to make logical disks that can move with their containers, around your cluster.
We can also support external storage solutions that are based around NFS or S3.
It is worth noting that unlike many other technologies, KubeFlex does not rely on a machine having a BMC as a core assumption. In the KubeFlex system BMC protocols like IPMI, Redfish, or proxy products for such are handled as optional add-ons. For most deployments nothing is lost with this model while enabling KubeFlex to manage smaller inexpensive machines that do not possess a BMC at all (such as Intel NUC device).
By default, it means that we need you to press the power button, then KubeFlex will take over and do the rest.
Boot Proxy Host
Boot Proxy is a lightweight agent that communicates with KubeFlex and is responsible for PXE booting bare-metal machines in the data center, on the edge, maybe in an office or store. KubeFlex requires a Boot Proxy anywhere it will be managing machines.
Although Agile Stacks supplies Boot Proxy as a container via Docker Hub, it needs a host to run on. Boot Proxy has very low compute requirements but it does need thoughtful network connectivity. This document has already mentioned it but we will reiterate that Boot Proxy, and thus its host too, needs outbound internet access and layer-2 access to any network where it will be booting machines. Besides that, a modest VM or similar (1+ CPU, 2GB RAM, and 10GB+ storage) would suffice.
The system expects that all clocks on all participating nodes must be synchronized to within 1 second. An internal NTP installation is recommended, if internal NTP is not available, then ports must be opened to reach regional NTP strata