DC/OS Kubernetes supports clusters that meet all the following criteria:
- No more than 85 Kubernetes nodes (total of private and public).
- No more than 8500 total Kubernetes pods.
- No more than 17000 total containers.
- No more than 100 Kubernetes pods per Kubernetes node.
- No more than 10 Kubernetes pods per core running on a Kubernetes node.
Note that these values are based on lightweight pods or containers like NGINX. It is up to you as the DC/OS cluster operator to size each Kubernetes cluster appropriately for the specific type of workload you plan on running atop of it. Also, please note that these criteria work on a per Kubernetes cluster basis (e.g. it is possible to have 170 Kubernetes nodes on the same DC/OS cluster by creating two separate Kubernetes clusters).
Planning the number of Kubernetes nodes
Private Kubernetes nodes
Before creating a Kubernetes cluster, it is important to consider what your scale requirements are. It is your responsibility to ensure there are enough resources available in the DC/OS cluster to satisfy the requirements of the Kubernetes cluster(s) you intend on creating.
A good way to size your Kubernetes cluster(s) is to ask yourself “how many pods do I require?”. From this, you can work out the minimum number of nodes that your Kubernetes cluster(s) will require to satisfy this requirement:
Number of pods / 100 = Minimum number of private nodes
It is your responsibility to ensure that both the DC/OS cluster and each Kubernetes cluster have sufficient capacity, and that there is enough spare capacity to ensure the stability of each Kubernetes cluster.
Public Kubernetes nodes
Another important question you must ask yourself is “do I require ingress traffic into my Kubernetes cluster(s)?”. If the answer to this question is positive, you need at least one public Kubernetes node - and hence at least one public DC/OS agent. Please note that contrary to what happens with private Kubernetes nodes (of which there can by multiple instances per DC/OS agent), each public Kubernetes node require a dedicated public DC/OS agent.
Other considerations
Enabling or disabling Calico’s Typha
If your requirements dictate that your Kubernetes cluster must have more than 50 Kubernetes nodes, you must enable Calico’s Typha using the calico.typha.enabled
package option.
The recommended number of Typha replicas in this scenario is 3, and you should set calico.typha.replicas
accordingly.
Maximum number of Kubernetes nodes
Finally, a very important aspect to have in mind is that the number of private and public Kubernetes nodes MUST NOT exceed 85 per Kubernetes cluster. In other words, you MUST ensure that the following condition is respected at any given time:
kubernetes.private_node_count + kubernetes.public_node_count <= 85
You must take this into consideration when doing capacity planning and, if necessary, create multiple Kubernetes clusters to accomodate your scale requirements - distributing your workloads among them. Failure to abide by this rule will result in a failed cluster, which may in turn result in permanent data loss.
Planning the capacity of each Kubernetes node
After having performed large-scale testing we conclude that the default configuration works well in most scenarios. However, you may still want to tweak the resources reserved for Kubernetes workloads in private Kubernetes nodes. You can do this by specifying the following package options for private Kubernetes nodes, and their public counterparts for public Kubernetes nodes:
kubernetes.private_reserved_resources.kube_cpus
kubernetes.private_reserved_resources.kube_mem
kubernetes.private_reserved_resources.kube_disk
You should take into consideration that, for each Kubernetes node, the value specified for kubernetes.private_reserved_resources.kube_disk
is shared by the following entities:
- Kubernetes volumes of type
emptyDir
requested by Kubernetes pods running on said Kubernetes node; - Images for the containers used by Kubernetes pods running on said Kubernetes node;
- Logs for each Kubernetes pod running on said Kubernetes node;
- Logs for the
kubelet
process itself;
You should also take into consideration that depending on the amount of Kubernetes pods running on each particular Kubernetes node, you may need to tweak the following package options (and/or their public counterparts):
kubernetes.private_reserved_resources.system_cpus
kubernetes.private_reserved_resources.system_mem
However, scenarios where this is required are uncommon. In any case, and as mentioned above, you are responsible for making sure that each DC/OS agent have sufficient capacity to accomodate your Kubernetes cluster’s resource requirements.
Planning the capacity of the Kubernetes control-plane
By default, DC/OS Kubernetes deploys a single etcd
and control-plane node per Kubernetes cluster.
However, whenever the intended usage of your Kubernetes cluster is running production workloads, you are STRONGLY ADVISED to set kubernetes.high_availability
to true
.
This can either be done when first creating the Kubernetes cluster, or later on when promoting said Kubernetes cluster to production.
Given that each control-plane task is itself a Kubernetes node, there are control-plane-specific counterparts to most of the abovementioned node-related options:
kubernetes.control_plane_reserved_resources.cpus
kubernetes.control_plane_reserved_resources.mem
kubernetes.control_plane_reserved_resources.disk
We find the default values for these options to be sufficient for most clusters given the current limitation of 85 Kubernetes nodes per Kubernetes cluster.
However, these can also be tweaked at will when required.
This is especially important if your workloads make heavy usage of the Kubernetes API (e.g. when you are deploying workloads such as etcd-operator
or cert-manager
).
etcd
's performance
Increasing Kubernetes uses etcd
as its data store.
Hence, increasing the performance of etcd
can lead to significant performance and stability benefits to a Kubernetes cluster.
DC/OS Kubernetes provides reasonable defaults for etcd
, but in order to get the best performance out of it, it is important to use fast disks.
Therefore, when possible, it is advised to back etcd
's storage with an SSD.
Like in the scenarios described above, it is your responsibility to provision these disks in your DC/OS cluster.
Finally, you may find it necessary to tweak the value of the following options under some circumstances:
etcd.cpus
etcd.mem
etcd.wal_disk
etcd.data_disk
It should be noted that etcd.wal_disk
and etcd.data_disk
MUST NOT be updated after the Kubernetes cluster has been created.
Doing so will cause permanent data loss.
etcd
cluster with 5 nodes
An etcd cluster automatically recovers from temporary failures. For a cluster of size N, it can withstand up to (N-1)/2 permanent failures. When a member permanently fails it loses access to the cluster. If the cluster permanently loses more than (N-1)/2 members then it fails, losing quorum. Once quorum is lost, the cluster cannot reach consensus and cannot continue accepting updates.
To increase etcd cluster availability you need to set both .kubernetes.high_availability
and etcd.5_etcd_nodes
to true
. To deploy an etcd cluster with 5 nodes it requires a cluster with at least 5 private agents.
By enabling this option your etcd
cluster can experience performance degradation.