Certificate Renewal
During cluster creation, Kubernetes establishes a Public Key Infrastructure (PKI) for generating the TLS certificates needed for securing cluster communication for various components such as etcd
, kubernetes-apiserver
and front-proxy
. The certificates created by these components have a default expiration of one year and are renewed when an administrator updates the cluster.
Kubernetes provides a facility to renew all certificates automatically during control plane updates. For administrators who need long-running clusters or clusters that are not upgraded often, dkp
provides automated certificate renewal, without a cluster upgrade.
Requirements
This feature requires Python 3.5 or greater to be installed on all control plane hosts.
Prerequisites
Prerequisite:
- Complete the bootstrap Cluster Lifecycle topic.
Create a cluster with automated certificate renewal
To enable the automated certificate renewal, create a Konvoy cluster using the certificate-renew-interval
flag:
dkp create cluster aws --certificate-renew-interval=60 --cluster-name=long-running
The certificate-renew-interval
is the number of days after which Kubernetes-managed PKI certificates will be renewed. For example, an certificate-renew-interval
value of 30 means the certificates will be renewed every 30 days.
Technical details
The following manifests are modified on the control plane hosts, and are located at /etc/kubernetes/manifests
. Modifications to these files requires SUDO access.
kube-controller-manager.yaml
kube-apiserver.yaml
kube-scheduler.yaml
kube-proxy.yaml
The following annotation indicates the time each component was reset:
metadata:
annotations:
konvoy.d2iq.io/restartedAt: $(date +%s)
This only occurs when the PKI certificates are older than the interval given at cluster creation time. This is activated by a systemd
timer
called renew-certs.timer
that triggers an associated systemd
service called renew-certs.service
that runs on all of the control plane hosts.
Debugging
To debug the automatic certificate renewal feature, a cluster administrator can look at several different components to see if the certificates were renewed. For example, an administrator might start with a look at the control plane pod definition to check the last reset time. To determine if a scheduler pod was properly reset, run the command:
kubectl get pod -n kube-system kube-scheduler-ip-10-0-xx-xx.us-west-2.compute.internal -o yaml
The output of the command will be similar to the following:
apiVersion: v1
kind: Pod
metadata:
annotations:
konvoy.d2iq.io/restartedAt: "1626124940.735733"
Administrators who want more details on the execution of the systemd
service can use ssh
to connect to the control plane hosts, and then use the systemctl
and journalctl
commands that follow to help diagnose potential issues.
To check the status of the timers, when they last ran, and when they are scheduled to run next, use the command:
systemctl list-timers
To check the status of the renew-certs
service, use the command:
systemctl status renew-certs
To get the logs of the last run of the service, use the command:
journalctl logs -u renew-certs