Skip to content

Kubernetes Short Notes(3)

  • Ops

Cluster Maintainance

OS Upgrade

Pod Eviction Timeout

When the nodes was down for more than 5 minute(default) then the pods are terminated; pod will recreate if has replicaset

Drain, Cordon, Uncordon

We’re not sure the node will come back online in 5 minutes, therefore we can drain the node.

After the drained node upgraded and come back, it still unschedulable, uncordon the node to make it schedulable.

Note that the previouse pods won’t be automatically reschedule back to the nodes.

Cluster Upgrade

The core control plane components’s version can be different, but should follow certain rules:

  • the kube-api is the primary component, none of the other components’s version must not be higher than the kube-api
  • the components can be lower in 1-2 versions
    • kube-api: x
    • Controlloer-manager, kube-scheduler: x, x-1
    • kubelet, kube-proxy: x, x-1, x-2
  • the kubectl can be one version higher than kube-api: x+1, x, x-1

The kubernetes support only up to the recent 3 minor versions. The recommanded approch is to update one minor version at a time.

Update the cluster depend on how you deploy them:

  • cloud provider: few clicks at the UI
  • kubeadm: using upgrade argument (you should upgrade the kubeadm first!)
  • the hard way from scratch: manually upgrade components by yourself

Two major steps:

  1. upgrade master node, the control plane componets goes down, all management function are down, only the applications deploy on worker nodes keeps serving
  2. update worker nodes, with strategies:
    • upgrade all at once with downtimes
    • upgrade one at a time
    • create new nodes and remove the workloads, then finally remove old nodes

When you run command like kubectl get nodes, the VERSION is indicat the version of the kubelet

Backup and Restore

Master / Node DR

  • Cordon & drain
  • Provision replacement master / node


Option: Backup resources

Saving objects as a copy by query the kube-api

Option: Backup ETCD

Making copies of the ETCD data directory

Or use the etcd command line tool

  1. Make a snap shot

    Remember to specify the certification files for authentication
  2. Stop kube-api
  3. Restore snapshot

    When ETCD restore from a backup, it initialize a new cluster configuration and configures the members of ETCD as new members to a new cluster. This is to prevent a new member from accidentally joining an existing cluster.
    For example, using a snapshot to provision a new etcd-cluster from testing purpose. You don’t want the members in the new test cluster to accidentally join the production cluster.

  4. Configure the etcd.service with new data directory and new cluster token

    During a restore, you must provide a new cluster token and the same initial cluster configuration

  5. Restart ETCD service
  6. Start kube-api

Persistant Volume DR

You can’t relay on kubernetes to for backing up and restore persistant volumes.

If you’re using cloud provider specific persistant volumes like EBS volumes, Azure managed disks or GCE persistent disks, you should use cloud provider snapshot APIs



Accessing the hosts
only SSH key based authentication avaliable. Root access disabled, password based authentication disabled

Accessing the kube-api

You can perform any actions by accessing the kubernetes api server, the security of kube-api is the first line of defence in the cluster


  • Files – Username and Password/Tokens
  • Certificates
  • External Authentication providers – LDAP
  • Service Accounts


  • Role Base – RBAC
  • Attribute Base – ABAC
  • Node Authorization
  • Webhook Mode

All communication with the cluster between various components is using TLS

Communication between applications

By default, all pods can access each other within the cluster, you can restrict access using Network Policies


Users access the cluster:

  • Admins: Human
  • Developers: Human
  • Bots (CI/CD): Service Account
  • Application End Users

For service accounts, you can manage them by kube-api:

For admins and developers, Kubernetes not manage them natively, it relies on an external source like file, certificates or LDAP


Basic authentication

This is not a recommended mechanism, to use this mechanism, you need to consider volume mount while providing the auth file in a kube-adm setup

  • Static Password File
    • create a csv file that contains password, username , userID and an optional group field
    • pass the file to the kube-api by specify arguments in kube-apiserver.service, or by kube-adm tool, update the /etc/kubernetes/manifests/kube-apiserver.yaml
    • restart the kube-api
    • access the kube-api
  • Static Token File
    • create a csv file contains token, username, userID and an optional group field
    • as above, specify argument --token-auth-file
    • access the kube-api by providing header

In order to authorize the user permissions, you need to create roles and bind the user to roles

TLS Basic

SSH connection between admin user and server

  1. Admin user generate a key pair
  2. Add an entry with admin user’s public key(lock) into the server
  3. The server verify user using user’s public key

HTTPS trafic between user and web app


Without client certificate exchange

  1. The server generate a pair of keys
  2. The server send a CSR(Certificate Signing Request) to a CA(Certificate Authority)
  3. The CA signed and return a certificate
  4. User access the web application, the server sends the certificate with server’s public key
  5. The user’s browser verify the certificate and uses the CA’s public key(store in browsers) and retrieve the server’s public key
  6. User generates a symmetric key to use going forward for all communication, encrypt with server’s public key
  7. The server decrypt user’s symmetric key with private key

With client certificate exchange

  1. The server request a certificate from the client
  2. The client generate a certificate that signed by a CA
  3. The server verify user’s certificate using the CA’s public key

Certificates in cluster

Gerneral Naming conventions

  • Public key: *.crt, *.pem
  • Secret key: *.key, *-key.pem


  • Root Certs (CA), k8s requires to have at least one CA
  • Server Certs
    • kube-api
    • etcd
    • kubelet(s)
  • Client Certs
    • admin users to kube-api
    • kube-scheduler to kube-api
    • kube-controller manager to kube-api
    • kube-proxy to kube-api
    • kube-api to etcd
    • kube-api to kubelet(s)

Certificate Creation

Tools: easyrsa, openssl, cfssl

Create CA certificate

Note the CA’s certificate must be well known for the clients and servers

  1. Generate a private key

  1. Generate a CSR

  1. Generate a self-signed certificate

Create admin user certificate

  1. Generate a private key

  1. Generate a CSR with group info

  1. Generate a signed certificate

Using certificates

Specify in the request

Use a configuration file kube-config.yaml

TLS for Kube-apiserver

The kube-apiserver goes by many name and aliases or even IP addresses in the cluster:

These names must present when generate the certificate for kube-apiserver, only then, other services refer to kube-apiserver by these names will be able to establish a valid connection

Use a SSL config file to provide these informations:

As the client of ETCD server and kubelets, generate and specify client certificates

Inspecting the certificate

If using kube-adm tools, you can find the certificate path in the config files stores in /etc/kubernetes/manifest

TLS for System Components

The system component such as kube-scheduler, the CN(Common Name) must prefix with the keyword system


Must specify group SYSTEM:NODES while generate node’s certificates. And don’t forget the system perfix for the CN

Certificate API

The CA is really just the pairs of key and certificates which need to protected in a save environment. For kubernetes, these file stores in the master node, therefore we can say the master is the CA server.

Kubernetes has a built-in certificate API that can help you sign the certificates

  1. Developer generate a key and a csr then send to admin user
  2. Admin user create a CertificateSigningRequest object
  3. Admin approve or deny the request
  4. Retrieve the certificate and decode with base64
  5. Send certificate to developer

Kubernetes Configuration File

The default config file at $HOME/.kube/config is a Config object in yaml format, which defines multiple clusters(in specific namespace) and users, then mapping them together by defining contexts

Change current context:

Use the KUBECONFIG environment variable to point to another config file

Providing the certificates, use field certificate-authority to provide certificate file path (better a absolute path), or use certificate-authority-data with base64 encoded content

API Groups

The kubernetes API is catagorized by their purpose

  • /metrics
  • /healthz
  • /version
  • /api
  • /apis
  • /logs

Let’s focus on the cluster functionality:

Core Group: /api

All core functionality


Named Group: /apis

More organized, all the feature are made avaliable to these named groups


View these API endpoints on online reference, or by API call:

Role Base Access Control

RBAC is an authorization mode in the cluster that defines in the kube-api server settings

Defines the role that can access certain resources using Role object

Then bind role to user using RoleBinding object

Check user access using can-i command

Cluster Roles

Unlike Role and RoleBinding, ClusterRole and ClusterRoleBinding cluster-scoped, not namespaced

There are other cluster wide resources such as:

  • nodes
  • PV
  • certificatesigningrequests
  • namespaces
  • user

Note that you can still create a cluster role for namespaced resources as well, for example, a cluster role that can create pods, it means you can create pods with that role within all namespaces accross the cluster

Image security

Let’s say we want to run a pod using a image from private image, for doing that, we need to get the credentials and login to that private registry

In the kubernetes, the images are pulled and run by the docker runtime on the worker node, therefore the credential need to pass to the worker node

First we create a secret for docker registry, this type of secret if built for storing docker credentials

Then specify the imagePullSecrets field in the object spec

Security Contexts

When you run a container, you have the option to defined a set of security standards such as the ID of the user, the Linux capabilities etc.

In kubernetes, you can configure the securityContext field at container level or at pod level (the container level will override the pod level setting)

Network Policy

Traffic Flow

Looking at the direction in which the traffic originated:

  • ingress: the incomming traffic from the users
  • egress: the out going request to the app server

Use a web application as example, which combins a frontend server(80 port), an backend server(5000 port) and a database(3306 port), when a user log in the website, can list these traffic rules:

Because of the all allow policy, the service can comminucate with each other within the cluster namespace by service name or IP address. Use the example above, you may have security concern to let frontend service to be able to communicate directly to the database

We can implement NetworkPolicy object to restrict the traffic, by lables and selector

Note that Network Policies are enforced by the Network Solution implemented on the Kubernetes Cluster. And not all network solutions support network policies.

  • Support: Kube-router, Calico, Romana, Weave-net
  • Not-support: Flannel


發佈留言必須填寫的電子郵件地址不會公開。 必填欄位標示為 *