Cluster Maintainance
OS Upgrade
Pod Eviction Timeout
When the nodes was down for more than 5 minute(default) then the pods are terminated; pod will recreate if has replicaset
Drain, Cordon, Uncordon
We’re not sure the node will come back online in 5 minutes, therefore we can drain the node.
After the drained node upgraded and come back, it still unschedulable, uncordon the node to make it schedulable.
Note that the previouse pods won’t be automatically reschedule back to the nodes.
Cluster Upgrade
The core control plane components’s version can be different, but should follow certain rules:
- the kube-api is the primary component, none of the other components’s version must not be higher than the kube-api
- the components can be lower in 1-2 versions
- kube-api: x
- Controlloer-manager, kube-scheduler: x, x-1
- kubelet, kube-proxy: x, x-1, x-2
- the kubectl can be one version higher than kube-api: x+1, x, x-1
The kubernetes support only up to the recent 3 minor versions. The recommanded approch is to update one minor version at a time.
Update the cluster depend on how you deploy them:
- cloud provider: few clicks at the UI
- kubeadm: using
upgrade
argument (you should upgrade the kubeadm first!) - the hard way from scratch: manually upgrade components by yourself
Two major steps:
- upgrade master node, the control plane componets goes down, all management function are down, only the applications deploy on worker nodes keeps serving
- update worker nodes, with strategies:
- upgrade all at once with downtimes
- upgrade one at a time
- create new nodes and remove the workloads, then finally remove old nodes
When you run command like kubectl get nodes
, the VERSION
is indicat the version of the kubelet
Backup and Restore
Master / Node DR
- Cordon & drain
- Provision replacement master / node
ETCD DR
Option: Backup resources
Saving objects as a copy by query the kube-api
1 2 |
kubectl get all --namespace=default -o yaml > default-deplayment-services.yaml |
Option: Backup ETCD
Making copies of the ETCD data directory
1 2 3 4 5 6 |
# etcd.service ExecStart=/user/local/bin/etcd \ --name= ${ETCD_NAME} ... --data-dir=/var/lib/etcd |
Or use the etcd command line tool
- Make a snap shot
123456ETCDTL_API=3 etcdctl backup save etcd.db \--endpoint=http://127.0.0.1:2379 \--cacert=/etc/etcd/ca.crt \--cert=/etc/etcd/etcd-server.crt \--key=/etc/etcd/etcd
Remember to specify the certification files for authentication - Stop kube-api
12service kube-apiserver stop - Restore snapshot
123456ETCDCTL_API=3 etcdctl snapshot restore etcd.db \-- data-dir=/var/lib/etcd-backup-dir \--initial-cluster master-1=https://192.169.5.11:2380,master-2=https://192.168.5.12:2380 \--initial-cluster-token etck-cluster-1 \--initial-advertise-peer-urls https://${INTERNAL_IP}:2380When ETCD restore from a backup, it initialize a new cluster configuration and configures the members of ETCD as new members to a new cluster. This is to prevent a new member from accidentally joining an existing cluster.
For example, using a snapshot to provision a new etcd-cluster from testing purpose. You don’t want the members in the new test cluster to accidentally join the production cluster. - Configure the etcd.service with new data directory and new cluster token
During a restore, you must provide a new cluster token and the same initial cluster configuration
- Restart ETCD service
123systemctl daemon-reloadservice etcd restart - Start kube-api
12service kube-apiserver start
Persistant Volume DR
You can’t relay on kubernetes to for backing up and restore persistant volumes.
If you’re using cloud provider specific persistant volumes like EBS volumes, Azure managed disks or GCE persistent disks, you should use cloud provider snapshot APIs
Security
Primitives
Accessing the hosts
only SSH key based authentication avaliable. Root access disabled, password based authentication disabled
Accessing the kube-api
You can perform any actions by accessing the kubernetes api server, the security of kube-api is the first line of defence in the cluster
Authentication
- Files – Username and Password/Tokens
- Certificates
- External Authentication providers – LDAP
- Service Accounts
Authorization
- Role Base – RBAC
- Attribute Base – ABAC
- Node Authorization
- Webhook Mode
All communication with the cluster between various components is using TLS
Communication between applications
By default, all pods can access each other within the cluster, you can restrict access using Network Policies
Authentication
Users access the cluster:
- Admins: Human
- Developers: Human
- Bots (CI/CD): Service Account
- Application End Users
For service accounts, you can manage them by kube-api:
1 2 3 |
kubectl create serviceaccount sa1 kubectl list serviceaccount |
For admins and developers, Kubernetes not manage them natively, it relies on an external source like file, certificates or LDAP
Basic authentication
This is not a recommended mechanism, to use this mechanism, you need to consider volume mount while providing the auth file in a kube-adm setup
- Static Password File
- create a csv file that contains password, username , userID and an optional group field
- pass the file to the kube-api by specify arguments in
kube-apiserver.service
, or by kube-adm tool, update the/etc/kubernetes/manifests/kube-apiserver.yaml
1234ExecStart=/usr/local/bin/kube-apiserver \...--basic-auth-file=user-details.csv - restart the kube-api
- access the kube-api
12curl -v -k https://<master-node-ip>/api/v1/pods -u "<username>:<password>"
- Static Token File
- create a csv file contains token, username, userID and an optional group field
- as above, specify argument
--token-auth-file
- access the kube-api by providing header
12curl -v -k https://<master-node-ip>/api/v1/pods --header "Authorization: Bearer <user-token>"
In order to authorize the user permissions, you need to create roles and bind the user to roles
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
--- kind: Role apiVersion: rbac.authorization.k8s.io/v1 metadata: namespace: default name: pod-reader rules: - apiGroups: [""] # "" indicates the core API group resources: ["pods"] verbs: ["get", "watch", "list"] --- # This role binding allows "jane" to read pods in the "default" namespace. kind: RoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: read-pods namespace: default subjects: - kind: User name: user1 # Name is case sensitive apiGroup: rbac.authorization.k8s.io roleRef: kind: Role #this must be Role or ClusterRole name: pod-reader # this must match the name of the Role or ClusterRole you wish to bind to apiGroup: rbac.authorization.k8s.io |
TLS Basic
SSH connection between admin user and server
- Admin user generate a key pair
- Add an entry with admin user’s public key(lock) into the server
- The server verify user using user’s public key
HTTPS trafic between user and web app
Without client certificate exchange
- The server generate a pair of keys
- The server send a CSR(Certificate Signing Request) to a CA(Certificate Authority)
- The CA signed and return a certificate
- User access the web application, the server sends the certificate with server’s public key
- The user’s browser verify the certificate and uses the CA’s public key(store in browsers) and retrieve the server’s public key
- User generates a symmetric key to use going forward for all communication, encrypt with server’s public key
- The server decrypt user’s symmetric key with private key
With client certificate exchange
- The server request a certificate from the client
- The client generate a certificate that signed by a CA
- The server verify user’s certificate using the CA’s public key
Certificates in cluster
Gerneral Naming conventions
- Public key: *.crt, *.pem
- Secret key: *.key, *-key.pem
- Root Certs (CA), k8s requires to have at least one CA
- Server Certs
- kube-api
- etcd
- kubelet(s)
- Client Certs
- admin users to kube-api
- kube-scheduler to kube-api
- kube-controller manager to kube-api
- kube-proxy to kube-api
- kube-api to etcd
- kube-api to kubelet(s)
Certificate Creation
Tools: easyrsa, openssl, cfssl
Create CA certificate
Note the CA’s certificate must be well known for the clients and servers
- Generate a private key
1 2 |
openssl genrsa -out ca.key 2048 |
- Generate a CSR
1 2 |
openssl req -new -key ca.key -subj "/CN=KUBERNETES-CA" -out ca.csr |
- Generate a self-signed certificate
1 2 |
openssl x509 -req -in ca.csr -signkey ca.key -out ca.crt |
Create admin user certificate
- Generate a private key
1 2 |
openssl genrsa -out admin.key 2048 |
- Generate a CSR with group info
1 2 |
openssl req -new -key admin.key -subj "/CN=kube-admin/O=system:masters" -out admin.csr |
- Generate a signed certificate
1 2 |
openssl x509 -req -in admin.csr -CA ca.crt -CAkey ca.key -out admin.crt |
Using certificates
Specify in the request
1 2 |
curl https://kube-apiserver:6443/api/v1/pods --key admin.key --cert admin.crt --cacert ca.crt |
Use a configuration file kube-config.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 |
apiVersion: v1 clusters: - cluster: certificate-authority: ca.crt server: https://kube-apiserver:6443 name: kubernetes kind: Config users: - name: kubernetes-admin user: client-certificate: admin.crt client-key: admin.key |
TLS for Kube-apiserver
The kube-apiserver goes by many name and aliases or even IP addresses in the cluster:
These names must present when generate the certificate for kube-apiserver, only then, other services refer to kube-apiserver by these names will be able to establish a valid connection
Use a SSL config file to provide these informations:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
[req] req_extensions = v3_req [ v3_req ] basicConstrains = CA:FALSE keyUsage = nonRepudiation subjectAltName = @alt_names [alt_names] DNS.1 = kubernetes DNS.2 = kubernetes.default DNS.3 = kubernetes.default.svc DNS.4 = kubernetes.default.svc.cluster.local IP.1 = 10.96.0.1 IP.2 = 172.17.0.87 |
1 2 |
openssl req -new -key apiserver.key -subj "/CN=kube-apiserver" -out apiserver.csr -config openssl.cnf |
As the client of ETCD server and kubelets, generate and specify client certificates
Inspecting the certificate
1 2 |
openssl x509 -in /etc/kubernetes/pki/apiserver.crt -text |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
Certificate: Data: Version: 3 (0x2) Serial Number: 1825300361453630837 (0x1954c50204d5f575) Signature Algorithm: sha256WithRSAEncryption Issuer: CN=kubernetes Validity Not Before: Dec 15 11:44:37 2019 GMT Not After : Dec 14 11:44:37 2020 GMT Subject: CN=kube-apiserver Subject Public Key Info: Public Key Algorithm: rsaEncryption Public-Key: (2048 bit) Modulus: 00:ef:16:5a:11:10:67:b6:3c:15:39:bb:55:f6:1a: ... Exponent: 65537 (0x10001) X509v3 extensions: X509v3 Key Usage: critical Digital Signature, Key Encipherment X509v3 Extended Key Usage: TLS Web Server Authentication X509v3 Subject Alternative Name: DNS:master, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, IP Address:10.96.0.1, IP Address:172.17.0.99 Signature Algorithm: sha256WithRSAEncryption 5f:5b:e2:99:df:66:43:29:35:df:59:26:a4:71:87:05:f1:08: ... -----BEGIN CERTIFICATE----- MIIDVjCCAj6gAwIBAgIIGVTFAgTV9XUwDQYJKoZIhvcNAQELBQAwFTETMBEGA1UE ... -----END CERTIFICATE----- |
If using kube-adm tools, you can find the certificate path in the config files stores in /etc/kubernetes/manifest
TLS for System Components
The system component such as kube-scheduler, the CN(Common Name) must prefix with the keyword system
Nodes
Must specify group SYSTEM:NODES
while generate node’s certificates. And don’t forget the system perfix for the CN
Certificate API
The CA is really just the pairs of key and certificates which need to protected in a save environment. For kubernetes, these file stores in the master node, therefore we can say the master is the CA server.
Kubernetes has a built-in certificate API that can help you sign the certificates
- Developer generate a key and a csr then send to admin user
123openssl genrsa -out john.key 2048openssl req -new -key -subj "/CN=john" -out john.csr - Admin user create a
CertificateSigningRequest
object
1234567891011121314apiVersion: certificates.k8s.io/v1beta1kind: CertificateSingingReqeustmetadata:name: johnspec:groups:- system:authenticatedusages:- digital signature- key encipherment- server authreques:<encoded john.csr with base64> - Admin approve or deny the request
123kubectl get csrkubectl certificate approve john - Retrieve the certificate and decode with base64
12kubectl get csr john -o yaml # look for status.certificate field - Send certificate to developer
Kubernetes Configuration File
The default config file at $HOME/.kube/config
is a Config
object in yaml format, which defines multiple clusters(in specific namespace) and users, then mapping them together by defining contexts
Change current context:
1 2 |
kubectl config use-context <context name> |
Use the KUBECONFIG
environment variable to point to another config file
Providing the certificates, use field certificate-authority
to provide certificate file path (better a absolute path), or use certificate-authority-data
with base64 encoded content
API Groups
The kubernetes API is catagorized by their purpose
- /metrics
- /healthz
- /version
- /api
- /apis
- /logs
Let’s focus on the cluster functionality:
Core Group: /api
All core functionality
Named Group: /apis
More organized, all the feature are made avaliable to these named groups
View these API endpoints on online reference, or by API call:
1 2 3 4 |
$ kubectl proxy $ curl http://127.0.0.1:8001 -k $ curl http://127.0.0.1:8001/apis -k | grep "named" |
Role Base Access Control
RBAC is an authorization mode in the cluster that defines in the kube-api server settings
Defines the role that can access certain resources using Role
object
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: developer namespace: development rules: - apiGroups: ["apps", "extensions"] resourceNames: ["mail-sender"] resources: ["deployments"] verbs: ["list", "get", "create", "update", "delete"] - apiGroups: [""] resources: ["ConfigMap"] verbs: ["create"] |
Then bind role to user using RoleBinding
object
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: developer-bot-rolebinding namespace: development subjects: - kind: User name: bot apiGroup: rbac.authorization.k8s.io roleRef: kind: Role name: developer apiGroup: rbac.authorization.k8s.io |
Check user access using can-i
command
1 2 3 4 5 6 |
kubectl auth can-i create deployments kubectl auth can-i delete nodes # For admin users to test other user permissions kubectl auth can-i create pods --as bot |
Cluster Roles
Unlike Role
and RoleBinding
, ClusterRole
and ClusterRoleBinding
cluster-scoped, not namespaced
There are other cluster wide resources such as:
- nodes
- PV
- certificatesigningrequests
- namespaces
- user
Note that you can still create a cluster role for namespaced resources as well, for example, a cluster role that can create pods, it means you can create pods with that role within all namespaces accross the cluster
Image security
Let’s say we want to run a pod using a image from private image, for doing that, we need to get the credentials and login to that private registry
1 2 |
docker login private-registry.io |
In the kubernetes, the images are pulled and run by the docker runtime on the worker node, therefore the credential need to pass to the worker node
First we create a secret for docker registry, this type of secret if built for storing docker credentials
1 2 3 4 5 6 |
kubectl create secret docker-registry my-registry-secret \ --docker-server=private-registry.io \ --docker-username=registry-user \ --docker-password=registry-password \ --docker-email=registry-user@org.com |
Then specify the imagePullSecrets
field in the object spec
1 2 3 4 5 6 7 8 9 10 11 |
apiVersion: v1 kind: Pod metadata: name: nginx spec: containers: - name: nginx image: private-registry.io/apps/internal-app/nginx imagePullSecrets: - name: my-registry-secret |
Security Contexts
When you run a container, you have the option to defined a set of security standards such as the ID of the user, the Linux capabilities etc.
1 2 3 |
docker run --user=1001 ubuntu sleep 3600 docker run --cap-add MAC_ADMIN ubuntu |
In kubernetes, you can configure the securityContext
field at container level or at pod level (the container level will override the pod level setting)
Network Policy
Traffic Flow
Looking at the direction in which the traffic originated:
- ingress: the incomming traffic from the users
- egress: the out going request to the app server
Use a web application as example, which combins a frontend server(80 port), an backend server(5000 port) and a database(3306 port), when a user log in the website, can list these traffic rules:
1 2 3 4 5 6 |
1. ingress 80 ----> frontend 2. egress 5000 ----> frontend 3. ingress 5000 ----> backend 4. egress 3306 ----> backend 5. ingress 3306 ----> database |
Because of the all allow policy, the service can comminucate with each other within the cluster namespace by service name or IP address. Use the example above, you may have security concern to let frontend service to be able to communicate directly to the database
We can implement NetworkPolicy
object to restrict the traffic, by lables and selector
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: db-policy spec: podSelector: matchLabels: role: db policyTypes: - Ingress ingress: - from: - podSelector: matchLables: name: backend-pod ports: - protocal: TCP port: 3306 |
Note that Network Policies are enforced by the Network Solution implemented on the Kubernetes Cluster. And not all network solutions support network policies.
- Support: Kube-router, Calico, Romana, Weave-net
- Not-support: Flannel