Scheduling
Manual Scheduling
Bind the pod to node by
nodeName
property, before that, the pod stays in thePending
stateManutal ways to bind:
specify the
spec.nodeName
, not updatablecreate the
Binding
object1234curl --header "Content-Type:application/json" \--request POST --data '{"apiVersion": "v1", "kind": "Binding" ...}' \http://$SERVER/api/v1/namespaces/default/pods/$PODNAME/binding/
Labeling
Use to group and select the objects, for example a ReplicaSet object configs:
metadata.labels
sets the ReplicaSet itselfspec.template.metadata.lables
sets the Podspec.selector.matchLabels
defines how ReplicaSet to discover the Pod
Annotations
Use to record other details for intergration purpose e.g. build info, contact details
Restriction
Taint/Toleration
Limit pods without tolerations cannot get scheduled to a tainted node
Taint the nodes
123kubectl taint nodes master node-role.kubernetes.io/master:NoSchedule # Taintubectl taint nodes master node-role.kubernetes.io/master:NoSchedule- # UnTaintSet the pods’ tolerance, three behavior are avaliable if not tolerant:
- NoSchedule
- PreferNoSchedule: not guaranteed
- NoExecute: new pods=NoSchedule, existed pods=evicted
Note the value in tolerations keys must use double quotes
Node Selector
Limit the pod to get scheduled to one kind of node only
- Lable the node
- Set the nodeSelector
Note there is no OR or NOT conditions, use node affinity instead
Node Affinity
Limit the pod to get scheduled to one or more particular nodes
- Lable the node
- Set the nodeAffinity
- operators: In, NotIn, Exists, DoesNotExist, Gt, Lt
- 3 types
Combines the Taint/Toleration with NodeSelector or NodeAffinity to cover the scenarios
Resources
Request
- The scheduling base on the resource requests
- By default, k8s assumes a pod requires 0.5 cpu and 256Mi memory
Limit
- By default, k8s limit a pod to 1 cpu and 512Mi memory
- When a pod try to exceed resources beyond the limit
- cpu: k8s throttles the cpu won’t kill
- memory: k8s kill the pod with OOM
Static Pods
Use in creating control plane components (kube admin tools)
Without the intervention from the kube-api server, the kubelet can manage a node independently by monitor config files in the file system, and be able to create, recreate, update and delete the POD only object
- –pod-manifest-path=/etc/Kubernetes/manifest
- –config=kubeconfig.yaml (staticPodPath)
While the static pod created, the kube-api only get a readable mirror and not have the ability to update/delete it
Multiple Scheduler
- copy the kube-scheduler configs from /etc/kubernetes/manifests
- rename the scheduler
--scheduler-name
- if one master nodes with multiple scheduler:
- set the
--leader-elect=false
- set the
- if multiple masters with multiple scheduler, only one scheduler can active at a time
- set the
--leader-elect=true
- set the
--lock-object-name
to differentiate the custom scheduler from default if multiple master
- set the
- specify the scheduler for pod by
schedulerName
Logging & Monitoring
Kubernetes does not come with a full featured built-in monitoring solution, there’re open-source solutions:
- Metrics-Server(Heapster): in-memory
- Prometheus
- Elastic Stack
- Datadog
- Dynatrace
the kubelet contains a subcomponent cAdvisor (Container Advisor), which is responsible for retrieving performance metrics from pods
Application Lifecycle
Rollout & Versioning
Strategy
- Recreate: scale down to zero then scale up
- RollingUpdate: default, scale down one pod at a time then scale up
Configure Application
Command and Argument
- use ENTRYPINT in the dockerfile as a CMD instruction specified, should be a default excutable program
- use CMD in the dockerfile as the default parameter pass to the command , come after ENTRYPOINT
Environment Variable
- use plain value
- use value from ConfigMap
- use value from Secret
Secret
- encode in base64 format, not encrypted, not safe in the sence
- only sent to a node if a pod on that node requires the secret
- kubelet sotres secret into a tmpfs, not written to disk
- will be deleted with pod
- imporve security
- know the risks
- enable encryption at rest
- use Helm-Secrets or HashiCorp-Vault
- use kms provider for better security
Multi-container PODs Design Patterns
- Sidecar
- Adapter
- Ambassador
initContainers
Run processes that runs to completion in a container, the processes defined in initContainers
must run to completion before the real container hosting the application starts, the use cases are:
- pulls a code or binary from a repository
- waits for an external service or database to be up
Self Healing
- Liveness Probes
- Readiness Probes