Kubernetes Short Notes (1)

  • Devops

Cluster Architecture

Master Node

  • ETCD cluster
  • kube-scheduler
  • kube-controller-manager

These components communicate via kube-api server

Worker Node

  • container runtime engine, e.g. Docker, Rocket, ContainerD
  • kubelet: agent that runs and listen for instructions from kube-api
  • containers

The services deploy within worker nodes communicate with each other via kube-proxy

Objectives

ETCD

  • a distributed reliable key-value store
  • client commuications on port 2379
  • server to server on port 2380

kube-api

  • primary management component

  • setup:

    1. using kube-admin tools

      • deploy kube-api as a pod in kube-system namespace

      • the manifests is at /etc/kubernetes/manifests/kube-apiserver.yaml

      • the options is at /etc/systemd/system/kube-apiserver.service

      • search for kube-apiserver process on master node

  • example: apply deployment using kubectl

    1. authenticates user

  • validate the HTTP requests
  • the kube-scheduler monitored the changes from the kube-api, then:
    • retrieve the node information from kube-api

  • schedule the pod to some node through kube-api to kubelet

  • update the pod info to ETCD
  • kube-controller-manager

    • continuously monitors the state of components
    • the controllers packages into a single process called Kube-Controller-Manager, which includes:
      1. deployment-controller, cronjob, service-account-controller …
      2. namespace-controller, job-contorller, node-controller …
      3. endpoint-controller, replicaset, replication-controller(replica set) …
    • remediate situation

    kube-scheduler

    • decide which pod goes to which node
      1. filter nodes
      2. rank nodes

    kubelet

    • follow the instruction from kube-scheduler to controll the container runtime engine (e.g. docker) that run or remove a container
    • using kube-admin tools to deploy cluster, the kubelet are not installed by default in worker nodes, need intstall manually

    kube-proxy

    • runs on each nodes in the cluster
    • create iptables rules on each nodes to forward traffic heading to the IP of the services to the IP of the actual pods
    • kube-admin tool deploy kube-proxy as daemonset in each nodes

    pod

    • the container are encapsulated into a pod
    • is a single instance of an application, the smallest object in k8s
    • containers in same pod shares storages and network namespaces, created and removed in the same time
    • multi-container pod is rare use case

    ReplicationController

    • apiVersion support in v1
    • the process to monitor the pods
    • maintain the HA and specified number of pods that running on all nodes
    • only care about the pod which RestartPolicy is set to Always
    • scalable and replacable application should be managed by the controller
    • use cases: rolling updates, multiple release tracks (multiple replication controller replica the same pod but using different labels)

    ReplicaSets

    • next generation of ReplicationController
    • api version support in apps/v1
    • enhance the filtering in .spec.selector (the major difference)
    • be aware of the non-template pod that has same lables
    • using Deployment as a replacement is recommended, it own and manage its ReplicaSets

    Deployment

    • provide replication vis replicaset and other:
      • rolling update
      • rollout
      • pause and resume

    Namespace

    • namespaces created at cluster creation

      1. kube-system

    • kube-public
    • default
    • each namespace can be assigned quota of resources

    • a DNS entry with SERVICE_NAME.NAMESPACE.svc.cluster.local format is automatically created when at service creation

      1. the cluster.local is the default domain name of the cluster

    • permanently config the namespace

    Read More »Kubernetes Short Notes (1)

    Generator as Coroutines

    • Python

    Generator as Coroutines

    • cooperative multitasking (cooperative routines)
    • concurrent not parallel (python program execute on a single thread)

    The way to create coroutines:

    • generators (asyncio)
    • native coroutines (using async /await)

    Concepts

    • concurrency: tasks start, run and complete in overlapping time periods
    • parallelism: tasks run simultaneousely

    image

    • cooperative: control relinquished to other task voluntarily, control by application(developer)
    • preemptive: control relinquished to other task involuntarily, control by the OS.

      some sort of scheduler involved

    image

    • Global Interpreter Lock(GIL)

      Only one native thread excutes at a time.

      Use Process based parallelism to avoid GIL. Not Thread based.

      The Python threading module uses threads instead of processes. Threads uniquely run in the same unique memory heap. Whereas Processes run in separate memory heaps. This makes sharing information harder with processes and object instances. One problem arises because threads use the same memory heap, multiple threads can write to the same location in the memory heap which is why the global interpreter lock(GIL) in CPython was created as a mutex to prevent it from happening.

    Make the right choice

    • CPU Bound => Multi processing
    • I/O Bound, Fast I/O, Limit Connections => Muilti Threading
    • I/O Bound, Slow I/O, Many Connections => Concurrency

    Use deque

    Much more efficient way to implement the stack and queue.

    Operate 10,000 items take 1,000 times average:

    (times in seconds) list deque
    append(right) 0.87 0.87
    pop(right) 0.002 0.0005
    insert(left) 20.8 0.84
    pop(left) 0.012 0.0005

    Use unlimited deque with deque() or deque(iterable)
    Use limited deque with deque(maxlen=n). If full, a corresponding number of items are discarded from the opposite end.

    Implement producer / consumer coroutine using deque

    Implement simple event loop

    Read More »Generator as Coroutines

    Context Manager

    • Python

    Context Manager

    what is context

    the state surrounding a section of code

    why we need a context manager

    • writing try/finally every time can get cumbersom
    • easy to forget closing the file

    use cases

    Useful for program that needs Enter / Exit handeling

    • create / releasing resources
    • database transaction
    • set and reset decimal context

    Common patterns

    • open / close
    • lock / release
    • change / reset
    • start / stop
    • enter / exit

    protocal

    implement these two dunder methods:

    • __enter__

      perform the setup, optionally return an object

    • __exit__

      receives error (silence or propagate)

      • need arguments exc_type, exc_value, exc_trace to handle exception
      • return True to silence exception

      perform clean up

    examples

    contextlib

    nested contexts

    Redis

    • Devops

    Redis

    compare to memcached

    • support persistant volume
      • RDB
      • AOF
    • support multiple data types
    • pub/sub

    commands

    • redis-cli: command line interface
    • redis-sentinel: cluster managing tool
    • redis-server: run server
    • redis-benchmark: stress testing
    • redis-check-aof: check AOF
    • redis-check-dump: check RDB

    configuration

    Use redis.conf. Docker official redis image not contain this file. Mount it yourself or through redis-server arguments.

    types

    • String: get, set, mget, mset
    • Integer: incr, decr, setbit
    • List: lpush, lrange, lpop
    • Hash Map: hset, hget, hmset, hmget
    • Set: sadd, smember, sdiff, sinter, sunion

    use docker

    Before start

    To connect a container, you need to know the name and the port, in the associated networks to be able to discover the service.

    There is no DNS resolution in docker deault bridge network. In default network, you need to specify --link to connect the containers. The --link is a legacy feature.

    Therefore, create a user-defined network is recommanded, it provide automatic DNS resolution.

    Create a bridge newrok

    Run a redis instance in user-defined network

    Run a redis-cli connect to the redis instance

    Transaction

    all commands are executed as a single isolated operation, serialized and executed sequentially
    atomic: all failed or all succeed

    • MULTI: open a transaction and always return OK
    • EXEC: execute commands in transaction
    • DISCARD: flush commands and exit transaction
    • WATCH: check and set, if watched key changes, not execute

    Errors

    • before EXEC: e.g. syntax error
    • after EXEC: e.g. value error

    The pipeline discarding the transaction automatically if there was an error during the command queueing

    … To be continued

    Generator

    • Python

    Generator

    • A type of iterator
    • generator function: function that uses yield statement
    • implement the iterator protocal, call next
    • raise StopIteration exhausted

    Less code

    Implement an iterator

    Implement a generator

    More efficient

    Generator Comprehensions

    • local scope
    • lazy evaluation
    • is an iterator, can be exhausted

    Delegating Generator

    Use the syntax yield from to yield items in a generator

    Memcached

    • Devops

    Memcache

    Store and retrieve data in memory(not persistent) base on specific hash function.

    concepts

    • Slab: allocate as many pages as the ones available

    • Page: a memory area of default 1MB which contains as many chunks

    • Chunk: minimum allocated space for a single item

    • LRU: least recently used list

    ref: Journey to the centre of memcached

    we could say that we would run out of memory when all the available pages are allocated to slabs

    memcached is designed to evict old/unused items in order to store new ones

    every item operation (get, set, update or remove) requires the item in question to be locked

    memcached only tries to remove the first 5 items of the LRU — after that it simply gives up and answers with OOM (out of memory)

    Read More »Memcached