Generator as Coroutines

Generator as Coroutines

  • cooperative multitasking (cooperative routines)
  • concurrent not parallel (python program execute on a single thread)

The way to create coroutines:

  • generators (asyncio)
  • native coroutines (using async /await)

Concepts

  • concurrency: tasks start, run and complete in overlapping time periods
  • parallelism: tasks run simultaneousely

image

  • cooperative: control relinquished to other task voluntarily, control by application(developer)
  • preemptive: control relinquished to other task involuntarily, control by the OS.

    some sort of scheduler involved

image

  • Global Interpreter Lock(GIL)

    Only one native thread excutes at a time.

    Use Process based parallelism to avoid GIL. Not Thread based.

    The Python threading module uses threads instead of processes. Threads uniquely run in the same unique memory heap. Whereas Processes run in separate memory heaps. This makes sharing information harder with processes and object instances. One problem arises because threads use the same memory heap, multiple threads can write to the same location in the memory heap which is why the global interpreter lock(GIL) in CPython was created as a mutex to prevent it from happening.

Make the right choice

  • CPU Bound => Multi processing
  • I/O Bound, Fast I/O, Limit Connections => Muilti Threading
  • I/O Bound, Slow I/O, Many Connections => Concurrency

Use deque

Much more efficient way to implement the stack and queue.

Operate 10,000 items take 1,000 times average:

(times in seconds) list deque
append(right) 0.87 0.87
pop(right) 0.002 0.0005
insert(left) 20.8 0.84
pop(left) 0.012 0.0005

Use unlimited deque with deque() or deque(iterable)
Use limited deque with deque(maxlen=n). If full, a corresponding number of items are discarded from the opposite end.

Implement producer / consumer coroutine using deque

Implement simple event loop

Read more “Generator as Coroutines”

發表留言

Context Manager

Context Manager

what is context

the state surrounding a section of code

why we need a context manager

  • writing try/finally every time can get cumbersom
  • easy to forget closing the file

use cases

Useful for program that needs Enter / Exit handeling

  • create / releasing resources
  • database transaction
  • set and reset decimal context

Common patterns

  • open / close
  • lock / release
  • change / reset
  • start / stop
  • enter / exit

protocal

implement these two dunder methods:

  • __enter__

    perform the setup, optionally return an object

  • __exit__

    receives error (silence or propagate)

    • need arguments exc_type, exc_value, exc_trace to handle exception
    • return True to silence exception

    perform clean up

examples

contextlib

nested contexts

發表留言

Redis

Redis

compare to memcached

  • support persistant volume
    • RDB
    • AOF
  • support multiple data types
  • pub/sub

commands

  • redis-cli: command line interface
  • redis-sentinel: cluster managing tool
  • redis-server: run server
  • redis-benchmark: stress testing
  • redis-check-aof: check AOF
  • redis-check-dump: check RDB

configuration

Use redis.conf. Docker official redis image not contain this file. Mount it yourself or through redis-server arguments.

types

  • String: get, set, mget, mset
  • Integer: incr, decr, setbit
  • List: lpush, lrange, lpop
  • Hash Map: hset, hget, hmset, hmget
  • Set: sadd, smember, sdiff, sinter, sunion

use docker

Before start

To connect a container, you need to know the name and the port, in the associated networks to be able to discover the service.

There is no DNS resolution in docker deault bridge network. In default network, you need to specify --link to connect the containers. The --link is a legacy feature.

Therefore, create a user-defined network is recommanded, it provide automatic DNS resolution.

Create a bridge newrok

Run a redis instance in user-defined network

Run a redis-cli connect to the redis instance

Transaction

all commands are executed as a single isolated operation, serialized and executed sequentially
atomic: all failed or all succeed

  • MULTI: open a transaction and always return OK
  • EXEC: execute commands in transaction
  • DISCARD: flush commands and exit transaction
  • WATCH: check and set, if watched key changes, not execute

Errors

  • before EXEC: e.g. syntax error
  • after EXEC: e.g. value error

The pipeline discarding the transaction automatically if there was an error during the command queueing

… To be continued

發表留言

Generator

Generator

  • A type of iterator
  • generator function: function that uses yield statement
  • implement the iterator protocal, call next
  • raise StopIteration exhausted

Less code

Implement an iterator

Implement a generator

More efficient

Generator Comprehensions

  • local scope
  • lazy evaluation
  • is an iterator, can be exhausted

Delegating Generator

Use the syntax yield from to yield items in a generator

發表留言

Memcached

Memcache

Store and retrieve data in memory(not persistent) base on specific hash function.

concepts

  • Slab: allocate as many pages as the ones available

  • Page: a memory area of default 1MB which contains as many chunks

  • Chunk: minimum allocated space for a single item

  • LRU: least recently used list

ref: Journey to the centre of memcached

we could say that we would run out of memory when all the available pages are allocated to slabs

memcached is designed to evict old/unused items in order to store new ones

every item operation (get, set, update or remove) requires the item in question to be locked

memcached only tries to remove the first 5 items of the LRU — after that it simply gives up and answers with OOM (out of memory)

commands with telnet

  • get

  • set

  • add: add key or return NOT_STORED if exists

  • replace: replace key or return NOT_STORED if exists

  • append, prepend

  • incr, decr

  • delete

  • flush_all

  • stats

  • version

  • quit

Run Service

Image used: memcached

Python client: pymemcache

Distributed Caching

image

Modulo Hashing

  • Pros: Balancing the distribution between instances in cluster
  • Cons: 1. Loss data if instance down 2. hard to scale

Example

Run 3 instances, expose at port 11211, 11212, 11213

Use python client to set key

Get client instance

pymemcache use Murmur3 hashing

Test with telnet

telnet the third cache server

Consistent Hashing

image

Scale up / down not affect all the servers on the ring

High Availability

  • Repcached: replica data between masters
  • KeepAlive: port forword to slave if master down
發表留言

Iterable and Iterator

Iterator & Iterable

iterator

  • get next item (__next__)
  • no indexes needed (Don’t need to be Sequence type)
  • consumable

iterable

  • collections that implement iterator

Protocal

Python need to count on certain funcionality: __next____iter__StopIteration

compare to sequence type

iteration can be more general than sequential indexing, we only need:

  • a bucket of items: collection, container
  • a way to get the next item, no need to care about ordering
  • an exception to raise if there is no next item

try to custom an iterator ourselfs:

Why re-create?

Seperate the Collection from the iterator

Iterable object

  • Maintaining the data of the collection is one object
  • Created once
  • implements __iter__, return a new iterator instance

Iterator object

  • Iterating over that data should be another object
  • throw away the iterator but don’t throw away the collection
  • Created every time
  • implements __iter__, return itself
  • implements __next__, return next item

iterable can be lazy

Caculate the next itme in an iterable until it’s actually requested

lazy evaluation

  • often used in class properties
  • properties of classes may not always populated when the object is created
  • value of property only becomes known when the property is requested/deferred

infnite iterables

  • itertools.cycle

Python Built-ins

  • range: return iterable
  • zip: return iterator
  • enumerate: return iterator
  • open: return iterator
  • reversed: return iterator

The type is important. Iterator object can be only iter over once.

iter()

when iter is called:

  • Python first looks for __iter__, if not then:
  • look for __getitem__ and create an iterator, if not then:
  • raise TypeError

Test it:

The __iter__ must return an iterator!

Iterating callable

iterator delegation

Example 1

Example 2

發表留言