Object Mutability in Python

Internal state

changing the data inside the object is called modifying the internal state of the object, the state(data) is changed, but memory address has not changed

Mutable

an object whose internal state can be changed

  • Lists, Sets, Dictionarys, User-defined Classes

Immutable

an object whose internal state can not be changed

  • Numbers(int, float, Booleans), String, Tuples, Frozen Sets, User-defined Classes
  • variable re-assignment change the reference not the internal value

 

發表留言

Variables in Python

variables are memory references, not equal to the object but reference(alias) the object at memory space.

Find out memory address referenced: using id()

reference counting

after we created a object in memory, python keep track of the number of the references we have to the object, as soon as the count goes to 0, python memory manager destroy object and reclaim the memory space

Find out reference count:

  • using sys.getrefcount()
  • using ctypes.c_long.from_address()

circular references

in the circumstance of circular reference, the reference count could not goes to 0(memory leak), need garbage collector to identify it

garbage collection

can be control programmatically using the gc module, turned on by default, beware to turn it off, for python < 3.4, if even one of the objects in the circular reference has a destructor, the destruction order may be important, but the GC does not know what order should be, so the circular reference objects will be marked as uncollectable and cause memory leak

dynamically typing

python variable name has no references to any type, when we use type(), python looks up the object which is referenced and return the type of the object

variable equality

  • identity operator(var_a is var_b) compare the memory address
  • equality operator(var_a == var_b) compare the object state
發表留言

Python Name Conventions

Must start with (_) or letter (a-z, A-Z), follow by any number of (_) or letter(a-z, A-z) or digit (0-9) except reserved words

Conventions

_my_var: indicate “internal use” or “private” object, cannot get imported by

__my_var: used to mangle class attributes, useful in inheritance chain

__my_var__: system defined

PEP8 style guide

發表留言

Python Multi-line Statements

How python interpret multi-line code into single line code:
  1. python program
  2. physical lines of code(end with a physical newline CHARACTER create by enter)
  3. logical lines of code(end with a logical NEWLINE token)
  4. tokenized
  5. execute
physical newlines vs logical newline

sometimes physical newlines are ignored in order to combine multiple physical lines into a  single logical newline

break implicitly: [], (), {}

break explicitly

multi-line strings

multi-line strings are regular string, not comments (can be used as docstring)

escaped characters(\n, \t), non-visible characters(newlines, tabs) in multi-line are part of string; escaped characters will formatted when print it

ref:

https://github.com/fbaptiste/python-deepdive/blob/master/Part%201/Section%2002%20-%20A%20Quick%20Refresher/01%20-%20Multi-Line%20Statements%20and%20Strings.ipynb

發表留言

Python Type Hierarchy

Number

  • Integral: Integer, Booleans
  • Non-Integral: Floats, Complex, Decimals, Fractions(1/3)

Collection

  • Sequences
    • Mutable: List
    • Immutable: Tuples, Strings
  • Sets
    • Mutable: Sets
    • Immutable: Frozen Sets
  • Mappings
    • Dictionaries (relate to set)

Callables

  • Built-in Functions
  • User-Defined Fuctions
  • Instance Methods (e.g. len())
  • Built-in Method (e.g. my_list.append(x))
  • Generators
  • Classes
  • Class Instances(__call__())

Singletons

  • None
  • NotImplemented
  • Elipsis operators
發表留言

Multiprocessing: Pickle Issue

multiprocessing use pickle module to serialize things among the process, but doesn’t support functions with closures, lambdas, or functions in __main__

Here is the example I try:

solution: dill

dill and multiprocessing: pathos

dill: a utility to serialize all of python
– pox: utilities for filesystem exploration and automated builds
– klepto: persistent caching to memory, disk, or database
– multiprocess: better multiprocessing and multithreading in python
– ppft: distributed and parallel python
– pyina: MPI parallel map and cluster scheduling
– pathos: graph management and execution in heterogenous computing

 

1 則迴響