This is a collection of tips about Python development that I have gathered over the years and put up together. Some of this content may be a bit outdated but can still apply today.

Language Basics

Introduction

Documentation

python-logo.gif

The Python language has a VERY extensive official documentation (written in reStructuredText). Most of the following information in this presentation comes from it.

When not sure about something, look here first:

http://docs.python.org/3/

The Interactive Console

Python is an interpreted language. By simply typing python you can access the python console and execute python code line by line.

$ python
Python 3.11.4 (main, Jun  7 2023, 00:00:00) [GCC 13.1.1 20230511 (Red Hat 13.1.1-2)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> print("python is bada" + 20 * "s")
python is badassssssssssssssssssss
>>>

There is a enhanced python console called IPython. It has nice useful features such as tab-completion, syntax coloring and integrated documentation.

Accessing Inline Help

The help() built-in function allows to view a “man-page” on any symbol in the interactive console.

>>> import scapy
>>> help(scapy)
Help on package scapy:

NAME
    scapy - Scapy: create, send, sniff, dissect and manipulate network packets.

DESCRIPTION
    Usable either from an interactive console or as a Python library.
    https://scapy.net

PACKAGE CONTENTS
    __main__
    all
    ansmachine
    arch (package)
    as_resolvers
    asn1 (package)
    ...
>>> from scapy.layers import ipsec
>>> sa = ipsec.SecurityAssociation(ipsec.ESP, spi=1234)
>>> help(sa)  # works on dynamic symbols too
Help on class SecurityAssociation in module scapy.layers.ipsec:

class SecurityAssociation(builtins.object)
 |  SecurityAssociation(proto, spi, seq_num=1, crypt_algo=None, crypt_key=None, crypt_icv_size=None, auth_algo=None, auth_key=None, tunnel_header=None, nat_t_header=None, esn_en=False, esn=0)
 |  
 |  This class is responsible of "encryption" and "decryption" of IPsec packets.  # noqa: E501
 |  
 |  Methods defined here:
 |  
 |  __init__(self, proto, spi, seq_num=1, crypt_algo=None, crypt_key=None, crypt_icv_size=None, auth_algo=None, auth_key=None, tunnel_header=None, nat_t_header=None, esn_en=False, esn=0)
 |      :param proto: the IPsec proto to use (ESP or AH)
 |      :param spi: the Security Parameters Index of this SA
 |      :param seq_num: the initial value for the sequence number on encrypted
 |                      packets
 |      :param crypt_algo: the encryption algorithm name (only used with ESP)
 |      :param crypt_key: the encryption key (only used with ESP)
 |      :param crypt_icv_size: change the default size of the crypt_algo
 |                             (only used with ESP)
 |      :param auth_algo: the integrity algorithm name
 |      :param auth_key: the integrity key
 |      :param tunnel_header: an instance of a IP(v6) header that will be used
 |                            to encapsulate the encrypted packets.
 |      :param nat_t_header: an instance of a UDP header that will be used
 |                           for NAT-Traversal.
 |      :param esn_en: extended sequence number enable which allows to use
 |                     64-bit sequence number instead of 32-bit when using an
 |                     AEAD algorithm
 |      :param esn: extended sequence number (32 MSB)
...

Syntax

Variables / Types

Python variables are strongly but dynamically typed. Declaration and initialization are done at the same time.

>>> a = 1
>>> type(a)
<type 'int'>
>>> a = "pouet"
>>> type(a)
<type 'str'>

Variables can be explicitly “undefined” with del:

>>> a = 1
>>> type(a)
<type 'int'>
>>> del a
>>> type(a)
NameError: name 'a' is not defined

Base Types: int, float

Python uses arbitrary precision integers. There is no max/min integer value and no overflow possible

>>> 2 ** 10
1024
>>> 1 << 120
1329227995784915872903807060280344576L
>>> 0xff
255
>>> hex(20)
'0x14'
>>> bin(20)
'0b10100'
>>> int('42')
42
>>> int('0x4b', 16)
75
>>> 5 / 2  # the division operator *always* returns floats
2.5
>>> 32 / 4
8.0
>>> 5 // 2  # explicit integer division
2
>>> 32 // 4
8

Base Types: list, tuple

>>> l = [3, 1, 4, 'awesome', 5, 9]
>>> l[2]
4
>>> l[-1]  # reverse indexing
9
>>> l[2:4]  # slicing
[4, 'awesome']
>>> l[0] = 7
>>> l.append(8)
>>> l
[7, 1, 4, 'awesome', 5, 9, 8]
>>> 4 in l
True

Tuples are immutable lists.

>>> t = (2, 6, 34)
>>> t[2] = 32
TypeError: 'tuple' object does not support item assignment

Base Types: str, bytes

Literal strings indiscriminately use single ' and double " quotes. They are strictly identical. For consistency, it is best to stick to one type of quote.

>>> s = "voilà une chaîne unicode"
>>> "_".join(s[-7:])
'u_n_i_c_o_d_e'
>>> b = b"this is a byte string, it only supports ASCII characters"
>>> b[40:45]
b'ASCII'
>>> b[40]  # individual bytes are returned as int values
65
>>> r"this is a \raw stri\ng, no escape charac\ters"
'this is a \\raw stri\\ng, no escape charac\\ters'
>>> '''this is
... a multiline
... string'''
'this is\na multiline\nstring'

It is possible to combine any of raw r"", byte b"" and multi-line """ """ strings.

>>> rb"""this
... is a multiline \r
... byte string"""
b'this\nis a multiline \\r\nbyte string'

Base Types: dict

>>> d = {0: 'zero', 6: 'six', 'foo': [0, 8]}
>>> d[6]
'six'
>>> d['foo']
[0, 8]
>>> d['cuir']
KeyError: 'cuir'
>>> d.get('cuir', 0)
0
>>> d['cuir'] = 1000
>>> d
{0: 'zero', 6: 'six', 'cuir': 1000, 'foo': [0, 8]}
>>> del d[0]
>>> d
{6: 'six', 'cuir': 1000, 'foo': [0, 8]}
>>> 'cuir' in d
True

Code Blocks / Indentation

Python does not have “curly braces”.

if condition1 and condition2:
    for p in packet_list:
        tag_packet(p)
else:
    print('blah')

while len(packet_list) > 0:
    try:
        send_packet(packet_list.pop())
        update_stats(sent=1)
    except IOError:
        update_stats(dropped=1)

A level of indentation can be any number of spaces/tabs as long as it is consistent through the same block. Mixed spaces and tabs in the same line are not supported. The standard is 4 spaces.

Functions

# function definition with type hints
def print_packet(p: Packet, print_l2: bool = False):
    if print_l2:
        print_l2(p)
    print_l3(p)
    print_l4(p)

# function call
print_packet(my_packet, True)

# function call with keyword arguments
print_packet(print_l2=False, p=my_other_packet)

# return type hint
def function_returning_a_value() -> int:
    return 54

def function_returning_None():
    print("I don't like functions")

https://docs.python.org/3/library/typing.html

Classes

# class definition
class Version:
    # constructor
    def __init__(self, x, y):
        self.x = x
        self.y = y
    # class method
    def show(self):
        print(f"x={self.x} y={self.y}")

# constructor call
v = Version(17, 1)
# method call
v.show()

# subclass
class DotedVersion(Version):
    # method override
    def show(self):
        print(f"{self.x}.{self.y}")

Exceptions

Like Java, C++ and other high level languages, Python has exceptions. These are an alternate way to return from a function (i.e. not with a return statement). Exceptions are raised with the raise keyword:

if foo:
    raise BarException("epic fail")

To handle an exception, one must use try except blocks:

try:
    os.rmdir("bar")
except FileNotFoundError:
    pass

Any expression may raise an exception. If it is not handled explicitly, the function will return and the exception will bubble up to the caller frame until it is handled. If it is not handled at the top level frame, the program execution will terminate and a stack trace will be printed on stderr.

[dlrepo]$ python -m dlrepo
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/rjarry/upstream/dlrepo/dlrepo/__main__.py", line 14, in <module>
    from . import views
  File "/home/rjarry/upstream/dlrepo/dlrepo/views/__init__.py", line 11, in <module>
    import aiohttp_jinja2
ModuleNotFoundError: No module named 'aiohttp_jinja2'

https://docs.python.org/3/tutorial/errors.html

Docstrings

The first string of a function/class/method is interpreted as the docstring of it. It is used by the interactive console to provide contextual help and by tools to generate documentation.

def func(a: int, b: str):
    """This does blah
    """
    print(a, b)

class Foo:
    """
    This class is awesome
    """
    def fetch(self):
        "This function does nothing but this is also a docstring"
        pass

Coding Style

Official (and rather inspiring) recommendations: http://www.python.org/dev/peps/pep-0008/

# indentation 4 spaces per level
for i in range(4):
    if i > 2:
        print(i)

# local variables, functions, class methods
my_variable = 2
def underscore_separated_lower_case():
    pass

# classes
class CamelCaseWithoutUnderscores:
    pass

Since a few years, I have switched all my projects to black https://github.com/psf/black and no longer pay attention to any code formatting. Here is a very nice article in french about black: https://sametmax2.com/once-you-go-black-you-never-go-back/index.html

Reusing code from other modules

The import Keyword

A .py file is named a Python module. It can be “included” from another module by using the import keyword.

import os
os.makedirs('/home/rjarry/devel/test')

Only some symbols of a module can be imported to make the code clearer.

from docutils import ApplicationError

NEVER use “wild” imports. It makes the code hard to understand and can cause infamous name collisions.

from scapy import *

The Module Search Path

For the import to work, the .py file must be located in the module search path. It is initialized when the interpreter starts from the environment variable PYTHONPATH. The value of the search path can be accessed/modified through the sys.path variable.

>>> import sys
>>> sys.path
['', '/usr/lib64/python311.zip', '/usr/lib64/python3.11',
'/usr/lib64/python3.11/lib-dynload', '/usr/lib64/python3.11/site-packages']
>>> sys.path.insert(0, '/home/rjarry/devel/test')
>>> import pathlib
>>> pathlib
<module 'pathlib' from '/home/rjarry/devel/test/pathlib.py'>

The current directory is included in the search path by default. Make sure not to shadow any builtin module with a local script. Additional information about the search path initialization: http://docs.python.org/3/library/site.html#module-site

Packages

To structure the code, it is advised to use packages. A package is a folder containing a __init__.py file. In order to import sub-modules one can use the following syntaxes.

import os.path
import xml.dom.minidom as xml
from django.contrib.admin import models, views

The package itself can also be “imported” by its name:

>>> from django import utils
>>> utils.__file__
......ges/Django-1.5.1-py2.7.egg/django/utils/__init__.pyc'

Note: Only the top-level package needs to be in the module search path.

Overview Of The Standard Library

Operating System Interface: os

http://docs.python.org/3/library/os.html

>>> import os
>>> os.environ['HOME']
'/home/rjarry'
>>> os.kill(some_pid, signal.SIGTERM)
>>> os.rename('/home/rjarry/test.py', '/home/rjarry/test.py.old')
>>> os.remove('/home/rjarry/test.py.old')
>>> os.makedirs('/home/rjarry/tmp/test')
>>> os.rmdir('/home/rjarry/tmp/test')
>>> for root, dirs, files in os.walk('/home/rjarry/.vim'):
...     print(root, dirs, files)
/home/rjarry/.vim ['colors', 'bundle', 'autoload', ...] ['.netrwhist', 'filetype.vim']
/home/rjarry/.vim/colors [] ['tir_diab.vim']
/home/rjarry/.vim/bundle ['vim-fugitive'] []
...

File Path Utilities: pathlib

http://docs.python.org/3/library/pathlib.html

>>> import pathlib
>>> p = pathlib.Path("/home/rjarry")
>>> p.exists()
False
>>> x = p / "devel" / "../wesh.py"
>>> x
PosixPath('/home/rjarry/devel/../wesh.py')
>>> x.resolve()
PosixPath('/home/rjarry/wesh.py')
>>> pathlib.Path("~/devel").expanduser()
PosixPath('/home/rjarry/devel')
>>> p.parent
PosixPath('/home')

Python Interpreter Internals: sys

http://docs.python.org/3/library/sys.html

>>> import sys
>>> sys.argv
['./autobench.py', '--help']
>>> sys.stdout
<open file '<stdout>', mode 'w' at 0x7f6e5d0431e0>
>>> sys.version_info >= (3, 9)
True
>>> sys.byteorder
'little'
>>> sys.exit(1)

Command Line Arguments: argparse, glob

http://docs.python.org/3/library/argparse.html

>>> import argparse
>>> parser = argparse.ArgumentParser(prog='my_script.py')
>>> parser.add_argument('-f', '--force', action='store_true',
...                     help='Do not ask for confirmation')
>>> parser.print_help()
usage: my_script.py [-f]
optional arguments:
  -f, --force  Do not ask for confirmation
>>> args = parser.parse_args(['my_script.py', '-f'])
>>> args.force
True

http://docs.python.org/3/library/glob.html

>>> import glob
>>> glob.glob('results/jacky_vm/*.log')
['results/jacky_vm/trace.log', 'results/jacky_vm/catherine.log', 'results/jacky_vm/localhost.log', 'results/jacky_vm/jacky_hypervisor.log', 'results/jacky_vm/jacky_vm.log']

Pattern Matching: re

http://docs.python.org/3/library/re.html

Python supports Perl-like regular expressions with some useful extensions.

>>> import re
>>> pattern = re.compile(r'processor\s+:\s\d+.+?model name\s+:\s+(.+?)\n',
...                      re.DOTALL)
>>> buf = open('/proc/cpuinfo').read()
>>> cores = 0
>>> model = None
>>> for match in pattern.finditer(buf):
...     cores += 1
...     model = match.group(1)
>>> cores
4
>>> model
'Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz'

String Formating

F-strings:

>>> src = ".vimrc"
>>> dst = "~/.vimrc"
>>> user = "root"
>>> host = "tom"
>>> f'scp {src} {user}@{host}:{dst}'
'scp .vimrc root@tom:~/.vimrc'

https://docs.python.org/3/reference/lexical_analysis.html#f-strings

Lazy formatting:

>>> 'scp {} {}@{}:{}'.format(src, user, host, dst)
'scp .vimrc root@tom:~/.vimrc'
>>> 'scp {s} {u}@{h}:{d}'.format(s=src, u=user, h=host, d=dst)
'scp .vimrc root@tom:~/.vimrc'

https://docs.python.org/3/library/string.html#format-string-syntax

Legacy printf formatting:

>>> 'scp %s %s@%s:%s' % (src, user, host, dst)
'scp .vimrc root@tom:~/.vimrc'

Interlude

snake-chick.png

Development Tools

Integrated Development Environments

Why Use An IDE?

Python is a very “frameworked” ecosystem. One often uses a lot of external libraries. Having a little help from the IDE can save a lot of time.

PROs

  • Source code completion
  • Error tracking
  • Browsing sources easier
  • Interactive debugger

CONs

  • Requires setup
  • Slow to start (only gui IDEs)
  • Needs graphical environment (only GUI IDEs)
  • Memory footprint (only GUI IDEs)

In a general matter, when working on a project more than 1h it makes sense to use an IDE. Especially if the project has multiple source files and uses a lot of external libraries which you don’t know.

Graphical IDEs

NB: I did use PyDev in the past. I tried using vscode but didn’t find it very ergonomic, neither did PyCharm.

Vim

Using vim as an IDE requires a little more involvement. However it is by far the most flexible and powerful solution for any language (not only Python). Installing extensions/plugins for vim can be done in multiple ways, here is the one I use:

curl -fLo ~/.vim/autoload/plug.vim --create-dirs \
     https://raw.githubusercontent.com/junegunn/vim-plug/master/plug.vim
touch ~/.vim/vimrc
" ~/.vim/vimrc

set nocompatible
set wildmode=longest,list,full
set wildmenu
set complete-=i
set completeopt=menu,menuone,noselect,noinsert

runtime plugconfig.vim
runtime pluginstall.vim
" ~/.vim/plugconfing.vim

"jedi
let g:jedi#show_call_signatures = 2
let g:jedi#goto_command = "<F3>"

"ale
let g:ale_linters_explicit = 1
let g:ale_linters = { 'python': ['pylint'] }
let g:ale_set_signs = 0
let g:ale_use_global_executables = 0
let g:ale_completion_enabled = 0
let g:ale_fixers = { 'python': ['black'] }
let g:ale_fix_on_save = 0
nnoremap <A-f> :ALEFix<CR>
if has("nvim")
  let g:ale_use_neovim_diagnostics_api = 1
else
  let g:ale_virtualtext_cursor = "all"
  let g:ale_set_highlights = 1
endif
" ~/.vim/pluginstall.vim

filetype off

call plug#begin()

"Plugins
Plug 'davidhalter/jedi-vim'
Plug 'dense-analysis/ale'

call plug#end()

filetype plugin indent on

Open vim, and run :PlugInstall.

My personal configuration files contain extensive customization of vim. Feel free to have a look for inspiration: https://git.sr.ht/~rjarry/dotfiles

Debugging Python

Using The Command Line Debugger

[linux-tools]$ python -m pdb -c 'b 213' -c c -m linux_tools.irqstat
Breakpoint 1 at linux_tools/irqstat.py:213
> linux_tools/irqstat.py(213)main()
-> t.print(sys.stdout)
(Pdb) s
--Call--
> linux_tools/table.py(41)print()
-> def print(self, fileobj: io.StringIO, with_headers: bool = True):
(Pdb) p fileobj
<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>
(Pdb) b 52
Breakpoint 2 at linux_tools/table.py:52
(Pdb) c
> linux_tools/table.py(52)print()
-> fileobj.write(self.separator.join(headers) + "\n")
(Pdb) n
IRQ       AFFINITY  EFFECTIVE-CPU  DESCRIPTION
> linux_tools/table.py(53)print()
-> for row in self.rows:
(Pdb) 

More info here: http://docs.python.org/3/library/pdb.html

Alternative Debugging Tools

Profiling Python Code

The cProfile Module

[linux-tools]$ python -m cProfile -s time -m linux_tools.netgraph
...
      13095 function calls (12831 primitive calls) in 0.037 seconds

Ordered by: internal time

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    22    0.004    0.000    0.004    0.000 {method 'read' of '_io.BufferedReader' objects}
    17    0.003    0.000    0.003    0.000 {built-in method marshal.loads}
 66/64    0.003    0.000    0.003    0.000 {built-in method builtins.__build_class__}
     5    0.002    0.000    0.002    0.000 {built-in method _posixsubprocess.fork_exec}
     8    0.001    0.000    0.001    0.000 {built-in method _imp.create_dynamic}
    59    0.001    0.000    0.003    0.000 <frozen importlib._bootstrap_external>:1604(find_spec)
 25/15    0.001    0.000    0.002    0.000 _parser.py:507(_parse)
   123    0.001    0.000    0.001    0.000 {built-in method posix.stat}
     3    0.001    0.000    0.001    0.000 enum.py:1657(convert_class)
    26    0.000    0.000    0.005    0.000 <frozen importlib._bootstrap>:1054(_find_spec)
 45/14    0.000    0.000    0.001    0.000 _compiler.py:37(_compile)
   282    0.000    0.000    0.001    0.000 <frozen importlib._bootstrap_external>:128(<listcomp>)
    17    0.000    0.000    0.000    0.000 {built-in method io.open_code}

http://docs.python.org/3/library/profile.html

Analyzing Profiling Results With KCacheGrind

[linux-tools]$ sudo dnf install kcachegrind
[linux-tools]$ pip install --user pyprof2calltree
[linux-tools]$ python -m cProfile -o profile.stats -m linux_tools.netgraph
[linux-tools]$ pyprof2calltree -i profile.stats -k

kcachegrind.png

VirtualEnv

What Is a VirtualEnv?

venv is a tool to create isolated Python environments.

https://docs.python.org/3/library/venv.html

It allows installing python dependencies into a dedicated folder.

How To Use VirtualEnv

[tmp]$ python -c "import scapy; print(scapy)"
<module 'scapy' from '/usr/lib/python3.11/site-packages/scapy/__init__.py'>
[tmp]$ python -m venv .venv
[tmp]$ . .venv/bin/activate
(.venv) [tmp]$ python -c "import scapy"
Traceback (most recent call last):
 File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'scapy'
(.venv) [tmp]$ pip install scapy
Collecting scapy
 Downloading scapy-2.5.0.tar.gz (1.3 MB)
    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 14.4 MB/s eta 0:00:00
...
(.venv) [tmp]$ python -c "import scapy; print(scapy)"
<module 'scapy' from '/home/rjarry/tmp/.venv/lib64/python3.11/site-packages/scapy/__init__.py'>

Interlude

* Sonium has joined #python-dev
<Sonium> someone speak python here?
<lucky>  HHHHHSSSSSHSSS
<lucky>  SSSSS
<Sonium> the programming language

Code Distribution

NB: I had lots of outdated info here. I removed everything. There is now an official packaging guide.

https://packaging.python.org/en/latest/tutorials/packaging-projects/

Interlude

LUKE Is Perl better than Python?

YODA No… no… no. Quicker, easier, more seductive.

LUKE But how will I know why Python is better than Perl?

YODA You will know. When your code you try to read six months from now.

Advanced Concepts

Useful Syntax Constructs

Boolean Operators

http://docs.python.org/3/library/stdtypes.html#boolean-operations-and-or-not

>>> r = init_value or 3
>>> x = r == 3 and 0 or -1  # ternary operator (t ? a : b)
>>> r
3
>>> x
0

>>> 4 in [0, 2, 4, 6, 8]
True

>>> d = {'moustache': 1, 'cuir': 8}
>>> 'cuir' in d  # equivalent to d.has_key('cuir')
True

List-Comprehensions

http://docs.python.org/3/tutorial/datastructures.html#list-comprehensions

>>> numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

>>> even = [ x for x in numbers if x % 2 == 0 ]
>>> even
[0, 2, 4, 6, 8]

>>> [ x * 10 for x in even ]
[0, 20, 40, 60, 80]

>>> import math
>>> [ math.sin(x) for x in even ]
[0.0, 0.9092974268256817, -0.7568024953079282, -0.27941549819892586, 0.9893582466233818]

The with Keyword

>>> with open('/etc/passwd', 'r') as f:
...     buf = f.read()
...
>>> f
<closed file '/etc/passwd', mode 'r' at 0x7f22b853d4b0>
>>> buf
'root:x:0:0:root:/root:/bin/bash\ndaemon:x:1:1:daemon:/usr/sbin:/bin/sh\nbin:x:2:2:bin:/bin:/bin/sh\nsys:x:3:

This is equivalent to:

try:
    f = open('/etc/passwd', 'r')
    buf = f.read()
finally:
    f.close()

The yield Keyword

>>> def big_list():
...     result = []
...     for i in xrange(1 << 28):
...         result.append(i)
...     return result
>>> for x in big_list():  # unnecessary allocation of a list
...     print(x)
>>> def big_list():
...     for i in xrange(1 << 28):
...         yield i
>>> big_list()
<generator object big_list at 0x7f22b84e5370>
>>> for x in big_list():  # limited & constant memory usage
...     print(x)
0
1
2
3

The *args And **kwargs Magic

http://docs.python.org/3/tutorial/controlflow.html#unpacking-argument-lists

>>> def f(*args, **kwargs):
...     print('args:', args, 'kwargs:', kwargs)
...
>>> f(4, 5, cuir=None, jacky='ok')
args: (4, 5) kwargs: {'jacky': 'ok', 'cuir': None}

>>> def f(a, b, c=-1):
...     return a * b * c
...
>>> k = {'b': 1, 'a': 8}
>>> f(**k)
-8
>>> a = [2, 4, 6]
>>> f(*a)
48

Lambda Expressions

Lambda expressions are “anonymous” function declarations.

>>> f = lambda x: x ** 2
>>> f
<function <lambda> at 0x7f775f9c8320>
>>> f(8)
64

Here is a practical use example:

>>> k = [Job('j1', prio=2), Job('j2', prio=1), Job('j3', prio=8)]
>>> k.sort(key=lambda j: j.prio)
>>> k
[Job('j2', prio=1), Job('j1', prio=2), Job('j3', prio=8)]

Decorators

Decorators allow to add “behaviour” to a function.

>>> def trace(f):  # define a new "trace" decorator
...     def __wrapped(*args, **kwargs):
...         print(f'{f.__name__}: args {args} {kwargs}')
...         val = f(*args, **kwargs)
...         print(f'{f.__name__}: return {val}')
...         return val
...     return __wrapped
...
>>> @trace  # equivalent of cuir = trace(cuir)
... def cuir(a, b, c):
...     return a * b + c
...
>>> r = cuir(1, 2, c=4)
cuir: args (1, 2) {'c': 4}
cuir: return 6
>>> r
6

Decorators also work on class definitions.

Asynchronous I/O

Since Python 3.5, python has the async and await keywords.

By themselves, these keywords do nothing. They require an event loop to process the values that they return.

Available implementations:

Example:

async def foo(a, b):
    if a == 2:
        x = await bar(a)  # does not block the current thread
    else:
        x = await bar(b)  # yields back control to the event loop
    return x

Memory Management

Variable Scope

There are 3 types of scopes: global, class, function.

GLOBAL_VARIABLE = 2

class Test:
    # "GLOBAL_VARIABLE" is visible
    CLASS_VARIABLE = 'super bien'
    def method(self, local1, local2):
        # "GLOBAL_VARIABLE" and "local*" are visible
        def _inner_func(a, b):
            # "GLOBAL_VARIABLE", "a", "b" and "local*" are visible
            # in order to modify the non-global variables, they need to be
            # declared as nonlocal first
            nonlocal local1
            local1 = 5
            ...
        # to access class variables
        Test.CLASS_VARIABLE or self.CLASS_VARIABLE
        if local1 == 2:
             x = 0
        # "x" is visible even "outside" the "if"

Reference Counting & Garbage Collecting

Python variables are “labels” attached to objects. When an object is referenced by a label, its reference counter is incremented. When the reference counter of an object reaches 0, the memory used by the object is freed immediately.

Interesting (old) post http://foobarnbaz.com/2012/07/08/understanding-python-variables/

Note: The de-allocation process can be deferred to a garbage collector thread by using the gc module: http://docs.python.org/3/library/gc.html. Normally the “standard” reference counter will suit most situations, use this with care.

Threading & GIL

Threading support

When compiled with the --with-threads option, the interpreter can spawn new pthreads through the threading module.

http://docs.python.org/3/library/threading.html

It exposes an API similar to the Java Thread Interface.

class MyThread(threading.Thread):
    def __init__(self, dataset, *args, **kwargs):
        threading.Thread.__init__(self, *args, **kwargs)
        self.dataset = dataset
    def run(self):
        # do stuff with self.dataset

threads = [ MyThread(d) for d in datasets ]
for t in threads:
    t.start()
for t in threads:
    t.join()

The Global Interpreter Lock

The Python interpreter is not fully “thread-safe”. Mainly because of the memory management system. It implies that only one stream of code can be"executed" by the interpreter at a time.

Each time a thread is executed by the interpreter, it acquires the GIL. It releases it when it does an I/O operation or it is forced to release it every “tick” (100ms) to allow other threads to run too.

Very detailed explanation on “why GIL?”: http://www.dabeaz.com/python/UnderstandingGIL.pdf

The GIL prevents the Python interpreter to exploit the multi-core platforms. In order to work around this problem, the multiprocessing module was created.

http://docs.python.org/3/library/multiprocessing.html

Since 2021, there’s an initiative from Guido Van Rossum (the creator of Python) to optimize cPython and get rid of the single threaded bottleneck.

https://github.com/faster-cpython/ideas

The End

from __future__ import *

logo-big.png