This is a collection of tips about Python development that I have gathered over the years and put up together. Some of this content may be a bit outdated but can still apply today.
Language Basics
Introduction
Documentation
The Python language has a VERY extensive official documentation (written in reStructuredText). Most of the following information in this presentation comes from it.
When not sure about something, look here first:
The Interactive Console
Python is an interpreted language. By simply typing python
you can access
the python console and execute python code line by line.
$ python
Python 3.11.4 (main, Jun 7 2023, 00:00:00) [GCC 13.1.1 20230511 (Red Hat 13.1.1-2)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> print("python is bada" + 20 * "s")
python is badassssssssssssssssssss
>>>
There is a enhanced python console called IPython. It has nice useful features such as tab-completion, syntax coloring and integrated documentation.
Accessing Inline Help
The help()
built-in function allows to view a “man-page” on any symbol in the
interactive console.
>>> import scapy
>>> help(scapy)
Help on package scapy:
NAME
scapy - Scapy: create, send, sniff, dissect and manipulate network packets.
DESCRIPTION
Usable either from an interactive console or as a Python library.
https://scapy.net
PACKAGE CONTENTS
__main__
all
ansmachine
arch (package)
as_resolvers
asn1 (package)
...
>>> from scapy.layers import ipsec
>>> sa = ipsec.SecurityAssociation(ipsec.ESP, spi=1234)
>>> help(sa) # works on dynamic symbols too
Help on class SecurityAssociation in module scapy.layers.ipsec:
class SecurityAssociation(builtins.object)
| SecurityAssociation(proto, spi, seq_num=1, crypt_algo=None, crypt_key=None, crypt_icv_size=None, auth_algo=None, auth_key=None, tunnel_header=None, nat_t_header=None, esn_en=False, esn=0)
|
| This class is responsible of "encryption" and "decryption" of IPsec packets. # noqa: E501
|
| Methods defined here:
|
| __init__(self, proto, spi, seq_num=1, crypt_algo=None, crypt_key=None, crypt_icv_size=None, auth_algo=None, auth_key=None, tunnel_header=None, nat_t_header=None, esn_en=False, esn=0)
| :param proto: the IPsec proto to use (ESP or AH)
| :param spi: the Security Parameters Index of this SA
| :param seq_num: the initial value for the sequence number on encrypted
| packets
| :param crypt_algo: the encryption algorithm name (only used with ESP)
| :param crypt_key: the encryption key (only used with ESP)
| :param crypt_icv_size: change the default size of the crypt_algo
| (only used with ESP)
| :param auth_algo: the integrity algorithm name
| :param auth_key: the integrity key
| :param tunnel_header: an instance of a IP(v6) header that will be used
| to encapsulate the encrypted packets.
| :param nat_t_header: an instance of a UDP header that will be used
| for NAT-Traversal.
| :param esn_en: extended sequence number enable which allows to use
| 64-bit sequence number instead of 32-bit when using an
| AEAD algorithm
| :param esn: extended sequence number (32 MSB)
...
Syntax
Variables / Types
Python variables are strongly but dynamically typed. Declaration and initialization are done at the same time.
>>> a = 1
>>> type(a)
<type 'int'>
>>> a = "pouet"
>>> type(a)
<type 'str'>
Variables can be explicitly “undefined” with del
:
>>> a = 1
>>> type(a)
<type 'int'>
>>> del a
>>> type(a)
NameError: name 'a' is not defined
Base Types: int
, float
Python uses arbitrary precision integers. There is no max/min integer value and no overflow possible
>>> 2 ** 10
1024
>>> 1 << 120
1329227995784915872903807060280344576L
>>> 0xff
255
>>> hex(20)
'0x14'
>>> bin(20)
'0b10100'
>>> int('42')
42
>>> int('0x4b', 16)
75
>>> 5 / 2 # the division operator *always* returns floats
2.5
>>> 32 / 4
8.0
>>> 5 // 2 # explicit integer division
2
>>> 32 // 4
8
Base Types: list
, tuple
>>> l = [3, 1, 4, 'awesome', 5, 9]
>>> l[2]
4
>>> l[-1] # reverse indexing
9
>>> l[2:4] # slicing
[4, 'awesome']
>>> l[0] = 7
>>> l.append(8)
>>> l
[7, 1, 4, 'awesome', 5, 9, 8]
>>> 4 in l
True
Tuples are immutable lists.
>>> t = (2, 6, 34)
>>> t[2] = 32
TypeError: 'tuple' object does not support item assignment
Base Types: str
, bytes
Literal strings indiscriminately use single '
and double "
quotes. They
are strictly identical. For consistency, it is best to stick to one type of
quote.
>>> s = "voilà une chaîne unicode"
>>> "_".join(s[-7:])
'u_n_i_c_o_d_e'
>>> b = b"this is a byte string, it only supports ASCII characters"
>>> b[40:45]
b'ASCII'
>>> b[40] # individual bytes are returned as int values
65
>>> r"this is a \raw stri\ng, no escape charac\ters"
'this is a \\raw stri\\ng, no escape charac\\ters'
>>> '''this is
... a multiline
... string'''
'this is\na multiline\nstring'
It is possible to combine any of raw r""
, byte b""
and multi-line """ """
strings.
>>> rb"""this
... is a multiline \r
... byte string"""
b'this\nis a multiline \\r\nbyte string'
Base Types: dict
>>> d = {0: 'zero', 6: 'six', 'foo': [0, 8]}
>>> d[6]
'six'
>>> d['foo']
[0, 8]
>>> d['cuir']
KeyError: 'cuir'
>>> d.get('cuir', 0)
0
>>> d['cuir'] = 1000
>>> d
{0: 'zero', 6: 'six', 'cuir': 1000, 'foo': [0, 8]}
>>> del d[0]
>>> d
{6: 'six', 'cuir': 1000, 'foo': [0, 8]}
>>> 'cuir' in d
True
Code Blocks / Indentation
Python does not have “curly braces”.
if condition1 and condition2:
for p in packet_list:
tag_packet(p)
else:
print('blah')
while len(packet_list) > 0:
try:
send_packet(packet_list.pop())
update_stats(sent=1)
except IOError:
update_stats(dropped=1)
A level of indentation can be any number of spaces/tabs as long as it is consistent through the same block. Mixed spaces and tabs in the same line are not supported. The standard is 4 spaces.
Functions
# function definition with type hints
def print_packet(p: Packet, print_l2: bool = False):
if print_l2:
print_l2(p)
print_l3(p)
print_l4(p)
# function call
print_packet(my_packet, True)
# function call with keyword arguments
print_packet(print_l2=False, p=my_other_packet)
# return type hint
def function_returning_a_value() -> int:
return 54
def function_returning_None():
print("I don't like functions")
https://docs.python.org/3/library/typing.html
Classes
# class definition
class Version:
# constructor
def __init__(self, x, y):
self.x = x
self.y = y
# class method
def show(self):
print(f"x={self.x} y={self.y}")
# constructor call
v = Version(17, 1)
# method call
v.show()
# subclass
class DotedVersion(Version):
# method override
def show(self):
print(f"{self.x}.{self.y}")
Exceptions
Like Java, C++ and other high level languages, Python has exceptions. These are an
alternate way to return from a function (i.e. not with a return
statement). Exceptions
are raised with the raise
keyword:
if foo:
raise BarException("epic fail")
To handle an exception, one must use try
except
blocks:
try:
os.rmdir("bar")
except FileNotFoundError:
pass
Any expression may raise an exception. If it is not handled explicitly, the function will return and the exception will bubble up to the caller frame until it is handled. If it is not handled at the top level frame, the program execution will terminate and a stack trace will be printed on stderr.
[dlrepo]$ python -m dlrepo
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/home/rjarry/upstream/dlrepo/dlrepo/__main__.py", line 14, in <module>
from . import views
File "/home/rjarry/upstream/dlrepo/dlrepo/views/__init__.py", line 11, in <module>
import aiohttp_jinja2
ModuleNotFoundError: No module named 'aiohttp_jinja2'
https://docs.python.org/3/tutorial/errors.html
Docstrings
The first string of a function/class/method is interpreted as the docstring of it. It is used by the interactive console to provide contextual help and by tools to generate documentation.
def func(a: int, b: str):
"""This does blah
"""
print(a, b)
class Foo:
"""
This class is awesome
"""
def fetch(self):
"This function does nothing but this is also a docstring"
pass
Coding Style
Official (and rather inspiring) recommendations: http://www.python.org/dev/peps/pep-0008/
# indentation 4 spaces per level
for i in range(4):
if i > 2:
print(i)
# local variables, functions, class methods
my_variable = 2
def underscore_separated_lower_case():
pass
# classes
class CamelCaseWithoutUnderscores:
pass
Since a few years, I have switched all my projects to black https://github.com/psf/black and no longer pay attention to any code formatting. Here is a very nice article in french about black: https://sametmax2.com/once-you-go-black-you-never-go-back/index.html
Reusing code from other modules
The import
Keyword
A .py
file is named a Python module. It can be “included” from another
module by using the import
keyword.
import os
os.makedirs('/home/rjarry/devel/test')
Only some symbols of a module can be imported to make the code clearer.
from docutils import ApplicationError
NEVER use “wild” imports. It makes the code hard to understand and can cause infamous name collisions.
from scapy import *
The Module Search Path
For the import
to work, the .py
file must be located in the module
search path. It is initialized when the interpreter starts from the environment
variable PYTHONPATH
. The value of the search path can be accessed/modified
through the sys.path
variable.
>>> import sys
>>> sys.path
['', '/usr/lib64/python311.zip', '/usr/lib64/python3.11',
'/usr/lib64/python3.11/lib-dynload', '/usr/lib64/python3.11/site-packages']
>>> sys.path.insert(0, '/home/rjarry/devel/test')
>>> import pathlib
>>> pathlib
<module 'pathlib' from '/home/rjarry/devel/test/pathlib.py'>
The current directory is included in the search path by default. Make sure not to shadow any builtin module with a local script. Additional information about the search path initialization: http://docs.python.org/3/library/site.html#module-site
Packages
To structure the code, it is advised to use packages. A package is a folder
containing a __init__.py
file. In order to import sub-modules one can use
the following syntaxes.
import os.path
import xml.dom.minidom as xml
from django.contrib.admin import models, views
The package itself can also be “imported” by its name:
>>> from django import utils
>>> utils.__file__
......ges/Django-1.5.1-py2.7.egg/django/utils/__init__.pyc'
Note: Only the top-level package needs to be in the module search path.
Overview Of The Standard Library
Operating System Interface: os
http://docs.python.org/3/library/os.html
>>> import os
>>> os.environ['HOME']
'/home/rjarry'
>>> os.kill(some_pid, signal.SIGTERM)
>>> os.rename('/home/rjarry/test.py', '/home/rjarry/test.py.old')
>>> os.remove('/home/rjarry/test.py.old')
>>> os.makedirs('/home/rjarry/tmp/test')
>>> os.rmdir('/home/rjarry/tmp/test')
>>> for root, dirs, files in os.walk('/home/rjarry/.vim'):
... print(root, dirs, files)
/home/rjarry/.vim ['colors', 'bundle', 'autoload', ...] ['.netrwhist', 'filetype.vim']
/home/rjarry/.vim/colors [] ['tir_diab.vim']
/home/rjarry/.vim/bundle ['vim-fugitive'] []
...
File Path Utilities: pathlib
http://docs.python.org/3/library/pathlib.html
>>> import pathlib
>>> p = pathlib.Path("/home/rjarry")
>>> p.exists()
False
>>> x = p / "devel" / "../wesh.py"
>>> x
PosixPath('/home/rjarry/devel/../wesh.py')
>>> x.resolve()
PosixPath('/home/rjarry/wesh.py')
>>> pathlib.Path("~/devel").expanduser()
PosixPath('/home/rjarry/devel')
>>> p.parent
PosixPath('/home')
Python Interpreter Internals: sys
http://docs.python.org/3/library/sys.html
>>> import sys
>>> sys.argv
['./autobench.py', '--help']
>>> sys.stdout
<open file '<stdout>', mode 'w' at 0x7f6e5d0431e0>
>>> sys.version_info >= (3, 9)
True
>>> sys.byteorder
'little'
>>> sys.exit(1)
Command Line Arguments: argparse
, glob
http://docs.python.org/3/library/argparse.html
>>> import argparse
>>> parser = argparse.ArgumentParser(prog='my_script.py')
>>> parser.add_argument('-f', '--force', action='store_true',
... help='Do not ask for confirmation')
>>> parser.print_help()
usage: my_script.py [-f]
optional arguments:
-f, --force Do not ask for confirmation
>>> args = parser.parse_args(['my_script.py', '-f'])
>>> args.force
True
http://docs.python.org/3/library/glob.html
>>> import glob
>>> glob.glob('results/jacky_vm/*.log')
['results/jacky_vm/trace.log', 'results/jacky_vm/catherine.log', 'results/jacky_vm/localhost.log', 'results/jacky_vm/jacky_hypervisor.log', 'results/jacky_vm/jacky_vm.log']
Pattern Matching: re
http://docs.python.org/3/library/re.html
Python supports Perl-like regular expressions with some useful extensions.
>>> import re
>>> pattern = re.compile(r'processor\s+:\s\d+.+?model name\s+:\s+(.+?)\n',
... re.DOTALL)
>>> buf = open('/proc/cpuinfo').read()
>>> cores = 0
>>> model = None
>>> for match in pattern.finditer(buf):
... cores += 1
... model = match.group(1)
>>> cores
4
>>> model
'Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz'
String Formating
F-strings:
>>> src = ".vimrc"
>>> dst = "~/.vimrc"
>>> user = "root"
>>> host = "tom"
>>> f'scp {src} {user}@{host}:{dst}'
'scp .vimrc root@tom:~/.vimrc'
https://docs.python.org/3/reference/lexical_analysis.html#f-strings
Lazy formatting:
>>> 'scp {} {}@{}:{}'.format(src, user, host, dst)
'scp .vimrc root@tom:~/.vimrc'
>>> 'scp {s} {u}@{h}:{d}'.format(s=src, u=user, h=host, d=dst)
'scp .vimrc root@tom:~/.vimrc'
https://docs.python.org/3/library/string.html#format-string-syntax
Legacy printf
formatting:
>>> 'scp %s %s@%s:%s' % (src, user, host, dst)
'scp .vimrc root@tom:~/.vimrc'
Interlude
Development Tools
Integrated Development Environments
Why Use An IDE?
Python is a very “frameworked” ecosystem. One often uses a lot of external libraries. Having a little help from the IDE can save a lot of time.
PROs
- Source code completion
- Error tracking
- Browsing sources easier
- Interactive debugger
CONs
- Requires setup
- Slow to start (only gui IDEs)
- Needs graphical environment (only GUI IDEs)
- Memory footprint (only GUI IDEs)
In a general matter, when working on a project more than 1h it makes sense to use an IDE. Especially if the project has multiple source files and uses a lot of external libraries which you don’t know.
Graphical IDEs
- Eclipse with PyDev http://pydev.org/
- Microsoft Visual Studio Code with the Python extension https://marketplace.visualstudio.com/items?itemName=ms-python.python
- JetBrains PyCham (not open source, requires a license) https://www.jetbrains.com/pycharm/
NB: I did use PyDev in the past. I tried using vscode but didn’t find it very ergonomic, neither did PyCharm.
Vim
Using vim as an IDE requires a little more involvement. However it is by far the most flexible and powerful solution for any language (not only Python). Installing extensions/plugins for vim can be done in multiple ways, here is the one I use:
curl -fLo ~/.vim/autoload/plug.vim --create-dirs \
https://raw.githubusercontent.com/junegunn/vim-plug/master/plug.vim
touch ~/.vim/vimrc
" ~/.vim/vimrc
set nocompatible
set wildmode=longest,list,full
set wildmenu
set complete-=i
set completeopt=menu,menuone,noselect,noinsert
runtime plugconfig.vim
runtime pluginstall.vim
" ~/.vim/plugconfing.vim
"jedi
let g:jedi#show_call_signatures = 2
let g:jedi#goto_command = "<F3>"
"ale
let g:ale_linters_explicit = 1
let g:ale_linters = { 'python': ['pylint'] }
let g:ale_set_signs = 0
let g:ale_use_global_executables = 0
let g:ale_completion_enabled = 0
let g:ale_fixers = { 'python': ['black'] }
let g:ale_fix_on_save = 0
nnoremap <A-f> :ALEFix<CR>
if has("nvim")
let g:ale_use_neovim_diagnostics_api = 1
else
let g:ale_virtualtext_cursor = "all"
let g:ale_set_highlights = 1
endif
" ~/.vim/pluginstall.vim
filetype off
call plug#begin()
"Plugins
Plug 'davidhalter/jedi-vim'
Plug 'dense-analysis/ale'
call plug#end()
filetype plugin indent on
Open vim
, and run :PlugInstall
.
My personal configuration files contain extensive customization of vim. Feel free to have a look for inspiration: https://git.sr.ht/~rjarry/dotfiles
Debugging Python
Using The Command Line Debugger
[linux-tools]$ python -m pdb -c 'b 213' -c c -m linux_tools.irqstat
Breakpoint 1 at linux_tools/irqstat.py:213
> linux_tools/irqstat.py(213)main()
-> t.print(sys.stdout)
(Pdb) s
--Call--
> linux_tools/table.py(41)print()
-> def print(self, fileobj: io.StringIO, with_headers: bool = True):
(Pdb) p fileobj
<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>
(Pdb) b 52
Breakpoint 2 at linux_tools/table.py:52
(Pdb) c
> linux_tools/table.py(52)print()
-> fileobj.write(self.separator.join(headers) + "\n")
(Pdb) n
IRQ AFFINITY EFFECTIVE-CPU DESCRIPTION
> linux_tools/table.py(53)print()
-> for row in self.rows:
(Pdb)
More info here: http://docs.python.org/3/library/pdb.html
Alternative Debugging Tools
- PDB++ https://github.com/pdbpp/pdbpp
Drop-in replacement of the built-in
pdb
module with some enhancements. - vimspector https://github.com/puremourning/vimspector Multi-language debugger plugin for vim.
- PyDev debugger https://www.pydev.org/manual_101_run.html
- PyCharm debugger https://www.jetbrains.com/help/pycharm/part-1-debugging-python-code.html
- VsCode debugger https://code.visualstudio.com/docs/languages/python#_debugging
Profiling Python Code
The cProfile
Module
[linux-tools]$ python -m cProfile -s time -m linux_tools.netgraph
...
13095 function calls (12831 primitive calls) in 0.037 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
22 0.004 0.000 0.004 0.000 {method 'read' of '_io.BufferedReader' objects}
17 0.003 0.000 0.003 0.000 {built-in method marshal.loads}
66/64 0.003 0.000 0.003 0.000 {built-in method builtins.__build_class__}
5 0.002 0.000 0.002 0.000 {built-in method _posixsubprocess.fork_exec}
8 0.001 0.000 0.001 0.000 {built-in method _imp.create_dynamic}
59 0.001 0.000 0.003 0.000 <frozen importlib._bootstrap_external>:1604(find_spec)
25/15 0.001 0.000 0.002 0.000 _parser.py:507(_parse)
123 0.001 0.000 0.001 0.000 {built-in method posix.stat}
3 0.001 0.000 0.001 0.000 enum.py:1657(convert_class)
26 0.000 0.000 0.005 0.000 <frozen importlib._bootstrap>:1054(_find_spec)
45/14 0.000 0.000 0.001 0.000 _compiler.py:37(_compile)
282 0.000 0.000 0.001 0.000 <frozen importlib._bootstrap_external>:128(<listcomp>)
17 0.000 0.000 0.000 0.000 {built-in method io.open_code}
http://docs.python.org/3/library/profile.html
Analyzing Profiling Results With KCacheGrind
[linux-tools]$ sudo dnf install kcachegrind
[linux-tools]$ pip install --user pyprof2calltree
[linux-tools]$ python -m cProfile -o profile.stats -m linux_tools.netgraph
[linux-tools]$ pyprof2calltree -i profile.stats -k
VirtualEnv
What Is a VirtualEnv?
venv
is a tool to create isolated Python environments.
https://docs.python.org/3/library/venv.html
It allows installing python dependencies into a dedicated folder.
How To Use VirtualEnv
[tmp]$ python -c "import scapy; print(scapy)"
<module 'scapy' from '/usr/lib/python3.11/site-packages/scapy/__init__.py'>
[tmp]$ python -m venv .venv
[tmp]$ . .venv/bin/activate
(.venv) [tmp]$ python -c "import scapy"
Traceback (most recent call last):
File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'scapy'
(.venv) [tmp]$ pip install scapy
Collecting scapy
Downloading scapy-2.5.0.tar.gz (1.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 14.4 MB/s eta 0:00:00
...
(.venv) [tmp]$ python -c "import scapy; print(scapy)"
<module 'scapy' from '/home/rjarry/tmp/.venv/lib64/python3.11/site-packages/scapy/__init__.py'>
Interlude
* Sonium has joined #python-dev
<Sonium> someone speak python here?
<lucky> HHHHHSSSSSHSSS
<lucky> SSSSS
<Sonium> the programming language
Code Distribution
NB: I had lots of outdated info here. I removed everything. There is now an official packaging guide.
https://packaging.python.org/en/latest/tutorials/packaging-projects/
Interlude
LUKE Is Perl better than Python?
YODA No… no… no. Quicker, easier, more seductive.
LUKE But how will I know why Python is better than Perl?
YODA You will know. When your code you try to read six months from now.
Advanced Concepts
Useful Syntax Constructs
Boolean Operators
http://docs.python.org/3/library/stdtypes.html#boolean-operations-and-or-not
>>> r = init_value or 3
>>> x = r == 3 and 0 or -1 # ternary operator (t ? a : b)
>>> r
3
>>> x
0
>>> 4 in [0, 2, 4, 6, 8]
True
>>> d = {'moustache': 1, 'cuir': 8}
>>> 'cuir' in d # equivalent to d.has_key('cuir')
True
List-Comprehensions
http://docs.python.org/3/tutorial/datastructures.html#list-comprehensions
>>> numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> even = [ x for x in numbers if x % 2 == 0 ]
>>> even
[0, 2, 4, 6, 8]
>>> [ x * 10 for x in even ]
[0, 20, 40, 60, 80]
>>> import math
>>> [ math.sin(x) for x in even ]
[0.0, 0.9092974268256817, -0.7568024953079282, -0.27941549819892586, 0.9893582466233818]
The with
Keyword
>>> with open('/etc/passwd', 'r') as f:
... buf = f.read()
...
>>> f
<closed file '/etc/passwd', mode 'r' at 0x7f22b853d4b0>
>>> buf
'root:x:0:0:root:/root:/bin/bash\ndaemon:x:1:1:daemon:/usr/sbin:/bin/sh\nbin:x:2:2:bin:/bin:/bin/sh\nsys:x:3:
This is equivalent to:
try:
f = open('/etc/passwd', 'r')
buf = f.read()
finally:
f.close()
The yield
Keyword
>>> def big_list():
... result = []
... for i in xrange(1 << 28):
... result.append(i)
... return result
>>> for x in big_list(): # unnecessary allocation of a list
... print(x)
>>> def big_list():
... for i in xrange(1 << 28):
... yield i
>>> big_list()
<generator object big_list at 0x7f22b84e5370>
>>> for x in big_list(): # limited & constant memory usage
... print(x)
0
1
2
3
The *args
And **kwargs
Magic
http://docs.python.org/3/tutorial/controlflow.html#unpacking-argument-lists
>>> def f(*args, **kwargs):
... print('args:', args, 'kwargs:', kwargs)
...
>>> f(4, 5, cuir=None, jacky='ok')
args: (4, 5) kwargs: {'jacky': 'ok', 'cuir': None}
>>> def f(a, b, c=-1):
... return a * b * c
...
>>> k = {'b': 1, 'a': 8}
>>> f(**k)
-8
>>> a = [2, 4, 6]
>>> f(*a)
48
Lambda Expressions
Lambda expressions are “anonymous” function declarations.
>>> f = lambda x: x ** 2
>>> f
<function <lambda> at 0x7f775f9c8320>
>>> f(8)
64
Here is a practical use example:
>>> k = [Job('j1', prio=2), Job('j2', prio=1), Job('j3', prio=8)]
>>> k.sort(key=lambda j: j.prio)
>>> k
[Job('j2', prio=1), Job('j1', prio=2), Job('j3', prio=8)]
Decorators
Decorators allow to add “behaviour” to a function.
>>> def trace(f): # define a new "trace" decorator
... def __wrapped(*args, **kwargs):
... print(f'{f.__name__}: args {args} {kwargs}')
... val = f(*args, **kwargs)
... print(f'{f.__name__}: return {val}')
... return val
... return __wrapped
...
>>> @trace # equivalent of cuir = trace(cuir)
... def cuir(a, b, c):
... return a * b + c
...
>>> r = cuir(1, 2, c=4)
cuir: args (1, 2) {'c': 4}
cuir: return 6
>>> r
6
Decorators also work on class definitions.
Asynchronous I/O
Since Python 3.5, python has the async
and await
keywords.
By themselves, these keywords do nothing. They require an event loop to process the values that they return.
Available implementations:
- https://docs.python.org/3/library/asyncio.html The standard implementation included in the standard library.
- https://trio.readthedocs.io/en/stable/ alternative event loop and async framework.
- https://github.com/MagicStack/uvloop drop-in replacement of
asyncio
withuvloop
based acceleration.
Example:
async def foo(a, b):
if a == 2:
x = await bar(a) # does not block the current thread
else:
x = await bar(b) # yields back control to the event loop
return x
Memory Management
Variable Scope
There are 3 types of scopes: global, class, function.
GLOBAL_VARIABLE = 2
class Test:
# "GLOBAL_VARIABLE" is visible
CLASS_VARIABLE = 'super bien'
def method(self, local1, local2):
# "GLOBAL_VARIABLE" and "local*" are visible
def _inner_func(a, b):
# "GLOBAL_VARIABLE", "a", "b" and "local*" are visible
# in order to modify the non-global variables, they need to be
# declared as nonlocal first
nonlocal local1
local1 = 5
...
# to access class variables
Test.CLASS_VARIABLE or self.CLASS_VARIABLE
if local1 == 2:
x = 0
# "x" is visible even "outside" the "if"
Reference Counting & Garbage Collecting
Python variables are “labels” attached to objects. When an object is referenced by
a label, its reference counter is incremented. When the reference counter of an object
reaches 0
, the memory used by the object is freed immediately.
Interesting (old) post http://foobarnbaz.com/2012/07/08/understanding-python-variables/
Note: The de-allocation process can be deferred to a garbage collector thread by using
the gc
module: http://docs.python.org/3/library/gc.html. Normally the “standard”
reference counter will suit most situations, use this with care.
Threading & GIL
Threading support
When compiled with the --with-threads
option, the interpreter can spawn new
pthreads through the threading
module.
http://docs.python.org/3/library/threading.html
It exposes an API similar to the Java Thread Interface.
class MyThread(threading.Thread):
def __init__(self, dataset, *args, **kwargs):
threading.Thread.__init__(self, *args, **kwargs)
self.dataset = dataset
def run(self):
# do stuff with self.dataset
threads = [ MyThread(d) for d in datasets ]
for t in threads:
t.start()
for t in threads:
t.join()
The Global Interpreter Lock
The Python interpreter is not fully “thread-safe”. Mainly because of the memory management system. It implies that only one stream of code can be"executed" by the interpreter at a time.
Each time a thread is executed by the interpreter, it acquires the GIL. It releases it when it does an I/O operation or it is forced to release it every “tick” (100ms) to allow other threads to run too.
Very detailed explanation on “why GIL?”: http://www.dabeaz.com/python/UnderstandingGIL.pdf
The GIL prevents the Python interpreter to exploit the multi-core platforms.
In order to work around this problem, the multiprocessing
module was created.
http://docs.python.org/3/library/multiprocessing.html
Since 2021, there’s an initiative from Guido Van Rossum (the creator of Python) to optimize cPython and get rid of the single threaded bottleneck.
https://github.com/faster-cpython/ideas
The End
from __future__ import *