We use python in a very complicated and critical scenario – automatic deposit and withdrawal of crypto-currencies.

It is hard to say whether choosing Python in this case is a good decision. It is a highly dynamic and untyped (by default) language which performs poorly. However, we did observe some of its merits in our engineering experience, including flexibility, ease of debugging, speed of development, pleasant syntax, and plenty of well-designed libraries.

In this article, I’ll focus more on the technical aspects, like code snippets or workflows that might be helpful to you. We will also discuss some high-level ideas.

Writing Code

Repo organization

.
├── DEBIAN
│   ├── control
│   └── postinst
├── Makefile
├── README.md
├── base*
│   ├── __init__.py
│   ├── ...
│   └── util.py
├── bin
│   ├── load_addresses
│   └── ...
├── btc
│   ├── __init__.py
│   ├── ...
│   └── worker
│       ├── __init__.py
│       ├── ...
│       └── broadcast.py
├── build.sh
├── db
│   ├── 1001-initial-tables.sql
│   └── ...
├── broadcast.service
├── requirements.txt
├── setup.py
└── tests
    └── main

NOTE: the base module is annotated with a * because it is in fact maintained as a git submodule.

Basically we follow what the most Python packages will do, putting the library source code under a specific folder (btc in this case). Each level of folder needs to have a __init__.py if you want to import this folder path as a module as well (e.g. things defined in btc/worker/__init__.py can be found under namespace btc.worker after importing.

The setup.py is responsible for registering and installing the package, such that you can import it with python anywhere.

from setuptools import setup, find_packages

setup(
    name='example-project',
    version='1.2.0',
    packages=find_packages(exclude=['contrib', 'docs', 'tests*']),
    python_requires='>=3.5'
)

The tests/main is a python script for integration test. In our project, all integration test’s relevant files are put inside a local and exclusive folder, e.g. tests/test01. So we can support a maximum concurrency rate of up to N. However, we should use virtualization in future to solve the external resource race problem.

We put all MySQL schemas in db. They are numbered according to the date being checked in, and all modification of schema must be appeneded incrementally as new SQL file, rather than directly modifying the previous .sql files.

We also rely on virtualenv for our development.

This is the script run by build bot each time for CI:

#!/bin/bash
set -x
set -e

## run in build bot

virtualenv -p python3 .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install -e .
make check
tests/main

Conventions

  • All pure utilities in bin are chmod a+x‘d, renamed as extension-less, and added shebang #!/usr/bin/env python3 head.
  • All command line tools should use argparse to parse arguments
  • Respect PEP-8 for basic code style
  • Respect PEP-257 and PEP-287 for writing inline documentation
  • Write type hints with best efforts
  • Use pymysql through the standard API in PEP-248 and PEP-249
  • Do NOT use float for precise calculation
  • For a complex constructor with a lot of string arguments, name each argument on the call-site
  • Extend generic functionalities through default arguments and try not to break legacy code
  • Don’t catch generic Exception, catch what you specifically need
  • Don’t over-use the exception mechanism – it might be meaningful to abort in some cases

Static Check

Lint code with PyLint

PyLint can detect most of the static errors. There are several tips on using it:

  1. Install it in the project’s virtualenv (along with other in requirements.txt), or else it might fail to find some depended packages
  2. Using vanilla pylint is too strict for CI purpose. We will let build-bot run pylint -d W -d C -d R -d U instead. You might need to use annotation # pylint: disable=<error-name> to disable lint of particular type of errors about particular line, file, imported module, or the entire project.

Check type hints statically with mypy

mypy can check the code against type hints. We usually run it as mypy --ignore-missing-imports.

Type hints can also be a natural and important documentation.

Dynamic Check

Check type hints dynamically with enforce

enforce makes further use of type hints to check type dynamically (since Python supports powerful reflection). This can make your code solid-safe.

However, as you may expect, performance will be affected – but in our case, perf is the last thing to consider.

Defensive Programming

We use a lot of assert in our code. This makes our code more robust against problems, and also serve as a nice documentation about the pre-condition of the code.

When using a conditional without else, ask yourself if this is an exception that should be reported.

Integration Test and Code Coverage

Our integration test can achieve more than 80% code coverage. The uncovered part are almost all exception handling code.

The major external states of our service are database and blockchain. All of them will be version-controlled and initialized from scratch in each run of CI. This caught most bugs in a reproducible way before them creeping into production. Also, the integration test is written from the end user’s perspective for more flexibility against code change, and everything is made as close to the production environment as possible.

Continuous Integration

Our CI process includes two parts:

  • static check
    • pylint
    • mypy
  • integration test
    • during which coverage will be collected and checked as well

By going through all 357 build logs in past two months for one typical project, we found that distribution of failure reasons is:

  • integration test error: 39
  • lint error: 9
  • type error: 5
  • coverage error: 1

Packaging and Release

Our software consists of two types: daemons and utilities.

We pack the Python runtime, libraries and our scripts together as a standalone Linux executable with the pyinstaller tool. The advantage is that we don’t have to install any Python dependencies on the target machine. However, it will become harder to debug since only bytecode is available in released binary.

The daemons are managed as systemd services. We release the software as a .deb package.

All the packaging work are done in the Makefile.

Operation and Monitoring

The stdout log of our daemons will be collected by syslog. We manage the daemons using standard interface provided by systemd, such as systemctl status/start/stop/restart. Finally, our monitoring infrastructure is based on collecting simple metrics: latency, traffic, errors, saturation (see prometheus for more). We also health-check our services through active heartbeat.

Revision – three months later

Our tools are put into production now for about one month, and handled more than a half million dollar worth (at least now) of deposits and withdrawals. This is just the bootstrapping period, but we still learned a lot about the consequences of our design choices, and had made many revisions to our workflow and code layout.

No postinst

The adduser responsibility is given to SRE instead. Only a line of README is provided as an example.

Check in the dependencies

To 1) better track the code change of release dependencies and 2) make sure our repository is more self-contained, we decided to check in all the Python dependencies to our repository.

For development-only dependencies, e.g. pylint, we pip download them, put them inside packages and check them in.

For released dependencies, e.g. pymysql, we use some Makefile logics to maintain the extracted content of the downloaded packages from PYPI inside a folder called third-party. For .whl package, simply unzip them and delete the .whl would be enough; but for .tar.gz package, we need to manually symbolic-link related module directories to third-party level. We use these dependencies by directing PYTHONPATH to it.

This revision not only changed our code review process, but also the release process. Now we use git archive to build the source distribution, including everything in our repository, into a .deb package. It means that we will release source code to our server. The lack of compilation and obfuscation is a double-bladed sword: it made our debugging much easier, but harder to keep the code safe.

Code style is checked as part of CI

we start to make autopep8 checking code style part of integration test.

Since DB status are not very related to how our code run, we found it easier to monitor them separately. This made the monitoring more centralized and easier to audit.

Note: what is the metrics about our code? e.g. heartbeat, errors, etc.

Configurations are part of the code

During the integration test, the configuration will be generated on the fly and fed to the test code. This made sure that by running test, we can know the most updated example of configuration

SQL GRANTs are added to integration test

We added the GRANTs as part of integration test so our permission control are always updated.

It also meant that our CI test became role-explicit. Each operation must be issued by the corresponding role in production deployment structure.

Improvements to make check

PYSRC=...

CHECKED_PYSRC=$(foreach s,${PYSRC},$(s).checked)

%.checked: %
        @echo "type checking $^..."
        mypy --ignore-missing-imports --strict-optional $^
        ./check-format $^

NUM_CORES=$(shell grep -c ^processor /proc/cpuinfo)

format:
        autopep8 --max-line-length 80 -i -j ${NUM_CORES} ${PYSRC}

strict-lint:
        pylint -j ${NUM_CORES} ${PYSRC}

lint:
        pylint -j ${NUM_CORES} -d W -d C -d R -d U ${PYSRC}

check: lint mypy ${CHECKED_PYSRC}

coverage:
        coverage run tests/main
        coverage html

check-format:

#!/bin/bash

DIFF=$(autopep8 --max-line-length 80 -d ${@:1})
if [ "$DIFF" != '' ]
then
    echo "$DIFF"
    exit 1
fi

CI failures statistics during that period