Code management#

We will present a few key concepts about software management. Modern practises of software management (especially within a group of developers) have evolved into the so-called DevOPS methodology, which integrates several steps to develop, test and release software.

Branching#

git allows you to keep multiple instances of your project using branches.

_images/git-model%402x.png

From https://nvie.com/posts/a-successful-git-branching-model/

Keep it simple: a single master / main branch

_images/branch21.png

Use feature branches for new / experimental features

_images/branch22.png

You can work on several branches, but only one at a time can be “checked out”. Keep an eye on the prompt to know which branch you are on, or use the following command

git branch -va
* feature_branch 4d64f07 Update text
  master         4d64f07 Update text

For collaborations, you may want to follow GitHub flow or a similar practise.

Tagging#

Tag your releases whenever you need to provide an official version of your code, like 1.0.0. Actually, the tag can be be any string, it need not be a numeric version.

This information is typically stored in some file within the code folder, so you should remember to update it when a new version is ready. I will assume this is done with some script

./update_version 1.0.0

Then you can commit the change with

git commit -am "Bump version 1.0.0"

Finally, you can create a tag. A git tag is just a label attached to a commit

git tag 1.0.0
_images/tag.png

To get a list of the current tags

git tag -l
1.0.0

Finally, to push the tag 1.0.0 to the server

git push 1.0.0

Versioning#

Possible software versioning schemes:

  • X: simple incremental versioning

  • X.Y: major-minor (ex: 0.1, that’s where you start from)

  • X.Y.Z: major-minor-patch (ex. git 2.17.1)

  • X.Y.Z.*: alpha / beta / release candidate info

There is one very reasonable way to decide how to increment versions: semantic versioning https://semver.org

In a nutshell:

  • increase X: backward incompatible changes

  • increase Y: new features

  • increase Z: cosmetics

Note

While semantic versioning does not fix all the issues related to dependencies between multiple software versions (aka dependency hell!), it is a highly recommended practise. The underlying assumption is that X evolves on a slower time scale than Y and that the public API of the code is stable enough.

A Python package template#

This repository contains a template Python package that you can use to start new software projects based on Python. It is meant to illustrate in a specific case some general features of code management, which are largely independent of the programming language of your choice.

Clone the project into the my_new_code folder

git clone https://framagit.org/coslo/template-python.git my_new_code
Cloning into 'my_new_code'...

and delete the .git/ folder at the root of directory my_new_code, so that you can start developing a brand new code.

Useful references:

Environment setup#

You should provide a list of packages / libraries on which your code depends.

Rules of thumb:

  1. Freeze the the major versions (X) of all dependencies, to avoid changes to the public API to break your code

  2. Provide a lower bound on Y, corresponding to the lowest minor version that is compatible with your code

Rule 1. can be relaxed if you realize that your code is working with a more recent major version.

In practise, Python provides a few ways means to define dependencies:

  1. requirements.txt: provide a list of packages to be installed via the pip package manager (see a few paragraphs further down)

  2. pyproject.toml: currently the official way of handling dependencies and code packaging in Python projects; the use of a setup.py script is deprecated in favor of pyproject.toml.

For instance, this is an excerpt of requirements.txt in the template-python project

argh>=0.28.1

For finer control on package versions, see the official pypa documentation.

I recommend to keep a per-project Python virtual environment for every code you develop and to delete old environments from time to time. The venv package is integrated in Python since version 3.6 and provides a simple and fairly robust way to handle Python virtual environments, described in full details in this nice tutorial.

In essence, make sure the venv package is installed in the Python distribution provided by you OS (this may require sudo apt install python-venv or analogous command for your OS). To check if you have it, type

python -m venv env

This will create a directory named env/ in the current directory. To activate the environment

. env/bin/activate

Note the dot (.) at the beginning of the command! If your prompt is correctly set up, it will tell you that you are working in that specific environment.

Check which Python executable are we using now

which python
/home/coslo/teaching/tools/env/bin/python

Then install your dependencies

pip install -r requirements.txt

and you are ready to develop and test your code. If you need additional Python packages for code development, just install them in the env/ environment.

Alternatively, you can install your package with pip in editable mode (although at the time of writing, this does not work with pyproject.toml unless you use the flint build backend)

pip install -e .

pip can install also local or remote python projects (over a network, on a remote git server, …), see https://pip.pypa.io/en/stable/cli/pip_install/#vcs-support

# A local version of numpy
pip install /home/coslo/usr/numpy
# A specific tag / branch / commit of numpy from github
pip install numpy@git+https://github.com/numpy/numpy@v1.21.0

When you are done working with this virtual environment, deactivate it

deactivate

Note

Consider the virtual environment as disposable: it should always be possible to delete it and recreate at any time from a requirements.txt file. Keeping multiple environments can easily eat a lot of disk space, so remember to delete them when they are not actively needed.

Unit testing#

From https://en.wikipedia.org/wiki/Unit_testing

Unit testing is a software testing method by which individual units of
source code are tested to determine whether they are fit for use

In Python, the unittest package provides a rather straigthforward approach to unit testing. Just keep one or multiple files with your tests in a tests/ folder, where each file has the following structure

import unittest

class Test(unittest.TestCase):

    def setUp(self):
        # Executed at the beginning of each Test method
        pass

    def test_simple(self):
        self.assertTrue(True)
        self.assertEqual(1, 1)
        self.assertAlmostEqual(1.0, 1.0)

    def tearDown(self):
        # Executed at the end of each Test method
        pass

You should of course import your own package and perform some actual tests in there, but you should get the idea.

To see this in action, let’s use template-python again. From the root of the package, execute

python -m unittest discover -s tests
.
----------------------------------------------------------------------
Ran 1 test in 0.000s

OK

Note

There are ways to execute these tests within you code editor, which lets you jump to the relevant line of code in case a test fails (ex. Ctrl+C Ctrl+T in emacs).

Ideally, the tests should cover almost 100% of the source code of your package - however, this does not guarantees that it will always run correctly in all circumstances! To check how much is your code coverage, you can use the coverage package

coverage run -m unittest discover -s tests
coverage report
.
----------------------------------------------------------------------
Ran 1 test in 0.000s

OK
Name                    Stmts   Miss  Cover
-------------------------------------------
mypackage/__init__.py       0      0   100%
mypackage/cli.py            6      2    67%
tests/test_hello.py         9      0   100%
-------------------------------------------
TOTAL                      15      2    87%

Continuous integration#

From https://en.wikipedia.org/wiki/Continuous_integration

Continuous integration (CI) the practice of merging all developers'
working copies to a shared mainline several times a day. Nowadays it
is typically implemented in such a way that it triggers an automated
build with testing.

The goal is to spot bugs quickly and to reduce the chance of conflicts between the main branch and those on which other developers are working.

In practice, this means that code testing and other tasks (ex. generating code documentation) are executed every time you push your code repository to your git server, which will perform these tasks remotely and send you an e-mail if anything fails. While this is rather overkill for single-user projects, it is very useful for large collaborations.

There are several implementations of CI and the main git platforms (github and gitlab) provide their own too

Here we have a look at a minimal GitLab CI configuration, which performs

  • unit tests on different Python versions

  • documentation update

At the root of the template project you will find a file .gitlab-ci.yml with the following content.

before_script:
  - python -V  # print out python version for debugging
  - pip install virtualenv
  - virtualenv env
  - source env/bin/activate
  - pip install -r requirements.txt

.test:
  script:
    - pip install coverage
    - coverage run -m unittest discover -s tests
  artifacts:
    paths:
      - .coverage
  coverage: '/^TOTAL.+?(\d+\%)$/'

The target before_script will install all necessary dependencies in a virtual environment, while .test provides a template for running a code coverage and extracting the corresponding coverage fraction.

We can now add targets to perform the above tasks for different Python versions using different Docker images.

test:3.6:
  image: python:3.6
  extends: .test

test:latest:
  image: python:latest
  extends: .test

We also add a target to generate the documentation from the files under docs and move them to a public web page accessible on the gitlab server (the path is specific to each GitLab instance). The parameter only performs this task only when a new tag is created and pushed to the git server (eco-friendly practise).

pages:
  script:
    - pip install sphinx
    - make -C docs html
    - mv docs/_build/html public
  artifacts:
    paths:
      - public
  only:
    - tags

Through this mechanism you can check the status of the tests, coverage and doc generation from the project web page.