Code management#
We will present a few key concepts about software management. Modern practises of software management (especially within a group of developers) have evolved into the so-called DevOPS methodology, which integrates several steps to develop, test and release software.
Branching#
git
allows you to keep multiple instances of your project using branches.
From https://nvie.com/posts/a-successful-git-branching-model/
Keep it simple: a single master / main branch
Use feature branches for new / experimental features
You can work on several branches, but only one at a time can be “checked out”. Keep an eye on the prompt to know which branch you are on, or use the following command
git branch -va
* feature_branch 4d64f07 Update text
master 4d64f07 Update text
For collaborations, you may want to follow GitHub flow or a similar practise.
Tagging#
Tag your releases whenever you need to provide an official version of your code, like 1.0.0
. Actually, the tag can be be any string, it need not be a numeric version.
This information is typically stored in some file within the code folder, so you should remember to update it when a new version is ready. I will assume this is done with some script
./update_version 1.0.0
Then you can commit the change with
git commit -am "Bump version 1.0.0"
Finally, you can create a tag. A git
tag is just a label attached to a commit
git tag 1.0.0
To get a list of the current tags
git tag -l
1.0.0
Finally, to push the tag 1.0.0
to the server
git push 1.0.0
Versioning#
Possible software versioning schemes:
X
: simple incremental versioningX.Y
: major-minor (ex:0.1
, that’s where you start from)X.Y.Z
: major-minor-patch (ex. git2.17.1
)X.Y.Z.*
: alpha / beta / release candidate info
There is one very reasonable way to decide how to increment versions: semantic versioning https://semver.org
In a nutshell:
increase
X
: backward incompatible changesincrease
Y
: new featuresincrease
Z
: cosmetics
Note
While semantic versioning does not fix all the issues related to dependencies between multiple software versions (aka dependency hell!), it is a highly recommended practise. The underlying assumption is that X
evolves on a slower time scale than Y
and that the public API of the code is stable enough.
A Python package template#
This repository contains a template Python package that you can use to start new software projects based on Python. It is meant to illustrate in a specific case some general features of code management, which are largely independent of the programming language of your choice.
Clone the project into the my_new_code
folder
git clone https://framagit.org/coslo/template-python.git my_new_code
Cloning into 'my_new_code'...
and delete the .git/
folder at the root of directory my_new_code
, so that you can start developing a brand new code.
Useful references:
Environment setup#
You should provide a list of packages / libraries on which your code depends.
Rules of thumb:
Freeze the the major versions (
X
) of all dependencies, to avoid changes to the public API to break your codeProvide a lower bound on
Y
, corresponding to the lowest minor version that is compatible with your code
Rule 1. can be relaxed if you realize that your code is working with a more recent major version.
In practise, Python provides a few ways means to define dependencies:
requirements.txt
: provide a list of packages to be installed via thepip
package manager (see a few paragraphs further down)pyproject.toml
: currently the official way of handling dependencies and code packaging in Python projects; the use of asetup.py
script is deprecated in favor ofpyproject.toml
.
For instance, this is an excerpt of requirements.txt
in the template-python project
argh>=0.28.1
For finer control on package versions, see the official pypa documentation.
I recommend to keep a per-project Python virtual environment for every code you develop and to delete old environments from time to time. The venv package is integrated in Python since version 3.6 and provides a simple and fairly robust way to handle Python virtual environments, described in full details in this nice tutorial.
In essence, make sure the venv
package is installed in the Python distribution provided by you OS (this may require sudo apt install python-venv
or analogous command for your OS). To check if you have it, type
python -m venv env
This will create a directory named env/
in the current directory. To activate the environment
. env/bin/activate
Note the dot (.
) at the beginning of the command! If your prompt is correctly set up, it will tell you that you are working in that specific environment.
Check which Python executable are we using now
which python
/home/coslo/teaching/tools/env/bin/python
Then install your dependencies
pip install -r requirements.txt
and you are ready to develop and test your code. If you need additional Python packages for code development, just install them in the env/
environment.
Alternatively, you can install your package with pip
in editable mode (although at the time of writing, this does not work with pyproject.toml
unless you use the flint
build backend)
pip install -e .
pip can install also local or remote python projects (over a network, on a remote git server, …), see https://pip.pypa.io/en/stable/cli/pip_install/#vcs-support
# A local version of numpy
pip install /home/coslo/usr/numpy
# A specific tag / branch / commit of numpy from github
pip install numpy@git+https://github.com/numpy/numpy@v1.21.0
When you are done working with this virtual environment, deactivate it
deactivate
Note
Consider the virtual environment as disposable: it should always be possible to delete it and recreate at any time from a requirements.txt file. Keeping multiple environments can easily eat a lot of disk space, so remember to delete them when they are not actively needed.
Unit testing#
From https://en.wikipedia.org/wiki/Unit_testing
Unit testing is a software testing method by which individual units of
source code are tested to determine whether they are fit for use
In Python, the unittest
package provides a rather straigthforward approach to unit testing. Just keep one or multiple files with your tests in a tests/
folder, where each file has the following structure
import unittest
class Test(unittest.TestCase):
def setUp(self):
# Executed at the beginning of each Test method
pass
def test_simple(self):
self.assertTrue(True)
self.assertEqual(1, 1)
self.assertAlmostEqual(1.0, 1.0)
def tearDown(self):
# Executed at the end of each Test method
pass
You should of course import your own package and perform some actual tests in there, but you should get the idea.
To see this in action, let’s use template-python again. From the root of the package, execute
python -m unittest discover -s tests
.
----------------------------------------------------------------------
Ran 1 test in 0.000s
OK
Note
There are ways to execute these tests within you code editor, which lets you jump to the relevant line of code in case a test fails (ex. Ctrl+C Ctrl+T
in emacs).
Ideally, the tests should cover almost 100% of the source code of your package - however, this does not guarantees that it will always run correctly in all circumstances! To check how much is your code coverage, you can use the coverage
package
coverage run -m unittest discover -s tests
coverage report
.
----------------------------------------------------------------------
Ran 1 test in 0.000s
OK
Name Stmts Miss Cover
-------------------------------------------
mypackage/__init__.py 0 0 100%
mypackage/cli.py 6 2 67%
tests/test_hello.py 9 0 100%
-------------------------------------------
TOTAL 15 2 87%
Continuous integration#
From https://en.wikipedia.org/wiki/Continuous_integration
Continuous integration (CI) the practice of merging all developers'
working copies to a shared mainline several times a day. Nowadays it
is typically implemented in such a way that it triggers an automated
build with testing.
The goal is to spot bugs quickly and to reduce the chance of conflicts between the main branch and those on which other developers are working.
In practice, this means that code testing and other tasks (ex. generating code documentation) are executed every time you push your code repository to your git server, which will perform these tasks remotely and send you an e-mail if anything fails. While this is rather overkill for single-user projects, it is very useful for large collaborations.
There are several implementations of CI and the main git platforms (github and gitlab) provide their own too
GitHub Actions (see also this example)
Jenkins, Travis, …
Here we have a look at a minimal GitLab CI configuration, which performs
unit tests on different Python versions
documentation update
At the root of the template project you will find a file .gitlab-ci.yml
with the following content.
before_script:
- python -V # print out python version for debugging
- pip install virtualenv
- virtualenv env
- source env/bin/activate
- pip install -r requirements.txt
.test:
script:
- pip install coverage
- coverage run -m unittest discover -s tests
artifacts:
paths:
- .coverage
coverage: '/^TOTAL.+?(\d+\%)$/'
The target before_script
will install all necessary dependencies in a virtual environment, while .test
provides a template for running a code coverage and extracting the corresponding coverage fraction.
We can now add targets to perform the above tasks for different Python versions using different Docker images.
test:3.6:
image: python:3.6
extends: .test
test:latest:
image: python:latest
extends: .test
We also add a target to generate the documentation from the files under docs
and move them to a public web page accessible on the gitlab server (the path is specific to each GitLab instance). The parameter only
performs this task only when a new tag is created and pushed to the git server (eco-friendly practise).
pages:
script:
- pip install sphinx
- make -C docs html
- mv docs/_build/html public
artifacts:
paths:
- public
only:
- tags
Through this mechanism you can check the status of the tests, coverage and doc generation from the project web page.