Implementing dependency management with Python Poetry
Recently updated on
One of the biggest issues with Python (our preferred development language) is dependency management. Any individual deployment can result in minor differences in the versions of the files and libraries which make up the application. And these minor variations introduce uncertainty and randomness into our deployment process.
So, we've investigated the available solutions for making our deployments more reliable and repeatable. There are several tools which we already use (virtual environments, pip, etc.) to isolate each individual project from the default Python libraries installed by the operating system. While this is a good first step towards our goal, it doesn't go far enough.
The Python packaging and deployment space has seen tremendous growth over the past several years and there are a number of possible solutions to the problem of repeatable dependency management: pipenv, pip-tools, hatch and the tool we ultimately chose: Poetry.
These tools try to manage both the top-level and the transitive dependencies of a project. The top-level dependencies are those packages that your project directly depends upon, like Django, and django-cms, and psycopg2-binary. Transitive dependencies are packages that your top-level dependencies depend upon, and the packages THEY depend on, and so on. Minor variations in the dependencies most often do not cause problems, but when there are minor variations it's often difficult to determine where the cause of the problem lies.
The tools we've explored solve this problem in different ways but the end result is the same. Poetry creates two files for the project: pyproject.toml and poetry.lock. PEP 518 specified the pyproject.toml file, which is intended to be used as a standard place for build tools and common configuration information. Poetry extends it to include the top-level dependency specifications, similar to requirements.txt (though there is an existing PEP to incorporate that information in a standard way instead of using tool-specific entries). The poetry.lock file stores the transitive dependencies. Both files are intended to be checked into your version control system, and should be kept in sync.
Generating a new Project
When you create a new project, you initialize it with 'poetry new', which asks some questions and creates a project directory structure like so:
$ poetry new poetry-demo poetry-demo ├── pyproject.toml ├── README.rst ├── poetry_demo │ └── __init__.py └── tests ├── __init__.py └── test_poetry_demo.py
And a pyproject.toml file like so:
[tool.poetry] name = "poetry-demo" version = "0.1.0" description = "" authors = ["Scott Sharkey <ssharkey@imagescape.com>"] [tool.poetry.dependencies] python = "*" [tool.poetry.dev-dependencies] pytest = "^3.4"
One of the first things that we do is edit the Python dependency to specify a specific Python version (e.g. 3.7.10) instead of the wildcard. This way, Poetry will insist on finding and installing dependencies which support that version of Python.
Converting an Existing Project
If you have an existing project that you want to bring under Poetry's management, it's as simple as this:
$ cd pre-existing-project $ poetry init
Converting from Requirements.txt
If you are converting from pip and requirements.txt, there is a wonderful tool named dephell which will convert from pip or pipenv or other formats to Poetry.
$ dephell deps convert --from-format=pip --to-format=poetry
Configuring Dependencies
Once we've initialized the project, we add dependencies manually and Poetry generates the lock file as well as making entries in our pyproject.toml. (or, you can use dephell as discussed above).
$ poetry add pendulum="^1.4"
In our example, we are requesting the pendulum package with the version constraint ^1.4. This means any version greater or equal to 1.4.0 and less than 2.0.0 (>=1.4.0 <2.0.0). There are a number of other ways to define dependencies in PEP 440. These allow you to control how Poetry will update your dependencies when you ask it to update the pyproject.toml.
I strongly recommend that you use the "add" command above, as it will generate the proper syntax, but you may also edit the pyproject.toml file manually. If you do edit manually, be sure to run 'poetry lock' afterwards to sync up the .toml file and the transitive dependencies in poetry.lock.
You should check both the .toml file and the .lock file into your version control so that the exact dependencies will be installed by other developers, ensuring consistency across environments.
Development Dependencies
For some projects, you have development dependencies which are tools that you install while developing the project, but which should NOT be installed in a production environment (debuggers, etc.). To add these dependencies to your project use the following command:
$ poetry add --dev <package>
Installing Your Project
When you are ready to install your project, or when a new developer joins your team and needs to get up to speed on your project, you use 'Poetry install' to install the libraries specified. There are two situations:
Without an Existing Poetry.lock
If you do not have a 'poetry.lock' file, Poetry will load pyproject.toml, evaluate the top-level dependencies, and then start pulling in the transitive dependencies that match the various package constraints, and install all into the currently active virtualenv (we will discuss virtual environments in a future blog, but we do not use Poetry to manage them). It will also create a new 'poetry.lock' file, recording the exact versions of everything installed.
If your pyproject.toml and poetry.lock files are out of sync, Poetry will inform you and ask you to run 'poetry lock' to re-sync them.
With an Existing Poetry.lock
If there is an existing 'poetry.lock' file, the process changes slightly. Assuming the .toml and .lock files are in sync, Poetry will use the existing .lock file to install exactly those dependencies, ensuring that everyone will have exactly the same versions, in all environments. If you need to update your dependencies, see the section below.
Installation Options
Poetry by default installs all dependencies, both dev and production. If you want to install just the production dependencies, without development tools, use the '--no-dev' flag.
Additionally, Poetry will install your project (ie, the code in the current directory) by default. This is useful when developing libraries, but maybe not so much for applications. To stop Poetry from installing your local project, use the '--no-root' option.
Updating Dependencies
Now that we've solved the issue of repeatable builds, we only have one remaining task -- to update the dependencies as safely and uniformly as possible. Fortunately, the 'poetry update' command will automatically calculate dependencies again when run, and update those packages which fit the criteria - usually meaning minor version changes. If you want to update a major version, you simply specify a new selection criterion which reflects the upgraded major version, and Poetry will re-calculate all the other dependencies.
Additional Poetry Features
In addition to all of this, Poetry has many other features including environment management, library building, etc. As we become more familiar with it, we may add additional thoughts through this blog. Until then, enjoy the reliability and consistency of Poetry.