packagehowto/README.md
2024-04-30 18:53:40 +02:00

322 lines
8.8 KiB
Markdown

# Package HowTo
This is a brief introduction to python packages. How to make them, how
to install them, how to upload them, how to maintain them, how to work
with them. For details consult https://packaging.python.org, or
https://py-pkgs.org, or other sites.
Before you create a package please consider building a [virtual
environment](./venv.md) for your project.
## Why packages?
When working on your project, you typically end up with some scripts,
functions and classes that could be of more general interest. For
example, if you would use this code in some other projects as well.
As a first step, you make some modules, i.e. separate python files,
where you collect this code. These modules can be easily imported from
other scripts in the very same directory.
For example, consider a module `addition.py` and a script `analyze.py`
both in the same directory:
```txt
├── addition.py
└── analyze.py
```
In `addition.py` we define a function `add_two()`:
```
def add_two(x):
return x + 2
```
We can use this function in `analyze.py` by importing it from
`addition.py`:
```
from addition import add_two
x = 5
y = add_two(x)
```
This works, however, only if scripts and modules are in the same
directory (or if modules are in sub-directories). To make modules
available in other places of your file system or even for other
people, you need to turn the modules into packages.
## Minimal package
First we make a project directory for our new package. Usually the
name of the project directory is the same as the one of the
package. Here, however, we call the project directory `packagehowto`
and the name of the package `numerix`.
A package is a directory, and the name of the package is the name of
this directory, here `numerix`. Inside the package directory we put
all the modules, for now just `addition.py`.
What makes this directory a package is the presence of a file named
`__init__.py`. This file is executed when the package is imported. For
now we leave it empty.
The package directory resides in the `src/` directory of the
project. The layout of your package project then looks like this:
```txt
packagehowto/
├── pyproject.toml
└── src/
└── numerix
├── __init__.py
└── addition.py
```
A package also needs a `pyproject.toml` file. This file contains some
metadata about the package and informations about how to build a
package. As the bare minimum the content of the `pyproject.toml` file
specifies `setuptools` to be used for building the project:
```txt
[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"
```
With this `pyproject.toml` file, and `addition.py` and `__init__.py`
in the `src/numerix` directory, you can install the `numerix` package
on your machine and import it from wherever you want.
To install the project, go to the project root, here `packagehowto/` and run
from your shell
```sh
pip3 install .
```
This installs the package of the current project folder `.` somewhere
in your home directory. From anywhere in your home file system you now
can import this package. The import line of our `analyze.py` file
needs to look like this:
```
from numerix.addition import add_two
```
From the addition module of the numerix package the function add_two
is imported.
You would need to reinstall the package whenever you change your
package, for exmple when you add a new function or when you just fix
some package code. This is tedious, and that is why there is a `-e`
option ("editable install") for `pip install`. So install your package
with
```sh
pip3 install -e .
```
Then all future changes on your package are immediately available
without the need to reinstall the package again.
## The `__init__.py` file
This file is your package. In fact, you could write a package that
only has an `__init__.py` file in the package directory. All code you
want to make available in you package could be placed in the
`__init__.py` file. For example, if you define an `add_four()` function
in `__init__.py` like this:
```
def add_four(x):
return x + 4
```
then it can be imported directly from the package:
```
from numerix import add_four
```
Alternatively, you can define all your functions in module files, like
we did with the function `add_two()` in the `addition.py` module. In
the `__init__.py` file you can import this function and this way make
it available directly from the package. With this line in `__init__.py`
```
from .addition import add_two
```
(note the `.` in front of `addition` - this makes it a relative
import) you can import `add_two()` without specifying the module:
```
from numerix import add_two
```
## Multiple modules within a package
You can have as many modules as you need in your package. Let's add a
module `numbers.py` to the `numerix` package
```txt
└── numerix
├── __init__.py
├── addition.py
└── numbers.py
```
with the following content:
```
from .addition import add_two
def three():
return add_two(1)
```
This makes a function `three()` available that can be imported via
```
from numerix.numbers import three
```
Note, that the `numbers.py` module imports a function from the
`addition.py` module via a relative import with the `.` in front of
`addition`.
And of course you can also import this function in `__init__.py` so
that it can be imported from the package directly.
## Versioning
Your package needs a version number (for details see below). But how to
choose the right version number? A nice and brief overview is given at
the [Python Packaging User
Guide](https://packaging.python.org/en/latest/discussions/versioning/).
In short, for semantic versioning the version number is composed of
three integer numbers called `major`, `minor`, and `patch`, separated
by a dot. These numbers are incremented in the following way:
- `major`, when incompatible API changes have been introduced,
- `minor`, when functionality is added in a backwards-compatible manner, and
- `patch`, for backwards-compatible bug fixes.
## Package version
For uploading your package to [PyPi](https://pypi.org/) you need to
specify a version number in the `pyproject.toml` file. However, your
package should also provide a `__version__` variable. And some
scripts of your package might want to know about the version number
as well when you call them with a `--version' argument.
For the latter two cases, you can define a `__version__` variable in
one of the modules, or even in a dedicated module, that you then
import whereever you need it. For example, you may create a simple
module called `version.py` file with the following content:
```
__version__ = '1.4.2'
""" Current version of the numerix package as string 'x.y.z'. """
__year__ = '2024'
""" Year of the current numerix version as string. """
```
In `__init__.py` you import the version:
```
from .version import __version__
```
Then a script can easily check the package's version like this:
```
from numerix import __version__
print(__version__)
```
Other modules of the package may also import the version by a relative
import to be able to use it:
```
from .version import __version__
def about():
print(f'{__name__} {__version__}')
```
Why not defining `__version__` directly in `__init__.py`? Each module
using the version string would import it from `__init__.py`. If you
also want to import some functions from such a module in `__init__.py`
to make them more easily importable from the package, you end up in
cyclic imports. Python does not like them for a good reason...
But how do you get the very same version into the `pyproject.toml`
file? It would be a very bad idea to write it there directly and
update it whenever you change it in the `version.py` file. The
version number should be set only in one place.
This is possible, of course. You need to specify the `version` field
as `dynamic` in the `[project]` section. This tells the package tool
that the version will be set dynamically by some code. There are
several options to do so. The simplest one is to add a
`[tool.setuptools.dynamic]` section where the version is read as an
attribute from the `version` module of the `numerix` package:
```txt
...
[project]
name = "numerix"
dynamic = ["version"]
...
[tool.setuptools.dynamic]
version = {attr = "numerix.version.__version__"}
```
## Distribute your package
Full pyproject.toml file...
README.md
LICENSE
[PyPi](https://pypi.org/)
```sh
python3 -m build
```
```sh
python3 -m twine upload dist/*
```
token
## Using `poetry`
You have now seen how you can do everything "from scratch". This is of course
good to know. But there are tools to significantly simplify package creation
and management, such as [`poetry`](https://python-poetry.org/). Its worth
checking it out. Heres a nice demo: https://python-poetry.org/docs/basic-usage/
## Unit tests
pytest
coverage
```sh
pytest -v --cov-report html --cov numerix tests/
```
## Documentation