Skip to content

Packaging a Python project

Louis Maddox edited this page Feb 8, 2021 · 41 revisions

Initial setup for package metadata

  • git init

  • .gitignore

    Examples:

    • pyx-sys and mvdef
      • packaged projects with vim swap files and Python build/dist artifacts (build, dist, .eggs etc.)
    • dx
      • a packaged project as above, also with store/ and tmp/ data directories
    • emoji-liif with Jupyter notebooks
  • README.md

    More details:

    • title e.g. # myproject (add a brief description, copy this to the GitHub repo description)
    • ## Usage (brief note on what usage will look like if you've not written it yet)
    • ## Requirements list - Python 3 and if there are many packages, put them on one line usually
      • These should be the proper human-readable names (the pip names may differ, they go in requirements.txt)
      • If there are complex conda environment requirements or different options, write a little backticked block to reproduce it (as here for emoji-liif)
  • LICENSE

    Details/example:

    • If you haven't pushed to GitHub yet, copy a LICENSE (the text will be detected)
      • E.g. from dx (click 'Raw' to get plain text)
  • pyproject.toml

    More details:

    • Handles the requirements for building binaries/wheels to ship (not required for package users), namely setuptools, wheel, setuptools_scm for auto-versioning
    • Copy the one for mvdef
    • Versioning is incremented by git tags (see the tags for mvdef for examples). These allow you to do 'releases' (again see mvdef)
  • echo > requirements.txt (if you haven't written any code yet, otherwise list requirements for pip)

  • setup.py

    Details/examples:

    • Handles the metadata for pip installation
    • Copy an example like dx
      • Change the classifiers for 'intended audience', license, and so on, to match your project
      • find_packages will auto-detect the package inside src
      • If you want command-line entrypoints these can be added (e.g. mvdef has 4 such console_scripts)
      • The version_scheme interprets the git tag to auto-version the package accordingly

Package source code

Paths in the following sections are under src/pkgname (replacing 'pkgname' with your package)

  • share indicates /src/pkgname/share

Path management

  • Create modules for each section (e.g. share, dataset, model) as well as a top-level __init__.py and any computation for things to be exported into the main namespace there in utils.py (but do not crowd this directory with anything else!)

Data management

More details:

  • Make a deliberate plan about where your data will go within the package, not just the first place that comes to mind.

    • Does it really need to be elsewhere on your system? (You can exclude it from the distributed package easily)
  • Keep data separate from computation on that data

    • src contains both but not in the same subdirectories
  • Keep fixed data files used for computation in share/data

    • This will be included in the distributed package via setup.py by setting include_package_data=True in setuptools.setup()
  • Keep artifacts [outputs computed from running your package code] in store/

    • This will be excluded from the distributed package via .gitignore by the line **/store/*
  • Don't put data at the top-level (above src) outside the package, your code won't be able to interact with it properly (by exposing directory paths from __init__.py files)

    • as done for dx in share/data/__init__.py

      from . import __path__ as _dir_nspath
      from pathlib import Path as _Path
      
      __all__ = ["dir_path"]
      
      dir_path = _Path(list(_dir_nspath)[0])

      which is then provided to the share.scraper.parse_topics module:

      from ..data import dir_path as data_dir

GitHub

Once you've done initial set up for your project locally, go to GitHub and create a repo, copy the description you put in your README to the description

More details - If it's for a specific organisation (e.g. the engine to drive a website) put it there rather than your personal account

Immediately I run these steps (comments are my bashrc aliases)

git add --all # gadd
git commit -m "Initial commit" # gcmmtinit
git remote add origin [email protected]:lmmx/pkgname.git # gorigadd pkgname
git push -u origin master # gpushu
Clone this wiki locally