Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update build system and rebuild with numpy >= 2.0 #611

Closed
wants to merge 5 commits into from

Conversation

cvanelteren
Copy link
Contributor

Hi there,

I made some changes to the build chain and update the numpy dependencies. The tests pass both here as in our package Ultraplot. Let me know if you wish me to comment on these changes.
Sincerely,
C

@cvanelteren
Copy link
Contributor Author

The PR would address #604 #608 #609

@cvanelteren
Copy link
Contributor Author

Perhaps one of the maintainers can comment on the build failure. Pinging @kant 🔔

@molinav
Copy link
Member

molinav commented Jan 24, 2025

@cvanelteren The error is coming because of the deprecated versions used in all the artifacts actions; v1 was needed because of using pleistocenic base images to still build with the manylinux1 tag.

I appreciate all your efforts in this PR, and I think it is actually a good moment to start thinking of basemap 2.0.0 and bring all the breaking changes that I was avoiding for the purpose of keeping the legacy things working. Some items I can list fast right now:

  • Moving to "Python >= 3.X", with X to be defined (reason for 3.7? better 3.8 because of the Cython 3.1 target versions? 3.9 to stay with the non-EOL versions?
  • Moving from manylinux1 to manylinux_2_17 for the GNU/Linux wheels. This already allows the use of not so old base images in the GitHub Actions, and probably v4 of the artifacts actions may work as is.
  • Start relying on implicit namespace packages (no empty __init__.py files). There is a PR from long ago still open.
  • Migrate everything in setup.py and setup.cfg to pyproject.toml. Is this even possible, considering that GEOS needs to be built? I do not have the knowledge but I see that you define something about that in the pyproject.toml of this PR.

My only concern with the current PR is that it makes several big changes all at once, I would be in favour of addressing each of the current problems one by one in separate PR. I think that, as of today, the develop branch should become the branch targeting a major release basemap 2.0.0. I will look at the latest commit and create a support-1.x branch, so that I can still keep my legacy support without disturbing the majority of the basemap users, because it really is a problem for you right now: the latest basemap patch release from my side was almost 1 year ago, and there is still no support for Python 3.13 and numpy >= 2.0.

@DWesl
Copy link
Contributor

DWesl commented Jan 24, 2025

  • Migrate everything in setup.py and setup.cfg to pyproject.toml. Is this even possible, considering that GEOS needs to be built?

It is not currently possible to create extension modules in pyproject.toml, but all the other metadata should be there. I think there's a PR over on PyPA/setuptools about what extension modules in pyproject.toml should look like.

@cvanelteren
Copy link
Contributor Author

@molinav no worries if this PR is trashed ;-). I wasn't keeping track of the main. I am happy to chip in to work on stuff. I don't have full overview of the pipeline but noticed some verbose and missing stuff in pyproject.toml (hence this PR).

I agree that this is perhaps a bit too rushed and fast. We could split this PR up by first tackling pyproject.toml (to move everything there, and then update the dependency for higher python versions (say 3.9 and up?). Let me know what you think.

To answer one of you questions:

Migrate everything in setup.py and setup.cfg to pyproject.toml. Is this even possible, considering that GEOS needs to be built? I do not have the knowledge but I see that you define something about that in the pyproject.toml of this PR.

Yes I have included it as such in the PR here

@ReimarBauer
Copy link

I'e just seen that there is a ruff checker available.
https://numpy.org/devdocs/numpy_2_0_migration_guide.html#numpy-2-migration-guide

@cvanelteren
Copy link
Contributor Author

I have some time to work on this but don't wanna step on anyone's toes. If the team could outline what they have in mind and my time is appreciated then by all means I can and will under some guidance where you wanna move the package.

@molinav
Copy link
Member

molinav commented Feb 16, 2025

I would really appreciate your help, @cvanelteren. Unfortunately, I am quite overloaded and close to my limits, so I doubt I can be hands-on myself at least until the second half of March. Reviewing changes should be something that I can still handle despite my limited availability.

My comment #611 (comment) above summarises the main changes needed. The following PRs into develop would be sensible, in this order (and making each PR address one thing at a time):

  • PR making the GitHub workflows work again. Here, we have two possibilities:

    1. Fix the current workflows. This means upgrading all the artifacts actions to v4. However, because of the node requirements for these actions, my Debian-based running the workflows need to be bumped to something newer (e.g. Debian 8?). This implies that the generated wheels will not be manylinux1 anymore, but at least manylinux_2_17. The current workflows are limited to Windows and GNU/Linux (glibc): musllinux wheels are missing, and MacOS wheels were generated and uploaded manually in one of my personal machines.
    2. Reimplement the workflow using cibuildwheel to build the wheels. This is probably the long-term solution, because we delegate the build setup to a project that is dedicated exclusively to it. For example, the current workflow cannot generate wheels for Python 3.13 on GNU/Linux simply because I did not have time to rebuild my Debian-based images together with Python 3.13 (the last time I had some free time to play with it, Python 3.12 was the latest). Here, I cannot be of much help due to my lack of knowledge with cibuildwheel.
  • PR removing support for some EOL Python versions. Depending on the approach for the former PR:

    1. With my Debian-based images, the minimum Python version is not really relevant, but it should be at least Python 3.3 because there is the PR New style namespace packages #576 blocked for long time requesting the use of implicit namespace packages (i.e. removing empty __init__.py files as indicated in PEP 420). Probably, other newer Python version is more sensible, but still not sure which one.
    2. With the cibuildwheel approach, the supported Python versions are constrained to what cibuildwheel decides to support, I think they follow a strict version upgrade based on the Python version support, but I am not sure about the details (support for all non-EOL versions or for the four latest non-EOL versions?
  • PR New style namespace packages #576 can then be revisited and accepted (only the changes related to the implicit namespace packages).

  • PR with the upgrade for numpy >= 2.0.

All these PRs (fixed workflows, manylinux_2_17 wheels, Python 3-only migration, NumPy 2.0 support, Python 3.13 support) should be merged into develop to create the next major release basemap 2.0.0. I am happy if I can create a support-1.x branch from the current latest develop commit (461b402) and continue some support for basemap 1.x, but I do not want that my specific requirements block the main basemap project. It was okay when I took over the maintenance in 2021, because the project was stale (we did not even have wheels on PyPI), but it is time to move on in the develop branch.

@cvanelteren
Copy link
Contributor Author

Thanks for the guidelines. I started working on the first task here: #612

@cvanelteren
Copy link
Contributor Author

Closing in favor of #614

@cvanelteren
Copy link
Contributor Author

I think all points above are now addressed @molinav. Can we proceed to push develop to main?

@molinav
Copy link
Member

molinav commented Feb 26, 2025

@cvanelteren All the major changes are now addressed. I would like to mention some points regarding open items:

  • Add linter to precommit bot #617 may remain open for the time being. I realised that flake8 and pylint were only running on a subset of files. A lot of cleanup is still needed before we can run both in the whole library. Furthermore, I would avoid free refactorings based on black at least until the library is clean (I am also not a big fan of black, but this is subjective).

  • [Proposal] Refactor project structure #616 may remain open for the time being. In this issue, I see two different topics:

    • The migration to pyproject.toml. I always have mixed feelings with these configuration files, because in the end any Python package seems to still need a setup.py for the dynamic things. I would not mind moving everything possible to pyproject.toml, and especially to migrate the setup.cfg options to it. We have three files (setup.py, setup.cfg and pyproject.toml) to handle the metadata and build process when the legacy approach only needed one (setup.py). Removing setup.cfg at least allows to go back to two (setup.py, and pyproject.toml). But I am not sure it makes sense to block basemap 2.0.0 for this refactoring, when it currently works and we can address it e.g. in basemap 2.1.0.
    • The project restructuring. In this item I have to disagree, but from your open issue I see that the documentation on the project structure needs to be improved. We need to distinguish "project" from "package" (you must have seen the two terms already during the cibuildwheel migration because they use the same terms). For most projects, one single package is developed, and the concept of project and package mix. In the basemap project, we build three packages, so project and package do no overlap directly. If you move to the packages folder (basemap, basemap_data, basemap_data_hires), each subfolder corresponds to a Python package tree with the standard src-based package skeleton. The approach suggested in the issue also moves from a src-based skeleton to a flat skeleton, and I think this is not a good idea. I would like to describe my reasons better in the issue itself, but I am quite limited in time right now.
  • The current cibuildwheel workflow does not build the wheels from the sdists. I am not totally happy with it, because the sdist -> wheel approach guarantees that the sdists allow to build wheels for the platforms in which precompiled wheels are not available. I uploaded broken sdists in the past here because of not using the sdist -> wheel approach. The legacy workflows implemented the build process in this manner to address this problem. I guess we can live with it for now, but it should be fixed at some point (I was playing around with cibuildwheel yesterday but without success).

  • The legacy workflows had a specific job to build the basemap documentation with Sphinx. This should come back, because I was using the artifacts of this job to later update the gh-pages branch without having to build the documentation manually.

  • The cibuildwheel workflow needs to install basemap-data and basemap-data-hires from the previous job during the basemap test stage. Otherwise, I cannot move the basemap-data requirement below to basemap-data >= 2.0, < 3.0 because basemap-data 2.0.0 and basemap-data-hires 2.0.0 packages are still not available on PyPI (they will be uploaded together with the release of basemap 2.0.0):

    basemap_data >= 1.3.2, < 3.0

I think the last two points should be addressed before I prepare basemap 2.0.0 for release. The first three points can wait.

@cvanelteren
Copy link
Contributor Author

Thanks for the detailed reply, it helps me understand the vision.

  • Regarding point 2, I would be in favor of the flat structure as it logically will dictate what the package is -- part source part data. As far as I can tell the data packages are not used independently from basemap, and thus should be built and released as part of the whole. Moving towards an integrated approach will be more in line with how common python packages are structured, and would feel less complex as it is currently. The usage of src is more common for compiled-based code as seen in C++ and C, but more modern languages (e.g. Nim) adopt the approach that python uses. In all I think would reduce the complexity of managing 3 separate setups and integrate them in a pyproject.toml and or build system (for example using setup.py.

  • I can see if I can address the wheel building, including the last point you addressed

  • I can make a sphinx workflow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants