Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken 0.9.3 release by delaying kerchunk.netCDF3 import #466

Merged
merged 3 commits into from
Jan 6, 2023

Conversation

cisaacstern
Copy link
Member

This recipe test attempted in pangeo-forge/C-iTRACE-feedstock#5 (comment) failed, and server logs reveal that this issue is that

from kerchunk.netCDF3 import NetCDF3ToZarr

requires scipy, which we do not require on install. The kerchunk[netcdf3] option added in this PR will bring this in via kerchunk. The issue can be reproduced by pip-installing pangeo-forge-recipes==0.9.3 in a blank conda environment, and then:

>>> from pangeo_forge_recipes.recipes import XarryZarrRecipe
Traceback (most recent call last):
  File "/Users/charlesstern/miniconda3/envs/kerchunk-import/lib/python3.9/site-packages/kerchunk/netCDF3.py", line 8, in <module>
    from scipy.io._netcdf import ZERO, NC_VARIABLE, netcdf_file, netcdf_variable
ModuleNotFoundError: No module named 'scipy'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/charlesstern/miniconda3/envs/kerchunk-import/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/__init__.py", line 2, in <module>
    from .reference_hdf_zarr import HDFReferenceRecipe
  File "/Users/charlesstern/miniconda3/envs/kerchunk-import/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/reference_hdf_zarr.py", line 14, in <module>
    from ..reference import create_kerchunk_reference, unstrip_protocol
  File "/Users/charlesstern/miniconda3/envs/kerchunk-import/lib/python3.9/site-packages/pangeo_forge_recipes/reference.py", line 8, in <module>
    from kerchunk.netCDF3 import NetCDF3ToZarr
  File "/Users/charlesstern/miniconda3/envs/kerchunk-import/lib/python3.9/site-packages/kerchunk/netCDF3.py", line 10, in <module>
    raise ImportError(
ImportError: Scipy is required for kerchunking NetCDF3 files. Please install with `pip/conda install scipy`. See https://scipy.org/install/ for more details.

Our CI testing let this through, because scipy is brought into the dev environment those tests run in.

@cisaacstern cisaacstern requested a review from rabernat January 6, 2023 19:17
@cisaacstern
Copy link
Member Author

cisaacstern commented Jan 6, 2023

I can unblock Pangeo Forge Cloud by using this branch for the moment.

@rabernat how would you like to handle this from a release standpoint?

Should we yank 0.9.3 and then release this as 0.9.4?

@rabernat
Copy link
Contributor

rabernat commented Jan 6, 2023

Can we make this a runtime optional dependency rather than a required dependency? Scipy is huge. Can't we just pip install it directly in the cloud environment?

Making something a required dependency of pangeo forge recipes should not be the only way to get a package into our environment.

@cisaacstern
Copy link
Member Author

Can we make this a runtime optional dependency rather than a required dependency? Scipy is huge. Can't we just pip install it directly in the cloud environment?

Yes, I see your point about scipy's size.

I guess this will require ImportError exception handling along the lines of:

https://github.com/fsspec/kerchunk/search?q=ImportError

because currently in 0.9.3, even if the user is not planning to use netcdf3 features, there is no way to import XarrayZarrRecipe (as demonstrated in the Details dropdown in the first comment above).

@rabernat
Copy link
Contributor

rabernat commented Jan 6, 2023

Ok, now I understand that pangeo forge recipes is not even importable without scipy. That's bad. Unfortunately we do need a new release.

Let's put the optional import here.

elif file_type == FileType.netcdf3:
chunks = NetCDF3ToZarr(url, max_chunk_size=100_000_000)

@cisaacstern
Copy link
Member Author

Decided not to handle the error because kerchunk will give us a descriptive message if scipy is missing:

https://github.com/fsspec/kerchunk/blob/c2a7656f637382c865eeac83198fb52d41aab7d5/kerchunk/netCDF3.py#L7-L13

@cisaacstern cisaacstern changed the title Fix broken 0.9.3 release with kerchunk[netcdf3] Fix broken 0.9.3 by delaying kerchunk.netCDF3 import Jan 6, 2023
@cisaacstern cisaacstern changed the title Fix broken 0.9.3 by delaying kerchunk.netCDF3 import Fix broken 0.9.3 release by delaying kerchunk.netCDF3 import Jan 6, 2023
@cisaacstern
Copy link
Member Author

I'll merge, pull 0.9.3, and release 0.9.4 once tests pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants