-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve dask example notebooks #100
Comments
@brendancol Since we are using things like black and isort in our pre-commit configuration for the main package, we could also add some notebook-specific linting to our pre-commit config. Something like this: - repo: https://github.com/nbQA-dev/nbQA
rev: 1.2.2
hooks:
- id: nbqa-black
files: \.ipynb$
- id: nbqa-flake8
files: \.ipynb$
- id: nbqa-isort
files: \.ipynb$
- repo: https://github.com/kynan/nbstripout
rev: 0.5.0
hooks:
- id: nbstripout
files: \.ipynb$
This might need to be adapted, I haven't tested it! |
Now that we have the notebooks for the "small" examples, we want to build an additional notebook with a larger dataset to showcase the Dask integration: #118 |
@brendancol @sehnem As discussed today: These large problems have lots of columns and layers to really stress the code. The code RRTMGP is vectorized on the inside. Default chunk size is 720x720x1. If you pass Robert's code that kind of chunk, it breaks because of memory. For radiative transfer solve, that might be too much, for example. So we need to chunk stuff up. The other, related question: it seems that right now (maybe only gas optics) we are calling several kernels in a row. This would mean we have one worker to work on these calls in a row? In theory, different problems are sent to different workers - increased complexity of the problem means more time. The dataset we have is interesting (the diamond dataset from #118). It has half a dozen dimensions. We don't need to overfit to this dataset. But it has several dimensions that vary, a level dimension, and maybe some others as well. to do the radiative transfer, we need to be able to have some number of 1-dimension and all the level dimensions. Maybe code should use whatever kind of chunking it needs and the solver does the chunking it needs. Or it might make more sense to have the cunking upfront to have arbitrary points on all the levels. To-Dos
|
Improve the dask-based examples as discussed here:
#98 (comment)
The text was updated successfully, but these errors were encountered: