Skip to content

variable python version #2152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ganapetya opened this issue Oct 5, 2024 · 3 comments
Closed

variable python version #2152

ganapetya opened this issue Oct 5, 2024 · 3 comments
Labels
type:Enhancement A proposed enhancement to the docker images

Comments

@ganapetya
Copy link

ganapetya commented Oct 5, 2024

What docker image(s) is this feature applicable to?

docker-stacks-foundation, all-spark-notebook, scipy-notebook

What change(s) are you proposing?

Hi
It's actually a question.
But may be a feature request
python version is super important to be variable, because even minor conflict with spark python version is preventing driver from working.
I'm aware that in docker-stacks-foundation there is an ARG for python version
But when I try to build the top project: "all-spark-notebook" with this arg, that does not lead to python version change in the container.
(The reason, I think, that all the images already pre-built in the repo)
I even tried to build all the hierarchy one by one on my docker-desktop, assigning local images, but it did not help, there is a [line] in scipy-notebook where everything fails during the local build without proper error message.
Thus I can not change the python from 3.11 to 3.12
Do you have any suggestion?
Thanks.

(Finally I just built entire hierarchy with the newer python.. )

How does this affect the user?

Can nor really work with spark that works with other python minor version

Anything else?

No response

@ganapetya ganapetya added the type:Enhancement A proposed enhancement to the docker images label Oct 5, 2024
@mathbunnyru
Copy link
Member

mathbunnyru commented Oct 5, 2024

There are a few questions here so let me answer them

I'm aware that in docker-stacks-foundation there is an ARG for python version
But when I try to build the top project: "all-spark-notebook" with this arg, that does not lead to python version change in the container.
(The reason, I think, that all the images already pre-built in the repo)

The reason is that ARG PYTHON_VERSION exists only in the docker-stacks-foundation image (and nowhere else), so setting it during building other images will have no effect.
When you build any other image, it will have the same python version as the image you build on top of.

To change the python version you indeed need to build all the images in correct order and set this variable when building docker-stacks-foundation.
I suggest taking a look at our guide on how to build a custom set of images: https://jupyter-docker-stacks.readthedocs.io/en/latest/using/custom-images.html#custom-arguments

I even tried to build all the hierarchy one by one on my docker-desktop, assigning local images, but it did not help, there is a [line] in scipy-notebook where everything fails during the local build without proper error message.

Unless you can provide more details, I don't think we can help here - it might have been some network error, sometimes it happens.
My advice is to always try to build at least twice.
Docker's error message sometimes is a bit difficult to read - it might not be in the end of logs.

(Finally I just built entire hierarchy with the newer python.. )

As far as I understand your main goal is to have an all-spark-notebook image with python 3.12.
Unfortunately, that is not possible with spark v3 (it doesn't and won't support python 3.12).
And we already have a PR updating our images to python 3.12, and the only thing we're waiting for is spark v4 being released

If you really want to use python3.12 with spark right now, I think you can modify our script so it install spark-4.0.0-preview2 and not latest stable version.
Fortunately, spark-4.0.0-preview2 exists in the archive we're installing from: https://archive.apache.org/dist/spark/spark-4.0.0-preview2/
There is a chance that you only need to slightly modify this line: https://github.com/jupyter/docker-stacks/blob/main/images/pyspark-notebook/setup_spark.py#L39 (you don't need to exclude preview builds).
Or you can even hardcode this version in get_latest_spark_version function

Hope this helps

@ganapetya
Copy link
Author

Thank you, mathbunnyru!
I actually have built successfully everything with Python 3.12 and it works with spark 3.5 from bitnami which also works with python 3,12
(I executed several simple spark jobs to test it)
The only significant change I've done to your images is an upgrade to pandas=2.2.3
Plus in several images I had to separate 'RUN mamba install --yes ..' calls in N calls instead one combined call - for some reason that helped to resolve the build issue I mentioned above.

@mathbunnyru
Copy link
Member

Great. I think we can close this issue then

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:Enhancement A proposed enhancement to the docker images
Projects
None yet
Development

No branches or pull requests

2 participants