-
Notifications
You must be signed in to change notification settings - Fork 19
Dask dependency management plugin for coffea-casa #219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Current status:
Note on the plugin itself. It needs to:
|
These are the changes I'd made to add a |
Another direction to consider here is to leverage the new ability of cloudpickle 2.0 to serialize modules: cloudpipe/cloudpickle#417 |
@Andrew42 check README in https://github.com/oshadura/dask-custom-docker to have some idea how to do a local development until I will manage to deploy coffea-casa in Minikube or e.g. |
@oshadura and I have been working on setting up a k8s+docker+helm local deployment of dask in an effort to mirror the coffea-casa production environment. After a lot of debugging and adjustments to the helm charts setup I think we've finally got the workers to startup with a nanny process at a host+port that it could correctly broadcast back to the scheduler and which also allows for the nanny process to communicate back and forth with the user dask So now we're able to utilize |
Update: I've made a new plugin called There was an issue where the worker would continue to use a cached version of the code, even when the updated files were sent over. My workaround for this was to use Another issue that came up was a At this point, most of the code is in place and should be ready to be tested, unless anyone has some additional or outstanding concerns that they would like to have addressed first. Next Steps:
If there any other steps that I should include, please let me know! |
It will be tested after PR #265 is merged |
Hello,
, but I have no success, as I continue to get an error that the file was not found What am I doing wrong? FYI, I am not writing this code in a notebook, but rather as a py script, in case this changes anything for what course of action I should take. |
So I think it is going to depend on how exactly your code runs, namely the part that runs on the worker. In any case, I agree that it doesn't look like import os
import dask
import pathlib
from dask.distributed import Client
from coffea_casa import CoffeaCasaCluster
directory = "tmp_test"
if not os.path.exists(directory):
os.makedirs(directory)
with open(pathlib.Path(directory) / "foo.py","w") as f:
f.write("x = 123")
with open(pathlib.Path(directory) / "bar.py","w") as f:
f.write("from foo import x\n")
f.write("print(x)")
cluster = CoffeaCasaCluster(job_extra={'docker_image': "coffeateam/coffea-casa-analysis:latest",'transfer_input_files': "tmp_test"})
host = os.getenv("HOST_IP")
client = Client(f"tls://{host}:8786")
print(client.run(os.listdir))
# >>> ['.bash_logout', '.bashrc', '.profile', 'dask-worker-space', '.conda', '.condor', '.local', 'work']
print(client.run(os.listdir,"dask-worker-space"))
# >>> ['global.lock', 'purge.lock', 'worker-p3calcjl', 'worker-p3calcjl.dirlock'] Now if all you want to do is get a directory onto the worker so that your code can access some static data content, you can try using the dask from distributed.diagnostics.plugin import UploadDirectory
client.register_worker_plugin(UploadDirectory(directory,restart=True,update_path=True),nanny=True)
print(client.run(os.listdir,"dask-worker-space"))
# >>> ['global.lock', 'purge.lock', 'tmp_test', 'worker-p3calcjl', 'worker-p3calcjl.dirlock']
print(client.run(os.listdir,"dask-worker-space/tmp_test"))
# >>> ['bar.py', 'foo.py'] Which should show that your directory exists on the worker machine. |
I am trying the For some context, my processor found at Here is how I defined my client, and check to see that my directory was uploaded:
After running the job, I am given some information about the cluster, followed by my print statements before preprocessing my dataset.
I can see that my directory, and in turn all of the relevant paths, have been uploaded. However, after preprocessing, the job still fails giving me the following error:
BTW It appears that my client is starting, stopping, and starting again, according to more logs that are output to the terminal after displaying the client information above. The messages are listed below:
Why does my memory show up as zero ( |
Is it possible your issue is just a relative path difference? I'm not sure why the memory shows up as zero, maybe @oshadura can comment on this? You're also starting up your own The Suffice it to say that all this headache with remote file placement is a dask issue and one we're actively trying to get ironed out as it presents a lot of potential problems for many analyses if they weren't designed with this in mind (as you seem to be finding out... sorry!) |
YES! After uploading the directory the code does run an extra directory up, namely
OK. I see now that when I use Thanks! |
Related issues:
The text was updated successfully, but these errors were encountered: