Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrating Hugging Face Hub 🤗 (updated) #5856

Merged
merged 34 commits into from
Jun 6, 2023

Conversation

simoninithomas
Copy link
Contributor

Hey there 👋 ,

Updated version of #5732

Proposed change(s)

  • Integrating Hugging Face Hub 🤗 in ML-Agents to allow users to download and upload models to the Hub.
  • The difference with Integrating Hugging Face Hub 🤗  #5732 is that we updated the Hugging Face API process to push to the Hub + some cleanup.

An example of a model card and the tensorboard dashboard integrated in the Hub: https://huggingface.co/ThomasSimonini/testpyramidsrndNewintegration

A notebook to test the feature: https://colab.research.google.com/drive/1qcdo3LEA5H1f3Ds4MWFfM4mgCWUPWqzj?usp=sharing

Here's the process

  1. You train your agent mlagents-learn --config="huggy.yaml" --run-id="Huggy_1"
  2. After the training is done you push your model to the 🤗 Hub with mlagents-push-to-hf --run-id="Huggy_1" --local-dir="results/Huggy_1" --repo-id="ThomasSimonini/Huggy_test" --commit-message="Huggy upload"
  3. Someone else wants to download it, it just need to type mlagents-load-from-hf --repo-id="ThomasSimonini/Huggy_test" --local-dir="./downloads"

What features do we want to add:

1️⃣ Add an evaluation step before pushing to the Hub the model

So I checked Statswriter but I have some thoughts and questions about the complete push to hub process design.
So people are going to train their model with mlagents-learn
After that to be able to evaluate the model they need to run mlagents-learn "folder containing the onnx" --resume --inference.
Because we created a new StatWriter we get the reward or elo and create a results.json put in the results folder and the result in the Hugging Face model card.
After that they will be able to push to hub using utils.push-to-hf
For the 2 instead of asking to write the command I suppose we can use subprocess (to run mlagents-learn "folder containing onnx" --resume --inference with specific plugin) in utils.push-to-hf?
The problem of the 2 comes from:

a. The fact that the number of testing timesteps == summary_freq => so it can be big. Do we have a solution for that or using episode instead?
b. Plus how to stop the inference? When it runs for one summary freq?

The very short-term solution could be instead, taking the final checkpoint in training_status.json directly on utils.push_to_hub.hf that contains either ELO or mean_reward (not std though).

2️⃣ Generate a video of the agent

  • For me, this is a good idea, but it might be a second PR.

Types of change(s)

  • Bug fix
  • New feature
  • Code refactor
  • Breaking change
  • Documentation update
  • Other (please describe)

Checklist

  • Added tests that prove my fix is effective or that my feature works
  • Updated the changelog (if applicable)
  • Updated the documentation (if applicable)
  • Updated the migration guide (if applicable)

Other comments

@CLAassistant
Copy link

CLAassistant commented Jan 3, 2023

CLA assistant check
All committers have signed the CLA.

@simoninithomas simoninithomas changed the base branch from main to develop January 3, 2023 13:58
@osanseviero osanseviero mentioned this pull request Apr 24, 2023
10 tasks
*Change loading to snapshot_download (able to use cache)
* Use create_repo and upload_folder instead of git
@simoninithomas
Copy link
Contributor Author

Hey there 👋 here is the integration update:

From outside nothing changed like the 3000 models already in the Hugging Face Hub, we run mlagents-push-to-hf --run-id --local-dir --repo-id --commit-message to upload a model and mlagents-load-from-hf --repo-id --local-dir to load a model.

The models: https://huggingface.co/models?library=ml-agents

If you want to try the implementation I wrote a notebook (start at START HERE 😄 ): https://colab.research.google.com/drive/1omWRS5ArBC6kkyOV_-oG54Iz3BORvAsO?usp=sharing

Some questions:

  1. Do you want to use a logger or print for the info given to the user during the downloading and the uploading?
  2. Omar for the model card do you have something you want me to change?
  3. Omar for setup.py I use "huggingface_hub~=0.8",.
  4. I have a problem specific to Colab: ImportError: cannot import name 'notf' from 'tensorboard.compat' (/usr/local/lib/python3.9/dist-packages/tensorboard/compat/init.py)
    What solutions do we have for that? I suppose it's due to Colab that now uses Python 3.9

🤗

Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice implementation @simoninithomas! I reviewed the logic and mostly made cosmetic comments. I haven't tested it myself as I was more focused on feedback about the huggingface_hub integration.

Copy link
Contributor

@osanseviero osanseviero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool! I left some minor suggestions but this is looking great!

simoninithomas and others added 4 commits April 27, 2023 10:41
Co-authored-by: Lucain <[email protected]>
Co-authored-by: Omar Sanseviero <[email protected]>
* Delete Huggy config file
* Update load_from_hf
* Update push_to_hf
Co-authored-by: Omar Sanseviero <[email protected]>
Co-authored-by: Lucain <[email protected]>
Copy link
Contributor

@osanseviero osanseviero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool! Thanks a lot!

@simoninithomas
Copy link
Contributor Author

simoninithomas commented May 8, 2023

Updated ✅ what do you prefer for logging? Usually, I like to print messages so that people can see them on their console or notebook but logging is also a correct solution. 🤔

@simoninithomas
Copy link
Contributor Author

simoninithomas commented May 15, 2023

Edit ✅ We changed the logger to MLAgents logger. However, the MLAgents logger does not display our info messages on collab. I'm wondering why 🤔 ?

Edit thanks to @osanseviero for the solution 🤗

The integration is ready feel free to tell me if you want us to change something 🤗

@simoninithomas
Copy link
Contributor Author

simoninithomas commented May 30, 2023

All checks passed 🥳 are we ready to merge? Or do you need some other changes before?

Copy link
Collaborator

@miguelalonsojr miguelalonsojr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like there's been many files touched without changes other than removing empty lines. Is there a reason for this? Did you rebase with develop on the main repo? Otherwise, the changes look good. The only request is to add a new documentation page that describes the new HF features and give some examples. You can add it to the docs folder and call it Hugging-Face-Integration.md. You'll also need to add a new section to the documentation overview https://github.com/Unity-Technologies/ml-agents/blob/develop/docs/ML-Agents-Toolkit-Documentation.md called Hugging Face Integration and then add a reference to the new integration md doc. You can follow the same format that's already in there. Once that's ready, I'll approve the PR and we can merge. Almost at the finish line!

@simoninithomas
Copy link
Contributor Author

simoninithomas commented Jun 2, 2023

I've just added the documentation and run black let me know if you want me to change something else.

I don't think I rebase with develop on the main repo. HF-integration was created from develop because develop is the default branch in ML-Agents. Do I need to change something on this part?

@miguelalonsojr
Copy link
Collaborator

I don't think I rebase with develop on the main repo. HF-integration was created from develop because develop is the default branch in ML-Agents. Do I need to change something on this part?

You should rebase with the develop branch from the main repo. It may have diverged from the time that you created your fork/feature. I'll rerun the workflows and see if there are conflicts.

Copy link
Collaborator

@miguelalonsojr miguelalonsojr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@miguelalonsojr miguelalonsojr merged commit 0696876 into Unity-Technologies:develop Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants