|
| 1 | +# Introduction |
| 2 | + |
| 3 | +This module reflects the LLM fine-tuning pipeline where we download versioned datsets from CometML and manage the deployment at scale using Qwak. |
| 4 | +**Completing this lesson**, you'll gain a solid understanding of the following: |
| 5 | + |
| 6 | +- what is Qwak AI and how does it help solve MLOps challenges |
| 7 | +- how to fine-tune a Mistral7b-Instruct on our custom llm-twin dataset |
| 8 | +- what is PEFT (parameter-efficient-fine-tuning) |
| 9 | +- what purpose do QLoRA Adapters and BitsAndBytes configs serve |
| 10 | +- how to fetch versioned datasets from Comet ML |
| 11 | +- how to log training metrics and model to Comet ML |
| 12 | +- understanding model-specific special tokens |
| 13 | +- the detailed walkthrough of how the Qwak build system works |
| 14 | + |
| 15 | +# What Is Fine-Tuning? |
| 16 | + |
| 17 | +Represents the process of taking pre-trained models and further training them on smaller, specific datasets to refine their capabilities and improve performance in a particular task or domain. Fine-tuning is about turning general-purpose models and turning them into specialized models. |
| 18 | + |
| 19 | +> [!IMPORTANT] |
| 20 | +> Foundation models know a lot about a lot, but for production, we need models that know a lot about a little. |
| 21 | +
|
| 22 | +In our LLM-Twin use case, we're aiming to fine-tune our model from a general knowledge corpora towards a targeted context that reflects your writing persona. |
| 23 | + |
| 24 | +We're using the following concepts widely adopted when Fine-Tuning LLMs: |
| 25 | + |
| 26 | +- [PEFT](https://huggingface.co/docs/peft/en/index) - Parameter Efficient Fine Tuning |
| 27 | +- [QLoRA](https://github.com/microsoft/LoRA) - Quantized Low Rank Adaptation |
| 28 | +- [BitsAndBytes](https://huggingface.co/blog/4bit-transformers-bitsandbytes) - Library to allow low-precision operations over custom GPU kernels |
| 29 | + |
| 30 | +You can learn more about the Dataset Generation and Fine-tuning Pipeline from Decoding ML LLM Twin Course: |
| 31 | + |
| 32 | +- Lesson 6: [The Role of Feature Stores in Fine-Tuning LLMs](https://medium.com/decodingml/the-role-of-feature-stores-in-fine-tuning-llms-22bd60afd4b9) |
| 33 | +- Lesson 7: [How to fine-tune LLMs on custom datasets at Scale using Qwak and CometML](https://medium.com/decodingml/how-to-fine-tune-llms-on-custom-datasets-at-scale-using-qwak-and-cometml-12216a777c34) |
| 34 | + |
| 35 | +## Refresher from Previous Lessons |
| 36 | + |
| 37 | +- In **Lesson 2** : [The Importance of Data Pipelines in the Era of Generative AI](https://medium.com/decodingml/the-importance-of-data-pipelines-in-the-era-of-generative-ai-673e1505a861) |
| 38 | + We've described the process of data ingestion where we're scrapping articles from Medium, posts from LinkedIn, and Code snippets from GitHub and storing them in our Mongo Database. |
| 39 | +- In **Lesson 3** : [Change Data Capture: Enabling Event-Driven Architectures](https://medium.com/decodingml/the-3nd-out-of-11-lessons-of-the-llm-twin-free-course-ba82752dad5a) |
| 40 | + We've showcased how to listen to MongoDB Oplog via the CDC pattern, and adapt RabbitMQ to stream captured events, this is our ingestion pipeline. |
| 41 | +- In **Lesson 6**: [The Role of Feature Stores in Fine-Tuning LLMs](https://medium.com/decodingml/the-role-of-feature-stores-in-fine-tuning-llms-22bd60afd4b9) |
| 42 | + We've showcased how to use filtered data samples from QDrant. Using Knowledge Distillation, we have the GPT3.5 Turbo to structure and generate the fine-tuning dataset that is versioned with CometML. |
| 43 | + |
| 44 | +# Architecture Overview |
| 45 | + |
| 46 | + |
| 47 | + |
| 48 | +**Here's what we're going to learn**: |
| 49 | + |
| 50 | +- Set-up the HuggingFace connection to be able to download Mistral7b-Instruct model. |
| 51 | +- Learn how to leverage Qwak to manage our training job at scale. |
| 52 | +- How to efficiently fine-tune a large model using PEFT & QLoRA |
| 53 | +- How to download datasets versioned with Comet ML |
| 54 | +- How does the Qwak Build Lifecycle works |
| 55 | + |
| 56 | +# Dependencies |
| 57 | + |
| 58 | +## Installation |
| 59 | + |
| 60 | +To prepare your environment for these components, follow these steps: |
| 61 | + |
| 62 | +```shell |
| 63 | +poetry install |
| 64 | +``` |
| 65 | + |
| 66 | +# Setup External Services |
| 67 | + |
| 68 | +1. [HuggingFace](https://huggingface.co) |
| 69 | +2. [Comet ML](https://www.comet.com/signup/?utm_source=decoding_ml&utm_medium=partner&utm_content=github) |
| 70 | +3. [Qwak](https://www.qwak.com/lp/end-to-end-mlops/?utm_source=github&utm_medium=referral&utm_campaign=decodingml) |
| 71 | + |
| 72 | + |
| 73 | +## 1. HuggingFace Integration |
| 74 | + |
| 75 | +We need a Hugging Face Access Token to download the model checkpoint and use it for fine-tuning. |
| 76 | + |
| 77 | +**Here's how to get it:** |
| 78 | + |
| 79 | +- Log-in to [HuggingFace](https://huggingface.co) |
| 80 | +- Head over to your profile (top-left) and click on Settings. |
| 81 | +- On the left panel, go to Access Tokens and generate a new Token |
| 82 | +- Save the Token |
| 83 | + |
| 84 | + |
| 85 | +## 2. Comet ML Integration |
| 86 | + |
| 87 | +### Overview |
| 88 | + |
| 89 | +[Comet ML](https://www.comet.com/signup/?utm_source=decoding_ml&utm_medium=partner&utm_content=github) is a cloud-based platform that provides tools for tracking, comparing, explaining, and optimizing experiments and models in machine learning. CometML helps data scientists and teams to better manage and collaborate on machine learning experiments. |
| 90 | + |
| 91 | +### Why Use Comet ML? |
| 92 | + |
| 93 | +- **Experiment Tracking**: CometML automatically tracks your code, experiments, and results, allowing you to compare between different runs and configurations visually. |
| 94 | +- **Model Optimization**: It offers tools to compare different models side by side, analyze hyperparameters, and track model performance across various metrics. |
| 95 | +- **Collaboration and Sharing**: Share findings and models with colleagues or the ML community, enhancing team collaboration and knowledge transfer. |
| 96 | +- **Reproducibility**: By logging every detail of the experiment setup, CometML ensures experiments are reproducible, making it easier to debug and iterate. |
| 97 | + |
| 98 | +### Comet ML Variables |
| 99 | + |
| 100 | +When integrating CometML into your projects, you'll need to set up several environment variables to manage the authentication and configuration: |
| 101 | + |
| 102 | +- `COMET_API_KEY`: Your unique API key that authenticates your interactions with the CometML API. |
| 103 | +- `COMET_PROJECT`: The project name under which your experiments will be logged. |
| 104 | +- `COMET_WORKSPACE`: The workspace name that organizes various projects and experiments. |
| 105 | + |
| 106 | +### Obtaining Comet ML Variables |
| 107 | + |
| 108 | +To access and set up the necessary CometML variables for your project, follow these steps: |
| 109 | + |
| 110 | +1. **Create an Account or Log In**: |
| 111 | + |
| 112 | + - Visit [Comet ML's website](https://www.comet.com/signup/?utm_source=decoding_ml&utm_medium=partner&utm_content=github) and log in if you already have an account, or sign up if you're a new user. |
| 113 | + |
| 114 | +2. **Create a New Project**: |
| 115 | + |
| 116 | + - Once logged in, navigate to your dashboard. Here, you can create a new project by clicking on "New Project" and entering the relevant details for your project. |
| 117 | + |
| 118 | +3. **Access API Key**: |
| 119 | + |
| 120 | + - After creating your project, you will need to obtain your API key. Navigate to your account settings by clicking on your profile at the top right corner. Select 'API Keys' from the menu, and you'll see an option to generate or copy your existing API key. |
| 121 | + |
| 122 | +4. **Set Environment Variables**: |
| 123 | + - These variables, `COMET_API_KEY`, `COMET_PROJECT` and `COMET_WORKSPACE`, should be added in the `build_config.yaml` when deploying on qwak. Follow the next module to integrate Qwak. |
| 124 | + |
| 125 | +## 3. Qwak Integration |
| 126 | + |
| 127 | +### Overview |
| 128 | + |
| 129 | +[Qwak](https://www.qwak.com/lp/end-to-end-mlops/?utm_source=github&utm_medium=referral&utm_campaign=decodingml) is an all-in-one MLOps platform designed to streamline the entire machine learning lifecycle from data preparation to deployment and monitoring. It offers a comprehensive suite of tools that allow data science teams to build, train, deploy, manage, and monitor AI and machine learning models efficiently. |
| 130 | + |
| 131 | +### Why Use Qwak? |
| 132 | + |
| 133 | +Qwak is used by a range of companies across various industries, from banking and finance to e-commerce and technology, underscoring its versatility and effectiveness in handling diverse AI and ML needs. Here are a few reasons: |
| 134 | + |
| 135 | +- **End-to-End MLOps Platform**: Qwak provides tools for every stage of the machine learning lifecycle, including data preparation, model training, deployment, and monitoring. This integration helps eliminate the need for multiple disparate tools and simplifies the workflow for data science teams |
| 136 | +- **Integration with Existing Tools**: Qwak supports integrations with popular tools and platforms such as HuggingFace, Snowflake, Kafka, PostgreSQL, and more, facilitating seamless incorporation into existing workflows and infrastructure. |
| 137 | +- **User-Friendly Interface**: Qwak offers a user-friendly interface and managed Jupyter notebooks, making it accessible for both experienced data scientists and those new to the field |
| 138 | +- **Smooth Developer Experience**: The CLI sdk is very intuitive and easy to use, and allows developers to scale inference/training jobs accordingly without the hassle of managing infrastructure. |
| 139 | + |
| 140 | +### Setting Up Qwak |
| 141 | + |
| 142 | +[Qwak.ai](https://www.qwak.com/lp/end-to-end-mlops/?utm_source=github&utm_medium=referral&utm_campaign=decodingml) is straightforward and easy to set-up. |
| 143 | + |
| 144 | +To configure your environment for Qwak, log in to [Qwak.ai](https://www.qwak.com/lp/end-to-end-mlops/?utm_source=github&utm_medium=referral&utm_campaign=decodingml) and go to your profile → settings → Account Settings → Personal API Keys and generate a new key. |
| 145 | + |
| 146 | +In your terminal, run `qwak configure` and it'll ask you for your `API-KEY`, paste it and you're done! |
| 147 | + |
| 148 | +### Creating a new Qwak Model |
| 149 | + |
| 150 | +In order to deploy model versions remotely on qwak, first you'll have to initialize a `model` and a `project`. To do that, run in the terminal: |
| 151 | + |
| 152 | +``` |
| 153 | +qwak models create "ModelName" --project "ProjectName" |
| 154 | +``` |
| 155 | + |
| 156 | +Once you've done that, make sure you have these environment variables: |
| 157 | + |
| 158 | +```plaintext |
| 159 | +HUGGINGFACE_TOKEN="your-hugging-face-token" |
| 160 | +COMET_API_KEY="your-key" |
| 161 | +COMET_WORKSPACE="your-workspace" |
| 162 | +COMET_PROJECT='your-project' |
| 163 | +``` |
| 164 | + |
| 165 | +Now, populate the `env` variables in the `build_config.yaml` to complete the qwak deployment prerequisites.: |
| 166 | + |
| 167 | +``` |
| 168 | +build_env: |
| 169 | + docker: |
| 170 | + assumed_iam_role_arn: null |
| 171 | + base_image: public.ecr.aws/qwak-us-east-1/qwak-base:0.0.13-gpu |
| 172 | + cache: true |
| 173 | + env_vars: |
| 174 | + - HUGGINGFACE_ACCESS_TOKEN="" |
| 175 | + - COMET_API_KEY="" |
| 176 | + - COMET_WORKSPACE="" |
| 177 | + - COMET_PROJECT="" |
| 178 | + no_cache: false |
| 179 | + params: [] |
| 180 | + push: true |
| 181 | + python_env: |
| 182 | + dependency_file_path: finetuning/requirements.txt |
| 183 | + git_credentials: null |
| 184 | + git_credentials_secret: null |
| 185 | + poetry: null |
| 186 | + virtualenv: null |
| 187 | + remote: |
| 188 | + is_remote: true |
| 189 | + resources: |
| 190 | + cpus: null |
| 191 | + gpu_amount: null |
| 192 | + gpu_type: null |
| 193 | + instance: gpu.a10.2xl |
| 194 | + memory: null |
| 195 | +build_properties: |
| 196 | + branch: finetuning |
| 197 | + build_id: null |
| 198 | + gpu_compatible: false |
| 199 | + model_id: ---MODEL_NAME--- |
| 200 | + model_uri: |
| 201 | + dependency_required_folders: [] |
| 202 | + git_branch: master |
| 203 | + git_credentials: null |
| 204 | + git_credentials_secret: null |
| 205 | + git_secret_ssh: null |
| 206 | + main_dir: finetuning |
| 207 | + uri: . |
| 208 | + tags: [] |
| 209 | +deploy: false |
| 210 | +deployment_instance: null |
| 211 | +post_build: null |
| 212 | +pre_build: null |
| 213 | +purchase_option: null |
| 214 | +step: |
| 215 | + tests: true |
| 216 | + validate_build_artifact: true |
| 217 | + validate_build_artifact_timeout: 120 |
| 218 | +verbose: 0 |
| 219 | +``` |
| 220 | + |
| 221 | +# Usage |
| 222 | + |
| 223 | +The project includes a `Makefile` for easy management of common tasks. Here are the main commands you can use: |
| 224 | + |
| 225 | +- `make help`: Displays help for each make command. |
| 226 | +- `make local-test-inference-pipeline`: Runs tests on local-qwak deployment. |
| 227 | +- `make create-qwak-project`: Create a Qwak project to deploy the model. |
| 228 | +- `make deploy-inference-pipeline`: Triggers a new fine-tuning job to Qwak remotely, using the configuration specified in `build_config.yaml` |
| 229 | + |
| 230 | +------ |
| 231 | + |
| 232 | +# Meet your teachers! |
| 233 | + |
| 234 | +The course is created under the [Decoding ML](https://decodingml.substack.com/) umbrella by: |
| 235 | + |
| 236 | +<table> |
| 237 | + <tr> |
| 238 | + <td><a href="https://github.com/iusztinpaul" target="_blank"><img src="https://github.com/iusztinpaul.png" width="100" style="border-radius:50%;"/></a></td> |
| 239 | + <td> |
| 240 | + <strong>Paul Iusztin</strong><br /> |
| 241 | + <i>Senior ML & MLOps Engineer</i> |
| 242 | + </td> |
| 243 | + </tr> |
| 244 | + <tr> |
| 245 | + <td><a href="https://github.com/alexandruvesa" target="_blank"><img src="https://github.com/alexandruvesa.png" width="100" style="border-radius:50%;"/></a></td> |
| 246 | + <td> |
| 247 | + <strong>Alexandru Vesa</strong><br /> |
| 248 | + <i>Senior AI Engineer</i> |
| 249 | + </td> |
| 250 | + </tr> |
| 251 | + <tr> |
| 252 | + <td><a href="https://github.com/Joywalker" target="_blank"><img src="https://github.com/Joywalker.png" width="100" style="border-radius:50%;"/></a></td> |
| 253 | + <td> |
| 254 | + <strong>Răzvanț Alexandru</strong><br /> |
| 255 | + <i>Senior ML Engineer</i> |
| 256 | + </td> |
| 257 | + </tr> |
| 258 | +</table> |
| 259 | + |
| 260 | +# License |
| 261 | + |
| 262 | +This course is an open-source project released under the MIT license. Thus, as long you distribute our LICENSE and acknowledge our work, you can safely clone or fork this project and use it as a source of inspiration for whatever you want (e.g., university projects, college degree projects, personal projects, etc.). |
| 263 | + |
| 264 | +# 🏆 Contribution |
| 265 | + |
| 266 | +A big "Thank you 🙏" to all our contributors! This course is possible only because of their efforts. |
| 267 | + |
| 268 | +<p align="center"> |
| 269 | + <a href="https://github.com/decodingml/llm-twin-course/graphs/contributors"> |
| 270 | + <img src="https://contrib.rocks/image?repo=decodingml/llm-twin-course" /> |
| 271 | + </a> |
| 272 | +</p> |
0 commit comments