-
-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Patch utils and models #167
Conversation
…y/mlx-vlm into pc/patch-utils-and-models
A Great Leap Forward! My run of the script bails out (see end of transcript). My own script just ploughs on (but doesn't properly reveal the warning) for whatever reason. It would be even better if it revealed some load/run timings, if only for performance tracking and relative performance comparisons. Suggestion: get those reporting bugs to run the smoke test on the failing model (and allow a model to be specified as an alternative to a list of them in a file).
from transformers import AutoModelForCausalLM, AutoProcessor model_id= "<huggingface_model_id>" model.save_pretrained("<local_dir>")
python -m mlx_vlm.convert --hf-path <local_dir> --mlx-path <mlx_dir>
|
Thank you very much!
I suspect the image is too big for a model that size. Even if you have 128GB. I will add a way to handle it and image resize shape to the smoke test.
I can add load time. But when ti comes to run time, it's best measured by token-per-sec which already exists.
Could you elaborate? I don't understand what you mean. |
Update on this error. I don't know of a way to handle this because it's literally like having too many chorme windows open, you PC just freezes and task manager wants to kill the culprit. So the only solution is to reduce the size of the image, use lower quant or use a smaller model. |
added model load time and ![]() |
Spoke too soon, this used not to break, but now does:
from transformers import AutoModelForCausalLM, AutoProcessor model_id= "<huggingface_model_id>" model.save_pretrained("<local_dir>")
python -m mlx_vlm.convert --hf-path <local_dir> --mlx-path <mlx_dir>
(now just seems to die) |
Don't mention it!
Big images always used to plough on, with my script, but no longer do ... I don't want to have to fiddle with image sizes when submitting them to the oracle. This is supposed to be AI! The script knows how much memory I have, and may know how much the model needs? Why are different models capable of running the same image without balking?
|
Could you share a reproducible example with 1 to 2 models? Please include the image.
Hey, I get where you're coming from with automatic image sizing - it would be super convenient! But here's the tricky part: these AI models are surprisingly quirky with how they handle images. You might have two models that look similar on paper (same size, same number of parameters), but one could be way more memory-hungry just because of how its vision system is built. It's kind of like how two cars might have the same horsepower but totally different fuel efficiency. Sure, we could try to guess how much memory each model needs, but it'd be like throwing darts blindfolded. Some users would end up with slower performance and might not even realize why. Plus, keeping it working with new hardware would be a constant headache. So while I'd love to make this work, I think for now it's better to let users control their own image sizes. That way, you know exactly what you're getting and can tune it to what works best for your setup. |
I'm working on other features that might bring the resouce usage down such as (Cache quant, rotating cache and image+prompt caching) But even these features if the image is too big you will need to resize it manually to a size the models you prefer run with good accuracy. Or you can make your own heuristic to handle resizing. |
One thing that could help with the tuning would be to build on your smoke test app and apply a model + prompt to a given image, resized to max 512 pixels in length / width, then 1024, 2048, 4096 and 8192, in each case recording memory usage and memory usage per megapixel. This is, in any case, a useful thing to know about a model. |
But it seems to be a bit more complicated. With a different image (from the same camera) I can get https://live.staticflickr.com/65535/54245767632_324aaa7699_c.jpg
from transformers import AutoModelForCausalLM, AutoProcessor model_id= "<huggingface_model_id>" model.save_pretrained("<local_dir>")
python -m mlx_vlm.convert --hf-path <local_dir> --mlx-path <mlx_dir>
(ie, my script runs all the way through) |
Copy# Various fixes and improvements
Core Changes
Testing and Documentation
Smoke Test
This PR introduces a smoke test suite for validating model functionality. The suite verifies:
Inspired by @jrp2014's eval harness.
Usage
Run the test suite:
models.txt
Closes #166