Skip to content

Releases: Blaizzy/mlx-vlm

v0.1.7

30 Dec 01:41
78920b0
Compare
Choose a tag to compare

What's Changed

  • Fix multi-image and 2x speed improvements (DS-VL2) by @Blaizzy in #157
  • Refactor utils (model loading, inference and output processing) by @Blaizzy in #161
  • Fix Llama-3.2-Vision (18x faster generation and 75% less memory usage) by @Blaizzy in #163

⚠️ Breaking Changes

This release introduces some breaking changes. If you encounter any issues, please open an issue or submit a PR.

Full Changelog: v0.1.6...v0.1.7

v0.1.6

22 Dec 20:00
f0b0058
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.1.5...v0.1.6

v0.1.5

22 Dec 17:19
398cb62
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.1.4...v0.1.5

v0.1.4

05 Dec 22:32
3f5e162
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.1.3...v0.1.4

v0.1.3

28 Nov 15:57
595c1f0
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.1.2...v0.1.3

v0.1.2

26 Nov 21:46
ebeb19d
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.1.1...v0.1.2

v0.1.1

23 Nov 15:15
88c31b4
Compare
Choose a tag to compare

What's Changed

  • Add example notebooks and support for system role by @Blaizzy in #95
  • fix pixtral image prompt order for doc VQA by @ndurner in #99
  • Fix Qwen2-VL OCR and repetition penalty by @Blaizzy in #109
  • Qwen2-VL performance improvements by @Blaizzy in #113
  • Faster / more memory efficient Qwen VL by @awni in #114
  • Add support for Molmo by @Blaizzy in #112
  • Add support for Florence-2 by @Blaizzy in #105
  • Fix image masks and update pointing example by @Blaizzy in #117

New Contributors

Full Changelog: v0.1.0...v0.1.1

v0.1.0

18 Oct 00:15
c478b7b
Compare
Choose a tag to compare

What's Changed

  • Add support for Pixtral-12B by @Blaizzy in #67
  • Fix pixtral multi-image by @hiima234 in #41
  • Added: Qwen2-VL Unit Tests, Refactored Weight Sanitization by @benzimring in #63
  • Trainer + Multi image v0.1.0 by @Blaizzy in #41
  • Fix example scripts in the readme.md to import and use load_config by @mark-lord in #82
  • Qwen2-VL Improvements (1-2x speedup) by @Blaizzy in #89
  • Fix Paligemma object detection and segmentation by @Blaizzy in #90
  • Add support for Llama-3.2-vision & Resize image by @Blaizzy in #83
  • Fix idefics-2 mask by @Blaizzy in #91

New Contributors

Full Changelog: v0.0.15...v0.1.0

v0.0.15

29 Sep 00:24
50961f6
Compare
Choose a tag to compare

What's Changed

  • Qwen2-VL fix vision tower bug for HD imagges by @Blaizzy in #62

Full Changelog: v0.0.14...v0.0.15

v0.0.14

28 Sep 16:14
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.0.13...v0.0.14