Name	Name	Last commit message	Last commit date
Latest commit Blaizzy Fix idefics (2 and 3) do-image-split (#191 ) Jan 29, 2025 5c7d159 · Jan 29, 2025 History 197 Commits
.github	.github	Patch utils and models (#167 )	Jan 3, 2025
examples	examples	Fix image masks and update pointing example (#117 )	Nov 22, 2024
mlx_vlm	mlx_vlm	Fix idefics (2 and 3) do-image-split (#191 )	Jan 29, 2025
.gitignore	.gitignore	Add example notebooks and support for system role (#95 )	Oct 19, 2024
.pre-commit-config.yaml	.pre-commit-config.yaml	add license, precommit and update readme	Apr 16, 2024
CONTRIBUTING.md	CONTRIBUTING.md	add license, precommit and update readme	Apr 16, 2024
LICENSE	LICENSE	add license, precommit and update readme	Apr 16, 2024
MANIFEST.in	MANIFEST.in	Create vlm module and update package name	Apr 16, 2024
README.md	README.md	Corrected order of parameters on example (#169 )	Jan 2, 2025
pytest.ini	pytest.ini	Add support for Deepseek-vl2 and Language only inputs (#153 )	Dec 22, 2024
requirements.txt	requirements.txt	Pin latest mlx (#184 )	Jan 18, 2025
setup.py	setup.py	add console script entry_points	May 18, 2024

Repository files navigation

MLX-VLM

MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.

Installation

The easiest way to get started is to install the mlx-vlm package using pip:

pip install mlx-vlm

Usage

Command Line Interface (CLI)

Generate output from a model using the CLI:

python -m mlx_vlm.generate --model mlx-community/Qwen2-VL-2B-Instruct-4bit --max-tokens 100 --temp 0.0 --image http://images.cocodataset.org/val2017/000000039769.jpg

Chat UI with Gradio

Launch a chat interface using Gradio:

python -m mlx_vlm.chat_ui --model mlx-community/Qwen2-VL-2B-Instruct-4bit

Python Script

Here's an example of how to use MLX-VLM in a Python script:

import mlx.core as mx
from mlx_vlm import load, generate
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_config

# Load the model
model_path = "mlx-community/Qwen2-VL-2B-Instruct-4bit"
model, processor = load(model_path)
config = load_config(model_path)

# Prepare input
image = ["http://images.cocodataset.org/val2017/000000039769.jpg"]
prompt = "Describe this image."

# Apply chat template
formatted_prompt = apply_chat_template(
    processor, config, prompt, num_images=len(image)
)

# Generate output
output = generate(model, processor, formatted_prompt, image, verbose=False)
print(output)

Multi-Image Chat Support

MLX-VLM supports analyzing multiple images simultaneously with select models. This feature enables more complex visual reasoning tasks and comprehensive analysis across multiple images in a single conversation.

Supported Models

The following models support multi-image chat:

Idefics 2
LLaVA (Interleave)
Qwen2-VL
Phi3-Vision
Pixtral

Usage Examples

Python Script

from mlx_vlm import load, generate
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_config

model_path = "mlx-community/Qwen2-VL-2B-Instruct-4bit"
model, processor = load(model_path)
config = load_config(model_path)

images = ["path/to/image1.jpg", "path/to/image2.jpg"]
prompt = "Compare these two images."

formatted_prompt = apply_chat_template(
    processor, config, prompt, num_images=len(images)
)

output = generate(model, processor, formatted_prompt, images, verbose=False)
print(output)

Command Line

python -m mlx_vlm.generate --model mlx-community/Qwen2-VL-2B-Instruct-4bit --max-tokens 100 --prompt "Compare these images" --image path/to/image1.jpg path/to/image2.jpg

These examples demonstrate how to use multiple images with MLX-VLM for more complex visual reasoning tasks.

Fine-tuning

MLX-VLM supports fine-tuning models with LoRA and QLoRA.

LoRA & QLoRA

To learn more about LoRA, please refer to the LoRA.md file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLX-VLM

Table of Contents

Installation

Usage

Command Line Interface (CLI)

Chat UI with Gradio

Python Script

Multi-Image Chat Support

Supported Models

Usage Examples

Python Script

Command Line

Fine-tuning

LoRA & QLoRA

About

Releases 33

Sponsor this project

Packages

Contributors 20

Languages

License

Blaizzy/mlx-vlm

Folders and files

Latest commit

History

Repository files navigation

MLX-VLM

Table of Contents

Installation

Usage

Command Line Interface (CLI)

Chat UI with Gradio

Python Script

Multi-Image Chat Support

Supported Models

Usage Examples

Python Script

Command Line

Fine-tuning

LoRA & QLoRA

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 33

Sponsor this project

Packages 0

Contributors 20

Languages

Packages