Installation

This is the code base for the ARC-AGI Without Pretraining project.

Installation

> git clone https://github.com/iliao2345/CompressARC.git
> cd CompressARC
> python -m venv arc_agi_without_pretraining
> source arc_agi_without_pretraining/bin/activate
> pip install -r requirements.txt

How to solve an ARC-AGI task

Run analyze_example.py to initialize a new model and train from scratch:

> python analyze_example.py
Enter which split you want to find the task in (training, evaluation, test): <split>
Enter which task you want to analyze (eg. 272f95fa): <task>
Performing a training run on task <task> and placing the results in <task>/
|100%|███████████████████████████████████████████████| 1500/1500 [12:22<00:00, 2.01it/s]
done

The code will create a folder <task>/ and put plots there after 1500 steps of training:

solutions at every 50 steps
interpretable tensors of task representations
graph of each tensor's contribution to the KL over time
graph of the KL vs reconstruction error over time

Most tasks may take up to 20 minutes to run, on one NVIDIA GeForce RTX 4070 GPU.

Tips for Reading the Code

A basic description of the code files in this repo:

For running via command line:

analyze_example.py: Demonstrates how to solve one ARC-AGI problem using our method, with visualizations of learned task representations and plots of metrics.
plot_problems.py: Plots all of the ARC-AGI problems in a split.
plot_accuracy.py: Plots pass@n accuracies during/after a bulk training run with train.py.
train.py: Trains a model for every task in a split, plotting the accuracy. Contains code that computes the loss function.

Functionality, not for running via command line:

arc_compressor.py: The network architecture and forward pass.
initializers.py: Model initialization, and handling of equivariances via weight tying.
layers.py: Implementation of individual layers in the forward pass.
multitensor_systems.py: Handling multitensors.
preprocessing.py: Converting the dataset into a form usable by the repo.
solution_selection.py: Logging metrics and converting model outputs into solution predictions.
visualization.py: Drawing problems and solutions.

Some classes that the repo defines and uses:

MultiTensorSystem (in multitensor_systems.py): A class that can spawn MultiTensors using stored dimensional information.
MultiTensor (in multitensor_systems.py): Container class for groups of tensors.
Logger (in solution_selection.py): For postprocessing of solutions outputted by the model, and their collection over time during training.
Task (in preprocessing.py): Contains information about an ARC-AGI task, such as grid dimensions and masks, pixel colors, etc.
ARCCompressor (in arc_compressor.py): Model class, with forward pass.
Initializer (in initializers.py): For initializing model weights.

Some repo-specific language that we use for variable naming, etc.

dims refers to a length 5 list of zeros and ones, and refers to the presence/absence of each of the five multitensor dimensions $(example, color, direction, height, width)$. Channel dimension is implicitly included.
axis always refers to the index of some dim in a tensor. For example, in a $(example, color, height)$ tensor, the $height$ dim is the 2nd axis, whereas for the $(height, width)$ tensor, it is the 0th axis.
This repo uses x and y to refer to the $height$ and $width$ dimensions, respectively.
The @multitensor_systems.multify decorator takes a function and modifies it to apply it once for every tensor in a multitensor. If the input is a tensor/object, then the new input is now a multitensor/multiobject. The function must be written with additional parameter dims.
The @layers.add_residual decorator takes a function and creates a residual connection around it, with projections to/from the input/output of the function and the residual stream. Optional parameters are added for using biases for the projections, using pre-norm, and post-norm.
The @layers.only_do_for_certain_shapes(*shapes) decorator takes a function with dims as its first input, and applies the function only if dims is in shapes. Else, it applies the identity function. Useful for chaining with the @multitensor_systems.multify decorator.

Code for different files may be written in slightly different styles due to polishing of individual code files by ChatGPT.

Citation

If you'd like to cite this blog post, use the following entry:

@online{liao2025arcagiwithoutpretraining,
	author = {Isaac Liao and Albert Gu},
	title = {ARC-AGI Without Pretraining},
	year = {2025},
	url = {https://iliao2345.github.io/blog_posts/arc_agi_without_pretraining/arc_agi_without_pretraining.html},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

How to solve an ARC-AGI task

Tips for Reading the Code

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
dataset		dataset
results_for_the_blog_post		results_for_the_blog_post
README.md		README.md
analyze_example.py		analyze_example.py
arc_compressor.py		arc_compressor.py
initializers.py		initializers.py
layers.py		layers.py
multitensor_systems.py		multitensor_systems.py
plot_accuracy.py		plot_accuracy.py
plot_problems.py		plot_problems.py
preprocessing.py		preprocessing.py
requirements.txt		requirements.txt
solution_selection.py		solution_selection.py
teaser_figure_w_title.png		teaser_figure_w_title.png
train.py		train.py
visualization.py		visualization.py

iliao2345/CompressARC

Folders and files

Latest commit

History

Repository files navigation

Installation

How to solve an ARC-AGI task

Tips for Reading the Code

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages