Skip to content

ARBORproject/arborproject.github.io

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

banner

Welcome to ARBOR

Welcome! ARBOR (Analysis of Reasoning Behavior through Open Research) is an open collaboration, where people across the internet can work together publicly and in real time, to collectively analyze and interpret AI reasoning models. By sharing partial progress and early results, we aim to eliminate duplicative work and dramatically accelerate progress. To get started, take a look at the list of project discussions. If you're curious who's involved, see the People page of the wiki—and please add yourself if you like!

Our goal is to create a clearinghouse where people can:

  • Ask, discuss, and prioritize research questions
  • Post, discuss, and analyze early and partial experimental results
  • Provide a reference for what the research community has learned.

Our strategy is to find ways for people of all types of backgrounds to participate. Our on-going projects can be found in Discussions.

Why is interpretability research urgent for reasoning models?

Interpretability research is especially urgent in the case of reasoning models. (By "reasoning model" we mean LLMs that have been trained to exploit extended inference-time computation.) These systems appear to be powerful, yet remain mysterious. Understanding them better can help us create safer technology, sooner. It's also a grand intellectual challenge.

See Discussions to participate!

Why work collectively, in the open?

Open-source reasoning models have appeared relatively recently, and many different groups are looking at them right now for the first time--likely working along similar lines. We think the community can move faster and save work by sharing partial results and early ideas. Although the academic machine learning community already has an admirable culture of sharing preprints of finished work, we think it's worth experimenting with an even more open model.

An important element of our plan is to find ways for people many levels of experience and skill to participate. In a project like this, there will be important roles for experience ML researchers, for people looking to break into ML research with small concrete coding projects, people who want to work on pure theory, who people like analyzing data, etc. In fact, there will may be room for "citizen science" where people with no direct machine learning experience can help find patterns in model outputs and inputs.

Our efforts are inspired by the world of open-source software, and also mathematical initiatives like polymath and Equational Theories Project.

Join Discussions to participate!

How to participate

All projects listed on Discussions are seeking collaborators by default. (Projects that are not looking for contributors are labeled as No longer seeking collaborators). Please see CONTRIBUTING.md for details, but here are some starting points:

  • We're using github Discussions to track active projects. If you want to propose or start a new project, follow feel free to start a new project thread; you can use this intro as a template.
  • Join our Discord. We are using NDIF's server - once you join, look for channels related to ARBOR.
  • See who's involved at the People page. Feel free to add your name and your interests!
  • If you have finished work, or know of a paper on arxiv people might be interested in, please publicize it on Discord, and add a link to the Bibliography.
  • Have an interesting small observation? Add it to the Observations wiki page.
  • Please read through our notes on authorship/credit, and the license for material in this repo.

A note on authorship

Our hope is that this project will catalyze multiple "classic" conference papers and collaborations. We advocate a fair, generous attitude toward authorship; see CREDIT.md for more information.

How to use this repository.

We track projects in the Discussion section. Please browse through this list to get a sense of ongoing work, and find projects you might potentially want to join. If you want to get started with interpretability research, check the Introductory Resources for the Curious.

To propose a new project, add a new discussion. The first post of each thread should follow a strict template, which includes:

  • Research Question Capturing the main goal of the project
  • Owners Names(and github usernames) of project leads
  • Project status Is this project ongoing? Are you looking for collaborators?

The first post should also be treated as a project-specific "Wiki" that summarizes the current status of the project. Please see instructions here and an example here.

Discord

Join our Discord! We are using NDIF's server - once you join, look for channels related to .



Maintainers

This project is maintained by the Bau Lab at Northeastern and the Insight and Interaction Lab at Harvard. Please visit the GitHub repository for more information.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published