title | group | category |
---|---|---|
Adding a New Language |
Documents |
Guides |
Before you start your work, we recommend you create a GitHub Issue to let us know. This is not a commitment of any sort, it just allows other contributors to be aware of your work and avoid duplicate efforts.
We're using tree-sitter to parse code into ASTs. Each language requires its own parser. Find yours in tree-sitter's list of parsers.
Once you find the parser, you need to install it:
bun add --dev tree-sitter-<language>
After installing it, add it to ./scripts/generate-parsers.ts
and run bun generate-parsers
to try and generate the .wasm
parser file from it.
If the package contains a pre-built .wasm
file, this will work.
If it fails, Follow the [tree-sitter instructions for generating .wasm language files][build wasm] to set up emsrcipten,
and run bun generate-parsers
again.
Once the command completes successfully, your new parser should be inside ./parsers
.
Each CFG-builder resides in its own file inside ./src/control-flow
.
Name yours cfg-<language>.ts
.
Your builder is expected to expose a createCFGBuilder(options: BuilderOptions): CFGBuilder
function.
A naive implementation to get started with would look something like this:
import type Parser from "web-tree-sitter";
import type { BasicBlock, BuilderOptions, CFGBuilder } from "./cfg-defs";
import {
type Context,
GenericCFGBuilder,
type StatementHandlers,
} from "./generic-cfg-builder.ts";
export function createCFGBuilder(options: BuilderOptions): CFGBuilder {
return new GenericCFGBuilder(statementHandlers, options);
}
const statementHandlers: StatementHandlers = {
named: {},
default: defaultProcessStatement,
};
function defaultProcessStatement(
syntax: Parser.SyntaxNode,
ctx: Context,
): BasicBlock {
const newNode = ctx.builder.addNode(
"STATEMENT",
syntax.text,
syntax.startIndex,
);
ctx.link.syntaxToNode(syntax, newNode);
return { entry: newNode, exit: newNode };
}
Once you have your initial builder file, there's quite a lot of wiring to do,
to register the language in all the relevant places.
Search for ADD-LANGUAGES-HERE
in the code, and add the language in all the relevant places.
Those will include:
- Language & builder definitions in
src/control-flow/cfg.ts
- Mapping languages to
.wasm
files insrc/components/parser-loader/wasmMappings.ts
- Mapping VSCode's
languageId
to our language definitions insrc/vscode/extension.ts
- Adding test-collectors and tests in
src/test/commentTestCollector.ts
- Adding the language in the demo's UI in
src/components/Demo.svelte
Once all the wiring is in place, it's time to actually generate the CFG. It is highly recommended that you read the other CFG implementation for reference.
While you're working, the tree-sitter playground will prove highly valuable in understanding the AST and creating queries.