Skip to main content
Deno 2 is finally here 🎉️
Learn more

The Tree Sitter for Deno!

This is a patched+enhanced version of the web-tree-sitter made to run on Deno.

Usage: How do I __ ?

  1. Install / Import
  2. Get an AST data structure
  3. Find a specific part of code (query the AST)
  4. Edit the tree (replace nodes, etc)

1. How to Install / Import

Thanks to Deno, boilerplate was able to be removed!

The Legacy web-tree-sitter Way 🤢

const Parser = require('web-tree-sitter');

(async () => {
  await Parser.init();
  const parser = new Parser();
  const Lang = await Parser.Language.load('tree-sitter-javascript.wasm');
  parser.setLanguage(Lang);
  const tree = parser.parse('let x = 1;');
  console.log(tree.rootNode.toString());
})();

The New Way ✨

import { createParser } from "https://deno.land/x/deno_tree_sitter@1.0.0.0/main/main.js"
import javascript from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/676ffa3b93768b8ac628fd5c61656f7dc41ba413/main/javascript.js"

const parser = await createParser(javascript) // path or Uint8Array or URL
const tree = parser.parse('let x = 1;')

2. How to Parse

  1. Find a tree-sitter.wasm file for the language you want to parse (I precompiled a bunch over here: https://github.com/jeff-hykin/common_tree_sitter_languages)
  2. Load that wasm file using a URL, file path, or Uint8Array
  3. Call .parse() on a string of code
import { createParser } from "https://deno.land/x/deno_tree_sitter@1.0.0.0/main/main.js"
import javascriptUint8Array from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/676ffa3b93768b8ac628fd5c61656f7dc41ba413/main/javascript.js"

// ex: uint8array
const parser1 = await createParser(javascriptUint8Array)
// ex: file path
const parser2 = await createParser('./path/to/javascript.wasm')
// ex: url
const parser3 = await createParser('https://github.com/jeff-hykin/common_tree_sitter_languages/raw/676ffa3b93768b8ac628fd5c61656f7dc41ba413/main/javascript.wasm')

// parse a string
const tree1 = parser1.parse('let x = 1;')

Quick Languages

I aggregated some wasm parser here for quick usage.

import html from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/html.js"
import c from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/c.js"
import python from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/python.js"
import bash from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/bash.js"
import typescript from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/typescript.js"
import yaml from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/yaml.js"
import javascript from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/javascript.js"
import rust from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/rust.js"
import css from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/css.js"
import json from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/json.js"
import wat from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/wat.js"
import wast from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/wast.js"
import tsx from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/tsx.js"
import toml from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/toml.js"
import nix from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/nix.js"
import cpp from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/cpp.js"
import gitignore from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/gitignore.js"
import treeSitterQuery from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/tree-sitter-query.js"

The Tree Data Structure

import { Parser, createParser } from "https://deno.land/x/deno_tree_sitter@1.0.0.0/main/main.js"
import rust from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/rust.js"

const parser = await createParser(rust)
const tree = parser.parse(' fn main() { }')

tree.rootNode // main thing you probably care about
tree.rootNode.children // array of nodes
tree.rootNode.children[0].fields // object of nodes
tree.rootNode.children[0].fields.parameters.text // "()"
tree.rootNode.children[0].fields.body.text // "{ }"
tree.rootNode.children[0].fields.name.text // "main"
tree.rootNode.text == " fn main() { }" // true

tree.language.types  // array 
tree.language.fields // array 
tree.rootNode == {
  type: "source_file",
  typeId: 139,
  startPosition: { row: 0, column: 0 },
  startIndex: 0,
  endPosition: { row: 0, column: 13 },
  endIndex: 13,
  indent: "",
  hasChildren: true,
  children: [
    {
      type: "function_item",
      typeId: 170,
      startPosition: { row: 0, column: 0 },
      startIndex: 0,
      endPosition: { row: 0, column: 13 },
      endIndex: 13,
      indent: undefined,
      hasChildren: true,
      children: [ [Object], [Object], [Object], [Object] ]
    }
  ]
}

3. How to Find Specific Things (Query)

If you want to use j-query like approach to an AST, you’re in luck. There is a whole query syntax explained here and here’s how to use it:

// 
// setup
// 
import { Parser, createParser } from "https://deno.land/x/deno_tree_sitter@1.0.0.0/main/main.js"
import javascript from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/javascript.js"
var parser = await createParser(javascript) // path or Uint8Array
var tree = parser.parse('let a = 1;let b = 1;let c = 1;')
var root = tree.rootNode

// if you have access to the node as a var, use console.log(node.getQueryForSelf()) on it
// ex:
    console.log(root.children[0].children[0].getQueryForSelf())
    // (program (lexical_declaration (let)))

// 
// quickQuery will return the high level node, even if you dont specify names (like @name1)
// 
    var firstLexicalNode = root.quickQuery(`(lexical_declaration)`)[0]
    // Alternatively:
    var firstLexicalNode = root.quickQueryFirst(`(lexical_declaration)`)

// 
// quickQueryFirst
// 
    var firstIdentifierNode = root.quickQueryFirst(`(lexical_declaration)`).quickQueryFirst(`(identifier)`)

// 
// you can also specify extraction names
// 
    var { blahInner, blahOuter } = root.quickQuery(`(lexical_declaration (identifier) @blahInner ) @blahOuter`)[0]

// 
// full .query()
// 
    // basic
    var results = tree.rootNode.query(`(identifier) @blahBlahBlah`)
    // capped count
    var results = tree.rootNode.query(`(identifier) @blahBlahBlah`, { matchLimit: 2 })
    // limited range
    var results = tree.rootNode.query(
        `(identifier) @blahBlahBlah`,
        {
            matchLimit: 2,
            startPosition: { row: 0, column: 0 },
            endPosition: {row: 1000, column: 1000}
        }
    )

// ouput structure
results == [
    {
        pattern: 0,
        captures: [
            {
                name: "blahBlahBlah",
                node: {
                    type: "identifier",
                    typeId: 1,
                    startPosition: { row: 0, column: 4 },
                    startIndex: 4,
                    endPosition: { row: 0, column: 5 },
                    endIndex: 5,
                    indent: undefined,
                    hasChildren: false,
                    children: []
                }
            }
        ]
    },
    {
        pattern: 0,
        captures: [
            {
                name: "blahBlahBlah",
                node: {
                    type: "identifier",
                    typeId: 1,
                    startPosition: { row: 0, column: 14 },
                    startIndex: 14,
                    endPosition: { row: 0, column: 15 },
                    endIndex: 15,
                    indent: undefined,
                    hasChildren: false,
                    children: []
                }
            }
        ]
    },
    {
        pattern: 0,
        captures: [
            {
                name: "blahBlahBlah",
                node: {
                    type: "identifier",
                    typeId: 1,
                    startPosition: { row: 0, column: 24 },
                    startIndex: 24,
                    endPosition: { row: 0, column: 25 },
                    endIndex: 25,
                    indent: undefined,
                    hasChildren: false,
                    children: []
                }
            }
        ]
    }
]

Traversing

It is surprisingly handy to be able to iterate over every node (at any depth) in order.

import { createParser } from "https://deno.land/x/deno_tree_sitter@1.0.0.0/main/main.js"
import javascript from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/javascript.js"
const parser = await createParser(javascript) // path or Uint8Array
const tree = parser.parse(`
    function thing(arg1) {
        let a = 10
    }
`)

// 
// example with nice printout:
// 
let indent = ""
for (const [ parents, node, direction ] of tree.rootNode.traverse()) {
    const isLeafNode = direction == "-"
    if (isLeafNode) {
        console.log(indent+`<${node.type} text=${JSON.stringify(node.text)} />`)
    } else if (direction == "->") {
        console.log(indent+`<${node.type}>`)
        indent += "    "
    } else if (direction == "<-") {
        indent = indent.slice(0,-4)
        console.log(indent+`</${node.type}>`)
    }
}

// prints:
// <program>
//     <function_declaration>
//         <function text="function" />
//         <identifier text="thing" />
//         <formal_parameters>
//             <( text="(" />
//             <identifier text="arg1" />
//             <) text=")" />
//         </formal_parameters>
//         <statement_block>
//             <{ text="{" />
//             <lexical_declaration>
//                 <let text="let" />
//                 <variable_declarator>
//                     <identifier text="a" />
//                     <= text="=" />
//                     <number text="10" />
//                 </variable_declarator>
//             </lexical_declaration>
//             <} text="}" />
//         </statement_block>
//     </function_declaration>
// </program>

Whitespace and Soft Nodes

If you’re making a formatter or a code refactoring tool, it would normally be a bit of a pain because the tree sitter doesn’t handle whitespace great.

Typically tree sitter languages don’t have whitespace nodes at all (and sometimes they even skip normal code!). This means there’s text that is stored outside of the nodes in the AST. This library solves that problem by auto injecting “soft nodes” back into the AST. Most of the time they are just whitespace nodes, but sometimes they can be other syntax depending on your grammar.

import { createParser } from "https://deno.land/x/deno_tree_sitter@1.0.0.0/main/main.js"
import javascript from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/javascript.js"

const parser = await createParser(javascript)
const tree = parser.parse('   let x = 1;')
tree.rootNode.children[0] // whitespace node

4. Edit the tree

import { createParser } from "https://deno.land/x/deno_tree_sitter@1.0.0.0/main/main.js"
import javascript from "https://github.com/jeff-hykin/common_tree_sitter_languages/raw/a1c34a3a73a173f82657e25468efc76e9e593843/main/javascript.js"

const parser = await createParser(javascript)
const tree = parser.parse(`
    function thing(arg1) {
        let a = 10
    }
`)

// replace a bunch of stuff (note using .replaceInnards() will destroy the .children of the node it is used on)
tree.rootNode.children[0].children[0].replaceInnards(`async function`)
tree.rootNode.children[0].children[5].children[2].children[2].children[4].replaceInnards(`999`)
tree.rootNode.children[0].children[5].children[2].children[2].children[2].replaceInnards(`+=`)

console.log(tree.rootNode.text)
// prints:
    // async function thing(arg1) {
    //     let a += 999
    // }

Contributing

You can edit the main.js or tree_sitter.js. But, if you edit the tree_sitter.js, you’ll need to edit run/pull_tree_sitter (which is JavaScript). The run/pull_tree_sitter is what allows this repo to stay up to date with the tree-sitter-web. It injects changes into the official tree-sitter codebase, and you’ll have to do that for any changes you make as well.