Overview

Mogglo is a multi-language AST-based code search and rewriting tool. Mogglo supports embedding Lua code in search patterns and replacements.

Mogglo focuses on the following goals:

  • Minimal setup: Mogglo will work right away on any codebase in a supported language.
  • Many languages: 12 and counting!
  • High-quality grammars: Mogglo is based on tree-sitter grammars.
  • Lua: Mogglo exposes a rich API to embedded Lua snippets.

Introduction

The following examples give a taste of Mogglo. Here's how to find pointless assignments of an expression to itself:

mogglo-rust --detail 'let $x = $x;' ./**/*.rs

The --detail flag helps you understand why something matched, it produces fancy output like:

   ╭─[./test/nonlinear.rs:4:1]
   │
 4 │ ╭─▶ let a =
   │ │       ┬
   │ │       ╰── $x
 5 │ ├─▶     a;
   │ │       ┬
   │ │       ╰──── $x
   │ │
   │ ╰──────────── let $x = $x;
   │
   │ Note: Multiple occurrences of $x were structurally equal
───╯

Lua code is wrapped in braces. Lua can recursively match patterns with rec. Here's a pattern to detect out-of-bounds array accesses:

mogglo-rust 'while $i <= $buf.len() { ${{ rec("$buf.get($i)") }} }' ./**/*.rs

Here's how to unroll a simple loop:

mogglo-rust \
  'for $i in 0..$h { $b; }' \
  --where 'h_num = tonumber(h); return h_num ~= nil and h_num % 4 == 0' \
  --replace 'for $i in 0..${{ string.format("%.0f", h / 4) }} { $b; $b; $b; $b; }' \
  ./*/**.rs

This transformation demonstrates the power of using Lua: it can't be done with regular expression substitutions and would be very difficult with other codemod tools.

Lua snippets can match and negate patterns, or even compose new patterns dynamically! See the guide for more detailed explanations, examples, and features.

Supported languages

Mogglo currently ships pre-built executables for the following languages:

Additionally, the following can be built from source or via Cargo/crates.io:

Languages are very easy to add, so file an issue or a PR if you want a new one!

Mogglo is not as polished as any of the tools mentioned in this section.

Mogglo is most similar to other multi-language code search and codemod tools.

  • Mogglo is similar to ast-grep, but supports more languages and allows embedding Lua in patterns.
  • Mogglo is similar to Comby. Comby uses lower-fidelity parsers, but is much more battle-tested and better documented. Mogglo also embeds Lua in patterns.
  • Mogglo has less semantic understanding of code (e.g., name resolution) than Semgrep or CodeQL, but is much easier to set up.

There are many excellent language-specific code search and codemod tools; these tend to be more polished but less general than Mogglo.

Installation

Pre-compiled binaries

Pre-compiled binaries are available on the releases page.

Fetching binaries with cURL

You can download binaries with curl like so (replace X.Y.Z with a real version number, LANG with a supported language, and TARGET with your OS):

curl -sSL https://github.com/langston-barrett/mogglo/releases/download/vX.Y.Z/mogglo-LANG_TARGET -o mogglo-LANG

Build from source

To install from source, you'll need to install Rust and Cargo. Follow the instructions on the Rust installation page.

From a release on crates.io

You can build a released version from crates.io. To install the latest release of Mogglo for the language <LANG>, run:

cargo install mogglo-<LANG>

This will automatically download the source from crates.io, build it, and install it in Cargo's global binary directory (~/.cargo/bin/ by default).

From the latest unreleased version on Github

To build and install the very latest unreleased version, run:

cargo install --git https://github.com/langston-barrett/mogglo.git mogglo-LANG

From a local checkout

See the developer's guide.

Uninstalling

To uninstall, run cargo uninstall mogglo-<LANG>.

Guide

Mogglo searches for patterns in code. Mogglo patterns consist of code augmented with metavariables and embedded Lua code.

Metavariables

Metavariables match nodes in the syntax tree. For example, the pattern let $x = (); finds pointless assignments of the unit value (); the metavariable $x matches any expression.

Multiple uses of the same metavariable imply equality. For example the pattern let $x = $x; finds pointless assignments of an identifier to itself.

The special metavariable $_ matches any syntax node, and multiple uses don't imply equality. For example, $_ == $_ finds an equality comparison between any two expressions.

The special metavariable $.. (read "ellipsis") can match any number of sibling nodes in the AST. For example, here's how to find the main function:

fn main() $.. { $.. }

Matching nodes with multiple children

Consider that there are several possible readings of the following pattern:

{ $f($x); $y + $z; }

It might only match blocks with exactly two statements, a call and an addition. It might match a block that contains any number of statements, as long as there is call followed immediately by an addition. In fact, Mogglo interprets this pattern as matching any block that contains any number of statements, including a function call that is followed at some point by an addition.

Lua

Lua code is written between curly braces: ${{lua code goes here}}. See the API reference for details.

Speed

Regular expressions are slow. Don't use them if string matching will do.

Usage

By default, matches are non-recursive:

echo 'let a = { let b = c; c };' | mogglo-rust 'let $x = $y;' -
   ╭─[-:1:1]
   │
 1 │ let a = { let b = c; c };
   │ ────────────┬────────────
   │             ╰────────────── Match
───╯

The --recursive flag requests recursive matches, it will additionally print:

   ╭─[-:1:11]
   │
 1 │ let a = { let b = c; c };
   │           ─────┬────
   │                ╰────── Match
───╯

Contributing

Thank you for your interest in Mogglo! We welcome and appreciate all kinds of contributions. Please feel free to file and issue or open a pull request.

You may want to take a look at:

If you have questions, please ask them on the discussions page!

Lua API reference

Contexts

Lua code is evaluated in three different contexts:

  • Patterns: Lua code embedded in ${{}} when matching code
  • Replacments: Lua code embedded in ${{}} when replacing code
  • Where clauses: Lua code passed to --where.

The value produced by code evaluated in patterns and where clauses is treated as a boolean. If code in a pattern evaluates to false or nil, the node is not matched; if the code evaluates to anything else, it is. For example, ${{true}} is equivalent to $_. Code evaluated in a replacement is treated as a string.

The APIs available to code in patterns differ from those available to replacements and where clauses. For example, code in patterns can write to metavariables; code in replacements and where clauses can only read them. In the rest of this guide, (P) denotes an API available only to patterns, (A) denotes an API available in all contexts.

Globals

(P) Lua code has access to a global variable t that holds the text of the syntax node in question. For example, this pattern finds let-bound variables that contain the letter x:

let ${{string.find(t, "x")}} = $_;

(A) All other metavariables are bound to globals; the pattern author is responsible for not clobbering other important globals.

Conventions

In the remainder of this document:

  • Option<T> means T or nil.
  • If the return type is omitted, it is nil.

Functions

  • bind(String), (P): Binds a metavariable to the current node

    • 1st argument: Metavariable name (without the $)
    • Example: ${{bind("x")}} is equivalent to $x if $x is not yet bound
    • Note: This function can overwrite existing bindings; use with care
  • match(String) -> bool, (P): Matches the current node against a pattern

    • 1st argument: A pattern
    • Returns: Whether or not the current node matches the pattern
    • Example: Patterns can be negated with match: ${{not match("<pattern>")}}, e.g., ${{not match("${{false}}")}} is equivalent to ${{true}}.
    • Note: Metavariables in the argument pattern are inherited from the overall pattern; variables bound inside the argument pattern are not bound outside of it.
  • meta(String) -> Option<String>, (A): Returns the binding for a metavariable

    • 1st argument: Metavariable name (without the $)
    • Returns: Value of the metavariable, or nil if there is none
    • Example: ${{meta("x") == t}} is roughly equivalent to $x if $x is already bound (though not exactly: it matches textually instead of structurally)
  • rec(String) -> bool, (P): Recursively matches all descendants of the current node against a pattern

    • 1st argument: A pattern
    • Returns: Whether or not some descendant of the current node matches the pattern
    • Example: let x = ${{rec("$x")}} + $y; matches let a = (b + a) + c;
    • Note: Metavariables in the argument pattern are inherited from the overall pattern; variables bound inside the argument pattern are not bound outside of it.
  • rx(String, String) -> bool, (A): Returns whether its first argument is a regular expression that matches its second.

    • 1st argument: Regular expression
    • 2nd argument: String to be matched
    • Returns: Whether the regex matched the string

Nodes

In addition to the "textual" API given by the t variable, Lua code has access to a "structured" API for AST nodes. The type of node objects is denoted Node. The "current node" is stored in the global focus.

Node methods:

Node kinds

Each node in a tree-sitter parse tree has a kind, e.g., "binary expression" or "compound statement". Some of these kinds are children of each other, e.g., "call expression" might be a child of "expression". The following functions can query such relationships:

  • is_child_of(String, String) -> bool, (A)
  • is_descendant_of(String, String) -> bool, (A): Recursive, reflexive version of is_child_of
  • is_parent_of(String, String) -> bool, (A)
  • is_ancestor_of(String, String) -> bool, (A): Recursive, reflexive version of is_parent_of

See grammar.js and node_types.json for the grammar in question for a list of possible node kinds (or just use the pattern ${{print(focus:kind())}}).

State and evaluation order

When matching against a single node, Lua snippets in a pattern share the same global state. Therefore, they can interact via global variables. For example, the following pattern is functionally equivalent to let $_ = $_;:

let ${{foo = "bar"; return true}} = ${{foo == "bar"}};

Evaluation order is depth-first, left-to-right. But be careful! It's hard to tell when and how many times a given snippet will execute.

Developer's guide

Build

To install from source, you'll need to install Rust and Cargo. Follow the instructions on the Rust installation page. Then, get the source:

git clone https://github.com/langston-barrett/mogglo
cd mogglo

Finally, build everything:

cargo build --release

You can find binaries in target/release. Run tests with cargo test.

Docs

HTML documentation can be built with mdBook:

cd doc
mdbook build

Release

  • Create branch with a name starting with release

  • Update CHANGELOG.md

  • Update the version numbers in ./crates/**/Cargo.toml

    find crates/ -type f -name "*.toml" -print0 | \
      xargs -0 sed -E -i 's/^version = "U.V.W"$/version = "X.Y.Z"/'
    
  • Run cargo build --release

  • Commit all changes and push the release branch

  • Check that CI was successful on the release branch

  • Merge the release branch to main

  • git checkout main && git pull origin && git tag -a vX.Y.Z -m vX.Y.Z && git push --tags

  • Verify that the release artifacts work as intended

  • Release the pre-release created by CI

  • Check that the crates were properly uploaded to crates.io

Test

Run end-to-end tests with lit and FileCheck:

cargo build
lit --path=$PWD/test/bin --path=$PWD/target/debug test/