Overview
Mogglo is a multi-language AST-based code search and rewriting tool. Mogglo supports embedding Lua code in search patterns and replacements.
Mogglo focuses on the following goals:
- Minimal setup: Mogglo will work right away on any codebase in a supported language.
- Many languages: 12 and counting!
- High-quality grammars: Mogglo is based on tree-sitter grammars.
- Lua: Mogglo exposes a rich API to embedded Lua snippets.
Introduction
The following examples give a taste of Mogglo. Here's how to find pointless assignments of an expression to itself:
mogglo-rust --detail 'let $x = $x;' ./**/*.rs
The --detail
flag helps you understand why something matched, it produces
fancy output like:
╭─[./test/nonlinear.rs:4:1]
│
4 │ ╭─▶ let a =
│ │ ┬
│ │ ╰── $x
5 │ ├─▶ a;
│ │ ┬
│ │ ╰──── $x
│ │
│ ╰──────────── let $x = $x;
│
│ Note: Multiple occurrences of $x were structurally equal
───╯
Lua code is wrapped in braces. Lua can recursively match patterns with rec
.
Here's a pattern to detect out-of-bounds array accesses:
mogglo-rust 'while $i <= $buf.len() { ${{ rec("$buf.get($i)") }} }' ./**/*.rs
Here's how to unroll a simple loop:
mogglo-rust \
'for $i in 0..$h { $b; }' \
--where 'h_num = tonumber(h); return h_num ~= nil and h_num % 4 == 0' \
--replace 'for $i in 0..${{ string.format("%.0f", h / 4) }} { $b; $b; $b; $b; }' \
./*/**.rs
This transformation demonstrates the power of using Lua: it can't be done with regular expression substitutions and would be very difficult with other codemod tools.
Lua snippets can match and negate patterns, or even compose new patterns dynamically! See the guide for more detailed explanations, examples, and features.
Supported languages
Mogglo currently ships pre-built executables for the following languages:
Additionally, the following can be built from source or via Cargo/crates.io:
Languages are very easy to add, so file an issue or a PR if you want a new one!
Comparison to related tools
Mogglo is not as polished as any of the tools mentioned in this section.
Mogglo is most similar to other multi-language code search and codemod tools.
- Mogglo is similar to ast-grep, but supports more languages and allows embedding Lua in patterns.
- Mogglo is similar to Comby. Comby uses lower-fidelity parsers, but is much more battle-tested and better documented. Mogglo also embeds Lua in patterns.
- Mogglo has less semantic understanding of code (e.g., name resolution) than Semgrep or CodeQL, but is much easier to set up.
There are many excellent language-specific code search and codemod tools; these tend to be more polished but less general than Mogglo.
Installation
Pre-compiled binaries
Pre-compiled binaries are available on the releases page.
Fetching binaries with cURL
You can download binaries with curl
like so (replace X.Y.Z
with a real
version number, LANG
with a supported language, and TARGET
with your OS):
curl -sSL https://github.com/langston-barrett/mogglo/releases/download/vX.Y.Z/mogglo-LANG_TARGET -o mogglo-LANG
Build from source
To install from source, you'll need to install Rust and Cargo. Follow the instructions on the Rust installation page.
From a release on crates.io
You can build a released version from crates.io. To install the latest
release of Mogglo for the language <LANG>
, run:
cargo install mogglo-<LANG>
This will automatically download the source from crates.io, build it, and
install it in Cargo's global binary directory (~/.cargo/bin/
by default).
From the latest unreleased version on Github
To build and install the very latest unreleased version, run:
cargo install --git https://github.com/langston-barrett/mogglo.git mogglo-LANG
From a local checkout
See the developer's guide.
Uninstalling
To uninstall, run cargo uninstall mogglo-<LANG>
.
Guide
Mogglo searches for patterns in code. Mogglo patterns consist of code augmented with metavariables and embedded Lua code.
Metavariables
Metavariables match nodes in the syntax tree. For example, the pattern
let $x = ();
finds pointless assignments of the unit value ()
; the
metavariable $x
matches any expression.
Multiple uses of the same metavariable imply equality. For example the pattern
let $x = $x;
finds pointless assignments of an identifier to itself.
The special metavariable $_
matches any syntax node, and multiple uses don't
imply equality. For example, $_ == $_
finds an equality comparison between
any two expressions.
The special metavariable $..
(read "ellipsis") can match any number of
sibling nodes in the AST. For example, here's how to find the main function:
fn main() $.. { $.. }
Matching nodes with multiple children
Consider that there are several possible readings of the following pattern:
{ $f($x); $y + $z; }
It might only match blocks with exactly two statements, a call and an addition. It might match a block that contains any number of statements, as long as there is call followed immediately by an addition. In fact, Mogglo interprets this pattern as matching any block that contains any number of statements, including a function call that is followed at some point by an addition.
Lua
Lua code is written between curly braces: ${{lua code goes here}}
.
See the API reference for details.
Speed
Regular expressions are slow. Don't use them if string matching will do.
Usage
By default, matches are non-recursive:
echo 'let a = { let b = c; c };' | mogglo-rust 'let $x = $y;' -
╭─[-:1:1]
│
1 │ let a = { let b = c; c };
│ ────────────┬────────────
│ ╰────────────── Match
───╯
The --recursive
flag requests recursive matches, it will additionally print:
╭─[-:1:11]
│
1 │ let a = { let b = c; c };
│ ─────┬────
│ ╰────── Match
───╯
Contributing
Thank you for your interest in Mogglo! We welcome and appreciate all kinds of contributions. Please feel free to file and issue or open a pull request.
You may want to take a look at:
If you have questions, please ask them on the discussions page!
Lua API reference
Contexts
Lua code is evaluated in three different contexts:
- Patterns: Lua code embedded in
${{}}
when matching code - Replacments: Lua code embedded in
${{}}
when replacing code - Where clauses: Lua code passed to
--where
.
The value produced by code evaluated in patterns and where clauses is treated
as a boolean. If code in a pattern evaluates to false
or nil
, the node
is not matched; if the code evaluates to anything else, it is. For example,
${{true}}
is equivalent to $_
. Code evaluated in a replacement is treated
as a string.
The APIs available to code in patterns differ from those available to replacements and where clauses. For example, code in patterns can write to metavariables; code in replacements and where clauses can only read them. In the rest of this guide, (P) denotes an API available only to patterns, (A) denotes an API available in all contexts.
Globals
(P) Lua code has access to a global variable t
that holds the text of the
syntax node in question. For example, this pattern finds let-bound variables
that contain the letter x
:
let ${{string.find(t, "x")}} = $_;
(A) All other metavariables are bound to globals; the pattern author is responsible for not clobbering other important globals.
Conventions
In the remainder of this document:
Option<T>
meansT
ornil
.- If the return type is omitted, it is
nil
.
Functions
-
bind(String)
, (P): Binds a metavariable to the current node- 1st argument: Metavariable name (without the
$
) - Example:
${{bind("x")}}
is equivalent to$x
if$x
is not yet bound - Note: This function can overwrite existing bindings; use with care
- 1st argument: Metavariable name (without the
-
match(String) -> bool
, (P): Matches the current node against a pattern- 1st argument: A pattern
- Returns: Whether or not the current node matches the pattern
- Example: Patterns can be negated with
match
:${{not match("<pattern>")}}
, e.g.,${{not match("${{false}}")}}
is equivalent to${{true}}
. - Note: Metavariables in the argument pattern are inherited from the overall pattern; variables bound inside the argument pattern are not bound outside of it.
-
meta(String) -> Option<String>
, (A): Returns the binding for a metavariable- 1st argument: Metavariable name (without the
$
) - Returns: Value of the metavariable, or
nil
if there is none - Example:
${{meta("x") == t}}
is roughly equivalent to$x
if$x
is already bound (though not exactly: it matches textually instead of structurally)
- 1st argument: Metavariable name (without the
-
rec(String) -> bool
, (P): Recursively matches all descendants of the current node against a pattern- 1st argument: A pattern
- Returns: Whether or not some descendant of the current node matches the pattern
- Example:
let x = ${{rec("$x")}} + $y;
matcheslet a = (b + a) + c;
- Note: Metavariables in the argument pattern are inherited from the overall pattern; variables bound inside the argument pattern are not bound outside of it.
-
rx(String, String) -> bool
, (A): Returns whether its first argument is a regular expression that matches its second.- 1st argument: Regular expression
- 2nd argument: String to be matched
- Returns: Whether the regex matched the string
Nodes
In addition to the "textual" API given by the t
variable, Lua code has
access to a "structured" API for AST nodes. The type of node objects is denoted
Node
. The "current node" is stored in the global focus
.
Node
methods:
child(int) -> Option<Node>
: Upstream docschild_count() -> int
: Upstream docskind() -> String
: Upstream docsnext_named_sibling() -> Option<Node>
: Upstream docsnext_sibling() -> Option<Node>
: Upstream docsprev_named_sibling() -> Option<Node>
: Upstream docsprev_sibling() -> Option<Node>
: Upstream docsparent() -> Option<Node>
: Upstream docstext() -> String
: Return the text of the node
Node kinds
Each node in a tree-sitter parse tree has a kind, e.g., "binary expression" or "compound statement". Some of these kinds are children of each other, e.g., "call expression" might be a child of "expression". The following functions can query such relationships:
is_child_of(String, String) -> bool
, (A)is_descendant_of(String, String) -> bool
, (A): Recursive, reflexive version ofis_child_of
is_parent_of(String, String) -> bool
, (A)is_ancestor_of(String, String) -> bool
, (A): Recursive, reflexive version ofis_parent_of
See grammar.js
and node_types.json
for the grammar in question for a list
of possible node kinds (or just use the pattern ${{print(focus:kind())}}
).
State and evaluation order
When matching against a single node, Lua snippets in a pattern share the same
global state. Therefore, they can interact via global variables. For example,
the following pattern is functionally equivalent to
let $_ = $_;
:
let ${{foo = "bar"; return true}} = ${{foo == "bar"}};
Evaluation order is depth-first, left-to-right. But be careful! It's hard to tell when and how many times a given snippet will execute.
Developer's guide
Build
To install from source, you'll need to install Rust and Cargo. Follow the instructions on the Rust installation page. Then, get the source:
git clone https://github.com/langston-barrett/mogglo
cd mogglo
Finally, build everything:
cargo build --release
You can find binaries in target/release
. Run tests with cargo test
.
Docs
HTML documentation can be built with mdBook:
cd doc
mdbook build
Release
-
Create branch with a name starting with
release
-
Update
CHANGELOG.md
-
Update the version numbers in
./crates/**/Cargo.toml
find crates/ -type f -name "*.toml" -print0 | \ xargs -0 sed -E -i 's/^version = "U.V.W"$/version = "X.Y.Z"/'
-
Run
cargo build --release
-
Commit all changes and push the release branch
-
Check that CI was successful on the release branch
-
Merge the release branch to
main
-
git checkout main && git pull origin && git tag -a vX.Y.Z -m vX.Y.Z && git push --tags
-
Verify that the release artifacts work as intended
-
Release the pre-release created by CI
-
Check that the crates were properly uploaded to crates.io
Test
Run end-to-end tests with lit
and FileCheck
:
cargo build
lit --path=$PWD/test/bin --path=$PWD/target/debug test/