No description
Find a file
Shawn Hurley 4ece0f0dc7
Fix proof conformance: correct origin assignment, receiver type-chain, and bare identifier refs
Three issues caused completeness loss against the ast-index proof:

1. determine_origin assigned Local origin to same-file non-selector refs
   (types, functions, variables). The proof's Case 5' requires Global
   origin for type-level defs. Changed step 3 to return Global, preserving
   the name-matching check to maintain Go's local-shadows-dot-import rule.

2. Method receivers were not extracted as parameter defs, so selectors
   like b.ID inside method bodies couldn't resolve through the receiver's
   type. Removed the skip_first logic in extract_param_defs so receivers
   enter def_spans and enable type-chain resolution (Case 6).

3. Bare identifier references (same-file function calls, variable reads)
   were not emitted as SymbolRefs. Added an identifier handler to
   collect_refs_recursive with parent-kind checks to skip definition
   sites. Removed the now-redundant call_expression dot-import handler
   and the duplicate composite_literal type_identifier ref push.
2026-04-23 12:19:43 -04:00
src Fix proof conformance: correct origin assignment, receiver type-chain, and bare identifier refs 2026-04-23 12:19:43 -04:00
tests Fix proof conformance: correct origin assignment, receiver type-chain, and bare identifier refs 2026-04-23 12:19:43 -04:00
.gitignore Initial commit: Go language indexer with scope tracking, type inference, and cross-package resolution 2026-04-21 14:50:31 -04:00
AGENTS.md Fix proof conformance: correct origin assignment, receiver type-chain, and bare identifier refs 2026-04-23 12:19:43 -04:00
Cargo.lock Fix proof conformance: correct origin assignment, receiver type-chain, and bare identifier refs 2026-04-23 12:19:43 -04:00
Cargo.toml Add lazy dependency resolution for stdlib and third-party Go modules 2026-04-22 11:55:21 -04:00
README.md Add lazy dependency resolution for stdlib and third-party Go modules 2026-04-22 11:55:21 -04:00

ast-index-golang

Go language indexer built on ast-index, using tree-sitter and tree-sitter-go for parsing.

Given a Go project with a go.mod, it parses every .go file, extracts definitions, imports, and references, and feeds them into ast-index's ProjectIndex for cross-file symbol tracing and querying. No Go toolchain is required -- tree-sitter-go parses Go source directly.

Installation

cargo build --release

The binary is at target/release/ast-index-golang.

CLI Usage

The tool has two subcommands: index (build and summarize) and query (build, then accept queries via stdin).

index -- Build and Summarize

ast-index-golang index /path/to/go/project

Prints a JSON summary to stdout:

{
  "type": "index_summary",
  "files_scanned": 7,
  "files_cached": 7,
  "files_failed": 0,
  "unresolved_modules": ["fmt"],
  "errors": []
}

unresolved_modules lists external dependencies (stdlib, third-party) that aren't part of the project. These can be resolved with dependency resolution flags (see Dependency Resolution).

query -- Interactive Query Mode

ast-index-golang query /path/to/go/project

After printing the index summary, the tool reads JSONL from stdin (one JSON object per line) and prints results to stdout.

Query Actions

query -- Search for Symbols

Find symbol definitions and their usage sites across the project.

{"action": "query", "pattern": "*.Handler", "symbol_kind": "TypeDef"}

Fields:

Field Required Description
action yes "query"
pattern yes Dot-separated search pattern (supports wildcards)
symbol_kind no Filter by kind (see Symbol Kinds)

Response:

{
  "type": "query_result",
  "pattern": "*.Handler",
  "traces": [
    {
      "definition": {
        "kind": "project_file",
        "file": "handlers/handler.go",
        "package": null,
        "span": {"start": 133, "end": 190},
        "symbol": {
          "name": "Handler",
          "qualified_name": "github.com/myproject/handlers.Handler",
          "kind": "TypeDef",
          "exported": true
        }
      },
      "chain": [
        {
          "file": "api/api.go",
          "kind": "reexport",
          "span": {"start": 173, "end": 199},
          "original_name": "Handler"
        }
      ],
      "usage_sites": [
        {
          "file": "api/api.go",
          "span": {"start": 192, "end": 199},
          "ref_kind": "Type"
        },
        {
          "file": "main.go",
          "span": {"start": 787, "end": 794},
          "ref_kind": "Type"
        }
      ]
    }
  ]
}

Each trace contains:

  • definition -- where the symbol is defined (project_file, dependency, or unresolved)
  • chain -- how the symbol reaches the usage site (re-exports, imports)
  • usage_sites -- every place the symbol is referenced in the project

analyze_file -- Single File Analysis

Get the full structure of one file without using the index.

{"action": "analyze_file", "file": "main.go"}

Returns imports, definitions, and references for that file:

{
  "type": "analyze_result",
  "file": "main.go",
  "package": "github.com/myproject",
  "imports": [
    {"source": "fmt", "span": {"start": 24, "end": 29}},
    {"source": "github.com/example/lib", "span": {"start": 31, "end": 55}}
  ],
  "definitions": [
    {
      "name": "main",
      "qualified_name": "github.com/myproject.main",
      "kind": "Function",
      "exported": false,
      "span": {"start": 58, "end": 200}
    }
  ],
  "references": [
    {
      "name": "Println",
      "qualified_name": "fmt.Println",
      "span": {"start": 80, "end": 87},
      "usage": "Read",
      "origin": "Import",
      "qualifier": "fmt"
    }
  ]
}

list_packages -- List Indexed Packages

{"action": "list_packages"}
{
  "type": "packages",
  "packages": [
    {"name": "github.com/myproject", "file_count": 1},
    {"name": "github.com/myproject/api", "file_count": 1},
    {"name": "github.com/myproject/handlers", "file_count": 2},
    {"name": "github.com/myproject/models", "file_count": 2}
  ]
}

Search Patterns

Patterns are dot-separated and match against fully qualified symbol names. Go qualified names use the format <import_path>.<Symbol> for top-level symbols and <import_path>.<Type>.<Member> for methods and fields.

Pattern Matches
*.Handler Any symbol named Handler in any package
fmt.Println Exactly fmt.Println
fmt.* All symbols in the fmt package
*.Server.* All members (methods, fields) of any type named Server
github.com/myproject/models.User The User type in the models package
*.New* Any symbol starting with New (factory functions)
Function|TypeDef Regex alternation on symbol names

Wildcards:

  • * matches any sequence of characters within a single name segment
  • Patterns are matched against both name and qualified_name

Symbol Kinds

Use these values for the symbol_kind filter:

Kind Go Construct
Function func Foo(), func (r *T) Foo(), interface method specs
TypeDef type Foo struct{}, type Foo interface{}
TypeAlias type X = Y, type X Y
Variable var x int
Const const X = 1
Field Struct fields

Dependency Resolution

By default, external imports (stdlib and third-party) appear in unresolved_modules and cannot be queried. To enable dependency resolution, provide paths to the Go module cache and/or GOROOT:

# Resolve both stdlib and third-party dependencies
ast-index-golang query \
  --goroot /usr/local/go \
  --deps-dir ~/go/pkg/mod \
  /path/to/project

# Auto-detect GOROOT from environment, resolve third-party only
ast-index-golang query \
  --deps-dir ~/go/pkg/mod \
  /path/to/project

# Use GOMODCACHE environment variable for the module cache path
ast-index-golang query \
  --deps-dir "$GOMODCACHE" \
  --goroot "$GOROOT" \
  /path/to/project

Flags

Flag Description Fallback
--deps-dir <PATH> Path to the Go module cache (e.g., ~/go/pkg/mod or $GOMODCACHE) None -- third-party deps not resolved
--goroot <PATH> Path to the Go installation root (e.g., /usr/local/go) $GOROOT env var

How It Works

Dependencies are resolved lazily at query time. When a query matches an unresolved module, the indexer:

  1. Determines if the import is stdlib (no dots in first path segment) or third-party
  2. For stdlib: looks in $GOROOT/src/<import_path>/
  3. For third-party: parses go.mod require directives to find the version, then looks in <deps_dir>/<module>@<version>/
  4. Finds all .go files in the directory (skips _test.go, dotfiles, _-prefixed files)
  5. Parses each file with tree-sitter and extracts exported definitions only
  6. Returns the definitions to the query engine, which caches them for subsequent queries

Dependency definitions appear as "kind": "dependency" in query results:

{
  "definition": {
    "kind": "dependency",
    "file": null,
    "package": "fmt",
    "span": null,
    "symbol": {
      "name": "Println",
      "qualified_name": "fmt.Println",
      "kind": "Function",
      "exported": true
    }
  },
  "chain": [],
  "usage_sites": [
    {"file": "main.go", "span": {"start": 121, "end": 128}, "ref_kind": "Read"}
  ]
}

Module Cache Layout

The tool expects the standard Go module cache layout:

$GOMODCACHE/
  github.com/
    gin-gonic/
      gin@v1.9.1/
        *.go
        middleware/
          *.go

$GOROOT/
  src/
    fmt/
      print.go
    net/
      http/
        server.go

Third-party module versions are resolved from the project's go.mod require block. If a module is not listed in require, the tool falls back to scanning the cache directory for the latest available semver version.

The Go module cache case-encoding convention is supported (uppercase letters encoded as ! + lowercase, e.g., github.com/Azure/... stored as github.com/!azure/...).

Library Usage

The indexer can be embedded as a Rust library. Add the dependency:

[dependencies]
ast-index-golang = { path = "../ast-index-languages/golang" }
ast-index = { path = "../ast-index" }

Basic Indexing

use std::path::Path;
use ast_index::ProjectIndex;
use ast_index::trace::QueryParams;
use ast_index_golang::analyzer::GoAnalyzer;
use ast_index_golang::lang::*;

type GoIndex = ProjectIndex<GoPackageData, GoImportData, GoDefData, GoRefData, GoAnalyzer>;

// Build the index
let path = Path::new("/path/to/go/project");
let analyzer = GoAnalyzer::new(path.to_path_buf());
let index = GoIndex::new(analyzer);
let stats = index.build(path);

println!("Indexed {} files", stats.files_cached);
println!("Unresolved: {:?}", stats.unresolved_modules);

// Query for a symbol
let params = QueryParams {
    symbol_kind: None,
    search_pattern: "*.Handler".to_string(),
};
let traces = index.params(params).query().unwrap();
for trace in &traces {
    println!("{}: {} usage sites",
        trace.definition.name(),
        trace.usage_sites.len()
    );
}

With Dependency Resolution

use std::path::PathBuf;

let analyzer = GoAnalyzer::with_deps(
    PathBuf::from("/path/to/project"),
    Some(PathBuf::from("/home/user/go/pkg/mod")),  // deps_dir
    Some(PathBuf::from("/usr/local/go")),           // goroot
);
let index = GoIndex::new(analyzer);
index.build(Path::new("/path/to/project"));

// Now queries for "fmt.Println" will resolve to stdlib definitions
let traces = index
    .params(QueryParams {
        symbol_kind: Some(ast_index::SymbolKindTag::Function),
        search_pattern: "fmt.Println".to_string(),
    })
    .query()
    .unwrap();

Single File Analysis

use ast_index::FileAnalyzer;
use ast_index_golang::parser::parse_go;
use ast_index_golang::extraction::extract;
use ast_index_golang::conversion::to_analysis_result;

let source = r#"
package main

import "fmt"

func main() {
    fmt.Println("hello")
}
"#;

let tree = parse_go(source).unwrap();
let extraction = extract(&tree, source);
let result = to_analysis_result(
    &extraction,
    Path::new("main.go"),
    &Some("github.com/myproject".to_string()),
);

for def in &result.symbol_defs {
    println!("{} ({:?})", def.name, def.kind.tag());
}
for imp in &result.imports {
    println!("import {}", imp.source);
}
for r in &result.symbol_refs {
    println!("ref {} ({:?})", r.name, r.usage);
}

Dependency Extraction Only

Extract exported definitions from Go source without building a full index, useful for custom dependency resolution pipelines:

use ast_index_golang::parser::parse_go;
use ast_index_golang::extraction::extract_exported_defs;
use ast_index_golang::conversion::to_dependency_defs;

let source = r#"
package lib

type Client struct { BaseURL string }
func NewClient(url string) *Client { return nil }
func (c *Client) Get(path string) error { return nil }
func unexported() {}
"#;

let tree = parse_go(source).unwrap();
let (package_name, raw_defs) = extract_exported_defs(&tree, source);
let sym_defs = to_dependency_defs(&raw_defs, "github.com/example/lib");

// Only exported symbols: Client, NewClient, Get, BaseURL
for def in &sym_defs {
    println!("{} ({})", def.name, def.qualified_name.as_deref().unwrap_or(""));
}

Go-to-ast-index Type Mapping

Go Construct SymbolKind Notes
func Foo() Function exported from capitalization
func (r *T) Foo() Function Receiver in GoDefData.receiver
type Foo struct{} TypeDef GoDefData.is_interface = false
type Foo interface{} TypeDef GoDefData.is_interface = true
type X = Y TypeAlias True alias with =
type X Y TypeAlias Named/defined type (no =)
var x int Variable
const X = 1 Const
Struct field Name string Field
Interface method Read() Function GoDefData.is_interface = true

Qualified Name Format

Symbol Format Example
Plain function pkg.Function github.com/myproject.NewServer
Method pkg.Type.Method github.com/myproject.Server.Start
Interface method pkg.Interface.Method github.com/myproject.Reader.Read
Struct field pkg.Struct.Field github.com/myproject.Config.Host

Configuration

Variable Description
RUST_LOG Controls tracing output to stderr. Set to debug or trace for verbose logging. Example: RUST_LOG=debug ast-index-golang index .
GOROOT Fallback for --goroot flag when not provided. Typically set by Go installation.
GOMODCACHE Common source for --deps-dir value. Defaults to ~/go/pkg/mod in standard Go installations. Find yours with go env GOMODCACHE.

Build & Test

cargo build
cargo test

All tests run against fixture projects in tests/fixtures/. No external Go toolchain is required.

Project Structure

src/
  main.rs          CLI entry point. Two subcommands: index, query.
  lib.rs           Module declarations.
  lang.rs          Go-specific type parameters for ast-index generics.
  parser.rs        Wrapper: parse_go(content) -> tree_sitter::Tree.
  extraction.rs    CST walking. Extracts RawDef, RawImport, RawRef.
  conversion.rs    Maps extraction types to ast-index types.
  resolve.rs       go.mod parsing, import resolution, dependency
                     directory resolution.
  analyzer.rs      GoAnalyzer (FileAnalyzer impl). Orchestrates the
                     pipeline and implements dependency resolution.
  io.rs            Serde JSON types for CLI input/output.

tests/
  integration.rs   177 tests covering Go semantics, query API, and
                     dependency resolution.
  fixtures/
    simple/        Minimal 2-file Go project.
    comprehensive/ 7-file Go project with 5 packages.
    typechain/     3-file project for type-chain resolution.
    deps_project/  Test project for dependency resolution.
    fake_goroot/   Fake GOROOT with minimal fmt and net/http.
    fake_gomodcache/ Fake module cache with a versioned third-party module.

Known Limitations

  • replace directives not supported: go.mod replace blocks that redirect module paths to local directories are not yet handled.
  • Build constraints not filtered: Files with //go:build tags are parsed regardless of target OS/architecture.
  • Range clause variables not extracted: for i, v := range items loop variables are not extracted as local definitions.

License

Apache-2.0