No description
  • Rust 96.8%
  • Scheme 3.2%
Find a file
2026-04-23 11:28:37 -04:00
src Add lazy dependency resolution for .NET SDK/NuGet assemblies and README 2026-04-22 12:10:06 -04:00
tests Add lazy dependency resolution for .NET SDK/NuGet assemblies and README 2026-04-22 12:10:06 -04:00
.gitignore Adding initial implementation for the c-sharp-indexer 2026-04-21 16:35:04 -04:00
AGENTS.md Add AGENTS.md with formal proof alignment and corrected test categories 2026-04-23 11:28:37 -04:00
Cargo.lock Update Cargo.lock for upstream ast-index dependency cleanup 2026-04-23 11:28:08 -04:00
Cargo.toml Add lazy dependency resolution for .NET SDK/NuGet assemblies and README 2026-04-22 12:10:06 -04:00
README.md Add lazy dependency resolution for .NET SDK/NuGet assemblies and README 2026-04-22 12:10:06 -04:00

c-sharp-indexer

A C# source code indexer built on ast-index. Parses C# source files using tree-sitter, extracts definitions, references, and imports with full C# language semantics, builds a cross-file index, and provides an interactive query REPL. Supports lazy dependency resolution from .NET SDK and NuGet assemblies.

Given a C# project, the indexer answers questions like:

  • Where is MyApp.Models.User defined and where is it used across the codebase?
  • What types and members are in the MyApp.Services namespace?
  • Who calls ProcessOrder() and from which files?
  • What does a file import via using directives, and how do those types flow through the project?

Quick Start

# Build
cargo build --release

# Index a project and enter the query REPL
./target/release/c-sharp-indexer /path/to/your/csharp/project

# With .NET SDK dependency resolution
./target/release/c-sharp-indexer /path/to/project \
  --deps-path /usr/local/share/dotnet/packs/Microsoft.NETCore.App.Ref/9.0.5/ref/net9.0/

On startup, the indexer scans all .cs files, builds the cross-file index, and drops you into an interactive REPL:

Indexing C# files in: /path/to/project
Dependency paths: 1 entries
  /usr/local/share/dotnet/packs/Microsoft.NETCore.App.Ref/9.0.5/ref/net9.0/
  DLLs found: 168
Index built:
  Files scanned:  24
  Files cached:   24
  Files failed:   0
  Packages found: 3
    MyApp.Models (4 files, 18 definitions)
    MyApp.Services (6 files, 32 definitions)
    MyApp.Data (3 files, 14 definitions)
  Unresolved imports: 2
    System.Text.Json
    Microsoft.Extensions.Logging

Ready for queries. Enter JSON, one per line:
  {"pattern": "MyApp.Models.User", "kind": "TypeDef"}
Press Ctrl+D to exit.

Installation

Requires Rust 1.85+ (edition 2024).

git clone <repo>
cd ast-index-languages/c-sharp-indexer
cargo build --release

The binary is at ./target/release/c-sharp-indexer.

CLI Reference

c-sharp-indexer [OPTIONS] <FOLDER>

Arguments

Argument Required Description
FOLDER Yes Path to the C# project root to index

Options

Flag Description
--deps-path <DIR> Path to a directory of .NET assembly DLLs (SDK reference assemblies, NuGet package libs). Searched recursively for .dll files. Can be specified multiple times.

Logging is controlled via the RUST_LOG environment variable (default: warn).

Examples

# Basic usage -- index project files only
c-sharp-indexer /path/to/project

# With .NET 9 SDK reference assemblies
c-sharp-indexer /path/to/project \
  --deps-path /usr/local/share/dotnet/packs/Microsoft.NETCore.App.Ref/9.0.5/ref/net9.0/

# SDK + NuGet package
c-sharp-indexer /path/to/project \
  --deps-path /usr/local/share/dotnet/packs/Microsoft.NETCore.App.Ref/9.0.5/ref/net9.0/ \
  --deps-path ~/.nuget/packages/newtonsoft.json/13.0.1/lib/net8.0/

# SDK + ASP.NET Core
c-sharp-indexer /path/to/project \
  --deps-path /usr/local/share/dotnet/packs/Microsoft.NETCore.App.Ref/9.0.5/ref/net9.0/ \
  --deps-path /usr/local/share/dotnet/packs/Microsoft.AspNetCore.App.Ref/9.0.5/ref/net9.0/

# Verbose logging to see dependency resolution
RUST_LOG=debug c-sharp-indexer /path/to/project \
  --deps-path /usr/local/share/dotnet/packs/Microsoft.NETCore.App.Ref/9.0.5/ref/net9.0/

# Pipe a query (non-interactive)
echo '{"pattern": "MyApp.Models.User", "kind": "TypeDef"}' | c-sharp-indexer /path/to/project

Query Reference

Queries are JSON objects, one per line, entered at the REPL prompt.

Format

{"pattern": "<search pattern>", "kind": "<optional kind filter>"}

Search Patterns

Pattern Meaning Example
MyApp.Models.User Exact fully-qualified name Find one specific type
MyApp.Models.* All symbols in a namespace Browse a namespace
MyApp.Models.User.* All members of a type See methods, properties, fields
*.User Symbol named "User" in any namespace Find by short name
MyApp.*.Run Method named "Run" in any type Find across namespaces
* Everything All symbols of a given kind
MyApp.Models.Status.Active Enum member by FQN Find specific constants
MyApp.Models.User.Settings Nested type by FQN Navigate nested types

Kind Filters

Kind Aliases Matches
typedef type, class, struct, interface, record All type definitions (classes, structs, records, interfaces)
enum Enum types
enum_member enummember Individual enum constants
function method Methods (including delegates)
constructor ctor Constructors
property prop Properties, events with accessors
field Fields, field-like events
const constant Compile-time constants
variable var Local variables (for type-chain resolution)

To distinguish between class/struct/record/interface within typedef results, check the type_kind field in the response (Class, Struct, Record, RecordStruct, Interface, Delegate).

Output Format

Each query returns an object with the query echo and an array of traces. A trace connects a symbol's definition to all the places it's used:

{
  "query": {"pattern": "MyApp.Models.User", "kind": "TypeDef"},
  "results": [
    {
      "definition": {
        "name": "User",
        "qualified_name": "MyApp.Models.User",
        "kind": "TypeDef",
        "file": "src/Models/User.cs",
        "span": {"start": 42, "end": 820},
        "exported": true,
        "visibility": "public",
        "type_kind": "Class"
      },
      "usage_sites": [
        {
          "file": "src/Services/UserService.cs",
          "name": "User",
          "span": {"start": 204, "end": 208},
          "ref_kind": "Type"
        },
        {
          "file": "src/Controllers/UserController.cs",
          "name": "User",
          "span": {"start": 312, "end": 316},
          "ref_kind": "Read"
        }
      ]
    }
  ]
}

Definition types:

  • project_file -- defined in a project source file (has file and span)
  • dependency -- resolved from a DLL via --deps-path (has package, span is 0:0)
  • unresolved -- imported but could not be resolved (has package only)

Usage types: Type (type reference), Read (value access), Write (assignment), ReadWrite

Query Examples

Find a class and see where it's used:

{"pattern": "MyApp.Models.User", "kind": "TypeDef"}

Find all types in a namespace:

{"pattern": "MyApp.Models.*", "kind": "TypeDef"}

Find a method across the codebase:

{"pattern": "*.ProcessOrder", "kind": "function"}

Query an SDK type (resolved through --deps-path):

{"pattern": "System.String", "kind": "TypeDef"}

Returns a dependency definition since it comes from the SDK DLLs.

Find all members of a class:

{"pattern": "MyApp.Models.User.*"}

Dependency Resolution

The indexer resolves references to external .NET assemblies (SDK types, NuGet packages) through the --deps-path flag.

How It Works

  1. At index time: using directives like using System.Collections.Generic; are recorded as unresolved modules
  2. At query time: when you query for a symbol in an unresolved module, the indexer lazily resolves it:
    • Scans all DLLs to build a namespace -> DLL index (reads PE headers + TypeDef table only -- fast)
    • Parses full metadata from matching DLLs using goblin (PE container) + clrmeta (ECMA-335 metadata)
    • Extracts types, methods, fields, properties, constructors with full type annotations and generics
    • Results are cached by ast-index's dependency_cache -- each namespace is parsed at most once
  3. No fallback: unlike the Java indexer's JDK stubs, there are no hardcoded stubs. Use --deps-path to point at the .NET SDK reference assemblies for full coverage.

Finding Your SDK Path

# macOS / Linux
ls /usr/local/share/dotnet/packs/Microsoft.NETCore.App.Ref/*/ref/

# Windows
dir "C:\Program Files\dotnet\packs\Microsoft.NETCore.App.Ref\*\ref\"

# Or use dotnet CLI
dotnet --list-runtimes

The reference assemblies are metadata-only DLLs (~20-50KB each, no IL code), designed for tooling. The .NET 9 SDK path typically looks like:

/usr/local/share/dotnet/packs/Microsoft.NETCore.App.Ref/9.0.5/ref/net9.0/

What Gets Extracted from DLLs

For each public/protected type in a namespace:

Element Extracted Data
Type Name, qualified name, kind (class/struct/interface/enum/delegate), modifiers, base type (extends), interfaces (implements), generic type parameters
Methods Name, parameter names and types, return type, modifiers (public/static/abstract/virtual)
Constructors Parameter names and types, modifiers
Properties Name, type annotation, modifiers
Fields Name, type annotation, modifiers (static/readonly)
Constants Name, type annotation (literal fields)
Enum members Name

Private and internal members are skipped. Generic type parameters are fully extracted (e.g., List<T> has type_parameters: ["T"], Dictionary<TKey, TValue> has type_parameters: ["TKey", "TValue"]). Signature type arguments are preserved (e.g., a field of type List<string> carries type_arguments: [{ name: "String", package: "System" }]).

Library Usage

The indexer can be embedded as a Rust library crate (c_sharp_indexer).

Basic Indexing

use std::path::Path;
use ast_index::{ProjectIndex, QueryParams, SymbolKindTag};
use c_sharp_indexer::analyzer::CSharpAnalyzer;
use c_sharp_indexer::lang_data::*;

type CSharpIndex = ProjectIndex<
    CSharpPackageData, CSharpImportData, CSharpDefData, CSharpRefData, CSharpAnalyzer
>;

// Create analyzer and build index
let analyzer = CSharpAnalyzer::new().expect("failed to create analyzer");
let index: CSharpIndex = ProjectIndex::new(analyzer);
let stats = index.build(Path::new("/path/to/project"));

println!("Indexed {} files", stats.files_scanned);

// Query for a type
let traces = index
    .params(QueryParams {
        search_pattern: "MyApp.Models.User".to_string(),
        symbol_kind: Some(SymbolKindTag::TypeDef),
    })
    .query()
    .expect("query failed");

for trace in &traces {
    println!("Found: {:?}", trace.definition);
    println!("  Used in {} places", trace.usage_sites.len());
}

With Dependency Resolution

use c_sharp_indexer::analyzer::CSharpAnalyzer;
use c_sharp_indexer::deps::DepsIndex;

// Create dependency index from SDK path
let deps = DepsIndex::new(vec![
    "/usr/local/share/dotnet/packs/Microsoft.NETCore.App.Ref/9.0.5/ref/net9.0/".into(),
    "~/.nuget/packages/newtonsoft.json/13.0.1/lib/net8.0/".into(),
]);

// Create analyzer with deps
let analyzer = CSharpAnalyzer::with_deps(deps)
    .expect("failed to create analyzer");
let index: CSharpIndex = ProjectIndex::new(analyzer);
index.build(Path::new("/path/to/project"));

// Now queries can resolve SDK types
let traces = index
    .params(QueryParams {
        search_pattern: "System.String".to_string(),
        symbol_kind: Some(SymbolKindTag::TypeDef),
    })
    .query()
    .expect("query failed");

// Definition comes from dependency cache, not a project file
assert!(matches!(
    &traces[0].definition,
    ast_index::trace::SymbolDefinition::Dependency { .. }
));

Filtering by Language-Specific Data

// Find all static methods
let traces = index
    .params(QueryParams {
        search_pattern: "MyApp.Services.*".to_string(),
        symbol_kind: Some(SymbolKindTag::Function),
    })
    .filter_defs(|lang_data: &CSharpDefData| {
        lang_data.is_static && lang_data.visibility == Visibility::Public
    })
    .query()
    .expect("query failed");

// Find all interfaces
let traces = index
    .params(QueryParams {
        search_pattern: "MyApp.*".to_string(),
        symbol_kind: Some(SymbolKindTag::TypeDef),
    })
    .filter_defs(|lang_data: &CSharpDefData| {
        lang_data.type_kind == Some(CSharpTypeKind::Interface)
    })
    .query()
    .expect("query failed");

Architecture

Source files (.cs)
    |
    v
[tree-sitter-c-sharp parser] --> Concrete Syntax Tree
    |
    v
[Extraction]  --> AnalysisResult { package, imports, definitions, references }
    |                                    |
    |    .scm query files define         |  Each reference gets a SymbolOrigin:
    |    what to capture:                |    - Import (from a using directive)
    |    - definitions.scm              |    - Local (same-file definition)
    |    - references.scm              |    - Global (unresolved)
    |                                    |
    |    Imports extracted via            |  Type-chain resolution:
    |    CST tree walk (not .scm)        |    - TypeAnnotation (name + package)
    |                                    |    - receiver_ref_span (chained calls)
    |    FQDNs built by walking          |    - initializer_ref_span (var inference)
    |    CST parent chain                |
    v                                    v
[ast-index::ProjectIndex::build()]  --> Cross-file index
    |
    v
[Query + Trace]  --> SymbolTrace { definition, usage_sites }
    |                      |
    |                      |  Lazy dependency resolution:
    |                      |    DepsIndex scans DLLs on first query,
    |                      |    parses PE metadata via goblin + clrmeta,
    |                      |    caches in dependency_cache
    v
[JSON output]

Module Map

Module Purpose
analyzer.rs CSharpAnalyzer -- FileAnalyzer trait implementation, definition/reference/import extraction, type-chain resolution, FQDN building
deps/mod.rs DepsIndex -- lazy namespace-to-DLL resolver with thread-safe caching
deps/scanner.rs Directory scanning, PE header parsing, CLI metadata extraction, namespace index building
deps/assembly.rs DLL metadata to SymbolDef conversion (types, methods, fields, properties, enums, generics)
deps/signatures.rs .NET TypeSig to ast-index TypeAnnotation conversion (primitives, class refs, generic instantiations, arrays)
lang_data.rs C#-specific metadata types: CSharpDefData (visibility, modifiers, type_kind), CSharpRefData, CSharpImportData
queries/definitions.scm Tree-sitter queries for definition extraction (20+ capture patterns)
queries/references.scm Tree-sitter queries for reference extraction (14 capture patterns)
main.rs CLI binary, JSON REPL, --deps-path flag

Key Design Decisions

  1. SCM queries for definitions/references, tree walk for imports -- alias using directives caused overlapping SCM matches; tree walk is more reliable.
  2. FQDN via parent-chain walking -- fully qualified names are built by walking up the CST collecting namespace/class/struct names. File-scoped namespaces (namespace Foo;) are handled by scanning compilation_unit children.
  3. Multi-namespace support -- C# files can have multiple namespace blocks. Each gets its own package via additional_package_defs.
  4. Type-chain resolution -- local variables, fields, and parameters are emitted as SymbolDef with TypeAnnotation (name + package). This enables x.Method() resolution by following x -> variable def -> type -> Method definition.
  5. Lazy dependency resolution -- DLLs are scanned on first query, not at index time. Namespace index is built once, full metadata parsed per-namespace on demand.

For the complete design documentation with 17 numbered design decisions, see AGENTS.md.

C#-to-ast-index Type Mapping

C# Construct ast-index SymbolKind type_kind Notes
class TypeDef Class
struct TypeDef Struct
record TypeDef Record
record struct TypeDef RecordStruct
interface TypeDef Interface
enum Enum -- Members are EnumMember
delegate Function Delegate Delegate is a callable type
Method Function --
Constructor Constructor --
Property Property --
Field Field --
const field Const -- Compile-time constant
event (field-like) Field -- event EventHandler Foo;
event (with accessors) Property -- event EventHandler Foo { add; remove; }
Local variable Variable -- For type-chain resolution
Indexer (this[int i]) Skipped -- No simple name

Inheritance heuristic: C# base lists (: BaseClass, IInterface) don't syntactically distinguish base classes from interfaces. If the type name starts with I followed by an uppercase letter, it's classified as Implements; otherwise the first non-interface type is Extends.

Testing

Running Tests

# All tests (193 project tests + 11 dependency tests)
cargo test

# Run a specific test
cargo test find_abstract_method

# Run a category of tests
cargo test reference_    # all reference tests
cargo test namespace_    # all namespace tests
cargo test find_type_    # all type-finding tests
cargo test deps_         # all dependency resolution tests
cargo test typechain_    # all type-chain tests

Test Categories (204 tests)

Category Count What's Tested
Find types 12 class, struct, record, record struct, interface, enum, delegate, generics
Find members 20 methods, constructors, properties, fields, enum members, events, local functions
Find by modifier 12 static, abstract, sealed, virtual, override, async, readonly, partial
Find by visibility 11 public, private, protected, internal, protected internal, private protected, implicit public
Search patterns 9 exact FQDN, wildcards, middle wildcard, kind filter, no filter
Inheritance 8 extends, implements, multiple interfaces, deep hierarchy, generic base
Namespaces 11 block-scoped, file-scoped, nested types, nested namespaces, multi-namespace
Generics 7 generic class, interface, methods, constructor, extension methods
Imports 5 plain, static, alias, global, from multiple files
Partial classes 4 defs from both files, members, all members combined
Method signatures 6 void, params, return type, async, delegate, interface method
Structs 4 typedef, fields, methods, readonly struct
Events/delegates 5 delegate types, generic delegate, all delegates, event methods
Constants 6 const int/string, static readonly, all constants
References 16 type refs, new, typeof, cast, is, as, member access, attributes, generic type args
C#-specific 9 abstract property, init property, nested struct, local function, indexer, record params
Type-chain 42 local vars, fields, params, var inference, chained calls, receiver_ref_span, generics
Build stats 3 file counts, package names, zero failures
Multi-namespace 1 separate packages per namespace block
Search edge cases 2 wildcard middle, deep FQDN
Dependency resolution 11 SDK namespace resolution, generic type params, properties, methods, enums, end-to-end query, interface detection

Test fixtures are in tests/fixtures/ -- 20 .cs files covering all C# constructs. Dependency tests require the .NET 9 SDK (auto-skipped if not installed).

Known Limitations

  • Extension method dispatch -- System.Linq.Enumerable.Where() is resolved as a type in the dependency cache, but extension method dispatch (list.Where(...)) is not implemented.
  • Builtin type method calls -- int x = 5; x.ToString() does not resolve because the analyzer filters out builtin types (int, string, etc.) and doesn't emit refs for them.
  • Lambda parameter types -- Lambda params are not emitted as definitions. Inferring their types requires resolving enclosing generic method type parameters.
  • Method-level generic substitution -- CallableData does not carry type_parameters. Only class-level generic substitution is supported.
  • foreach element type inference -- foreach (var item in list) cannot determine the element type from the CST. Requires resolving the collection's generic type arguments.
  • Namespace import package resolution -- Types from namespace-level using imports default to the file's own namespace, which is incorrect for cross-namespace types.

License

Apache-2.0