No description
  • Rust 98.9%
  • Shell 1.1%
Find a file
Shawn Hurley b984e63166
Some checks failed
CI / Check & Test (push) Failing after 1m21s
refactor: remove all language-specific code from core crates
Consolidate git utilities into crates/core/src/git.rs (WorktreeGuard,
read_git_file, git_diff_file, sanitize_ref_name) — eliminates
duplication between TS and Java crates. Java now uses core WorktreeGuard
directly; TS delegates utility functions to core while keeping its own
error types.

Make Language::build_report() a method (&self) instead of an associated
function, improving call-site ergonomics (lang.build_report() vs
turbofish syntax).

Remove all language-specific code from crates/core/:
- MinimalSemantics: strip TS-specific overrides (star re-export
  filtering, deprecated/next path handling); add TsLikeTestSemantics
  for tests that need those behaviors
- type_category(): make pluggable via primitive_type_names() trait
  method; TS adds undefined/never/any/unknown, default covers
  string/number/boolean/void/null
- is_primitive_or_absent(): same — thread primitives through
  detect_renames
- ExtendedAnalysisParams: rename dep_css_dir → dep_dir,
  removed_css_blocks → removed_dep_components
- renders_element: remove from ApiChange, LlmApiChange, FileApiChange,
  and the LLM prompt template
- ComponentStatus → TypeStatus
- Fix all doc comments using TS/React terminology to be cross-language

Remove dead code from LLM crate:
- LlmCompositionChange, CompositionPatternResponse types
- parse_composition_pattern_response, parse_composition_from_file_response
- Simplify analyze_file_diff return from 3-tuple to 2-tuple
- Remove composition_entries collection from orchestrator

Move TS-specific LLM prompts to crates/ts/src/llm_prompts.rs:
- build_hierarchy_inference_prompt (React/JSX hierarchy)
- build_suffix_rename_prompt (CSS logical properties)
- Refactor LLM API: infer_hierarchy_from_prompt and
  infer_suffix_renames_from_prompt take pre-built prompt strings

Improve konveyor-core separation:
- Refactor consolidation_key to accept package_extractor closure,
  removing internal extract_package_from_path dependency
- Extract DEFAULT_FILE_PATTERN constant from hardcoded JS fallback
- Remove dead increment_version_prefix
- Document all JS-specific functions in module header
- Re-export JS-specific functions and config types via
  crates/ts/src/konveyor_frontend.rs

Add missing derives (PartialEq/Eq on 15 types, Debug on 3 internal
diff types), merge redundant Symbol<M> impl blocks, replace complex
migration candidate tuple with named MigrationCandidate struct,
tighten normalize_type_structure visibility.

Wire Java degradation tracking and document stub implementations.

0 clippy warnings, 1157 tests pass.
2026-04-14 15:04:53 -04:00
.forgejo/workflows update for the workflow to work-corectly 2026-03-30 14:13:18 -04:00
.opencode/plans feat: reduce false positives — remove redundant composition rules, fix migration rule triggers, fix BEM collision 2026-04-07 12:23:44 -04:00
crates refactor: remove all language-specific code from core crates 2026-04-14 15:04:53 -04:00
design fix: correct edge strengths in composition trees for ground truth alignment 2026-04-09 22:56:58 -04:00
docs feat: shared removed-prop classifier and unmapped props in family strategies 2026-04-10 23:13:19 -04:00
hack feat: add explicit CSS custom property rename mappings and fix token mappings 2026-04-06 17:05:36 -04:00
src refactor: remove all language-specific code from core crates 2026-04-14 15:04:53 -04:00
.gitignore adding docs slides and konveyor rule generation 2026-03-16 13:45:06 -04:00
AGENTS.md refactor: wire deferred items — ExtendedAnalysisParams, hierarchy to TS, LLM category parameterization 2026-04-14 10:06:51 -04:00
Cargo.lock feat: add Java language support to validate multi-language architecture 2026-04-14 09:15:02 -04:00
Cargo.toml feat: add Java language support to validate multi-language architecture 2026-04-14 09:15:02 -04:00
open-issues.md refactor to make generic and add forgejo actions 2026-03-30 13:09:29 -04:00
PLAN.md able to generate fairly accurate results for semver analysis 2026-03-13 17:14:28 -04:00
README.md feat: CLI improvements, comprehensive documentation, and pipeline default swap 2026-04-08 22:40:41 -04:00

semver-analyzer

Deterministic, structured analysis of semantic versioning breaking changes between two git refs. Extracts API surfaces, diffs them, performs source-level analysis, and generates Konveyor migration rules with fix strategies.

Currently supports TypeScript/JavaScript/React projects.

Quick Start

# Build
cargo build --release

# Analyze breaking changes between two tags
semver-analyzer analyze typescript \
  --repo /path/to/your-ts-project \
  --from v1.0.0 \
  --to v2.0.0 \
  -o report.json

# Generate Konveyor migration rules from the report
semver-analyzer konveyor typescript \
  --from-report report.json \
  --output-dir ./rules

A convenience script is provided for running against PatternFly, the primary validation target. See docs/patternfly-walkthrough.md for the full setup guide.

hack/run-patternfly.sh

Prerequisites

  • Rust (stable toolchain) -- build the analyzer
  • Node.js >= 18 and npm/yarn/pnpm -- required by target projects for tsc and dependency installation
  • Git -- worktree creation and diff parsing
  • TypeScript (tsc) -- installed as a dev dependency in the target project, or globally

Installation

git clone <repo-url>
cd semver-analyzer
cargo build --release

# Binary is at target/release/semver-analyzer
# Optionally add to PATH:
export PATH="$PWD/target/release:$PATH"

Commands

analyze typescript -- Full Pipeline

Runs the complete analysis: extract API surfaces at both refs, diff them, perform source-level analysis, and diff package.json manifests.

semver-analyzer analyze typescript \
  --repo /path/to/repo \
  --from v5.0.0 \
  --to v6.0.0 \
  -o report.json
Option Description
--repo <path> Path to local git repository
--from <ref> Old git ref (tag, branch, SHA)
--to <ref> New git ref (tag, branch, SHA)
-o, --output <path> Output file (JSON). Defaults to stdout
Pipeline
--behavioral Use the behavioral analysis (BU) pipeline instead of the default source-level diff (SD). See Pipelines
LLM Options
--no-llm Skip LLM-based behavioral analysis (static only)
--llm-command <cmd> Command to invoke for LLM analysis (see LLM Integration)
--llm-timeout <secs> Timeout per LLM invocation (default: 120)
--llm-all-files Send all changed files to LLM, not just those with test changes. Requires --behavioral
Build
--build-command <cmd> Custom build command. If not set, the analyzer detects the package manager and runs tsc with monorepo-aware fallbacks
Dependency Repo
--dep-repo <path> Path to a dependency git repo (e.g., a CSS framework repo). Enables CSS profile extraction
--dep-from <ref> Old git ref for the dependency repo
--dep-to <ref> New git ref for the dependency repo
--dep-build-command <cmd> Build command for the dependency repo

konveyor typescript -- Generate Migration Rules

Generates Konveyor-compatible YAML rules from breaking change analysis. Rules can be consumed by kantra or the Konveyor frontend analyzer to detect migration issues in consumer codebases. See docs/konveyor-rules.md for detailed documentation on rule types, conditions, fix strategies, and customization.

Two modes:

  1. From a report (recommended for iteration):

    semver-analyzer konveyor typescript \
      --from-report report.json \
      --output-dir ./rules
    
  2. Inline analysis (runs the full pipeline then generates rules):

    semver-analyzer konveyor typescript \
      --repo /path/to/repo \
      --from v5.0.0 --to v6.0.0 \
      --output-dir ./rules
    
Option Description
--from-report <path> Load a pre-existing analysis report (mutually exclusive with --repo)
--repo <path> Path to git repository (runs full analysis)
--from <ref> Old git ref
--to <ref> New git ref
--output-dir <path> Output directory for the generated ruleset
Rule Generation
--rename-patterns <path> YAML file with regex-based rename patterns
--no-consolidate Keep one rule per declaration change (disable merging)
--file-pattern <glob> File glob for filecontent rules (default: *.{ts,tsx,js,jsx,mjs,cjs})
--ruleset-name <name> Name for the generated ruleset (default: semver-breaking-changes)

The konveyor command also accepts --behavioral, LLM, build, and dependency repo flags when running in inline analysis mode (--repo). Run semver-analyzer konveyor typescript --help for the full list.

Output structure:

rules/
├── ruleset.yaml              # Ruleset metadata
└── breaking-changes.yaml     # Migration rules

Example rule:

- ruleID: component-prop-removed-button-variant
  labels:
    - "source=semver-analyzer"
    - "change-type=prop-removed"
  effort: 3
  category: mandatory
  description: "Property 'variant' was removed from Button"
  message: |
    The `variant` prop was removed from `Button`.
    Remove this prop or migrate to the replacement API.
  when:
    frontend.referenced:
      pattern: "^variant$"
      location: JSX_PROP
      component: "^Button$"

extract typescript -- Extract API Surface

Extracts the public API surface at a single git ref. Useful for inspecting or caching surfaces.

semver-analyzer extract typescript \
  --repo /path/to/repo \
  --ref v5.0.0 \
  -o surface.json
Option Description
--repo <path> Path to local git repository
--ref <ref> Git ref to extract from
-o, --output <path> Output file (JSON). Defaults to stdout
--build-command <cmd> Custom build command

diff -- Compare Two Surfaces

Compares two previously extracted API surface JSON files. This command is language-agnostic.

semver-analyzer diff \
  --from old-surface.json \
  --to new-surface.json \
  -o changes.json

How It Works

The analyzer combines two pipelines. The TD (structural) pipeline always runs. By default, the SD (source-level) pipeline runs alongside it. Optionally, the BU (behavioral) pipeline can be used instead via --behavioral.

TD (Top-Down) Pipeline -- Structural Analysis

Always runs. Extracts and diffs the public API surface:

  1. Creates git worktrees for each ref
  2. Detects the package manager (npm/yarn/pnpm) and installs dependencies
  3. Runs tsc --declaration --emitDeclarationOnly with monorepo-aware fallbacks:
    • Solution tsconfig detection (tsc --build)
    • Project build script fallback (yarn build)
    • Custom --build-command override
  4. Parses generated .d.ts files with OXC
  5. Builds the ApiSurface with type canonicalization (union/intersection sorting, Array<T> normalization, whitespace, never/unknown absorption, import resolution)
  6. Diffs old vs new surface with 4-phase matching: exact name, relocation/deprecated detection, fingerprint+LCS rename detection, unmatched
  7. Detects 30+ categories of structural changes (removed exports, signature changes, type changes, visibility, generics, class hierarchy, enum members, etc.)
  8. Diffs package.json for manifest-level breaks (entry points, module system, exports map, peer deps, engines, bins)

SD (Source-Level Diff) Pipeline -- Source Analysis (default)

Runs by default alongside TD. Performs deterministic, AST-based analysis of source code changes between refs:

  • Component composition trees -- Builds parent-child relationship trees for component families using 10 evidence-based signals (internal rendering, CSS selectors, React context, DOM nesting, cloneElement). Generates conformance rules that detect incorrect component nesting in consumer code.
  • CSS token analysis -- Extracts BEM-structured CSS class/variable usage per component. Detects removed CSS classes, renamed variables, and layout-affecting changes (grid, flex context).
  • React API changes -- Tracks portal usage, forwardRef/memo wrapping, context dependencies, and cloneElement injection patterns across versions.
  • Prop defaults and bindings -- Extracts default values from destructuring patterns and detects prop-to-CSS-class binding changes.
  • DOM structure -- Compares rendered element trees, ARIA attributes, roles, and data attributes.
  • Deprecated replacement detection -- When a component is relocated to /deprecated/ and replaced by a differently-named component (e.g., Chip -> Label), detects the replacement via rendering swap signals.

The SD pipeline produces fully deterministic results -- no LLM or heuristics involved.

BU (Bottom-Up) Pipeline -- Behavioral Analysis (opt-in)

Opt-in via --behavioral. Replaces the SD pipeline with test-delta heuristics and optional LLM inference:

  1. Parses git diff to find changed source files
  2. Extracts function bodies at both refs using OXC
  3. Identifies functions whose implementations changed
  4. Cross-references with TD findings to avoid duplicates (via DashMap + broadcast channel)
  5. Discovers associated test files (7 strategies covering common project layouts)
  6. If test assertions changed: HIGH confidence behavioral break
  7. If LLM enabled: sends diffs to an external LLM for semantic analysis
  8. Walks up the call graph for private functions with behavioral breaks

Output

The report is a JSON document. Key top-level fields:

{
  "repository": "/path/to/repo",
  "comparison": {
    "from_ref": "v5.0.0",
    "to_ref": "v6.0.0",
    "from_sha": "abc123",
    "to_sha": "def456",
    "commit_count": 142,
    "analysis_timestamp": "2026-03-16T12:00:00Z"
  },
  "summary": {
    "total_breaking_changes": 1523,
    "breaking_api_changes": 1500,
    "breaking_behavioral_changes": 23,
    "files_with_breaking_changes": 87
  },
  "changes": [
    {
      "file": "packages/react-core/src/components/Card/Card.d.ts",
      "status": "modified",
      "breaking_api_changes": [
        {
          "symbol": "CardProps.isFlat",
          "kind": "property",
          "change": "removed",
          "before": "isFlat?: boolean",
          "after": null,
          "description": "Property 'isFlat' was removed from CardProps"
        }
      ]
    }
  ],
  "packages": [ "..." ],
  "sd_result": { "..." },
  "manifest_changes": [],
  "metadata": { "tool_version": "0.0.4" }
}

The packages field contains a per-package hierarchical view used by rule generation. The sd_result field (populated by the SD pipeline) contains source-level changes, composition trees, and conformance checks.

LLM Integration

The analyzer can optionally use any CLI-accessible LLM for behavioral analysis. LLM analysis is only used with the --behavioral pipeline -- the default SD pipeline is fully deterministic and requires no LLM.

# Using goose with the behavioral pipeline
semver-analyzer analyze typescript \
  --repo /path/to/repo \
  --from v5.0.0 --to v6.0.0 \
  --behavioral \
  --llm-command "goose run --no-session -q -t"

# Using any command that accepts a prompt as its last argument
semver-analyzer analyze typescript \
  --repo /path/to/repo \
  --from v5.0.0 --to v6.0.0 \
  --behavioral \
  --llm-command "my-llm-cli"

See docs/llm-integration.md for detailed setup instructions, goose installation, the CLI contract for custom providers, and cost considerations.

Architecture

semver-analyzer (binary)
├── src/main.rs              # CLI entry, report building
├── src/orchestrator.rs      # Pipeline orchestrator (TD+SD or TD+BU)
└── src/cli/mod.rs           # Clap CLI definitions

crates/
├── core/                    # Language-agnostic types and diff engine
│   └── src/
│       ├── traits.rs        # Pluggable language support trait
│       ├── shared.rs        # SharedFindings (DashMap + broadcast)
│       ├── diff/            # 6-phase API surface differ
│       └── types/           # ApiSurface, Symbol, AnalysisReport
├── ts/                      # TypeScript/JavaScript support
│   └── src/
│       ├── extract/         # OXC-based .d.ts API extraction
│       ├── canon/           # 6-rule type canonicalization
│       ├── source_profile/  # Component source profile extraction
│       ├── composition/     # Composition tree builder (v2)
│       ├── sd_pipeline.rs   # Source-level diff pipeline
│       ├── diff_parser/     # Git diff -> changed functions
│       ├── test_analyzer/   # Test discovery + assertion detection
│       ├── call_graph/      # Same-file caller detection
│       ├── jsx_diff/        # Deterministic JSX render diffing
│       ├── css_scan/        # CSS variable/class prefix scanning
│       ├── manifest/        # package.json diff
│       ├── konveyor.rs      # Konveyor rule generation (TD pipeline)
│       ├── konveyor_v2.rs   # Konveyor rule generation (SD pipeline)
│       └── worktree/        # Git worktree lifecycle, tsc, pkg mgr
├── konveyor-core/           # Shared Konveyor rule types and utilities
│   └── src/lib.rs           # Rule construction, consolidation, fix strategies
└── llm/                     # LLM behavioral analysis
    └── src/
        ├── invoke.rs        # External LLM command execution
        ├── prompts.rs       # Structured prompt templates
        └── spec_compare.rs  # Structural spec comparison

The core crate defines a Language trait, making the architecture language-pluggable. TypeScript is the first (and currently only) implementation.

Development

# Run all tests
cargo test

# Run tests for a specific crate
cargo test -p semver-analyzer-core
cargo test -p semver-analyzer-ts

# Build in debug mode
cargo build

# Build release
cargo build --release

Documentation

Guide Description
TypeScript/React Guide What the analyzer detects, how to interpret results
Konveyor Rules Rule types, conditions, fix strategies, customization
Report Format Complete JSON report schema reference
PatternFly Walkthrough Step-by-step guide for analyzing PatternFly v5 -> v6
LLM Integration Goose setup, CLI contract, cost considerations

Known Limitations

  • ESM/CJS declaration deduplication: Projects that emit both ESM and CJS builds will have roughly doubled symbol counts. The analyzer picks up .d.ts from both output directories.
  • MCP server: The serve subcommand is defined but not yet implemented.
  • Language support: Only TypeScript/JavaScript is currently supported.

License

See LICENSE for details.