rlsbl v0.92.0 /Import scanning
On this page

Import scanning architecture that validates dependencies and detects dead modules using tree-sitter parsers for Python, Go, and JS/TS plus regex for Dart.

#Import scanning

rlsbl scans source code imports across 4 language ecosystems to detect unused dependencies, undeclared dependencies, dead modules, and circular dependencies. It uses 3 tree-sitter parsers for Python, Go, and JavaScript/TypeScript parsing, and regex for Dart. Import results are cached per check context to avoid redundant source tree walks across the 4 dependency checks that share scan data.

#Architecture

The import scanning system has two layers, each serving a different purpose. The workspace layer maps imports to project names for dependency validation, while the file layer builds intra-project graphs for dead-module and circular-dependency detection:

#Workspace-level scanners (import_scanners.py)

These scanners parse source files and map discovered import statements to workspace project names. They answer the question "which workspace siblings does this project actually import?" and feed their results to the four dependency validation checks (unused, undeclared, runtime-test-only, dev-in-lib).

  • PythonImportScanner -- uses PythonAstLinter.scan_imports(), filters by workspace membership
  • GoImportScanner -- uses scan_imports() from lint.go_ast, matches against workspace Go module paths
  • NpmImportScanner -- uses NpmAstLinter.scan_imports(), extracts bare package names
  • DartImportScanner -- regex-based extraction of package: imports

All scanners return list[ImportInfo], where each ImportInfo carries the matched workspace package name, file path, line number, and whether the file is in a test context.

#File-level graph builders (dep_validation.py)

These functions build intra-package import graphs by resolving each import statement to a concrete file path within the same project. They answer the question "which files within this project reference each other?" and produce the adjacency data used by dead-module BFS traversal and circular-dependency detection via Tarjan's algorithm.

  • _build_python_import_graph() -- implied by find_dead_modules() which uses _collect_python_imports()
  • find_dead_go_packages() -- uses scan_imports() per file, groups by package directory
  • _build_npm_import_graph() -- resolves relative imports to absolute file paths
  • _build_dart_import_graph() -- resolves relative and self-package imports via regex

#Shared infrastructure

#ImportScanner protocol (lint/protocol.py)

python
class ImportScanner(Protocol):
    def scan_imports(self, project_path: str) -> set[tuple[str, str, int]]:
        """Returns (package_name, file_path, line_number) tuples."""

This is the interface that low-level AST linters implement. The workspace-level scanners (above) consume this output and post-process it.

#walk_source_files() (lint/utils.py)

File discovery utility shared by both the workspace-level scanners and the file-level graph builders. It walks the project directory tree, filters by file extension, and excludes non-source directories to produce the set of files that should be scanned for imports. Key features:

  • Extension matching (e.g., (".py",), (".go",), (".js", ".ts", ".mjs", ".cjs", ".tsx"))
  • Built-in exclusion of common non-source directories: .venv, node_modules, __pycache__, .git, build, dist, .selfdoc, _build, static, public, assets
  • Automatic .egg-info directory exclusion
  • exclude_patterns parameter for fnmatch-style glob filtering
  • exclude_dirs parameter for preventing scans of sibling workspace project directories (critical for root-path monorepo projects where sibling project dirs are immediate children)

#_is_test_context()

Classifies a file as production vs test code by checking its path against known test directory names and file naming conventions. This classification determines whether an import counts toward runtime dependency usage or test-only usage, which directly affects the deps-runtime-test-only and deps-dev-in-lib checks. Classification is based on:

  • Directory names: test, tests, __tests__, examples, example
  • File name patterns: test_*.py, *_test.py, *_test.go, *_test.dart, *.test.[jt]sx?, *.spec.[jt]sx?, conftest.py

#_NON_PRODUCTION_PATTERNS

Shared constant exposing the file classification patterns as a dict with 3 keys: test_dirs (5 directory names like test, tests, __tests__), example_dirs (2 directory names), and test_file_patterns (7 glob patterns for test file naming conventions). This dict is reused by both import_scanners.py and dep_validation.py to keep production vs test classification consistent across all dependency checks.

#Per-language details

Per-language details
LanguageParserWorkspace ScannerGraph BuilderExclusions
Pythontree-sitter-pythonPythonImportScanner_collect_python_imports() + find_dead_modules()stdlib (sys.stdlib_module_names), relative imports
Gotree-sitter-goGoImportScannerfind_dead_go_packages() via scan_imports()self-module imports
npm (JS/TS)tree-sitter-javascript + tree-sitter-typescriptNpmImportScanner_build_npm_import_graph()Node.js builtins, relative imports
DartregexDartImportScanner_build_dart_import_graph()dart: imports, external package: imports

#Go module path mapping

GoImportScanner handles Go's module-path-based import system by building a reverse lookup from Go module paths to workspace project names. This mapping is necessary because Go imports use full module paths like github.com/org/repo/pkg, not bare package names:

  1. Reads go.mod from each workspace project to extract its module declaration
  2. Builds a module_path_map: dict[str, str] mapping workspace project name to module path
  3. For each import in source files, checks if the import path equals or starts with (+ /) any workspace module path
  4. Excludes self-imports by reading the scanning project's own module path

This handles Go's module-path-based import system where github.com/org/repo/internal/pkg maps to a workspace project whose go.mod declares module github.com/org/repo.

#npm resolution

The npm file-level graph builder (_build_npm_import_graph()) implements a subset of Node.js module resolution to accurately map import statements to source files. This is necessary because JavaScript and TypeScript have several implicit resolution conventions that affect which file an import actually refers to:

  • Extension appending: tries .ts, .tsx, .js, .mjs, .cjs when bare path has no extension
  • **.js to .ts mapping**: TypeScript projects compile .ts to .js; resolves .js references back to .ts source
  • Directory to index file: resolves ./utils to ./utils/index.ts (tries index.ts, index.tsx, index.js, index.mjs, index.cjs)
  • Import types: ES6 import, CommonJS require(), dynamic import()
  • Conditional exports: _collect_export_paths() recursively traverses package.json exports maps (string, dict with condition keys, nested subpath maps, arrays)

#Caching

Workspace-level import results are cached on the check context object (ctx._dep_import_cache) to avoid scanning the same source trees multiple times. Since four separate checks all need the same import data, caching reduces the total number of source tree walks from four per project to one. The _build_dep_import_cache() function in rlsbl/checks/_common.py:

  1. Iterates all workspace projects once
  2. Computes (lib_imports, test_imports) per project using _get_imported_workspace_packages()
  3. Stores the result dict on ctx._dep_import_cache
  4. Returns the cached result on subsequent calls

This cache is shared across four workspace dependency checks:

  • deps-unused
  • deps-undeclared
  • deps-runtime-test-only
  • deps-dev-in-lib

The intra-package checks (dead-modules, circular-deps) build their own file-level graphs using the same underlying walk_source_files() and AST infrastructure, but do not share the workspace cache since they operate at a different granularity (individual files rather than workspace package names).

#Entry point detection

Dead-module analysis requires knowing which files serve as roots for BFS reachability traversal. A file is considered an entry point if it is part of the package's public API or an executable script that users invoke directly. Entry point detection varies by language because each ecosystem has different conventions for declaring public surfaces:

Entry point detection
LanguageEntry points
Python__init__.py files (package entry points); all production modules cross-reference each other via import prefix matching
GoInternal packages only -- checks whether any non-test file outside the package directory imports the package path
npmpackage.json fields: exports (recursive path collection), main, bin (string or dict of paths)
Dartlib/<package_name>.dart (barrel file from pubspec.yaml name field) + all bin/*.dart scripts

#Source modules

The import scanning implementation spans 3 modules: import_scanners provides the per-language AST parsers and workspace-level import collection, lint.protocol defines the shared interface for lint rule implementations, and lint.utils contains utility functions for file walking, pattern matching, and result aggregation.

#rlsbl.import_scanners

Python, Dart, npm, Go, Java, and Kotlin import scanners for dependency-import validation.

Filters raw import data to workspace-relevant imports, handles language-specific edge cases, and distinguishes lib/ vs test/ contexts.

#ImportInfo

A single workspace-relevant import detected in a source file.

#_is_test_context

python
def _is_test_context(filepath: str, project_path: str) -> bool

Determine whether a file is in a non-production context.

Uses a layered approach to avoid false positives for production paths that happen to contain directory names like "test":

Layer 1 -- Unconditional directories (match at any depth): __tests__/, testdata/

Layer 2 -- Root-relative directories (match only as first component): test/, tests/, example/, examples/, integration_test/

Layer 3 -- File name patterns (checked against basename): test_.py, _test.py, _test.go, _test.dart, .test.[jt]sx?, .spec.[jt]sx?, conftest.py

#build_namespace_map

python
def build_namespace_map(projects, workspace_root: str) -> dict[str, str]

Map namespace-qualified import paths to workspace project names.

For a project named 'protocols' at 'protocols/src/orxt/protocols/', returns {'orxt.protocols': 'protocols'}.

Algorithm:

  1. For each project, call detect_python_package_root() to get the

package root (e.g., 'src/orxt')

  1. The namespace is the package root's leaf directory name (e.g., 'orxt')
  2. Walk subdirectories of the package root looking for the project's

directory name

  1. If src/orxt/protocols/ exists and project name is 'protocols',

map 'orxt.protocols' -> 'protocols'

#PythonImportScanner

Scan Python source files for workspace-relevant imports.

Uses the AST-based scanner from the lint system, then post-processes to filter out stdlib, relative imports, and non-workspace packages. Supports namespace package detection via namespace_map and import_names.

#scan

python
def scan(self, project_path: str, workspace_names: set[str], exclude_dirs: list[str] | None=None, *, namespace_map: dict[str, str] | None=None, import_names: dict[str, str] | None=None) -> list[ImportInfo]

Scan project_path for Python imports matching workspace members.

Args:

  • project_path: absolute path to the project root.
  • workspace_names: set of workspace member package names

(as they appear in pyproject.toml, e.g. "my-lib").

  • exclude_dirs: directory paths to skip during the walk

(relative to project_path or absolute).

  • namespace_map: mapping of namespace-qualified import paths

to workspace project names (e.g., {'orxt.protocols': 'protocols'}). Built by build_namespace_map().

  • import_names: mapping of project_name -> import_name from workspace

config. Used for explicit import_name overrides.

Returns:

  • list of ImportInfo for imports that match workspace members.

#DartImportScanner

Scan Dart source files for workspace-relevant package imports.

Uses regex to extract package names from import/export statements. Checks for missing generated (.g.dart) files when build_runner is configured.

#scan

python
def scan(self, project_path: str, workspace_names: set[str], exclude_dirs: list[str] | None=None) -> list[ImportInfo]

Scan project_path for Dart imports matching workspace members.

Args:

  • project_path: absolute path to the project root.
  • workspace_names: set of workspace member package names

(as they appear in pubspec.yaml).

  • exclude_dirs: directory paths to skip during the walk

(relative to project_path or absolute).

Returns:

  • list of ImportInfo for imports that match workspace members.

Raises:

  • RuntimeError: if build.yaml exists but no .g.dart files

are found in the project (missing code generation).

#_check_generated_files

python
def _check_generated_files(self, project_path: str) -> None

Raise RuntimeError if build_runner is configured but no .g.dart files exist.

#_extract_npm_bare_name

python
def _extract_npm_bare_name(specifier: str) -> str | None

Extract bare package name from an npm import specifier.

Returns None for relative imports, Node.js builtins, and node:-prefixed builtins. For scoped packages (@scope/pkg/foo), returns @scope/pkg. For unscoped (pkg/foo), returns pkg.

#NpmImportScanner

Scan JS/TS source files for workspace-relevant imports.

Uses the AST-based scanner from the npm lint system, then post-processes to filter out relative imports, Node.js builtins, and non-workspace packages.

#scan

python
def scan(self, project_path: str, workspace_names: set[str], exclude_dirs: list[str] | None=None) -> list[ImportInfo]

Scan project_path for JS/TS imports matching workspace members.

Args:

  • project_path: absolute path to the project root.
  • workspace_names: set of workspace member package names

(as they appear in package.json, e.g. "@scope/my-lib").

  • exclude_dirs: directory paths to skip during the walk

(relative to project_path or absolute).

Returns:

  • list of ImportInfo for imports that match workspace members.

#GoImportScanner

Scan Go source files for workspace-relevant imports.

Uses the tree-sitter-based scanner from the Go lint system, then post-processes to filter to imports matching other workspace projects' Go module paths.

#scan

python
def scan(self, project_path: str, workspace_names: set[str], exclude_dirs: list[str] | None=None, *, module_path_map: dict[str, str] | None=None) -> list[ImportInfo]

Scan project_path for Go imports matching workspace members.

Args:

  • project_path: absolute path to the project root.
  • workspace_names: set of workspace member package names.
  • exclude_dirs: directory paths to skip during the walk

(relative to project_path or absolute).

  • module_path_map: mapping of workspace project name to its

Go module path (from go.mod). Only Go projects appear in this map. Required for Go import detection.

Returns:

  • list of ImportInfo for imports that match workspace members.

#_match_workspace_import

python
def _match_workspace_import(import_path: str, module_to_name: dict[str, str]) -> str | None

Check if an import path belongs to a workspace sibling.

An import matches a workspace module if the import path equals the module path or starts with it followed by '/'.

#build_jvm_package_map

python
def build_jvm_package_map(projects: list, workspace_root: str) -> dict[str, str]

Map Java/Kotlin package prefixes to workspace project names.

For each workspace project with a pom.xml or build.gradle(.kts), reads the groupId (from POM) or group (from Gradle) and maps it to the project name. This allows import scanning to determine which workspace project an import like com.example.foo.Bar belongs to.

Args:

  • projects: list of workspace project dicts/objects with name

and path attributes.

  • workspace_root: absolute path to the workspace root.

Returns:

  • dict mapping dotted package prefix to workspace project name.
  • E.g. {"com.example.foo": "foo-lib"}

#_JvmImportScannerBase

Base class for Java and Kotlin import scanners.

Scans source files for import statements matching workspace projects via a package prefix map. Subclasses specify which file extensions to scan.

#scan

python
def scan(self, project_path: str, workspace_names: set[str], exclude_dirs: list[str] | None=None, *, package_map: dict[str, str] | None=None) -> list[ImportInfo]

Scan project_path for JVM imports matching workspace members.

Args:

  • project_path: absolute path to the project root.
  • workspace_names: set of workspace member package names.
  • exclude_dirs: directory paths to skip during the walk

(relative to project_path or absolute).

  • package_map: mapping of dotted package prefix to workspace

project name. Built by build_jvm_package_map(). Required for JVM import detection.

Returns:

  • list of ImportInfo for imports that match workspace members.

#JavaImportScanner

Scan Java source files for workspace-relevant imports.

Uses regex to extract import statements from .java files, then matches against the workspace package prefix map.

#KotlinImportScanner

Scan Kotlin source files for workspace-relevant imports.

Uses regex to extract import statements from .kt and .kts files, then matches against the workspace package prefix map.

#rlsbl.lint.protocol

Abstract protocols defining interfaces for per-language linters and import scanners.

#LanguageLinter

#lint

python
def lint(self, project_path: str, config: LanguageLintConfig) -> list[LintResult]

#ImportScanner

#scan_imports

python
def scan_imports(self, project_path: str) -> set[tuple[str, str, int, bool]]

Collect all imports from source files in a project.

Returns a set of (package_name, file_path, line_number, guarded) tuples. Guarded imports are those inside try/except ImportError blocks.

#rlsbl.lint.utils

Shared file-walking utilities for linters providing recursive directory traversal with gitignore-aware filtering and extension matching.

#walk_source_files

python
def walk_source_files(project_path: str, extensions: tuple[str, ...], exclude_patterns: list[str], exclude_dirs: list[str] | None=None) -> list[str]

Walk project directory, return source files matching extensions.

Excludes directories in _EXCLUDED_DIRS and .egg-info dirs. Applies exclude_patterns (fnmatch) against relative paths. Skips directories whose normalized absolute path matches any entry in exclude_dirs (used to exclude sibling workspace project directories). By default (empty exclude_patterns), all files including tests are included.