Skip to main content
Version: Next

Program Inventory

Before refactoring or migrating SAS, teams need a high-level map of each program: how many DATA steps and procedures it contains, where they appear in the source, and what kinds of constructs dominate. The parser turns source files into a structured inventory you can print, export to CSV, or feed into a dashboard.

Add Dependencies

repositories {
mavenLocal()
mavenCentral()
flatDir {
dirs("deps")
}
}

dependencies {
implementation(files("deps/sas-parser-with-dependencies-1.6.5-all.jar"))
}

Collect top-level steps

SourceFile exposes statementsAndDeclarations — the top-level elements of the program. Classify each node with simpleNodeType (or getSimpleNodeType() in Java) and record its source position.

import com.strumenta.sas.ast.Identifier
import com.strumenta.sas.ast.SourceFile
import com.strumenta.sas.ast.VariableExpression
import com.strumenta.sas.ast.datastep.DataStep
import com.strumenta.sas.ast.macro.MacroDefinition
import com.strumenta.sas.ast.other.DatasetSpec
import com.strumenta.sas.ast.sql.SqlProcedure
import com.strumenta.kolasu.commercial.LicenseManager
import com.strumenta.sas.parser.SASLanguage
import java.io.File

data class StepEntry(val kind: String, val detail: String, val line: Int)

fun formatDatasetSpec(spec: DatasetSpec): String {
val name = spec.name
val textName = when (name) {
is VariableExpression -> name.variable
is Identifier -> name.name
else -> name?.toString() ?: "?"
}
return if (spec.library != null) "${spec.library}.$textName" else textName
}

fun inventory(root: SourceFile): List<StepEntry> {
return root.statementsAndDeclarations.map { node ->
val line = node.position?.start?.line ?: 0
when (node) {
is DataStep -> {
val outs = node.datasets.joinToString(", ") { formatDatasetSpec(it) }
StepEntry("DATA", outs.ifEmpty { "?" }, line)
}
is SqlProcedure -> StepEntry("PROC SQL", "${node.sqlStatements.size} statement(s)", line)
is MacroDefinition -> StepEntry("MACRO", node.name, line)
else -> StepEntry(node.simpleNodeType, "", line)
}
}
}

fun printMarkdownInventory(file: File, entries: List<StepEntry>) {
println("# Inventory: ${file.name}")
println()
println("| Kind | Detail | Line |")
println("|------|--------|------|")
entries.forEach { e ->
println("| ${e.kind} | ${e.detail} | ${e.line} |")
}
}

fun main() {
val sasFile = File("examples/SAS/all-the-code.sas")
LicenseManager.registerLicense(File("licenses/strumenta.SAS.license"))
val sas = SASLanguage()
sas.parseNativeSQL = true
val root = sas.parse(sasFile).root ?: return
printMarkdownInventory(sasFile, inventory(root))
}

Batch a directory

Wrap the same logic in a directory walk (mirroring the jsonast CLI layout) to produce one inventory per .sas file or a consolidated CSV for the whole codebase.

Combine with deeper analysis

An inventory answers what is in a file. Pair it with SQL lineage and DATA step I/O to answer how data flows through those steps.