Skip to main content
Version: Next

Transform the AST

Parsers produce ASTs so you can analyze and rewrite code. Starlasu (Kolasu) supports transforming trees: visit each node, replace it or transform it, and create a new root.

Use transforms for refactoring helpers, normalization passes, and the first stage of a transpiler. Transformers are the foundational feature we at Strumenta use to build our transpilers.

When to use transforms

ToolTypical use
Read-only walk / visitorInventory, lineage, metrics
TransformRename identifiers, strip constructs, transpile to target language

Add Dependencies

repositories {
mavenLocal()
mavenCentral()
flatDir {
dirs("deps")
}
}

dependencies {
implementation(files("deps/sas-parser-with-dependencies-1.6.5-all.jar"))
}

Collect DatasetSpec

This example collect every DatasetSpec name. The ASTTransformer.transform give you complete control on every transformation. The consequence of this approach is that you need to define a transformation not just for every node you need to transform, but for every node that could contain that node. Otherwise the transformer will apply the default transformation you selected on the creation of the ASTTransformer.

import com.strumenta.kolasu.commercial.LicenseManager
import com.strumenta.kolasu.transformation.ASTTransformer
import com.strumenta.kolasu.validation.Issue
import com.strumenta.sas.ast.SourceFile
import com.strumenta.sas.parser.SASLanguage
import java.io.File
import com.strumenta.kolasu.transformation.IDENTTITY_TRANSFORMATION
import com.strumenta.sas.ast.Identifier
import com.strumenta.sas.ast.VariableExpression
import com.strumenta.sas.ast.datastep.DataStep
import com.strumenta.sas.ast.other.DatasetSpec
import com.strumenta.kolasu.model.Node

fun formatDatasetSpec(spec: DatasetSpec): String {
val name = spec.name
val textName = when (name) {
is VariableExpression -> name.variable
is Identifier -> name.name
else -> name?.toString() ?: "?"
}
return if (spec.library != null) "${spec.library}.$textName" else textName
}

data class ActualName(val name: String) : Node()

data class ActualNames(val names: List<ActualName>) : Node()

data class DataSteps(val names: List<String>) : Node()

fun collectDatasetSpecNames(sasFile: File, license: File) {
LicenseManager.registerLicense(license)
val sas = SASLanguage()
val result = sas.parse(sasFile)
val root = result.root

val issues: MutableList<Issue> = result.issues.toMutableList()
val documenter: ASTTransformer = ASTTransformer(
issues = issues,
allowGenericNode = true,
defaultTransformation = IDENTTITY_TRANSFORMATION
)

documenter.registerNodeFactory(SourceFile::class)
{ sf : SourceFile, transformer ->
val dataSteps = sf.statementsAndDeclarations.filterIsInstance<DataStep>()
DataSteps(
dataSteps.flatMap { (transformer.transform(it) as ActualNames).names.map { it.name } }
)
}

documenter.registerNodeFactory(DataStep::class)
{ dt : DataStep, transformer ->
ActualNames(
dt.datasets.map { transformer.transform(it) as ActualName}
)
}

documenter.registerNodeFactory(DatasetSpec::class)
{ dt : DatasetSpec, transformer ->
ActualName(formatDatasetSpec(dt))
}

val newAST = documenter.transform(root)
println(newAST)
}

Typically you define a transformation for every node, unless you have a well-defined and specific need, such as transforming all Data Step in SQL statements.