Skip to main content
Version: 1.6.5

Build a Typed Visitor

Walking the AST with walk() (Kotlin) or Traversing (Java) works well for simple filters. When analysis grows across many node types, needs accumulated context, a typed visitor might be the better solution. The visitor pattern is a common design pattern that will be familiar to many developers.

The SAS parser ships with SASBaseVisitor, a visitor base class with one visit method per AST node type.

When to use a visitor

ApproachGood for
walk() / filterIsInstance (Kotlin) or Traversing.walk + instanceof (Java)Working on single types, quick scripts
SASBaseVisitorMulti-pass analysis, lint rules, basic transpiler back-ends

Add Dependencies

repositories {
mavenLocal()
mavenCentral()
flatDir {
dirs("deps")
}
}

dependencies {
implementation(files("deps/sas-parser-with-dependencies-1.6.5-all.jar"))
}

A simple max length visitor

This visitor counts the max length of an identifier in the code file. The default SASBaseVisitor implementation dispatches to the correct visit overload for each node type. The visitor comes with default method to aggregate results (aggregateResult) of different visitors and a default result (defaultResult). We override these default implementation, because there cannot be a sensible default (e.g., how do you aggregate two custom classes?).

import com.strumenta.kolasu.parsing.ParsingResult
import com.strumenta.sas.ast.SourceFile
import com.strumenta.sas.ast.Identifier
import com.strumenta.sas.ast.VariableRangeByName
import com.strumenta.sas.ast.VariableRangeByNumber
import com.strumenta.sas.ast.VariableRangeWithPrefix
import com.strumenta.sas.ast.Variables
import com.strumenta.sas.ast.datastep.DataStep
import com.strumenta.sas.ast.sql.ColumnRef
import com.strumenta.sas.ast.sql.TableRef
import com.strumenta.sas.traversing.SASBaseVisitor

class CountVisitor : SASBaseVisitor<Int>() {
override fun visit(node: Identifier) : Int {
return node.name.length;
}

override fun visit(node: TableRef) : Int {
return (if(node.schema != null) visit(node.schema!!) else 0) + node.table.length
}

override fun visit(node: ColumnRef) : Int {
return (if(node.table != null) visit(node.table!!) else 0) + node.column.length
}

override fun visit(node: Variables) : Int {
return node.variables.map {
visit(it)
}.max()
}

override fun visit(node: VariableRangeByName) : Int {
return Math.max(visit(node.from), visit(node.to))
}

override fun visit(node: VariableRangeByNumber) : Int {
return Math.max(visit(node.from), visit(node.to))
}

override fun visit(node: VariableRangeWithPrefix) : Int {
return visit(node.prefix)
}

override fun visit(node: DataStep) : Int {
return node.statements.map {
visit(it)
}.max()
}

override fun defaultResult() : Int {
return 0
}

override fun aggregateResult(aggregate: Int?, nextResult: Int?): Int? {
return if(aggregate != null && nextResult != null)
aggregate + nextResult
else
null
}
}

fun countIdentifierLength(result: ParsingResult<SourceFile>): Int {
val visitor = CountingVisitor()
val maxLength = visitor.visit(result.root!!)
return maxLength
}

Bottom-up processing with a child index

Some analyses need children before parents — for example, aggregating dependencies from inner statements before handling the enclosing step. Build a child index once, then visit children explicitly inside each visit method:

import com.strumenta.kolasu.model.Node
import com.strumenta.kolasu.traversing.walk
import com.strumenta.sas.ast.datastep.DataStep
import com.strumenta.sas.ast.datastep.SetStatement
import com.strumenta.kolasu.parsing.ParsingResult
import com.strumenta.sas.ast.SourceFile

class BottomUpSetCounter(result: ParsingResult<SourceFile>) :
SASBaseVisitor<Int?>() {
private val children = HashMap<Node, MutableSet<Node>>()
init {
result.root!!.walk().forEach { node ->
children.putIfAbsent(node, HashSet())
val parent = node.parent ?: return@forEach
children.putIfAbsent(parent, HashSet())
children[parent]!!.add(node)
}
visit(result.root!!)
}
fun getChildren() : HashMap<Node, MutableSet<Node>> {
return children
}
override fun visit(node: DataStep): Int {
var setCount = 0
children[node]?.forEach { child ->
setCount += visit(child) ?: 0
}
return setCount
}
override fun visit(node: SetStatement): Int {
return 1
}
override fun defaultResult() : Int {
return 0
}
}

Override only the node types you care about; the generated base class provides default implementations for everything else.

Next steps

Visitors pair naturally with DATA step I/O analysis and lint-style rules that emit structured findings per node type.