Kyle Loomis · October 16, 2024
Earlier this year, I set out to build a new data query system. Previously, I had developed functionality to select fields, filter, and sort, but I needed a mechanism to apply functions to a table dynamically. The key requirements were:
- Serialization and deserialization between JSON and Kotlin data classes.
- Basic SQL functions like SUM and COUNT.
- Column aliasing: AS field_name.
- Nested data structures: SUM(COUNT(*)).
- An extensible and flexible design to handle additional functionality in the future.
These requirements might seem straightforward, but encapsulating core SQL functions in a serializable data class is challenging. Fortunately, Jackson provides a class annotation that handles the serialization/deserialization to and from JSON. The rest of the code was implemented using Kotlin data classes—no external libraries required.
Data Classes
Let's start with a sealed class called Expression that will be inherited by all child classes. The Jackson annotation JsonTypeInfo automatically creates and populates the property field matching the name of the class.
Copied!import com.fasterxml.jackson.annotation.JsonTypeInfo
@JsonTypeInfo(
use = JsonTypeInfo.Id.NAME,
include = JsonTypeInfo.As.PROPERTY,
property = "descriptor"
)
sealed class Expression
Next, we create two child classes to represent a field and function, respectively. These are useful to organize and segment the type of sub-expressions.
Copied!sealed class FieldExpression : Expression()
sealed class FunctionExpression : FieldExpression()
Core Expressions
Using these sealed classes, we can build the core data classes to represent fields and functions. Let's create the straightforward Field and Value data classes.
Copied!// `SELECT field_name ...`
data class Field(
val name: String,
val alias: String? = null
) : FieldExpression()
// `SELECT 'Hello' AS field_name ...`
data class Value(
val value: Any?,
val alias: String? = null
) : FieldExpression()
Nested Expressions
Leveraging the basic FieldExpression child classes, we can build a SingleExpression class that accepts any FieldExpression, enabling the composition of nested expressions.
Copied!data class SingleExpression(
val parameter: FieldExpression,
val operator: SqlFunction,
val alias: String? = null
) : FunctionExpression()
SqlFunction is an enum class that contains the core set of enums representing the type of functions to apply to the FieldExpression.
Copied!enum class SqlFunction {
SUM,
AVG,
ABS,
COUNT,
COUNT_DISTINCT,
MIN,
MAX,
MEDIAN,
COALESCE
}
This enables us to build a nested expression representing something like SUM(COUNT(id)):
Copied!SingleExpression(
parameter = SingleExpression(
parameter = Field(
name = "id"
),
operator = SqlFunction.COUNT
),
operator = SqlFunction.SUM
)
Serialization and Deserialization
Serializing the above into JSON yields the following:
Copied!{
"descriptor": "SingleExpression",
"parameter": {
"descriptor": "SingleExpression",
"parameter": {
"descriptor": "Field",
"name": "id",
"alias": null
},
"operator": "COUNT",
"alias": null
},
"operator": "SUM",
"alias": null
}
SQL Engine
The SQL Engine builds on the above to translate the data classes into the SQL query. There are a number of SQL frameworks, but I opted to use jOOQ for its broad support and ease of use. Let's start by modifying the FieldExpression to incorporate a toField method which constructs a jOOQ Field.
Copied!import org.jooq.Field as JooqField
sealed class FieldExpression : Expression() {
abstract fun <T : Any> toField(): JooqField<T>
}
Now can modify the Field and Value classes to add the toField logic.
Copied!data class Field(
val name: String,
val alias: String? = null
) : FieldExpression() {
override fun <T : Any> toField(): JooqField<T> {
val field: JooqField<Any> = alias?.let {
DSL.field(name).`as`(it)
} ?: DSL.field(name)
@Suppress("UNCHECKED_CAST")
return field as JooqField<T>
}
}
data class Value(
val value: Any?,
val alias: String? = null
) : FieldExpression() {
override fun <T : Any> toField(): JooqField<T> {
val field: JooqField<Any?> = alias?.let {
DSL.inline(value).`as`(it)
} ?: DSL.inline(value)
@Suppress("UNCHECKED_CAST")
return field as JooqField<T>
}
}
This logic handles both the construction of the field and the field alias, if applicable. The SingleExpression matches on the operator enum to apply the correct SQL function. Notice how parameter.toField() is executed before the SQL function.
Copied!import org.jooq.impl.DSL
data class SingleExpression(
val parameter: FieldExpression,
val operator: SqlFunction,
val alias: String? = null
) : FunctionExpression() {
override fun <T : Any> toField(): JooqField<T> {
val functionField = when (operator) {
SqlFunction.SUM -> DSL.sum(parameter.toField())
SqlFunction.AVG -> DSL.avg(parameter.toField())
SqlFunction.ABS -> DSL.abs(parameter.toField())
SqlFunction.COUNT -> DSL.count(parameter.toField<Any>())
SqlFunction.COUNT_DISTINCT -> DSL.countDistinct(parameter.toField<Any>())
SqlFunction.MIN -> DSL.min(parameter.toField())
SqlFunction.MAX -> DSL.max(parameter.toField())
SqlFunction.MEDIAN -> DSL.median(parameter.toField())
else -> throw ValidationException("SingleExpression does not support operator: ${operator.name}")
}
val field = alias?.let { functionField.`as`(it) } ?: functionField
@Suppress("UNCHECKED_CAST")
return field as JooqField<T>
}
}
Conclusion
This framework provides a simple yet powerful mechanism to store SQL functions in a serializable structure. While this is ready to use, I recommend implementing input/output validation to ensure type safety and prevent SQL injection.