Serializing SQL: Building Serializable Data Classes in Kotlin

    Kyle Loomis · October 16, 2024

    Serializing SQL: Building Serializable Data Classes in Kotlin

    Earlier this year, I set out to build a new data query system. Previously, I had developed functionality to select fields, filter, and sort, but I needed a mechanism to apply functions to a table dynamically. The key requirements were:

    • Serialization and deserialization between JSON and Kotlin data classes.
    • Basic SQL functions like SUM and COUNT.
    • Column aliasing: AS field_name.
    • Nested data structures: SUM(COUNT(*)).
    • An extensible and flexible design to handle additional functionality in the future.

    These requirements might seem straightforward, but encapsulating core SQL functions in a serializable data class is challenging. Fortunately, Jackson provides a class annotation that handles the serialization/deserialization to and from JSON. The rest of the code was implemented using Kotlin data classes—no external libraries required.

    Data Classes

    Let's start with a sealed class called Expression that will be inherited by all child classes. The Jackson annotation JsonTypeInfo automatically creates and populates the property field matching the name of the class.

    import com.fasterxml.jackson.annotation.JsonTypeInfo @JsonTypeInfo( use = JsonTypeInfo.Id.NAME, include = JsonTypeInfo.As.PROPERTY, property = "descriptor" ) sealed class Expression

    Next, we create two child classes to represent a field and function, respectively. These are useful to organize and segment the type of sub-expressions.

    sealed class FieldExpression : Expression() sealed class FunctionExpression : FieldExpression()

    Core Expressions

    Using these sealed classes, we can build the core data classes to represent fields and functions. Let's create the straightforward Field and Value data classes.

    // `SELECT field_name ...` data class Field( val name: String, val alias: String? = null ) : FieldExpression() // `SELECT 'Hello' AS field_name ...` data class Value( val value: Any?, val alias: String? = null ) : FieldExpression()

    Nested Expressions

    Leveraging the basic FieldExpression child classes, we can build a SingleExpression class that accepts any FieldExpression, enabling the composition of nested expressions.

    data class SingleExpression( val parameter: FieldExpression, val operator: SqlFunction, val alias: String? = null ) : FunctionExpression()

    SqlFunction is an enum class that contains the core set of enums representing the type of functions to apply to the FieldExpression.

    enum class SqlFunction { SUM, AVG, ABS, COUNT, COUNT_DISTINCT, MIN, MAX, MEDIAN, COALESCE }

    This enables us to build a nested expression representing something like SUM(COUNT(id)):

    SingleExpression( parameter = SingleExpression( parameter = Field( name = "id" ), operator = SqlFunction.COUNT ), operator = SqlFunction.SUM )

    Serialization and Deserialization

    Serializing the above into JSON yields the following:

    { "descriptor": "SingleExpression", "parameter": { "descriptor": "SingleExpression", "parameter": { "descriptor": "Field", "name": "id", "alias": null }, "operator": "COUNT", "alias": null }, "operator": "SUM", "alias": null }

    SQL Engine

    The SQL Engine builds on the above to translate the data classes into the SQL query. There are a number of SQL frameworks, but I opted to use jOOQ for its broad support and ease of use. Let's start by modifying the FieldExpression to incorporate a toField method which constructs a jOOQ Field.

    import org.jooq.Field as JooqField sealed class FieldExpression : Expression() { abstract fun <T : Any> toField(): JooqField<T> }

    Now can modify the Field and Value classes to add the toField logic.

    data class Field( val name: String, val alias: String? = null ) : FieldExpression() { override fun <T : Any> toField(): JooqField<T> { val field: JooqField<Any> = alias?.let { DSL.field(name).`as`(it) } ?: DSL.field(name) @Suppress("UNCHECKED_CAST") return field as JooqField<T> } } data class Value( val value: Any?, val alias: String? = null ) : FieldExpression() { override fun <T : Any> toField(): JooqField<T> { val field: JooqField<Any?> = alias?.let { DSL.inline(value).`as`(it) } ?: DSL.inline(value) @Suppress("UNCHECKED_CAST") return field as JooqField<T> } }

    This logic handles both the construction of the field and the field alias, if applicable. The SingleExpression matches on the operator enum to apply the correct SQL function. Notice how parameter.toField() is executed before the SQL function.

    import org.jooq.impl.DSL data class SingleExpression( val parameter: FieldExpression, val operator: SqlFunction, val alias: String? = null ) : FunctionExpression() { override fun <T : Any> toField(): JooqField<T> { val functionField = when (operator) { SqlFunction.SUM -> DSL.sum(parameter.toField()) SqlFunction.AVG -> DSL.avg(parameter.toField()) SqlFunction.ABS -> DSL.abs(parameter.toField()) SqlFunction.COUNT -> DSL.count(parameter.toField<Any>()) SqlFunction.COUNT_DISTINCT -> DSL.countDistinct(parameter.toField<Any>()) SqlFunction.MIN -> DSL.min(parameter.toField()) SqlFunction.MAX -> DSL.max(parameter.toField()) SqlFunction.MEDIAN -> DSL.median(parameter.toField()) else -> throw ValidationException("SingleExpression does not support operator: ${operator.name}") } val field = alias?.let { functionField.`as`(it) } ?: functionField @Suppress("UNCHECKED_CAST") return field as JooqField<T> } }

    Conclusion

    This framework provides a simple yet powerful mechanism to store SQL functions in a serializable structure. While this is ready to use, I recommend implementing input/output validation to ensure type safety and prevent SQL injection.