A simple query language parser that converts user queries to SQL conditions (using Antlr4 and JOOQ) with an aim for least surprise.
Example query language:
this "and that", "or these and those and" someField:"is this"
Given a defined dataset schema, users can input freeform query language to produce filtered result sets from a provided dataset.
All you have to do is:
- Define a TypedQuery from a JOOQ table
- Configure fields as searchable/orderable
- Send a JSON DataSet.Request to an endpoint
- Render produced records and sizing information
No more custom SQL!
- Unquoted terms are converted into tokens.
- Spaced terms are considered AND conditions.
- Commaed terms are considered OR conditions.
- Terms can be grouped with parentheses.
- Targets can be specialized with colons
target:value
Given the following query:
from:"some name" attachment:true (content:(user_input), user_input)
It is expected to be converted into:
WHERE
from:"some name"
AND attachment:true
AND (
content:user_input
OR any:user_input
)
And given the schema:
val query = DataSet.forTable(EMAIL) {
search("daysAgo") { s, _ ->
if (s.startsWith("<")) {
val ltDaysAgo = s.drop(1).toLongOrNull()
if (ltDaysAgo == null) null
else CREATED_AT.lessOrEqual(OffsetDateTime.now().minusDays(ltDaysAgo))
} else if (s.startsWith(">")) {
val gtDaysAgo = s.drop(1).toLongOrNull()
if (gtDaysAgo == null) null
else CREATED_AT.greaterOrEqual(OffsetDateTime.now().minusDays(gtDaysAgo))
} else {
val daysAgo = s.toLongOrNull()
if (daysAgo == null) null
else CREATED_AT.greaterOrEqual(OffsetDateTime.now().minusDays(daysAgo))
}
}
search("is_x", open = true) { s, _ ->
if (s == "x") DSL.trueCondition()
else null
}
field(ATTACHMENT) { f ->
search = { s ->
if (s.lowercase() == "true") {
f.isNull
} else {
null
}
}
}
field(CONTENT) { f ->
search = { s ->
f.containsIgnoreCase(s)
}
}
field(FROM) { f ->
search = { s ->
f.containsIgnoreCase(s)
}
}
autoDetectFields(db)
}
val req = DataSet.Request(
showColumns = true, showCounts = true, search = "x", partition = ""
)
val rsp = DataSet.Response.fromRequest(db, query, req)
Because this is a regular language, and users may supply bad data, there needs to be a method of rescuing parse errors.
Any time there is a parse error, the offending token is escaped.
This works out in the grammar to convert:
from:((, (this is parser torture""\")
to
from:(\(, \(this is parser torture\"\"\")
While the user may not end up rendering what they think, the (last applied) search is returned in the dataset result.
In the case of partitioning or sub querying, we simply send back an empty string.
Make sure to update the user's search input with searchRendered
return value to ensure that the user knows what produced their result.