Introduce a SqlDialect abstraction to make per-database logic a single source of truth
Per-database knowledge is currently scattered across many files and keyed on two parallel enums that nothing forces to agree: SupportedDatabase (8 variants: Sqlite, Duckdb, Oracle, Postgres, MySql, Mssql, Snowflake, Generic) and sqlx's AnyKind (5 variants). Because the dialect-specific data is spread over ~6 independent match sites, several keyed on the wrong enum, the compiler cannot enforce completeness: each site silently falls through to a _ default. Adding or fixing a dialect today means hunting down every site and hoping none was missed, and a missed arm is a silent correctness/data bug, not a compile error.
Proof: the references are real and scattered
$ grep -rcn -E "SupportedDatabase::(Sqlite|Postgres|MySql|Mssql|Duckdb|Oracle|Snowflake|Generic)" src | grep -v ':0' ; echo "total:" ; grep -rn -E "SupportedDatabase::(Sqlite|Postgres|MySql|Mssql|Duckdb|Oracle|Snowflake|Generic)" src | wc -l
src/webserver/database/sql.rs:37
src/webserver/database/sql/parameter_extraction.rs:9
src/filesystem.rs:3
total: 49
$ grep -rn -E "AnyKind::" src | wc -l
24
49 per-variant SupportedDatabase references (37 in sql.rs alone) plus 24 AnyKind references, all to be kept consistent by hand.
The scattered sites
The enum mismatch directly produces fragile fallthroughs that only exist because the key type is wrong:
Proposed change
Introduce a single trait SqlDialect (or a method-rich impl on SupportedDatabase) that centralizes everything per-DB:
- placeholder style and placeholder cast type
- sqlparser
Dialect
- otel
db.system.name
- default pool size
- capability flags: native JSON,
COPY FROM STDIN, trace propagation, rollback statement, :: cast support, MSSQL concat rewriting
Crucially, key the placeholder data on SupportedDatabase and make DB_PLACEHOLDERS (and the other tables) an exhaustive match, so the compiler forces every dialect to be handled and the unreachable!/_ => Odbc/_ => Generic fallthroughs disappear. Adding or fixing a dialect then becomes "implement one trait / add the arm the compiler demands" instead of editing ~6 scattered sites that nothing forces to agree.
Risks and mitigations
- Large but mechanical refactor. Mitigate by leaning on the existing dialect/placeholder tests in
sql.rs (the ALL_DIALECTS table drives many of them) which already pin per-dialect behavior.
- Land it incrementally, one capability at a time, so each step stays reviewable and bisectable.
Expected win
- Single source of truth for per-DB behavior.
- Compiler-enforced dialect completeness instead of hand-synced matches across two enums.
- Removes the silent
_ => Odbc / _ => Generic fallthroughs that can quietly mishandle (or corrupt data for) an unlisted dialect.
This refines the "split sql.rs by responsibility" suggestion in #1249: a SqlDialect abstraction is the natural seam to split along.
Introduce a
SqlDialectabstraction to make per-database logic a single source of truthPer-database knowledge is currently scattered across many files and keyed on two parallel enums that nothing forces to agree:
SupportedDatabase(8 variants: Sqlite, Duckdb, Oracle, Postgres, MySql, Mssql, Snowflake, Generic) and sqlx'sAnyKind(5 variants). Because the dialect-specific data is spread over ~6 independentmatchsites, several keyed on the wrong enum, the compiler cannot enforce completeness: each site silently falls through to a_default. Adding or fixing a dialect today means hunting down every site and hoping none was missed, and a missed arm is a silent correctness/data bug, not a compile error.Proof: the references are real and scattered
49 per-variant
SupportedDatabasereferences (37 insql.rsalone) plus 24AnyKindreferences, all to be kept consistent by hand.The scattered sites
dialect_for_dbreturns the sqlparser dialect via an 8-arm match: sql.rs#L293-L304AnyKind(only 5 entries), with a silent fallback to a temp prefix when a kind is absent: parameter_extraction.rs#L25-L46SupportedDatabase(different enum, with a_ =>default): parameter_extraction.rs#L96-L115AnyConnectionKindto special-case the PostgresCOPYpath: csv_import.rs#L172-L177CREATE TABLEper dialect: filesystem.rs#L265-L280from_dbms_name, plus its inverse mapping duplicated inside the tests: sql.rs#L786-L792The enum mismatch directly produces fragile fallthroughs that only exist because the key type is wrong:
_ => Self::Genericinfrom_dbms_name_ => run_csv_import_insert(any non-Postgres connection)unreachable!(...)when a kind is missing fromDB_PLACEHOLDERS, and the silent temp-prefix fallback for an unlisted kind (parameter_extraction.rs#L58-L60)Proposed change
Introduce a single
trait SqlDialect(or a method-rich impl onSupportedDatabase) that centralizes everything per-DB:Dialectdb.system.nameCOPY FROM STDIN, trace propagation, rollback statement,::cast support, MSSQL concat rewritingCrucially, key the placeholder data on
SupportedDatabaseand makeDB_PLACEHOLDERS(and the other tables) an exhaustivematch, so the compiler forces every dialect to be handled and theunreachable!/_ => Odbc/_ => Genericfallthroughs disappear. Adding or fixing a dialect then becomes "implement one trait / add the arm the compiler demands" instead of editing ~6 scattered sites that nothing forces to agree.Risks and mitigations
sql.rs(theALL_DIALECTStable drives many of them) which already pin per-dialect behavior.Expected win
_ => Odbc/_ => Genericfallthroughs that can quietly mishandle (or corrupt data for) an unlisted dialect.This refines the "split
sql.rsby responsibility" suggestion in #1249: aSqlDialectabstraction is the natural seam to split along.