Add support for converting pyformat SQL to positional markers with bulk_parameters#812
Add support for converting pyformat SQL to positional markers with bulk_parameters#812bgunebakan wants to merge 5 commits into
Conversation
| elif bulk_parameters is not None and _NAMED_PARAM_RE.search(sql): | ||
| sql = _rewrite_pyformat_sql(sql) |
There was a problem hiding this comment.
Wouldn't the bulk_parameters also need some sort of conversion and be in dict format for this to work reliably?
Or does pyformat imply that the bulk parameters order matches the order of named placeholders? If so - is that documented somewhere?
Do other clients allow this?
There was a problem hiding this comment.
Wouldn't the bulk_parameters also need some sort of conversion and be in dict format for this to work reliably?
I did more tests and you're right that the original implementation was fragile so I added new check for dict format.
Or does pyformat imply that the bulk parameters order matches the order of named placeholders? If so - is that documented somewhere?
This isn't specified in DB-API 2.0, bulk_parameters is a CrateDB specific and not covered by the spec.
Do other clients allow this?
Other clients does not use this, psycopg3 don't expose a bulk_parameters parameter, it uses executemany() or COPY methods.
I'm extending the docs to cover this.
| >>> connection.client.set_next_response({ | ||
| ... "results": [ | ||
| ... {"rowcount": 1}, | ||
| ... {"rowcount": 1} | ||
| ... ], | ||
| ... "duration": 123, | ||
| ... "cols": [], | ||
| ... }) |
There was a problem hiding this comment.
Shouldn't this be hidden from the public docs given that it's mocking the response and isn't a public API that should/can be used?
There was a problem hiding this comment.
I added hidden keyword.
| For advanced use cases (such as SQLAlchemy integrations) you can call | ||
| ``execute()`` directly with the ``bulk_parameters`` keyword argument, | ||
| bypassing ``executemany()``: |
There was a problem hiding this comment.
Just curios: Why is this relevant for SQLAlchemy?
To me it is generally not clear what advantage this offers over using executemany?
Judging from d4223cd it appears to me as if this was more of a side effect of re-using some logic.
There was a problem hiding this comment.
There is no server level performance improvements compared to executemany(). I didn't do any performance tests to both methods but they use same http level, just executemany() does iteration on python level so they could be a difference based on data size. probably relatively small difference.
Do you think should we remove "Using bulk_parameters directly" part from public docs?
There was a problem hiding this comment.
Do you think should we remove "Using bulk_parameters directly" part from public docs?
Given the type signature is public API I think it is fine to keep some docs, but I'd tend to nudge users to executemany() because that's DB API and merely mention that execute(..., bulk_parameters) exists as an alternative.
There was a problem hiding this comment.
| For advanced use cases (such as SQLAlchemy integrations) you can call | |
| ``execute()`` directly with the ``bulk_parameters`` keyword argument, | |
| bypassing ``executemany()``: | |
| ``execute()`` accepts a ``bulk_parameters`` keyword argument directly: | |
| .. NOTE:: | |
| Please prefer ``executemany()`` for bulk inserts, it is the standard DB-API 2.0 | |
| interface. The ``bulk_parameters`` argument is a lower-level alternative. |
maybe note like that?
Co-authored-by: Mathias Fußenegger <mfussenegger@users.noreply.github.com>
Co-authored-by: Bilal Tonga <bilaltonga@gmail.com>
Summary of the changes / Why this is an improvement
Fixes a silent runtime error when
cursor.execute()is called withbulk_parametersand a SQL statement that uses%(name)s(pyformat) placeholders.When
bulk_parametersis passed alongside pyformat SQL, the driver was sending the raw%(name)smarkers verbatim to CrateDB, which rejected them with:SQLParseException[no viable alternative at input 'VALUES (%']The fix adds an elif branch in
execute()that rewrites the SQL template to $N positional markers before the request is sent.The bug was introduced indirectly by PR #810 which declared
paramstyle = "pyformat"for the driver. Before that change, SQLAlchemy generated?style SQL, so thebulk_parameterspath was never fed pyformat placeholders so the broken code path simply didn't exist.It fixes implementation in sqlalchemy-crate package crate/sqlalchemy-cratedb#266
Checklist