Add support for MySQL DELIMITER directive#91
Conversation
Recognize the MySQL/MariaDB `DELIMITER <value>` client directive as its own statement and use the new value as the terminator for subsequent statements. This unblocks editing and splitting stored-procedure scripts in tools like Beekeeper Studio where DELIMITER is the standard idiom for CREATE PROCEDURE/FUNCTION/TRIGGER bodies that contain inner `;` terminators. The tokenizer now accepts a `delimiter` parameter and emits a `semicolon` token for arbitrary symbol delimiters (`$`, `$$`, `//`, etc.). The parser tracks the current delimiter and applies a new one after a DELIMITER statement flushes. Only enabled for the mysql dialect.
| { | ||
| start: 25, | ||
| end: 33, | ||
| text: 'SELECT 2$', |
There was a problem hiding this comment.
How is a user of the library meant to know that $ represents the end of the query after splitting it like this?
Maybe we should be providing statements either without the terminator (;, $), or include an optional extra field specifying the terminating character. Probably the latter and we can introduce it as a new field, rather than changing the format of the text field, unless we feel like the text change is better?
There was a problem hiding this comment.
Added an endStatement field on every IdentifyResult that carries the terminator string (;, $, $$, //, etc.). Kept the text field unchanged (it still includes the terminator), so this is additive.
Consumers can now do text.slice(0, -endStatement.length) to get the statement without its terminator, or just inspect endStatement to know what the delimiter was at that point. For DELIMITER statements themselves, endStatement is omitted (they're terminated by end-of-line, not a delimiter) — the new delimiter is on the statement's newDelimiter field instead.
Documented in the README API section and with a worked example in the new "Working with MySQL DELIMITER" section.
Generated by Claude Code
| tables: [], | ||
| columns: [], | ||
| }, | ||
| { |
There was a problem hiding this comment.
Probably the delimiter type should include a new field specifying what the new delimiter is from this point in time?
There was a problem hiding this comment.
Done. DELIMITER statements now carry a newDelimiter field with the new terminator value (e.g. "$$", "//", or ";" when resetting). Quote stripping works: DELIMITER "//" yields newDelimiter: "//".
Generated by Claude Code
| }); | ||
|
|
||
| ['UPDATE', 'DELETE', 'INSERT', 'TRUNCATE'].forEach((type) => { | ||
| ['UPDATE', 'DELETE', 'INSERT', 'TRUNCATE', 'DELIMITER'].forEach((type) => { |
There was a problem hiding this comment.
DELIMITER isn't really a modification query -- it's not modifying the database, it's more of a structural change to the query. Maybe we need a new or different category.
Suggest a better category. Add a new one if we need one.
Why?
- Modification == changes the database in some way. Delimiter does not do that.
- If we label as modification, then folks using this category to exclude write operations won't be able to run
DELIMITERqueries.
There was a problem hiding this comment.
Good point. Introduced a new NO_OP execution type and switched DELIMITER to it. Rationale: DELIMITER is a client-side parsing directive — the server never sees it — so it doesn't fit any existing category and shouldn't be filtered alongside write operations.
NO_OP is added to the ExecutionType union and documented in the README execution types list as "the statement has no effect on the database server; currently used for DELIMITER".
Generated by Claude Code
| * ALTER_INDEX | ||
| * ALTER_PROCEDURE | ||
| * ANON_BLOCK (BigQuery and Oracle dialects only) | ||
| * DELIMITER (MySQL dialect only — sets the statement terminator used by the |
There was a problem hiding this comment.
Add an example to the readme of identifying a set of queries after a DELIMITER change and how the user should use the information provided by the library to interpret each query.
There was a problem hiding this comment.
Added a "Working with MySQL DELIMITER" section to the README. It shows:
- A DELIMITER-using input with a
CREATE PROCEDUREand a reset to;. - The resulting
identify()output withnewDelimiter/endStatementpopulated on the relevant statements. - A short code snippet showing how a consumer should interpret the results: skip
type === 'DELIMITER'(client-side only,NO_OP) and stripendStatementfrom each remaining statement'stextbefore sending it to the server.
Also added an "Each returned statement has…" bullet list in the API section documenting every field, including the new endStatement and newDelimiter.
Generated by Claude Code
…DME example - Expose `endStatement` on every IdentifyResult (the terminator string that ended the statement), so consumers can reliably strip or interpret it. - Expose `newDelimiter` on DELIMITER statements (the new terminator for the following statements). - Add a new `NO_OP` executionType and use it for DELIMITER instead of MODIFICATION, since DELIMITER is a client-side directive that does not modify the database and should not be filtered with write operations. - Document DELIMITER handling in the README with a worked example showing how a consumer should interpret the output to execute statements against a MySQL server. - Also: set `endStatement` on the CTE-termination UNKNOWN statement path for consistency with the rest of the parser.
|
Addresses PR feedback from @not-night-but: - Rename Token type `'semicolon'` to `'delimiter'`. Calling a custom terminator like `$$` or `//` a "semicolon" was confusing; the new name matches what the token actually represents. - Validate DELIMITER arguments. Referencing mysql-shell's `Sql_splitter::set_delimiter` (mysqlshdk/libs/utils/utils_mysql_parsing.cc), which only rejects empty and backslash, we are a little stricter because several other characters silently break our tokenizer: * empty argument * backslash (`\`) * string/identifier quotes (`'`, `"`, `` ` ``) * inline comment markers (`--`, `#`) * block-comment characters (`/`, `*`) In strict mode the parser throws; in non-strict mode the DELIMITER statement is returned without `newDelimiter` and the previous delimiter is kept in effect, matching mysql-shell's behaviour. - Handle DELIMITER lines via raw input scanning instead of token consumption. A malformed argument such as `DELIMITER '` used to tokenise as an unterminated string that ate the rest of the input, hiding all subsequent statements. Raw scanning also drops the previous quote-stripping convenience, matching mysql-shell which treats the argument as a whitespace-delimited raw word. - Add comprehensive tests for every rejection case plus a regression test confirming that a malformed DELIMITER does not swallow the rest of the script in non-strict mode. - Document the validation rules in the README.
|
@not-night-but thanks — both addressed in 67783ff. Token type rename. Argument validation. Referenced mysql-shell's
Behaviour on rejection matches mysql-shell: strict mode throws; non-strict mode emits the Bonus regression fix. Tests added in
Also dropped the quote-stripping convenience ( Full suite: 680 passing, lint + tsc clean. Generated by Claude Code |
| }; | ||
| // Delimiter-typed tokens (including `;`) are emitted by the delimiter-match | ||
| // path in scanToken, so it can handle arbitrary terminators like '$$' or '//'. | ||
| const INDIVIDUALS: Record<string, Token['type']> = {}; |
There was a problem hiding this comment.
Not sure worth keeping this around given it does nothing, and not sure what other characters might be added to this. Easy enough to re-introduce if we do come up with a test case for it.
There was a problem hiding this comment.
Removed in f50a6c3. The empty INDIVIDUALS map, resolveIndividualTokenType, and scanIndividualCharacter are all gone — the delimiter-match path is the sole producer of 'delimiter' tokens now. Easy to re-introduce the individual-character layer if a new single-char token type ever comes up.
Generated by Claude Code
| * `DELIMITER` statements (which are terminated by end-of-line, not a | ||
| * delimiter). | ||
| */ | ||
| endStatement?: string; |
There was a problem hiding this comment.
Why not call this delimiter?
There was a problem hiding this comment.
Good call — renamed in f50a6c3. The field is now delimiter on both IdentifyResult and the internal Statement interface. Pairs nicely with newDelimiter on DELIMITER statements:
delimiter— the terminator that ended this statementnewDelimiter— the new terminator thisDELIMITERdirective sets for the statements that follow
All tests, docs, and the README consumer-example snippet updated accordingly.
Generated by Claude Code
…elimiter Addresses @MasterOdin's feedback: - Remove the now-empty `INDIVIDUALS` map in src/tokenizer.ts along with `scanIndividualCharacter` and `resolveIndividualTokenType`. Since the delimiter-match path is the only producer of `'delimiter'` tokens, the individual-character code path was dead. Easy to re-add if a new single-char token type comes up. - Rename the public/internal field `endStatement` to `delimiter` on both `IdentifyResult` and the internal `Statement` interface. The shorter name matches the token type and the domain vocabulary. All tests and README examples updated accordingly.
|
I think this is ready now. I've spent a lot of time iterating on this with Claude |
|
@MasterOdin thoughts on the updated version? You ok for this to be merged? |
Summary
This PR adds support for parsing and handling the MySQL
DELIMITERdirective, which allows users to change the statement terminator character(s) in MySQL clients. This enables proper parsing of multi-statement scripts that use custom delimiters (e.g.,$$or//) commonly used in stored procedure definitions.Key Changes
createDelimiterStatementParser()to handle theDELIMITERkeyword and extract the new delimiter value, with support for quoted delimiters and inline commentsparse()function to track the current delimiter and pass it through the tokenization pipeline, updating it whenever aDELIMITERstatement is encounteredscanDelimiter()function to match arbitrary statement terminators instead of hardcoding;$$delimitersscanToken()signature to accept adelimiterparameterDELIMITERas a new statement type withMODIFICATIONexecution typemysqldialect; other dialects treat it as anUNKNOWNstatementDELIMITER "$$"→$$)Notable Implementation Details
DELIMITERstatement'sendposition excludes the trailing newline, consistent with how the directive is typically usedDELIMITERstatement even without a trailing newlineDELIMITERlinehttps://claude.ai/code/session_01U9CCUpZPbkyZmqj2jDBkXZ