MDEV-37365: Crash on concurrent ALTER TABLE parent + INSERT on FK child#5085
MDEV-37365: Crash on concurrent ALTER TABLE parent + INSERT on FK child#5085arcivanov wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request addresses MDEV-37365 by implementing a prelocking mechanism that acquires MDL_SHARED_READ on foreign key parent tables during DML operations on child tables, preventing crashes during concurrent DDL. Key changes include the addition of prepare_fk_referenced_prelocking_list in sql_base.cc and the references_foreign_key method in the handler interface. Review feedback points out a high-severity issue where a raw return is used instead of DBUG_RETURN, potentially corrupting the debug stack, and a compilation error caused by passing LEX_CSTRING objects by value instead of by address to table_already_fk_prelocked.
When `ALTER TABLE` runs on a parent table with FK children and concurrent `INSERT` runs on a child table, the server crashes in `innobase_reload_table()` → `dict_sys.remove()` with assertion `table->n_rec_locks == 0`. The root cause is that `INSERT INTO child` performs its FK constraint check inside InnoDB, acquiring InnoDB-internal locks (LOCK_IS + record locks) on the parent table without any corresponding MDL on the parent. When ALTER's commit phase tears down and recreates the parent's `dict_table_t`, it hits those still-held locks. The fix closes the gap by extending the DML prelocking strategy: when a child table with foreign keys is opened for DML, the SQL layer now also prelocks the FK parent table(s) with `TL_READ` (→ `MDL_SHARED_READ`). This properly declares the FK dependency at the MDL layer, so DDL on the parent (which holds `MDL_EXCLUSIVE`) will wait for child DML transactions to complete before proceeding. Implementation: - New function `prepare_fk_referenced_prelocking_list()` in `sql/sql_base.cc`, symmetric to the existing `prepare_fk_prelocking_list()` (which handles the parent→children direction for cascading FK actions). Uses `get_foreign_key_list()` to find referenced parent tables and prelocks them with `TL_READ` + `PRELOCK_FK` (→ `OPEN_STUB`, so only MDL is acquired, no table open). - New `handler::references_foreign_key()` virtual (+ InnoDB override) as a lightweight early-exit check, symmetric to the existing `referenced_by_foreign_key()`. Uses `dict_sys.freeze()` (shared latch) to check `foreign_set.empty()`, avoiding the heavier `get_foreign_key_list()` (exclusive latch) for tables without FKs. - Called from `DML_prelocking_strategy::handle_table()` in both the `trg_event_map` and `slave_fk_event_map` branches. Behavioral change: DDL on a parent table (`ALTER`, `DROP`, `TRUNCATE`, `RENAME`) now blocks at the MDL layer while any child table has an open transaction that touched FK columns (even if the DML statement failed). Previously, DDL could proceed and return FK-specific errors (`ER_TRUNCATE_ILLEGAL_FK`, `ER_ROW_IS_REFERENCED_2`), but InnoDB-internal locks were still held by the child, leading to crashes on concurrent `ALTER TABLE`. With this fix, DDL gets `ER_LOCK_WAIT_TIMEOUT` instead, controlled by `lock_wait_timeout` (not `innodb_lock_wait_timeout`, since the conflict is at the MDL layer). Once the child transaction ends, DDL returns the same FK-specific errors as before. The regression is narrow: the DDL was never going to succeed anyway (FK constraints prevent it regardless of MDL), so only the error code changes, not the outcome. The old behavior was a crash waiting to happen. `innodb.foreign_key` test (MDEV-26554 section) updated accordingly. Galera/WSREP: no `wsrep_foreign_key_append()` needed in the new function — the child→parent FK check is read-only and doesn't require writeset certification keys for the parent table.
5e329f4 to
a6d53bd
Compare
|
@gemini-code-assist rereview |
There was a problem hiding this comment.
Code Review
This pull request addresses MDEV-37365 by implementing metadata lock (MDL) prelocking for foreign key parent tables during DML operations on child tables, preventing crashes during concurrent DDL. It also introduces early foreign key checks for DROP and TRUNCATE operations using upgradable shared locks. Review feedback identifies several critical issues in the prelocking implementation, including incorrect iterator initialization, type mismatches between LEX_CSTRING and character pointers, and the need for proper object construction. Improvements were also suggested for error reporting consistency and optimizing redundant schema lock acquisitions.
0f60e0f to
4c76984
Compare
DML-side FK prelocking (previous commit) made child DML hold `MDL_SHARED_READ` on FK parent tables. This blocks DDL's `MDL_EXCLUSIVE`, causing TRUNCATE and DROP to return `ER_LOCK_WAIT_TIMEOUT` instead of FK-specific errors (`ER_TRUNCATE_ILLEGAL_FK`, `ER_ROW_IS_REFERENCED_2`). Perform the FK constraint check early, before acquiring `MDL_EXCLUSIVE`: 1. Acquire schema `MDL_INTENTION_EXCLUSIVE` (matching `lock_table_names()` ordering) 2. Acquire table `MDL_SHARED_UPGRADABLE` (compatible with child DML's SR; blocks `MDL_SHARED_NO_WRITE` needed by FK creation, preventing TOCTOU) 3. Open handler via `tdc_acquire_share` + `open_table_from_share` 4. Run FK check (`fk_truncate_illegal_if_parent` / `fk_drop_illegal_if_parent`) 5. On FK error: rollback MDL savepoint, return FK-specific error 6. On success: `upgrade_shared_lock(SU -> X)` 7. `lock_table_names()` finds existing IX + X tickets via `find_ticket()`, only acquires `BACKUP_DDL` For DROP, `fk_drop_illegal_if_parent()` additionally skips FKs whose child table is in the DROP list (e.g. `DROP TABLE child, parent`). The early check is skipped when `foreign_key_checks=0` (all DDL falls through to `lock_table_names` which blocks on child DML's SR as before) and when in `locked_tables_mode` (TRUNCATE only).
gkodinov
left a comment
There was a problem hiding this comment.
Thank you for your contribution! This is a preliminary review.
The diff answers the formal criteria. I'm approving it for that. Please stay tuned for the final review.
| if (prepare_fk_prelocking_list(thd, prelocking_ctx, table_list, | ||
| need_prelocking, | ||
| table_list->trg_event_map)) | ||
| return TRUE; | ||
|
|
||
| if (prepare_fk_referenced_prelocking_list(thd, prelocking_ctx, table_list, | ||
| need_prelocking)) | ||
| return TRUE; |
There was a problem hiding this comment.
As far as I understand, both @vuvova and @svoj have been against acquiring more locks during DML operations. I can imagine that this could introduce a significant performance regression.
I believe that the DDL/DML races can be fixed by extending the locking during DDL statements. Did you try implementing the following: Any DDL operation that is dropping or renaming a table, or dropping or adding foreign key constraints needs to exclusively lock all child and parent table names, in addition to locking the current table name.
There was a problem hiding this comment.
I can imagine that this could introduce a significant performance regression.
So the whole point of this approach is that lock MDL_SHARED_READ is basically never contended on except by the MDL_EXCLUSIVE.
Every SELECT anywhere takes an MDL_SHARED_READ on every table involved.
Every INSERT used to take MDL_SHARED_WRITE on a table being inserted into but will after this patch also take an MDL_SHARED_READ on every table referenced by an FK (which will be uncontended by virtually everybody else).
I haven't looked at MariaDB's internal lock implementation but acquiring a shared uncontended lock should be virtually performance-neutral (since the lock is not distributed), especially in comparison to the other parts of the queries.
Exclusively locking all children and a parent tables for DDL, on the other hand, stops everything that is happening on those child tables. The actual time to lock is going to take a very long time on heavily loaded database with a table with a large number of FKs. Depending on the locking fairness (I don't know the locking implementation for MDL) it may end up that MDL_EXCLUSIVE for each table will get pushed down behind literal thousands of MDL_SHARED_READ/WRITE. Those locks could be held for SECONDS per query (large SELECT queries) which means that if you're locking, for example, 1 parent and 5 children by the time you are locking the 5th child down the first child to be locked has been sitting in MDL_EXCLUSIVE for seconds to tens of seconds or worse (very much depends on how many queries per second are there) with ALL operations halted on those tables including SELECTS (defeating MVCC!).
Unless I'm extremely confused about the nature of MDL_EXCLUSIVE the idea of locking parent and children exclusively to me sounds like the worst possible approach behind only maybe acquiring an exclusive lock on the whole database (if it were a thing).
Summary
MDL_SHARED_READon parent table(s) via prelocking, preventing crashes when concurrent DDL on the parent tears downdict_table_twhile InnoDB-internal FK locks are heldprepare_fk_referenced_prelocking_list()insql/sql_base.cc— symmetric to existingprepare_fk_prelocking_list()(parent→children direction)handler::references_foreign_key()virtual + InnoDB override for lightweight early-exit checkslave_fk_event_mappath) also gets FK parent prelockingBehavioral change
DDL on parent now gets
ER_LOCK_WAIT_TIMEOUTinstead of FK-specific errors while a child has an open transaction. The DDL was never going to succeed anyway (FK constraints prevent it). Once the child transaction ends, behavior is identical to before.