Atomic DDL
From MariaDB 10.6.1, we have improved readability for DDL (Data Definition Language) operations to make most of them atomic, and the rest crash-safe, even if the server crashes in the middle of an operation.
The design of Atomic/Crash-safe DDL (MDEV-17567) allows it to work with all storage engines.
Definitions
- Atomic means that either the operation succeeds (and is logged to the binary log or is completely reversed.
- Crash-safe means that in case of a crash, after the server has restarted, all tables are consistent, there are no temporary files or tables on disk and the binary log matches the status of the server.
- DDL Data definition language.
- DML Data manipulation language.
- 'DDL recovery log' or 'DDL log' for short, is the new log file,
ddl-recovery.log
by default, that stores all DDL operations in progress. This is used to recover the state of the server in case of sudden crash.
Background
Before 10.6, in case of a crash, there was a small possibility that one of the following things could happen:
- There could be temporary tables starting with
#sql-alter
or#sql-shadow
or temporary files ending with '' left. - The table in the storage engine and the table's .frm file could be out of sync.
- During a multi-table rename, only some of the tables were renamed.
Which DDL Operations are Now Atomic
- CREATE TABLE, except when used with CREATE OR REPLACE, which is only crash safe.
- RENAME TABLE and RENAME TABLES.
- CREATE VIEW
- CREATE SEQUENCE
- CREATE TRIGGER
- DROP TRIGGER
- DROP TABLE and DROP VIEW. Dropping multiple tables is only crash safe.
- ALTER TABLE
- ALTER SEQUENCE is not listed above as it is internally implemented as a DML.
Which DDL Operations are Now Crash Safe
DROP TABLE of Multiple Tables.
DROP TABLE over multiple tables is treated as if every DROP is a separate, atomic operation. This means that after a crash, all fully, or partly, dropped tables will be dropped and logged to the binary log. The undropped tables will be left untouched.
CREATE OR REPLACE TABLE
CREATE OR REPLACE TABLE foo is implemented as:
DROP TABLE IF EXISTS foo; CREATE TABLE foo ...
This means that if there is a crash during CREATE TABLE
then the original table 'foo' will be dropped even if the new table was not created. If the table was not re-created, the binary log will contain the DROP TABLE
.
DROP DATABASE
DROP DATABASE is implemented as:
loop over all tables DROP TABLE table
Each DROP TABLE is atomic, but in case of a crash, things will work the same way as DROP TABLE with multiple tables.
Atomic with Different Storage Engines
Atomic/Crash-safe DDL works with all storage engines that either have atomic DDLs internally or are able to re-execute DROP
or RENAME
in case of failure.
This should be true for most storage engines. The ones that still need some work are:
- The S3 storage engine.
- The partitioning engine. Partitioning should be atomic for most cases, but there are still some known issues that need to be tested and fixed.
The DDL Log Recovery File
The new startup option --log-ddl-recovery=path (ddl-recovery.log
by default) can be used to specify the place for the DDL log file. This is mainly useful in the case when one has a filesystem on persistent memory, as there is a lot of sync on this file during DDL operations.
This file contains all DDL operations that are in progress.
At MariaDB server startup, the DDL log file is copied to a file with the same base name but with a -backup.log
suffix. This is mainly done to be able to find out what went wrong if recovery fails.
If the server crashes during recovery (unlikely but possible), the recovery will continue where it was before. The recovery will retry each entry up to 3 times before giving up and proceeding with the next entry.
Conclusions
- We believe that a clean separation of layers leads to an easier-to-maintain solution. The Atomic DDL implementation in MariaDB 10.6 introduced minimal changes to the storage engine API, mainly for native ALTER TABLE.
- In our InnoDB implementation, no file format changes were needed on top of the RENAME undo log that was introduced in MariaDB 10.2.19 for a backup-safe TRUNCATE re-implementation. Correct use of sound design principles (write-ahead logging and transactions; also file creation now follows the ARIES protocol) is sufficient. We removed the hacks (at most one CREATE or DROP per transaction) and correctly implemented
rollback
andpurge
triggers for the InnoDB SYS_INDEXES table. - Numerous DDL recovery bugs in InnoDB were found and fixed quickly thanks to https://rr-project.org. We are still working on one: data files must not be deleted before the DDL transaction is committed.
Thanks to Atomic/Crash-safe DDL, the MariaDB server is now much more stable and reliable in unstable environments. There is still ongoing work to fix the few remaining issues mentioned above to make all DDL operations Atomic. The target for these is MariaDB 10.7.
See Also
- MDEV-17567 Atomic DDL. This MDEV entry links to all other entries related to Atomic operations that contains a lot of information how things are implemented.
© 2021 MariaDB
Licensed under the Creative Commons Attribution 3.0 Unported License and the GNU Free Documentation License.
https://mariadb.com/kb/en/atomic-ddl/