InnoDB VS MyISAM - comparison of two MySQL Database Storage Engines

in #programming7 years ago (edited)

inno-db-my-isam-vs.png
This articles aims to compare two MySQL Database engines by providing a thorough description of their key advantages and disadvantages. We begin by providing a high level overview of key concepts. If you have noticed any inconsistencies, please feel free to leave a comment.

High level overview

What is a Database

A database is an organized collection of data. It is the collection of schemas, tables, queries, reports, views, and other objects. The data are typically organized to model aspects of reality in a way that supports processes requiring information, such as modelling the availability of rooms in hotels in a way that supports finding a hotel with vacancies.

What is a database management system

DBMS is a computer software application that interacts with the user, other applications, and the database itself to capture and analyze data. A general-purpose DBMS is designed to allow the definition, creation, querying, update, and administration of databases

What is a database storage engine

A database engine (or storage engine) is the underlying software component that a database management system (DBMS) uses to create, read, update and delete (CRUD) data from a database. Most database management systems include their own application programming interface (API) that allows the user to interact with their underlying engine without going through the user interface of the DBMS.

MySQL DBMS and its Storage Engines

MySQL supports two storage engines InnoDB and MyISAM. Both engines can be used within the same schema, you do not have to choose one engine over another for the entire schema. However it is not recommended to do so without a very specific reason, as different engines will accept different operations.

InnoDB vs MyISAM

Transactions & Atomicity

Data in a table is managed using Data Manipulation Language (DML) statements, such as SELECT, INSERT, UPDATE and DELETE. A transaction group two or more DML statements together into a single logical operation such that either the entire unit is applied, or none of it is. For instance, a transfer of funds from one bank account to another, even involving multiple changes such as debiting one account and crediting another, is a single transaction.

MyISAM do not support transactions whereas InnoDB does.

If an operation is interrupted while using a MyISAM table, the operation is aborted immediately, and the rows (or even data within each row) that are affected remains affected, even if the operation did not go to completion.

If an operation is interrupted while using an InnoDB table, because it using transactions, which has atomicity, any transaction which did not go to completion will not take effect, since no commit is made.

Referential Integrity

The concept of referential integrity ensures that relationships between tables remains consistent. More specifically, this means when a table (e.g. Orders) has a foreign key (e.g. Product ID) pointing to a different table (e.g. Products), when updates or deletes occur to the pointed-to table, these changes are cascaded to the linking table.

InnoDB is a relational DBMS (RDBMS) and thus has referential integrity, while MyISAM does not.

Table-locking vs Row-locking

When a query runs against a MyISAMtable, the entire table in which it is querying will be locked. This means subsequent queries will only be executed after the current one is finished. If you are reading a large table, and/or there are frequent read and write operations, this can mean a huge backlog of queries.

When a query runs against an InnoDB table, only the row(s) which are involved are locked, the rest of the table remains available for CRUD operations. This means queries can run simultaneously on the same table, provided they do not use the same row.

This feature in InnoDB is known as concurrency. As great as concurrency is, there is a major drawback that applies to a select range of tables, in that there is an overhead in switching between kernel threads, and you should set a limit on the kernel threads to prevent the server coming to a halt.

Transactions & Rollbacks

When you run an operation in MyISAM, the changes are set; in InnoDB, those changes can be rolled back. The most common commands used to control transactions are COMMIT, ROLLBACK and SAVEPOINT.

  1. COMMIT - you can write multiple DML operations, but the changes will only be saved when a COMMIT is made
  2. ROLLBACK - you can discard any operations that have not yet been committed yet
  3. SAVEPOINT - sets a point in the list of operations to which a ROLLBACK operation can rollback to

Reliability

MyISAM offers no data integrity - Hardware failures, unclean shutdowns and canceled operations can cause the data to become corrupt. This would require full repair or rebuilds of the indexes and tables.

InnoDB, on the other hand, uses a transactional log, a double-write buffer and automatic checksumming and validation to prevent corruption. Before InnoDB makes any changes, it records the data before the transactions into a system tablespace file called ibdata1. If there is a crash, InnoDB would autorecover through the reply of those logs.

Indexes

InnoDB is built on clustered indexes and uses MVCC to achieve high concurrency. This provides very fast primary key lookups. MyISAM doesn't support transactions, FK contraints, or row-level locks. MyISAM uses shared and exclusive locks on the entire table. However, concurrent reads & inserts for new rows are allowed.
FULLTEXT Indexing

InnoDB does not support FULLTEXT indexing until MySQL version 5.6.4. As of the writing of this post, many shared hosting providers’ MySQL version is still below 5.6.4, which means FULLTEXT indexing is not supported for InnoDB tables.

However, this is not a valid reason to use MyISAM. It’s best to change to a hosting provider that supports up-to-date versions of MySQL. Not that a MyISAM table that uses FULLTEXT indexing cannot be converted to an InnoDB table.

Memory consumption

InnoDB requires a lot of memory (buffer pool). The data and indexes are cached in memory. Changes are written to the log buffer (physical memory) and are flushed every second to the log files (method depends on innodb_flush_log_at_trx_commit value). Having the data in memory is a huge performance boost. MyISAM only caches indexes (key_buffer_size) so that's where you would allocate most of your memory if you're only using MyISAM

Summary

InnoDB is newer, more complex and feature rich storage engine. InnoDB should be the default choice when it comes down to MySQL storage engines as it has many advantages over MyISAM. Choose MyISAMonly if there is a very specific need for it.

Sources