MemSQL Architecture: Technology Innovations Power Convergence of Transactions and Analytics
MemSQL is the real-time database for transactions and analytics with an in-memory, distributed, relational architecture. MemSQL enables high volume, high velocity Big Data processing so organizations can extract more value, more quickly from their data. MemSQL opens the doors for real-time operational analytics by enabling concurrent transactional and analytical workloads in a single database on commodity hardware, deployed in your data center or in the cloud.
The Need for In-Memory Distributed Databases
Data volumes are growing too large for traditional databases that reside on a single server. Modern applications require better performance than legacy disk-based technologies allow. IT organizations use multiple solutions to address these problems. Some companies manually shard a single box database, but the resulting cluster is difficult to manage and the IT department sees decreasing marginal returns on performance as the cluster grows. Other companies choose to purchase a high performance appliance, but this solution is expensive and inflexible, requiring a "forklift upgrade" when the system needs more storage or compute power. MemSQL is purpose-built for high performance, horizontal scalability, and ease of maintenance on large, high-velocity data. This paper provides an overview of the architecture, deployment considerations, core technology, and administration of MemSQL.
Primer on Distributed Relational Databases
MemSQL automatically distributes data through sharding. Sharding is a method of horizontal, or row-wise, data partitioning that splits tables into smaller chunks, known as shards or partitions, across multiple physical servers. In addition to dividing data into smaller chunks, sharding requires additional metadata to inform the distributed query optimizer where particular data resides. In MemSQL, this is accomplished with a shard key.
Traditionally, sharding is done manually which requires constant management by a database administrator (DBA), and necessitates application-level query routing and aggregation. Furthermore, manual sharding destroys the separation of responsibility between a DBA and application developers because every time a change is made the number of machines or their configuration, the application itself needs to be adjusted. MemSQL is designed to eliminate this complexity, resulting in a distributed database that automatically performs sharding and can be used as though it resides on a single server.
The practice of sharding is related to shared nothing architectures, a design paradigm in distributed computing. A shared nothing architecture mandates no two nodes share resources, including CPU, memory, or disk. Shared nothing contrasts with systems sharing a central data repository or peer-to-peer systems where nodes share compute and storage resources. MemSQL uses a shared nothing architecture. Another important consideration for high performance databases is concurrency control. Multi-threading enables applications to run orders of magnitude faster than single-threaded applications through parallel data processing. However, multi-threading requires a mechanism to maintain data integrity. For instance, suppose separate users submit concurrent SELECT and UPDATE queries. If the SELECT query tries to read data that the UPDATE query is currently modifying, the SELECT query may return only a partial result, or a result that is half updated and half the original data. To avoid these situations, databases must have a mechanism to ensure thread safety, a guarantee that shared data structures are only manipulated in a consistent, predictable manner and by only one thread at a time.
The most common conventional way of dealing with contentious queries is through locking. For the SELECT and UPDATE example above, the UPDATE query locks the table in order to manipulate the table. Locks can be secured on other levels as well, such as locking a row or an entire database, but this example assumes the database locks a table. The lock prevents other queries, such as the SELCT, from accessing data stored in that table. While this prevents inconsistent and unpredictable query execution, it degrades performance because concurrent queries must be executed in series, each waiting for the preceding query to complete. This can have a severe performance impact when the database is loading streaming data or trying to perform a mixed read and write workload.
Real-time applications such as those processing large volumes of high-velocity data cannot tolerate the performance degradation of locking. The best transactional databases implement multiversion concurrency control (MVCC), a design to prevent queries from blocking one another by reducing the amount of required database locking. However, most databases, even those with MVCC, require some degree of locking, which can degrade performance, especially during mixed read and write workloads.
MemSQL takes a new approach, using state of the art technology to eliminate the need for locks such that writes never block reads and vice versa. In addition to MVCC, MemSQL uses various lock-free data structures that always allow one thread to make progress, ensuring system-wide throughput. This allows MemSQL to handle concurrency more efficiently than databases with even the most sophisticated locking mechanisms. In particular, this is where MemSQL outperforms many NoSQL solutions that are designed for fast execution of single or queued queries, but lack methods for gracefully managing concurrency.
MemSQL In-Memory Distributed Architecture
MemSQL delivers blazing performance on commodity hardware by using a distributed computing model that makes use of all available system resources. The MemSQL architecture is simple by design, making it easy to set up, maintain, and scale, reducing both upfront and long-term maintenance costs. In addition to performance and ease of use, MemSQL is designed for fault tolerance and horizontal scalability on commodity hardware In your datacenter or the cloud.
Components of a MemSQL Cluster
MemSQL uses a two-tiered architecture consisting of aggregator nodes and leaf nodes. Aggregator nodes are cluster-aware query routers that act as a gateway into the distributed system. They store only metadata and reference data. Aggregators intelligently distribute queries across the leaf nodes and aggregate results that are sent to the client. Increasing the number of aggregators will improve operations like data loading and will allow MemSQL to process more client requests concurrently.
Application workloads can be divided amongst groups of aggregators, where one group of aggregators serves Application A, another group serves Application B, and so forth.
Figure 1: MemSQL Distributed Architecture
Leaf nodes function as storage and compute nodes. Data is automatically distributed across leaf nodes into partitions to enable parallelized query execution. Increasing the number of leaf nodes will increase the overall capacity of the cluster and speed up query execution, especially queries that require large table scans and aggregations. Additional leaf nodes also allow the cluster to process more queries in parallel.
The number of aggregator and leaf nodes deployed determines cluster capacity and performance. Typical deployments will have a 5:1 ratio of leaf nodes to aggregator nodes, but this ratio may vary depending on the workload. When considering your cluster design, keep in mind that applications serving many clients should have a higher aggregator to leaf node ratio, whereas applications with larger capacity requirements should have a lower aggregator to leaf node ratio.
Replication is the only time a node communicates with another node in its own tier (aggregator to aggregator or leaf to leaf). Other than replication, all communication between MemSQL nodes is implemented with SQL commands executed over the MySQL protocol. For example, cluster heartbeats are implemented as SELECT 1 queries, rather than with a special-purpose interface. Both tiers offer automatic failover in case of server failure to provide fault tolerance. For more information, see "Planning Your MemSQL Cluster" section, which discusses MemSQL and High Availability.
Figure 2: Deployment Ratios
Shared Nothing Architecture
MemSQL has a shared nothing architecture, meaning no two nodes share storage or compute resources. MemSQL uses this architecture to facilitate massively parallel data processing. This design reduced latency by minimizing network communication and data transfer. In addition, MemSQL manages sharding with deterministic hashing so aggregators always know where data resides without the trial and error of checking multiple nodes, as is sometimes required with a peer-to-peer architecture. Distributed Query Optimizer
Distributed Query Optimizer
When a client machine issues a query to an aggregator, the aggregator breaks the query down into partial-result statements and distributes them among leaf nodes. The distributed query optimizer ensures consistent workload utilization and balancing across the cluster.
MemSQL performance scales linearly as nodes are added to the cluster. Nodes can be added "just in time," as more performance or capacity becomes necessary, while keeping the cluster online. Most distributed databases require the steps of making a backup, taking the cluster offline, configuring additional nodes, and then bringing the cluster back online. With MemSQL, all of these activities can be performed with the cluster online and running a normal workload.
MemSQL manages sharding automatically, and adding new nodes is as simple as making the cluster aware of the new machines and running a single REBALANCE PARTITIONS command.
Query Language and Integration
Interaction with MemSQL is similar to a traditional single machine RDBMS, even though the underlying architecture is distributed. Analysts and applications query the database by sending SQL statements through a single interface. MemSQL is ANSI SQL-92 compliant.
MemSQL is MySQL wire compatible, which means compatibility with the suite of MySQL connectors (ODBC and JDBC) and corresponding MySQL client libraries for every major programming language.
While planning your database schema is crucial, invariably there come times when it needs to be changed to better model data or accelerate computation. MemSQL facilitates this process with features like online ALTER TABLE which can be run during a normal workload while the cluster is online.
In addition, MemSQL offers native SQL support for JSON storage and manipulation. This allows users to store and query JSON and relational data together through a single interface, which provides flexibility in data modeling and an easy way to efficiently handle sparse data.
MemSQL is built with technology specifically designed for a distributed in-memory architecture. In terms of performance, these features set MemSQL apart from other in-memory offerings.
Code Generation and Compiled Query Plans
With disk I/O bottlenecks removed from an in-memory system, queries execute so quickly that dynamic SQL interpretation can impact peak performance. MemSQL addresses this by interpreting SQL statements and into a compiled query execution plan.
With each new query, MemSQL automatically removes the parameters and generates an execution plan, written in C++, which is compiled to machine code. MemSQL stores compiled query plans in a repository called the plan cache. When future queries match an existing parameterized query plan template, MemSQL bypasses code generation and executes the query immediately using the cached plan. Executing a compiled query plan is much faster than interpreting SQL thanks to low level optimizations and the inherent performance advantage of executing compiled versus interpreted code.
Compiled query plans provide performance advantages during mixed read and write workloads. Some companies use a caching layer on top of their RDBMS, usually a key-value store with SQL statements mapped to query results. This strategy may improve performance for queries on immutable datasets, but this approach runs into problems with frequently updated data. When the dataset changes, the cache must be repopulated with updated query results, a process ratelimited by the underlying database. In addition to the performance degradation, synchronizing the state across multiple data stores in invariably a difficult engineering problem. Query planning with MemSQL provides an advantage by executing a query on the in-memory database directly, rather than fetching cached results. This helps MemSQL maintain remarkable query performance even with frequently changing data.
Lock-free Data Structures and Multiversion Concurrency Control
MemSQL achieves high throughput using lock-free data structures and multiversion concurrency control (MVCC), which allows the database to avoid locking on both reads and writes. Traditional databases manage concurrency with locks, which results in some processes blocking others until they complete and release the lock. In MemSQL, writes never block reads and vice versa.
MemSQL uses a lock-free skiplist as the primary index-backing data structure. Skiplists deliver concurrency and performance benefits. Lock free skiplists are an efficient technique for searching and manipulating data. This is in marked contrast to databases that use B-Trees to store indexes for disk-based databases.
In the past, dealing with "Big Data"-sized datasets either required purchasing a monolithic appliance or expertly managing a fickle, manually sharded cluster. Neither is an appealing option. In addition to speed, MemSQL eliminates the complexity of managing and developing applications on a distributed database.
MemSQL supports automatic High Availability by setting the cluster redundancy level to two. The database will automatically provision and synchronize paired leaf nodes, creating partition-level master and slave replicas. Each node has roughly half of the master partitions and half of the slave partitions to make the most efficient use of CPU resources, rather than keeping a passive backup. In the event that a leaf node goes down, the aggregators automatically failover to the node's replica and promote the slave partitions to master with no noticeable performance degradation.
MemSQL shards data automatically, exposing a setting for the number of partitions per node (one partition per CPU core by default) and a REBALANCE PARTITIONS command. Traditionally, sharding data was a labor-intensive process and required the full attention of expert DBAs. Unlike legacy relational databases, which were designed to run on a single server, MemSQL is a distributed database with transparent, low maintenance sharding.
Durable by Default
In addition to redundancy, MemSQL ensures durability with logging and full database snapshots. If a node goes down, its state can be recreated by replaying the most recent snapshot and log files. Durability is optional and can be configured. In addition, users can specify the log size at which a snapshot is taken, as well as the size of the in-memory transaction buffer. Together, these variables allow administrators to tune the right combination of durability and performance for their organization.
MemSQL supports online backup and restore. Both operations can be accomplished with single commands and can run in the background during a normal workload. MemSQL writes logs on a per-partition basis. This eliminates disk contention as the logs are being written, a common issue for single server databases and distributed databases with a single consolidated log. The feature dramatically accelerates recovery time through parallelized recovery.
Deploying MemSQL is remarkably simple. MemSQL is as software solution that can be deployed on premises, on a private or public cloud, or on virtual machines or containers.
Planning Your MemSQL Cluster
When designing a MemSQL cluster, you should consider your use case and its requirements with respect to memory and flash or disk capacity, CPU, and network capabilities. Each of these resources affects database performance and should be tailored to your particular application. However, note that MemSQL is designed for simple, flexible administration and many administrative tasks can be accomplished with the cluster online.
When planning for the total capacity of your MemSQL cluster, consider the following the following factors:
- Primary data: The amount of RAM storage required for your application data.
- Indexes: Indexes have an initial fixed capacity cost per-row. After that, the total RAM required for each index will depend on the number of rows in the table, and will increase over time as the volume of data grows.
- Redundancy level: Redundancy Level 1 only stores a single copy of data, whereas redundancy Level 2 stores two copies of the data and requires twice the amount of primary data capacity.
More CPU cores allow MemSQL to more broadly parallelize query execution. The number of CPU cores across the cluster will affect query performance more than any other single factor. By default, MemSQL creates one partition per CPU core on each node. Note, this ratio is recommended but can be modified.
In a distributed database environment, query execution time is a function of network performance. To ensure high performance levels with MemSQL, the network should be at least a gigabit network to provide sufficient bandwidth for sending queries and results between the aggregators and leaf nodes.
Traditionally, configuring a distributed computing environment has been a time and labor-intensive process. However, MemSQL is easy to install and configure. There are three options for installing your MemSQL cluster:
a. Easy Installation: Use the MemSQL Cluster Installer to deploy on premises or on any cloud service provider. The Cluster Installer is a command line tool that installs and configures an entire MemSQL cluster from a single machine.
b. Spin up Cluster in Cloud: Deploy MemSQL on Amazon EC2 using cloud.memsql.com, which provides a graphical installation wizard that provisions and configures an entire cluster automatically. This is the easiest way to get started with MemSQL.
c. Manual Installation: Install and configure your MemSQL cluster manually. RPM, DEB, and direct binary installation (tar.gz) options are available. This option requires installing MemSQL on each node in the cluster and configuring aggregator and leaf relationships from the command line.
Prerequisites for installing MemSQL
- Software Requirements
|Operating System||64-bit Linux-based operating system. MemSQL will not run on 32-bit Linux|
|Client Utilities||MySQL client tools must be installed to connect to MemSQL|
|Compiler||g++ compiler (if needed for your Linux distribution). If it is not already installed, the MemSQL installation process will attempt to install it for you|
The following Linux distributions are officially supported.
|Distribution||Minimum Version||Appropriate g++ Command|
|Amazon AMI||2012.03||sudo yum install gcc-c++|
|CentOS||6.0||sudo yum install gcc-c++|
|Debian||6.0||sudo apt-get install g++|
|Fedora||14||sudo yum install gcc-c++|
|OpenSUSE||11.0||sudo zypper install gcc-c++|
|Red Hat||6.0||sudo yum install gcc-c++|
|Ubuntu||8.04||sudo apt-get install g++|
- Hardware Requirements
The following are the hardware requirements for installing MemSQL.
|CPU||x86-based server with at least 8 cores. At a minimum, Intel Core i3 or better processor|
|Memory||Recommended: 64-96GB. Minimum: 16GB|
|Hard Drive||SSD or SATA 7200RPM hard drive with storage capacity at least 3x the amount of system memory|
In addition, note the following hardware considerations:
- MemSQL will run on machines with less than 16GB RAM, but this is not recommended.
- MemSQL capacity is limited by the amount of RAM on the host machine. Increasing RAM increases the amount of available data capacity.
- MemSQL is optimized for architectures supporting SSE4.2, but it will run on most earlier architectures.
MemSQL in the Cloud
MemSQL administrative features lend themselves to simple deployment in the cloud. In particular, data loading and query execution performance improve linearly as the cluster scales and administrators can add and remove nodes while keeping the cluster online. These features take advantage of cloud services that allow users to spin up servers at will, offering a company the ability to tailor a MemSQL cluster to their performance and storage needs.
In a large database market, MemSQL distinguishes itself by offering customers incomparable speed, scalability, and simplicity. While many databases offer some subset of these features, all three are necessary for a reliable, future-proofed Big Data solution. MemSQL offers performance out of the box with minimal tuning required, and the flexibility to make schema changes and add or remove nodes while the cluster remains online, even running a normal workload. This makes MemSQL not only the most powerful, but also the most agile in-memory database on the market.