Photo: Martin Taylor
We often hear “How can I use MemSQL together with my Oracle database?”
As a relational database, MemSQL is similar to an Oracle database, and can serve as an alternative to Oracle in certain scenarios. Here is what sets MemSQL apart:
- MemSQL is a distributed system, designed to run on multiple machines with a massively parallel processing architecture. An Oracle database, on the other hand, resides in a single, large machine, or a smaller fixed cluster size.
- MemSQL has two primary data stores: an in-memory rowstore and a disk-based columnstore. An Oracle database, on the other hand, has one primary data store – a disk-based rowstore. Oracle does have an in-memory option that allows users to make a columnar copy of its disk-based rowstore data in-memory, but even with that, all in-memory data must first be created on disk.
With its distinct architecture, MemSQL complements Oracle in several cases where users can deploy the two databases side by side. These include:
- MemSQL as the real-time analytics engine for an Oracle database
- MemSQL as the ingest layer for an Oracle database
- MemSQL as the stream processing layer for an Oracle database
MemSQL as the Real-Time Analytics Engine for an Oracle Database
Enterprises typically have Oracle databases in place for transactional (OLTP) workloads. In these cases, batch processes (ETL) are typically run at the end of each day to transfer data into a separate Oracle data warehouse for analytical (OLAP) workloads. Within the Oracle data warehouse, data is then aggregated and rolled up for efficient querying.
MemSQL performance eliminates the need for batch processing. Data can be copied from the OLTP Oracle database into MemSQL immediately through Oracle GoldenGate or another change data capture tool, and analytical queries can be performed in real-time.
By eliminating ETL, MemSQL minimizes the time between data coming into the system and analysis being gathered from that data set. Ultimately, this enhances enterprises’ ability to make decisions in real time.
MemSQL as the Ingest Layer for an Oracle Database
For many enterprises using Oracle databases, the rate at which data is inserted into the database can be too large for an affordable Oracle system to handle. Ingest performance for an Oracle database is limited by its disk; all inserts into Oracle need to be persisted to disk at each database commit. The Oracle in-memory option does not help with inserts, as “in-memory” data is just a copy of the disk-resident data. As such, any insert into an Oracle database is limited by disk speeds.
For optimal ingest that takes full advantage of in-memory processing, you need a pure in-memory database like MemSQL. Using MemSQL as the data ingest layer on top of an Oracle database allows ingest at in-memory speed.
MemSQL as the Stream Processing Layer for an Oracle Database
Streaming data has become quite popular, yet the Oracle database was designed long ago, well before data streams from sources like Apache Kafka came about. These streams often have unstructured and high volume data that requires real-time transformation. Processing a data stream with these traits requires a specially designed system. To that end, MemSQL provides an integrated Apache Spark solution called Streamliner. Streamliner makes it easy to deploy Spark within MemSQL for ingesting and enriching data streams. With Streamliner, MemSQL can serve as the stream processing layer in front of an Oracle database.
The Avant-Garde of New Relational Databases
Alex Woodie cites a recent Gartner research report from Adam Ronthal in Meet the Avant-Garde of New Relational Databases; the Gartner report states that “over the next three years, 70 percent of new projects requiring ‘scale-out elasticity, distributed processing and hybrid cloud capabilities for relational applications, as well as multi-data-center transactional consistency’, will prefer an ‘emerging’ database vendor over traditional vendors.”
Enterprises with existing Oracle databases should consider adding an ‘emerging’ database like MemSQL into the mix for benefits of scale out, distributed processing, and memory-first technology.