MemSQL 5 Ships with LLVM-based Code Generation for SQL Queries

Nikita Shamgunov
Nikita Shamgunov

We are proud to announce general availability for MemSQL 5 today. A key milestone in this release is a full fledged SQL compiler resulting in faster query processing across the board. Making this happen was a result of several months of hard work, which featured a large uplift of our existing database execution engine.

This new SQL compiler is using LLVM for code generation. This modern compilation strategy is capable of supporting dynamic compilation of programming languages.

In addition to performance, MemSQL 5 has many new features and smarter query optimization.

Building the Fastest SQL Compiler

Query processing in database engines such as Oracle, SQL Server, or Postgres can benefit significantly from code generation, especially for complex analytical workloads. However, too often, code generation does not get enough attention, teams have lacked enough compiler experts, or the need was not acute. Today, with the rise of big data, proper code generation is critical, especially in memory optimized systems where I/O is no longer a bottleneck.

Building code generation is a compilers project. We knew it would be an extremely ambitious engineering undertaking, but that the competitive advantages gained would greatly outweigh the cost.

Building a new SQL compiler within months requires a world class team. For us it started with Drew Paroski, who came to MemSQL as an architect specifically to design and lead our code generation efforts. He spent his initial weeks on the project understanding the current code and its shortcomings and prototyping new designs, working closely with MemSQL engineers Michael Andrews and David Stolp (aka Pieguy).

One big design decision was how far we should push code generation away from the classic volcano model for query processing. The trade-off was between using the volcano model and generating code for just parts of the SQL query, or abandoning that model to generate code for the whole query, including error handling and corner cases. We chose the latter approach for its composability and precise control over performance. This required more work, but ultimately it provides a significantly better experience for our customers.

Download MemSQL 5 Today

MemSQL 5 is generally available now. Download it now and enjoy its real-time capabilities. You can build real-time data pipelines and take them to unprecedented levels of scale and sophistication. Our customers solve their hardest data problems with MemSQL and we are thrilled to continue driving database innovation for the industry.

memsql rainbow wave
Live Webinar
See a Demo of MemSQL & Kubernetes