Today we’re announcing the general availability of MemSQL 6. This is a big milestone for the product, which comes with new features to help customers get even more value out of MemSQL. The latest release includes breakthrough query performance, enhanced online operations, and extensibility. In this blog, we’ll take a deeper look at the new Extensibility features.
Why did you add Extensibility to MemSQL 6?
The Extensibility feature was built based on market demand, and enables people to move database workloads that require stored procedure functionality into MemSQL.
What are the main features that have been added with Extensibility?
Extensibility is primarily split into four newly added features: table-valued functions, user-defined functions, user-defined aggregate functions, and stored procedures. These functions use the Extensibility language, have different execution characteristics, and different application use cases.
For example, user-defined functions can be used in the MemSQL Pipelines code, so when data comes in from Pipelines the user-defined function can process or sanitize the incoming data.
Additionally, one of our engineers tested Extensibility and created a Mandelbrot set demo (code here) using table-valued functions. Also known as parameterized views, table-valued functions offer more flexibility than normal views, because it can be called like a function and passed in variables.
With Extensibility added in MemSQL 6, what can customers do now that they couldn’t before?
Extensibility gives customers the ability to write custom logic in the database for the manipulation and processing of data.
With Extensibility, more business logic can be moved into the database to take advantage of what the database already does well. Two examples of this are transactions and role-based security. With transactions, customers can be confident that every operation in a unit of business logic written with MemSQL Extensibility will either be totally complete or be fully rolled back.
Customers can also use the existing security systems from MemSQL to give specific users the ability to create, alter, or execute Extensibility functions and procedures. This means that database administrators can be more precise about which data users are able to access or modify the underlying table.
What is MPSQL?
MPSQL stands for Massively Parallel SQL, and it is the custom language introduced in MemSQL 6 for writing Extensibility functions and procedures. MPSQL syntax is inspired by Postgres PL/pgSQL, but it also includes some MemSQL-specific additions.
What added benefits are customers getting from MPSQL that they didn’t get before?
The functions and procedures use the same compilation system that MemSQL uses for normal queries. This means that the functions and procedures are compiled down to the machine code and stored into a plan cache, which makes repeated execution faster.
MPSQL, similar to the rest of MemSQL, is distributed, so execution won’t get bottlenecked in a machine on a single cluster and the Extensibility code will be executable on multiple machines simultaneously.
Why did you create MPSQL?
You could not express Extensibility functions with only the syntax that was in MemSQL 5. MPSQL is a Turing complete language, with all normal control flow operations supported. The language needed to be introduced into MemSQL so customers could take advantage of the new functionality.
Does Extensibility enable any other larger tech trends, such as machine learning?
Yes, machine learning is one of the primary use cases for Extensibility in MemSQL and in other database systems. It gives customers flexibility in processing data required for ML applications. We also added the built-in functions DOT_PRODUCT, VECTOR_SUB, and EUCLIDEAN_DISTANCE.
What do those functions allow customers to do?
The functions use AVX2 instructions to provide execution at memory bandwidth speed of common linear algebra operations used in machine learning.
How would customers take advantage of these new features today?
MemSQL is a great addition to any machine learning system, especially one that has performance requirements, or one where the amount of data cannot fit on a single machine. By using Extensibility, data flows can be customized and processed to do a vast number of machine learning operations.
To learn more about how customers are taking advantage of DOT_PRODUCT, read more on this blog covering the real-time image recognition use case with Thorn.