Many organizations find they need to support a database-as-a-service workload. This is a model where multiple organizations have data living side by side to reduce management costs but each organization requires a strong namespace and security boundary around its data. Organizations also often want to leverage the existing ecosystem of tooling available on top of the data layer. This is a challenge to do using legacy databases because they do not scale out and a challenge with NoSQL systems because they lack the tooling ecosystem and structure their users want.
This pattern appears in two different scenarios.
The “Enterprise Database-as-a-Service”
This is where a large enterprise with an IT team that wants to enable self-service for departments that build their own applications or manages their own data for doing analytics. They need a database to do it but don’t want to be responsible for managing the hardware or system software. The data owners own the logical management (defining their schema, tuning their queries, creating indexes, etc.) but the IT department manages the physical aspects (hardware, system software, and capacity management). The number of databases is usually a multiple of the number of departments that need this functionality, which means there could be hundreds or thousands of databases in a large organization. Some databases are small (tens or hundreds of gigabytes) with a few that are large (multiple terabytes). The activity on the database (i.e. how many users or applications are querying it at the same time) will also vary wildly depending on the use case. Data owners will also have varying levels of SLAs on availability and durability of the data. This makes resource consumption highly variable. Given these requirements it is a challenge for IT to operate and manage the databases and maintain the SLAs required by the data owners.
The “Multi-tenant SaaS Service”
This is where a company is building a multi-tenant service that is sold to organizations where each organization owns its data. An example of this would be a marketing analytics service where the service takes in data about how a marketing campaign did (with data coming from many sources), then offering canned and/or ad-hoc analytics over the campaign results. In this case, the service owner wants the ability to easily separate the data for each of its customers for security, namespace, and performance reasons while still retaining a single control point for management (i.e. a single cluster to manage). Each database likely has the same schema, and the schema needs to be updated to keep it in sync, with perhaps small customizations. This amortizes the cost of management so that the service owner can maintain profitability as it acquires more customers. In addition, customers want to support both very large and very small customer databases without worrying about over-provisioning, under-provisioning, or hitting scale limits of a single node.
MemSQL is a great platform for building such a system. A database in a MemSQL cluster is a natural namespace and security boundary. A cluster is a perfect single control point for managing the system. Additionally, MemSQL is a distributed database so you don’t have to worry about one of your customers hitting a scalability limit, like you would for single box databases, such as Azure SQL DB or Aurora. In legacy database systems the largest databases have to be manually sharded in the application layer because they outgrow a single node. Manually sharding like this is very difficult and error prone. MemSQL handles this naturally and transparently. Customers can grow their usage as needed and simply use more resources in their cluster. These operations are online and therefore transparent to the data owners. Customers, especially smaller ones, share resources, which keeps costs low. Because of the sharding model of MemSQL, the workload is naturally spread evenly across the cluster. When your aggregate usage grows larger than the cluster can handle, a cluster can be increased by simply adding more nodes to handle the load. MemSQL also has a resource governance mechanism to prevent one database from unfairly using more than its fair share of resources. Last, MemSQL supports both transactional and analytical workloads making it appealing regardless of the workload type that you need to support.
So if you are an enterprise architect tasked with building a database-as-a-service model or a company building a new software-as-a-service offering, then MemSQL is a great option for your data layer.