KUBERNETES

Spin Up a MemSQL Cluster on Kubernetes in 10 Minutes

Rob Richardson

Even though MemSQL is a distributed system, you can run a minimal version of MemSQL on your laptop in Kubernetes. We tell you how in this blog post. The combination of free access, and being able to run MemSQL on your laptop, can be extremely convenient for demos, software testing, developer productivity, and general fooling around.

In this post we’ll quickly build a single-instance MemSQL cluster running inside Kubernetes on Docker Desktop, on a laptop computer, for free. You’ll need a machine with at least 8GB RAM and four CPUs. This is ideal for quickly provisioning a system to understand the capabilities of the SQL engine. Everything we build today will be running on your machine, and with the magic of containers, we’ll not need to install or configure much of MemSQL to get it running.

You can run MemSQL in Kubernetes on your laptop.

The steps here are: get a free MemSQL license; install Docker Desktop; create a Kubernetes yaml file; start the MemSQL cluster through Kubernetes; and browse to MemSQL Studio. You’ll have a bare-bones MemSQL cluster running on your laptop machine in no time.

Why Kubernetes, and Why a Single Container?

Containers are a great way to run software in a protected sandbox and easy-to-manage environment, with less overhead than a virtual machine (VM) – and much less than a dedicated server. You can use containers to spin up applications and systems to try out software, or to quickly spin up a database to support local application development. We’ll use Docker Desktop in Kubernetes mode to provision and spin up a free MemSQL cluster, and just as easily, destroy it when we’re done.

Using Kubernetes makes it much easier to run a small MemSQL cluster without interfering with other software running on the machine. The cluster-in-a-box container image that we’ll use here includes an aggregator node, a leaf node, and MemSQL Studio (our browser-based SQL editor and database maintenance tool), all running in one place, all pre-configured to work together. The minimal hardware footprint wouldn’t be nearly enough for production workloads, but it allows us to quickly spin up a cluster, connect it to our project, and try things out.

You also have the option of using Docker containers without Kubernetes. We believe that having Kubernetes in the mix makes it easier to manage your cluster and introduces you to a powerful modus operandi for running MemSQL. However, if you don’t already have Kubernetes in your production environment, nor much experience for running MemSQL, you may want to consider running MemSQL in Docker containers, without Kubernetes. The steps to do that are very similar to the steps described in this blog post, and you can view them here.

We could also use a Virtual Machine (VM), but containers are lighter-weight than a virtual machine (VM). Like virtual machines, containers provide a sandbox between processes. But unlike VMs, containers virtualize the operating system instead of the hardware, and the configuration-as-code mindset shared by Docker and Kubernetes ensures that we can quickly provision a complete virtual system from a small text file stored in Git.

With the single-container MemSQL cluster described here, you can craft the simplest of tables all the way up to running a complex app, a dashboard, a machine learning model, streaming ingest from Kafka or Spark, or anything else you can think of against MemSQL. You’ll quickly understand the methodology and features of MemSQL, and can plan accordingly.

The MemSQL cluster-in-a-box container has minimum hardware specs disabled, but you’ll still want a machine with at least 8 GB RAM and four CPUs. With specs well below MemSQL’s limits, you’ll see poor performance, so this system is definitely not the right setup for a proof of concept (PoC). But you can use this setup to experience the full features of MemSQL, and understand how it applies to your business problems.

Once you know how MemSQL works, you can take these experiments and use what you learn to help you achieve your service-level agreements (SLAs) on distributed clusters that meets MemSQL’s minimum requirements and your high-availability needs. That’s when you can really open up the throttle, learn how your data performs on MemSQL, and dial in system performance for your production workloads.

Cluster-in-a-box Multi-node Cluster
Hardware Laptop computer Many hefty servers
Best use-case * Try out MemSQL
* Test MemSQL capabilities
* Prototyping
* Proof of concept (PoC)
* Production workloads
* High availability
* High availability
Cost Free up to four nodes with 32GB RAM each, and with community support Free up to four nodes with 32GB RAM each, and with community support

Sign Up For MemSQL

To get a free license for MemSQL, register at memsql.com/download and click the link in the confirmation email. Then go to the MemSQL customer portal at portal.memsql.com and login. Click “Licenses” and you’ll see your license for running MemSQL for free. This license never expires, and is good for clusters up to four machines and up to 128GB of combined RAM. This is not the license you’ll want for a production cluster, but it’s great for these “kick the tires” scenarios. Note this license key. We’ll need to copy/paste it into place next.

Install Docker Desktop

The first step in getting our MemSQL cluster running in Kubernetes (k8s) is to get Docker Desktop installed. We’ll use Docker Desktop’s Kubernetes mode as the simplest way to a Kubernetes cluster. Though beyond the scope of this article, you can also use another K8s cluster such as MiniKube, K3s, MicroK8s, or kind.

Docker’s install requirements are quite specific, though most modern mid-range systems will do. Docker Desktop for Windows runs a Linux VM in Hyper-V, and Hyper-V requires Windows 10 Pro or Enterprise. Docker Desktop for Mac runs a Linux VM in xhyve, and requires a 2010 or newer model with macOS 10.13 or better.

To install Docker Desktop, go to Docker Hub and choose the Docker Desktop version for your operating system. The download will require you to create a free account. Run the downloaded installer and accept all the defaults.

Note for Windows users: If you are doing a fresh install, ensure you choose “Linux Containers” mode. If you installed Docker previously, ensure you’re running in Linux containers mode. Right-click on the Docker whale in the system tray (bottom-right by the clock), and choose “Switch to Linux Containers”. If it says “Switch to Windows Containers”, you’re already in the right place – that is, in Linux Containers mode.

Note – adding more RAM: Though not required, MemSQL will definitely behave better when Docker Desktop has more capacity. Click on the Docker whale, choose “Settings…” on Windows or “Preferences…” on Mac, and click on the “Advanced” tab. If your machine has more than 8 GB RAM, set this to 8192. If your machine has 8 GB RAM or less, set it as high as you can. Then change the CPU count from 2 to 4.

To turn on Kubernetes, open the Docker whale, choose “Settings…” on Windows or “Preferences…” on Mac, click the Kubernetes tab, and check “Enable Kubernetes”. If you don’t see this option, ensure you’re running in Linux containers mode or upgrade Docker Desktop. The first time you enable Kubernetes mode, it’ll take quite a while to download all the K8s control plane containers and start the cluster. Next time you start Docker, it’ll start much faster.

Kubernetes Configuration Files

Kubernetes stores configuration details in yaml files. (A yaml file is a text file that’s great for capturing our architecture setup.) Typically each yaml file contains a single resource. For simplicity, we’ll create one yaml file that includes both a deployment and a service.

We’ll connect to the service, the service will proxy to the pod, and the pod will route the request into the container.

We’ll use the memsql/cluster-in-a-box image built by MemSQL and available on Docker Hub. This image comes with the MemSQL database engine and MemSQL Studio preinstalled. The minimum system requirements are disabled in this “cluster-in-a-box” configuration.

Create an empty directory, and create a file named kubernetes-memsql.yaml inside. Open this file in your favorite code editor and paste in this content. As with Python source code, white space is significant.

Yaml uses two spaces, not tabs. Double-check the yaml file to ensure each section is indented with exactly two spaces. If you have more or fewer spaces, or if you’re using tabs, you’ll get an error on startup.

Here are the sections in the kubernetes-memsql.yaml file:

— designates the break between the two resources. The content above it is the deployment, the content below it is the service. The deployment manages and restarts pods on failure, and the service load-balances traffic across all matching pods. A pod is a Kubernetes wrapper around one or more containers.

Deployment:

  1. replicas: 1 notes that we only want one pod (container) to spin up.
  2. The metadata section is a list of key/value pairs. There’s nothing magic about these the names and values here, but they must match between both references in the deployment and the service.
  3. In the containers list, we list only one container. We pull the memsql/cluster-in-a-box image built by MemSQL, and name the container memsql.
  4. In the ports section, we identify inbound traffic that’ll route into the container. Open port 3306 for the database engine and port 8080 for MemSQL Studio.
  5. The first environment variable, START_AFTER_INIT, exists for legacy reasons. Without this environment variable, the container will spin up, initialize the MemSQL cluster, and immediately stop. This is great for debugging, but not the behavior we want here.
  6. The second environment variable, LICENSE_KEY, holds the MemSQL license. In production scenarios, we’d pull this value from a k8s , but for this demo, paste your license key from portal.memsql.com into place in this file.

Service:

  1. The selector section matches the metadata from the deployment. This is how the service knows which pods to use to load-balance incoming traffic.
  2. Ports of type NodePort are exposed between 30,000 and 32,767, so we adjust the port numbers into this range. In the service, we route database traffic into k8s from port 30306 to the container on port 3306, and we route SQL Studio traffic to k8s from port 30080 to the container on port 8080. Through the magic of Kubernetes, only traffic on these two ports routes from the WAN side of the k8s router to the LAN side of the container. All other traffic is blocked. If either 30306 or 30080 is in use on your machine, change these to an open port between 30,000 and 32,767. It’s also possible to get Kubernetes to randomly assign a port, though that’s out of the scope for this article.

Save the file, and we’re ready to launch the resources in Kubernetes.

Starting the MemSQL Cluster

Open a new terminal window in the same directory with the kubernetes-memsql.yaml file. This could be Powershell, a command prompt, or a regular terminal.

Type this in the shell:

kubectl apply -f kubernetes-memsql.yaml

This tells Kubernetes to create (or adjust) the service and deployment definitions, and to startup the container. The output from the container isn’t streamed to the console.

To see the status of the pod as it starts up, type:

kubectl get all

The results look like this:

Results as part of running MemSQL in Kubernetes on your laptop.

If the pod status doesn’t say Ready then we need to look at the container’s logs. Grab the pod name. (In this case it’s pod/memsql-6cfd48586b-8b2fj.) Then type:

kubectl logs pod/memsql-YOUR_POD_HERE

Substitute your pod name into place.

If you get an error starting the cluster, double-check that the license key is correct and from the Docker whale icon, ensure that both Docker and Kubernetes mode are running. If you get an image pull failure, ensure your network connection is working as expected.

To relaunch the Kubernetes content, type:

kubectl delete -f kubernetes-memsql.yaml
kubectl apply -f kubernetes-memsql.yaml

Congratulations! We’ve launched a MemSQL cluster. Let’s dive in and start using it.

Start MemSQL Studio

Now that MemSQL is running in Kubernetes, let’s dive in and start using it. Open the browser to http://localhost:30080 to launch MemSQL Studio. Click on local cluster, enter username of root, leave the password blank, and login.

Note: Using an alternate kubernetes environment?: With slight modifications, these instructions also work with Minikube, Microk8s, k3s, and other Kubernetes runtimes. Rather than browsing to http://localhost:30080, browse to the cluster’s IP address. For Minikube, open http://minikube:30080. Outside the scope of this article, you may need to switch the service to type LoadBalancer or adjust firewall rules if the cluster is not running locally. There are security implications to these changes.

On this main dashboard screen, we can see the health of the cluster. Note that this is a two-node cluster. Clicking on Nodes on the left, we see one node is a leaf node, one is an aggregator node, and they’re both running in the same container. In production, we’d want more machines running together to support production-level workloads and to provide high availability.

Click on the SQL Editor page, and we see the query window. In the query window, type each command, select the line, then push the execute button on the top-right.


For more details on MemSQL Studio, check out the docs or watch the MemSQL Studio tour video.

Where Can We Go From Here?

MemSQL is now ready for all the “kick the tires” tasks we need. You could:

  1. Hook up your analytics dashboard to MemSQL, connecting to localhost:30306.
  2. Start your application and connect it to MemSQL using any MySQL connector.
  3. Create an ingest pipeline from Kafka, S3, or other data source.

Being able to run these tasks from such a simple setup is a real time-saver. However, don’t expect the same robustness or performance as you would have with a full install on a full hardware configuration.

Cleanup

We’ve finished our experiment today. To stop the database, run this command:

kubectl delete -f kubernetes-memsql.yaml

This will delete the service and the deployment, which will delete the pod and stop the container. If you’re done experimenting with MemSQL, delete the image by running docker image rm memsql/cluster-in-a-box in the terminal. Or better yet, leave this image in place to quickly start up your next experiment. Run docker system prune to remove any stored data or dangling containers left in Docker Desktop and free up this space on your hard disk.

Conclusion

With the MemSQL cluster-in-a-box container and Kubernetes, we quickly provisioned a “kick the tires” MemSQL cluster. Aside from Docker itself, there was nothing to install, and thus cleanup is a breeze. We saw how easy it is to spin up a cluster, connect to it with MemSQL Studio, and start being productive. Now go build great things!

MemSQL Helios eclipse
Introducing
MemSQL Helios
The World’s Fastest Cloud Database