LINUX

Spin up a MemSQL Cluster on Linux in 10 Minutes

Rob Richardson

This blog post is part of a series describing how to run an entire MemSQL cluster – which usually requires at least five servers or machine instances – on your laptop, in a Linux virtual machine (VM). That’s right – even though MemSQL is a distributed system, you can run a minimal version of MemSQL on your laptop, in a single Linux VM. We tell you how in this blog post. 

You also have the option of running MemSQL on your laptop in other environments – see our blog posts for Docker Desktop, Kubernetes, and Vagrant. You should use the one you have more experience with, or is more compatible with your work environment. Whichever method you use, the combination of free access, and being able to run MemSQL on your laptop, can be extremely convenient for demos, software testing, developer productivity, and general fooling around. 

In this post we’ll quickly build a single-instance MemSQL cluster running on Linux, in a VM, on a laptop computer, for free. You’ll need a machine with at least 8GB RAM and four CPUs. This is ideal for quickly provisioning a system that will help you test and understand the capabilities of the SQL engine. Everything we build today will be running on your machine, and we’ll not need to install or configure much to get it running.

The steps here are: get a free MemSQL license; install and boot a VM; install the MemSQL engine and tools; provision MemSQL as a cluster-in-a-box; browse to MemSQL Studio; and create a database. You’ll have a bare-bones MemSQL cluster running on your laptop machine in no time.

Why a Single VM?

Virtual Machines (VMs) are a great way to run software in a protected sandbox and easy-to-manage environment, with less overhead than a dedicated server, and less ceremony than with containers. You can use VMs to spin up applications and systems to try out software — even from a Mac or Windows machine, or to quickly spin up a database to support local application development. We’ll use a Linux VM to provision and spin up a free MemSQL cluster – and, just as easily, to destroy it when we’re done.

Using a Linux terminal makes it much easier to explore the pieces in the MemSQL ecosystem and provision a small cluster. MemSQL’s cluster-in-a-box configuration that we’ll use here includes an aggregator node and a leaf node running on a single machine. We’ll add MemSQL Studio (our browser-based SQL editor and database maintenance tool), all running in one place, all configured to work together. The minimal hardware footprint wouldn’t be nearly enough for production workloads, but it allows us to quickly spin up a cluster, connect it to our project, and try things out.

There are other options for running MemSQL’s cluster-in-a-box setup. We could use containers running on Kubernetes or Docker Desktop. If your production cluster runs on Linux machines or VMs, taking the approach described in this post helps add parity between production and development environments. If production is running in Kubernetes, you may prefer running MemSQL on Kubernetes. If you use containers without Kubernetes, you may find running the MemSQL container with a

docker-compose.yaml

file easier. You also have the option of using Vagrant – which, unlike the Docker containers used in the other two options, has its own separate instance of the operating system. 

With the single-machine MemSQL cluster described here, you can craft the simplest of tables all the way up to running a complex app, a dashboard, a machine learning model, streaming ingest from Kafka or Spark, or anything else you can think of, against MemSQL. You’ll quickly understand the methodology and features of MemSQL, and you can plan accordingly for real-world deployments.

In this post, we’ll disable MemSQL’s minimum hardware specs, but you’ll still want a machine with at least 8GB RAM and four CPUs. With specs well below MemSQL’s limits, you’ll see poor performance, so this system is definitely not the right setup for a proof of concept (PoC). But you can use this setup to experience the full features of MemSQL, and understand how it applies to your business problems.

Once you know how MemSQL works, you can take these experiments and use what you learn to help you achieve your service-level agreements (SLAs) on distributed clusters that meets MemSQL’s minimum requirements and your high-availability needs. That’s when you can really open up the throttle, learn how your data performs on MemSQL, and dial in system performance for your production workloads. 

  Cluster-in-a-box Multi-node Cluster
Hardware 💻 Laptop computer 🖥️🖥️🖥️ Many hefty machines
Best use-case
  • Try out MemSQL
  • Test MemSQL capabilities
  • Prototypes
  • Proof of concept (PoC)
  • Production workloads
  • High availability
  • Performance testing
Cost Free up to four nodes with 32GB RAM each, and with community support Free up to four nodes with 32GB RAM each, and with community support

 

Sign Up For MemSQL

To get a free license for MemSQL, register at memsq.com/download and click the link in the confirmation email. Then go to the MemSQL customer portal at portal.memsql.com and login. Click “Licenses” and you’ll see your license for running MemSQL for free. This license never expires, and is good for clusters up to four machines and up to 128GB of combined RAM. This is not the license you’ll want for a production cluster, but it’s great for these “kick the tires” scenarios. Note this license key. We’ll need to copy/paste it into place later.

Install VirtualBox

The first step in getting our MemSQL cluster running is to get a Linux Virtual Machine. You can use VirtualBox, Xen, VMware, Hyper-V, or any other virtualization technology to craft a Linux VM. In this example we’ll use VirtualBox because it’s free and available on all platforms. If you’re already using virtualization technology, you can skip this step.

Go to https://www.virtualbox.org/wiki/Downloads and download VirtualBox for your platform. Open the installer and follow the prompts. If you’re running on Windows, this will disable Hyper-V.

Start a Linux VM

Download a modern version of an Ubuntu VM from https://www.osboxes.org/ubuntu-server/, or from https://virtualboxes.org/images/ubuntu/, or from your corporate VM catalog. We’ll exclusively use the terminal, so choose a Server version if prompted. A server version removes the Desktop shell. You won’t have a mouse or VM tools installed in this configuration.

The MemSQL database is tested to run on RHEL-based and Debian-based Linux Operating Systems, but with minor variation, you may be able to run this tutorial on other systems as well.

Start up your virtualization software, and create a new virtual machine. In VirtualBox, the “New VM” button is on the toolbar. As you create the new VM, configure these settings:

  1. Set the VM to 64-bit mode. MemSQL doesn’t run on 32-bit systems.
  2. Set the memory to 8GB RAM. If your laptop doesn’t have 8GB of RAM free, set it as high as you can.
  3. Don’t create a new virtual hard drive. Instead, choose the Linux machine you downloaded.
  4. Ensure the VM has a network card.

With the VM created, click Start, and boot the VM. Now let’s get to work installing MemSQL.

Installing MemSQL

We’re following the install steps for Debian-based systems to match the Ubuntu VM. If you chose a RedHat-based Linux OS, switch to the RedHat setup steps in the MemSQL docs for this section.

In the Linux VM, open a terminal if necessary, and run these commands. We won’t be saving any files into the directory, so we don’t need to change directory into a specific folder. Each command in this section is run as root because we’re configuring the system here. The output of these commands is verbose and roughly amounts to console barf. As long as the commands return without error, we’re good.

  1. Become root. Though we could run each command as sudo, it’s easier to do it all at once.
sudo su
  1. Allow installing Linux packages as https:
apt update
apt install -y apt-transport-https
  1. Install MemSQL’s apt repository into Linux:
wget -O - 'https://release.memsql.com/release-aug2018.gpg' 2>/dev/null | apt-key add -
apt-key list
echo "deb [arch=amd64] https://release.memsql.com/production/debian memsql main" | tee /etc/apt/sources.list.d/memsql.list

Note that there are three commands here. If the lines wrap on your monitor, you may need to remove line-breaks as you type the commands.

  1. Install the latest version of MemSQL Toolbox, MemSQL command-line client, and MemSQL Studio:
apt update
apt install -y memsql-toolbox memsql-client memsql-studio
  1. Start MemSQL Studio, the browser-based administration tool as a background service:
systemctl start memsql-studio
  1. Stop being root
exit

MemSQL Components

What did we install? Let’s look at the three tools and understand what each does.

  1. memsql-toolbox adds the admin tools for provisioning a cluster. We’ll use memsql-deploy from this toolbox to start our cluster.
  2. memsql-client adds memsql, a lightweight client application that allows you to run SQL queries against your database from a terminal window. We won’t use it in this tutorial, but it’s really handy to have when you just need to pull some data real fast.
  3. memsql-studio is the browser-based cluster administration tool and SQL query executor. We ran this one as a service to keep it running in the background.

In a production environment, we may choose to install only memsql-toolbox on our main machines, and leave the other tools for ansilatory machines. In this developer-focused single machine setup, we installed all three onto the same machine.

Starting the MemSQL Cluster

We’ll start a “cluster-in-a-box” cluster. This cluster configuration has a single aggregator node and a single leaf node, both running on the same machine. In a production scenario, we’d want these on different machines, and many more than one.

Go to portal.memsql.com, switch to the license tab, and copy your license key. It’s really long, and likely ends in ==.

Reusing the terminal from above, we run these commands not as root from any directory:

memsql-deploy cluster-in-a-box --license "YOUR_LICENSE_HERE" -y

You can verify it’s running by running:

ps -ef | grep memsql

My output looks like this:

root      3597     1  0 00:17 ?        00:00:03 /usr/bin/memsql-studio
memsql    3779     1  0 00:18 ?        00:00:00 /opt/memsql-server-7.0.11-df50c6ab30/memsqld_safe --defaults-file /var/lib/memsql/6c659cc1-2c1a-4c7c-95fb-f2b96c53d0d8/memsql.cnf --user 110 --auto-restart StagedEnable
memsql    3785  3779 12 00:18 ?        00:00:55 /opt/memsql-server-7.0.11-df50c6ab30/memsqld --defaults-file /var/lib/memsql/6c659cc1-2c1a-4c7c-95fb-f2b96c53d0d8/memsql.cnf --user 110
memsql    3787  3785  0 00:18 ?        00:00:00 /opt/memsql-server-7.0.11-df50c6ab30/memsqld --defaults-file /var/lib/memsql/6c659cc1-2c1a-4c7c-95fb-f2b96c53d0d8/memsql.cnf --user 110
memsql    4038     1  0 00:18 ?        00:00:00 /opt/memsql-server-7.0.11-df50c6ab30/memsqld_safe --defaults-file /var/lib/memsql/aeebcd10-7103-4b81-aef8-49f1308a0bde/memsql.cnf --user 110 --auto-restart StagedEnable
memsql    4044  4038 12 00:18 ?        00:00:53 /opt/memsql-server-7.0.11-df50c6ab30/memsqld --defaults-file /var/lib/memsql/aeebcd10-7103-4b81-aef8-49f1308a0bde/memsql.cnf --user 110
memsql    4046  4044  0 00:18 ?        00:00:00 /opt/memsql-server-7.0.11-df50c6ab30/memsqld --defaults-file /var/lib/memsql/aeebcd10-7103-4b81-aef8-49f1308a0bde/memsql.cnf --user 110
rob       4399  4384  0 00:25 pts/0    00:00:00 grep memsql

If your cluster doesn’t start correctly, you can run these additional commands as root to disable hardware capacity checks:

sudo memsqlctl -yj update-config --all --key minimum_core_count --value 0
sudo memsqlctl -yj update-config --all --key minimum_memory_mb --value 0

After running these commands, re-run the memsql-deploy command not as root.

Congratulations! We’ve launched a MemSQL cluster. Let’s dive in and start using it.

Start MemSQL Studio

As we installed the software, we started MemSQL Studio as a service. Then we launched the MemSQL database. Now let’s use them.

First we need to find the IP address of the Linux VM. In the terminal on the VM, run:

ifconfig

The results will look like this:

lo: flags=73 mtu 16384
    inet 127.0.0.1 netmask 0xff000000 
    inet6 ::1 prefixlen 128 
    … snip ...
en0: flags=4163 mtu 1500
    inet 10.0.2.15 netmask 0xffffff00 broadcast 192.168.102.255
    inet6 0102::0102:0102:0102:0102 prefixlen 64 scopeid 0x20 
    ether 01:02:03:04:05:06 
    … snip ...

We’re looking for the IP address of the VM, so we’ll ignore the loopback interface (127.0.0.1). Look for the inet line, and grab the first set of numbers. My VM’s IP is 10.0.2.15.

Next, let’s launch MemSQL Studio. It runs on port 8080. Open your favorite browser on your laptop, and browse to http://YOUR_IP:8080/, substituting your VM’s IP. So I would browse to http://10.0.2.15:8080/.

  1. Click “Add New Cluster”
  2. The host name is localhost because MemSQL Studio and the MemSQL cluster are both running on the same machine.
  3. The port is 3306, the port of the master aggregator.
  4. The username is root, and the password is blank.
  5. Mark the cluster as Development.
  6. Give the cluster a name and description, such as “MemSQL dev cluster”.
  7. Click “Create Cluster”.
    Note: this button is disabled until you’ve filled in all the required fields.

We’re now directed to the main dashboard screen for our cluster.

On this main dashboard screen, we can see the health of the cluster. Note that this is a two-node cluster. Clicking on Nodes on the left, we see one node is a leaf node, one is an aggregator node, and they’re both running in the same container. In production, we’d want more machines running together to support production-level workloads and to provide high availability.

Click on the SQL Editor page, and we see the query window. In the query window, type each command, select the line, then push the execute button on the top-right. 

For more details on MemSQL Studio, check out the docs or watch the MemSQL Studio tour video.

Where Can We Go From Here?

MemSQL is now ready for all the “kick the tires” tasks we need. You could:

  1. Hook up your analytics dashboard to MemSQL, connecting to the machine’s IP on port 3306.
  2. Start your application and connect it to MemSQL using any MySQL connector.
  3. Create an ingest pipeline from Kafka, S3, or other data source. 

Being able to run these tasks from such a simple setup is a real time-saver. However, don’t expect the same robustness or performance as you would have with a full install on a full hardware configuration

Cleanup

We’ve finished our experiment today. To stop the VM, run this command:

sudo shutdown

This will stop the VM. As the VM powers down, so too will the MemSQL Studio service and the MemSQL nodes. If you’re done experimenting with MemSQL, delete the VM. Or better yet, leave this machine in place, to quickly start up your next experiment.

Conclusion

With the MemSQL cluster-in-a-box configuration and a Linux virtual machine, we quickly provisioned a “kick the tires” MemSQL cluster. We got to experience more of the components and configuration of a MemSQL cluster than we would have with a pre-provisioned system. We saw how easy it is to spin up a cluster, connect to it with MemSQL Studio, and start being productive. Now go build great things!

MemSQL Helios eclipse
Introducing
MemSQL Helios
The World’s Fastest Cloud Database