VAGRANT

Spin up a MemSQL Cluster on Vagrant in 10 Minutes

Rob Richardson

This blog post is part of a series describing how to run an entire MemSQL cluster – which usually requires at least five servers or machine instances – on your laptop, in a Linux virtual machine (VM) provisioned by Vagrant, a tool for managing virtual machinres. That’s right – even though MemSQL is a distributed system, you can run a minimal version of MemSQL on your laptop, in a single Linux VM, with a tool you may already know how to use. We tell you how in this blog post. 

You also have the option of running MemSQL on your laptop in other environments – see our blog posts for Docker Desktop, Kubernetes, and Linux (without Vagrant). You should use the one you have more experience with, or is more compatible with your work environment. Whichever method you use, the combination of free access, and being able to run MemSQL on your laptop, can be extremely convenient for demos, software testing, developer productivity, and general fooling around. 

In this post we’ll quickly build a single-instance MemSQL cluster running on Linux, in a VM provisioned by Vagrant, on a laptop computer, for free. You’ll need a machine with at least 8GB RAM and four CPUs. This is ideal for quickly provisioning a system that will help you test and understand the capabilities of the SQL engine. Everything we build today will be running on your machine, and we’ll not need to install or configure much to get it running.

The steps here are: get a free MemSQL license; install Vagrant and virtualization software; install MemSQL engine and tools; provision MemSQL as a cluster-in-a-box; browse to MemSQL Studio; and create a database. You’ll have a bare-bones MemSQL cluster running on your laptop machine in no time.

Why Vagrant and Why a Single VM?

Virtual Machines (VMs) are a great way to run software in a protected sandbox and easy-to-manage environment, with less overhead than a dedicated server, and less ceremony than containers. You can use VMs to spin up applications and systems to try out software — even from a Mac or Windows machine, or to quickly spin up a database to support local application development. We’ll use a Linux VM to provision and spin up a free MemSQL cluster, and just as easily, destroy it when we’re done.

Using Vagrant makes it easy to provision a fresh VM provisioned exactly the way you want. A Vagrantfile specifies the exact details of the VM and the initialization scripts to run. When Vagrant finishes instantiating the machine, everything is provisioned exactly as you expect. We can pause (halt) and resume machines to continue work, or destroy and recreate VMs to ensure we have the latest versions running in the VM. In production we would likely use Terraform for this level of automation, but when running locally, Vagrant is a simpler, developer-friendly tool.

MemSQL’s cluster-in-a-box configuration that we’ll use here includes an aggregator node and a leaf node running on a single machine. We’ll add MemSQL Studio (our browser-based SQL editor and database maintenance tool), all running in one place, all configured to work together. The minimal hardware footprint wouldn’t be nearly enough for production workloads, but it allows us to quickly spin up a cluster, connect it to our project, and try things out.

There are other options for running MemSQL’s cluster-in-a-box setup. We could use containers running on Kubernetes or Docker Desktop. If your production cluster runs on Linux machines or VMs, taking the approach in this post helps add parity between production and development environments. If production is running in Kubernetes, you may prefer running MemSQL on Kubernetes. If you use containers without Kubernetes, you may find running the MemSQL container with a docker-compose.yaml file easier. You can also run MemSQL in a Linux virtual machine

With the single-machine MemSQL cluster described here, you can craft the simplest of tables all the way up to running a complex app, a dashboard, a machine learning model, streaming ingest from Kafka or Spark, or anything else you can think of against MemSQL. You’ll quickly understand the methodology and features of MemSQL, and you can plan accordingly for real-world deployments.

In this post, we’ll disable MemSQL’s minimum hardware specs, but you’ll still want a machine with at least 8 GB RAM and four CPUs. With specs well below MemSQL’s limits, you’ll see poor performance, so this system is definitely not the right setup for a proof of concept (PoC). But you can use this setup to experience the full features of MemSQL, and understand how it applies to your business problems.

Once you know how MemSQL works, you can take these experiments and use what you learn to help you achieve your service-level agreements (SLAs) on distributed clusters that meets MemSQL’s minimum requirements and your high-availability needs. That’s when you can really open up the throttle, learn how your data performs on MemSQL, and dial in system performance for your production workloads. 

  Cluster-in-a-box Multi-node Cluster
Hardware 💻 Laptop computer 🖥️🖥️🖥️ Many hefty machines
Best use-case
  • Try out MemSQL
  • Test MemSQL capabilities
  • Prototypes
  • Proof of concept (PoC)
  • Production workloads
  • High availability
  • Performance testing
Cost Free up to four nodes with 32GB RAM each, and with community support Free up to four nodes with 32GB RAM each, and with community support

 

Sign Up For MemSQL

To get a free license for MemSQL, register at memsq.com/download and click the link in the confirmation email. Then go to the MemSQL customer portal at portal.memsql.com and login. Click “Licenses” and you’ll see your license for running MemSQL for free. This license never expires, and is good for clusters up to four machines and up to 128GB of combined RAM. This is not the license you’ll want for a production cluster, but it’s great for these “kick the tires” scenarios. Note this license key. We’ll need to copy/paste it into place later.

Install Vagrant and Virtual Box

The first step in getting our MemSQL cluster running is to get Vagrant to provision a Linux Virtual Machine. Vagrant uses VirtualBox by default, but can easily be configured to use other hypervisors as well.

On Windows, Hyper-V conflicts with other virtualization technology, so use Hyper-V, and don’t install VirtualBox.

Head to Vagrant’s downloads page and download the version of Vagrant for your system, and install it. If you’ve chosen a vm provider not included in the install, install the 3rd party provider too.

If you’re on Mac or don’t have Hyper-V, go to https://www.virtualbox.org/wiki/Downloads and download VirtualBox for your platform. Open the installer and follow the prompts. If you’re on Windows and already have Hyper-V installed, you can skip this step.

Build the Vagrantfile

The Vagrantfile tells Vagrant how to provision the virtual machine. Though we could provision the VM and type all these commands, putting them in a Vagrantfile allows the tool to do this for us.

Create an empty directory, and create a file named Vagrantfile (not Vagrantfile.txt) in the empty folder. Add this content to the file:

In this file, we’re using the generic/ubuntu1904 box available from https://app.vagrantup.com/boxes/search I’ve chosen this box because there’s a version of this VM for many providers.  We’ve also configured the VM to use 4 CPUs and 4 gigs RAM. This definitely isn’t enough for a production cluster, but will be great for a developer setup.

If you’re on Windows and using Hyper-V, change the provider section to this:

  # set the provider
  config.vm.provider "hyperv"
  # configure the provider
  config.vm.provider "hyperv" do |v|
    v.cpus = 4
    v.memory = 4096
    v.maxmemory = 4096
    v.enable_virtualization_extensions = true # hyperv only
  end

The Vagrantfile references two additional configuration files that will do the heavy lifting of installing MemSQL and starting the cluster.  One is run as root, the other is run as a regular user. Let’s create these two files in the same folder as the Vagrantfile:
provision.sh:

This file runs as root. It configures apt to connect to the MemSQL repository, then installs memsql-toolbox, memsql-client, and memsql-studio. Finally it starts memsql-studio as a service. We’ll come back to what’s in each of these packages.

For the next file, we’ll need the license key. Go to portal.memsql.com, switch to the license tab, and copy your license key. It’s really long, and likely ends in ==.
start.sh:

Set your license key into place. Because we’ve embedded the license key in this file, it’s not appropriate to check this file into source control. Though it is possible to pass parameters into Vagrant as you provision a VM, and then use the parameters in a Vagrantfile, in this developer-centric workflow, that’s unnecessary complexity.

With Vagrant installed, and the Vagrantfile and supporting scripts in place, we’re ready to let Vagrant provision our VM.

Open a terminal in the directory with all the files. On Windows, run the command prompt as Administrator.

Then run this:

vagrant up

On Windows using Hyper-V, it will first prompt you for the network interface to use. Choose a network interface with internet access so that it can install the MemSQL packages. It’ll also ask for your Windows credentials so it can create a file share to get the provision.sh and start.sh files into the VM.

The output will begin with details like this:

Bringing machine 'default' up with 'virtualbox' provider...
==> default: Importing a Virtualbox instance
    default: Creating and registering the VM...
    default: Successfully imported VM
    default: Configuring the VM...
==> default: Starting the machine...
==> default: Waiting for the machine to report its IP address...
    default: Timeout: 120 seconds
    default: IP: 192.168.184.164
==> default: Waiting for machine to boot. This may take a few minutes...
    default: SSH address: 192.168.184.164:22
    default: SSH username: vagrant

Take note of the machine’s IP address here. We’ll need it later as we connect to the machine.

The first time this launches, it’ll download the VM from Hashicorp’s site, so it may look frozen for a time. Not to worry — it’s still working.

From here you’ll see all the console output from each of the provisioning commands. I find it wonderfully fascinating to watch the console output roll by. In time, Vagrant will finish provisioning the machine, and return the command prompt to you.

It’s running!

Note: The username for all Vagrant machines is vagrant, and the password is also vagrant

Install MemSQL Components

What did we install? Let’s look at the three tools and understand what each does.

  1. memsql-toolbox adds the admin tools for provisioning a cluster. We used memsql-deploy from this toolbox to start our cluster.
  2. memsql-client adds memsql, a lightweight client application that allows you to run SQL queries against your database from a terminal window. We won’t use it in this tutorial, but it’s really handy to have when you just need to pull some data real fast.
  3. memsql-studio is the browser-based cluster administration tool and SQL query executor. Vagrant ran this one as a service to keep it running in the background.

In a production environment, we may choose to install only memsql-toolbox on our main machines, and leave the other tools for ansilatory machines. In this developer-focused single machine setup, we installed all three onto the same machine.

Start MemSQL Studio

As you installed the software, it started MemSQL Studio as a service, and launched the MemSQL database. Now let’s use them.

We grabbed the IP address of the Vagrant-provisioned VM at the beginning of the output from vagrant up. If you missed it, scroll back up in the console.

MemSQL Studio runs on port 8080. Open your favorite browser on your laptop, and browse to http://YOUR_IP:8080/ substituting your VM’s IP. Since my VM is on 192.168.184.164 I would browse to http://192.168.184.164:8080/.

  1. Click “Add New Cluster”
  2. The host name is localhost because MemSQL Studio and the MemSQL cluster are both running on the same machine.
  3. The port is 3306, the port of the master aggregator.
  4. The username is root, and the password is blank.
  5. Mark the cluster as Development.
  6. Give the cluster a name and description such as “MemSQL dev cluster”.
  7. Click “Create Cluster”.
    Note: this button is disabled until you’ve filled in all the required fields.

We’re now directed to the main dashboard screen for our cluster.

On this main dashboard screen, we can see the health of the cluster. Note that this is a two-node cluster. Clicking on Nodes on the left, we see one node is a leaf node, one is an aggregator node, and they’re both running in the same container. In production, we’d want more machines running together to support production-level workloads and to provide high availability.

Click on the SQL Editor page, and we see the query window. In the query window, type each command, select the line, then push the execute button on the top-right. 


For more details on MemSQL Studio, check out the
docs or watch the MemSQL Studio tour video.

Where Can We Go From Here?

MemSQL is now ready for all the “kick the tires” tasks we need. You could:

  1. Hook up your analytics dashboard to MemSQL, connecting to the machine’s IP on port 3306.
  2. Start your application and connect it to MemSQL using any MySQL connector.
  3. Create an ingest pipeline from Kafka, S3, or other data source. 

Being able to run these tasks from such a simple setup is a real time-saver. However, don’t expect the same robustness or performance as you would have with a full install on a full hardware configuration

Cleanup

We’ve finished our experiment today. To stop and delete the VM, run this command:

vagrant delete

This will delete the VM. If you’re done experimenting with MemSQL, you can delete the Ubuntu VM by running:

vagrant box remove generic/ubuntu1904

Or better yet, leave this machine in place to quickly start up your next experiment.

Conclusion

With the MemSQL cluster-in-a-box configuration and a Vagrant-provisioned Linux virtual machine, we quickly stood up a “kick the tires” MemSQL cluster. Vagrant took all the work out of provisioning the cluster, and much like a container, handed us a fully running system. We saw how easy it is to spin up a cluster, connect to it with MemSQL Studio, and start being productive. Now go build great things!

MemSQL Helios eclipse
Introducing
MemSQL Helios
The World’s Fastest Cloud Database