ANALYTICS

Using MemSQL and Looker for Real-Time Data Analytics

Floyd Smith

MemSQL is a fast, scalable SQL database. Looker is a fast, scalable analytics platform. You can use MemSQL and Looker to create a fast, scalable – yes, those words again – analytics solution that works well across a wide range of data ingest, transaction processing, and analytics needs.

Both MemSQL and Looker are flexible and powerful tools. With the ability to provide full ANSI SQL support, MemSQL has the ability to works with a wide range of analytics tools. For Looker, its ability to connect to any SQL data source allows it to work well with a vast number of databases. Looker also optimizes its database interface to take advantage of specific database features, as you will see below.

When paired together, MemSQL and Looker combine these areas of strength to deliver consistent and concrete results. For instance, one of the most popular applications for real-time analytics is to create a real-time dashboard. There may not be an easier or more effective way to create such dashboards than to first implement MemSQL and Looker together atop your existing architecture. Use Looker to make creating your dashboard easy, and use MemSQL to make it fast.

Speeding Up Analytics with MemSQL and Looker

You can use the combination of Looker and MemSQL atop an existing data architecture to make data much easier to access and greatly speed up performance. MemSQL is faster than competing solutions; often twice as fast, at half the cost. You can also use MemSQL to take over some or all of the work currently done by an existing SQL or NoSQL database, further improving performance.

A solid example of an organization using MemSQL to speed up analytics performance is the online retail company Fanatics. Fanatics sources and sells branded merchandise for some of the world’s leading sports teams, including the NBA and the NFL, along with global brands such as Manchester United. Fanatics uses MemSQL to create a fast and reliable data architecture for all their analytics needs – including apps, business intelligence (BI) tools, and ad hoc SQL queries.

Data from all sources powers streaming analytics in Fanflow using MemSQL
Fanatics’ Fanflow feeds into robust analytics capabilities powered by MemSQL

Looker can also be used alongside existing BI and analytics tools. Where you use Looker, you’ll gain high-performance SQL query performance and ease of use, thanks to the LookML modeling layer. By implementing Looker, you can begin to create and foster a true data culture at your organization.

One company that has done this is Kollective, which has the demanding job of distributing video content for a wide range of customers, including Fortune 500 companies like ExxonMobil, HSBC, and T-Mobile. Kollective chose MemSQL and Looker and uses the tools together for real-time analytics. You can read the case study or watch a joint webinar presented by people from both companies.

You can also use MemSQL to replace both your existing database types – transactional and analytical – with a single, converged database, MemSQL. As you do so, you’re removing existing batch loading and extract, transform, and load (ETL) processes from your overall data flow. This change brings apps and analytics tools – including Looker – closer to your source data. With a properly architected solution, you can achieve near-real-time or real-time analytics.

Together, MemSQL and Looker support much broader access to your data. By making data much easier to access, Looker increases the number of people who want to dig into data and the frequency with which they access it. MemSQL contributes, with its outstanding performance and its high degree of concurrency. As a scalable SQL database, MemSQL lets you power your data with the amount of hardware that you need to get the performance that you want, for all the users who need it.

MemSQL’s unique ability to offer this kind of solution is mentioned by Looker in Looker’s Pocket Guide to Databases. Looker describes MemSQL as a database that is:

  • Powered by both rowstore functionality, traditionally used mostly for transactions, and columnstore functionality, traditionally used mostly for analytics.
  • A massively parallel processing (MPP) database, capable of smoothly scaling out across multiple nodes.
  • Both a self-managed (aka on-premises) MPP database and an on-demand (aka cloud) MPP database.
Looker's guide describes a wide range of databases, including MemSQL, and how best to use them with Looker
Looker’s guide to databases features MemSQL

Setting Up MemSQL to Work Well with Looker

A typical “small” MemSQL implementation has two aggregator nodes, four leaf nodes, 128GB of RAM, and – through the use of mixed rowstore and columnstore data – up to perhaps a terabyte of total data. You add nodes to support larger and larger amounts of data.

In its early implementations, MemSQL worked as a very fast, rowstore, in-memory database. Several years ago, MemSQL added columnstore functionality, which keeps data – including strongly compressed data – on disk, with a solid chunk of RAM dedicated for use as a cache over the columnstore.

Because MemSQL functions as both a rowstore and columnstore database in one, most operations proceed at or near in-memory speed. This allows data that’s presented in Looker to appear as near-real-time analytics, at a cost closer to that of a disk-based system.

More recently, MemSQL has added support for semi-structured data. Geospatial data, JSON data, and AVRO data (a specialized, compressed format based on JSON) are all supported, easy to manage, and with performance very close to fully structured data.

You don’t really need to do anything special to MemSQL to make it work well with Looker. In fact, Looker is designed to take full advantage of MemSQL’s capabilities.

Looker supports MemSQL’s semi-structured data formats. You don’t have to limit their use in order to keep your data available for analytics, and you don’t have to worry that Looker, as your analytics tool, will bog down on semi-structured data. (In order to take advantage of the data, you first need to schematize it into a relational format.) You can store and manage your data in the way that makes sense for the specific data you’re storing and the queries you’ll be making against it.

Second, Looker can flexibly use rowstore or columnstore data. This allows you to maximize your use of either, or both, without worrying about the needs of your analytics program. For example, if you really feel the need for speed, you can keep more of your MemSQL data in memory, assumedly in rowstore format. You can then let Looker do the work needed to efficiently run queries that would normally only work well against columnstore data.

Setting Up Looker to Work Well with MemSQL

One of the advantages of using Looker with MemSQL is that Looker “gets” MemSQL. Looker works smoothly and well across rowstore and columnstore tables, hiding the implementation details from the people and applications generating the queries, with excellent performance.

Looker has specific setup instructions for use with a set of databases that are deeply MySQL-compatible: MySQL, Clustrix, MariaDB, and MemSQL. For all of these databases, you can enable either persistent or regular derived tables. Derived tables are powerful tools that can give you more capability in LookML and more performance from your SQL queries.

Looker and MemSQL together also help users to resolve a challenge in using any database that supports SQL, including MemSQL. The challenge is to easily generate SQL that’s optimized for the database in question.

With Looker and MemSQL together, you have four options:

  1. Write your own SQL. Many people are so SQL-conversant that this is an easy option for them, for simple queries.
  2. Let LookML generate SQL for you. Looker generates highly-performant SQL queries that query your database, and are optimized for it, directly from LookML’s modeling layer.
  3. Use Looker’s SQL Runner to optimize your query. SQL Runner has a wide range of capabilities, including the ability to test derived tables.
  4. Use MemSQL Studio and MemSQL’s command-line tools. With these tools, you can profile and optimize your database for maximum performance against the queries generated as ad hoc SQL queries (#1 above), from Looker (#2 and #3 above), and from other sources, including machine learning and AI programs.

Note: Code generated from LookML by Looker (see #2 above) is likely to run faster than handwritten SQL. For instance, Looker takes advantage of the MemSQL Persistent Derived Tables capability to generate optimized tables – in MemSQL – for extremely fast performance of one-time or repeated queries.

You can further speed up analytics by optimizing your data storage choices in many ways, taking advantage of MemSQL’s flexible use of rowstore and columnstore tables. For instance, you can construct a dashboard in Looker that’s backed entirely by rowstore tables in SQL for optimal performance. Or you can mix columnstore and rowstore data flexibly to target the price-performance that you need.

You can use Looker and MemSQL together to iteratively optimize your database structure and analytics needs. The two companies have been working together for years. For a quick demo of building an analytics app with MemSQL, using Looker as the analytics tool, please view our webinar on the topic.

Looker is one of scores of BI tools you can easily use with MemSQL.
Our MemSQL webinar shows how to set up MemSQL,
connect to it from Looker, and quickly create an analytics app.

Ready To Get Started?

Want to learn more and get started with MemSQL? You can get started for free. Or reach out to our team to learn more about how MemSQL can work for you.

And if you’re ready to find out how Looker can enable data-driven insights at your organization, contact the Looker team to request a demo and connect with their experts.

MemSQL Helios eclipse
Introducing
MemSQL Helios
The World’s Fastest Cloud Database