GEOSPATIAL

MemSQL Geospatial Operations Perform 2x – 24x Faster Than Alternatives

Shekhar Bapat
Shekhar Bapat

The mobile revolution, epitomized by the rise of GPS-enabled smartphones, is transforming our lives – the way we travel, connect with like-minded people, track our goods and shipments, manage traffic congestion, get threat alerts, and hunt for Pokemon.

Many estimates place the number of smartphone subscribers at 2 billion, a number expected to grow to 6.1 billion by 2020. Each of these devices serves as a GPS-data emitting mobile sensor, providing geospatial data that can be used to track the movement of people, vehicles, and merchandise. Though this data presents rich opportunities for businesses to extract valuable insights and operational efficiencies, the sheer volume requires a modern scale-out analytic solution and technology that can process it without delay for timely actionable intelligence.

MemSQL is designed from the ground up as a massively parallel scale-out solution for real-time analytics with built-in support for geospatial queries. While geospatial capabilities are also available through other solutions, only MemSQL offers them in the context of an ACID-compliant in-memory scale-out operational data warehousing solution.

While MemSQL offers strong transactional semantics and ElasticSearch as a datastore offers no support for transactions at all, they both offer powerful geospatial analytics and it seems appropriate to benchmark MemSQL against ElasticSearch geolocation capabilities. In this analysis, we compare MemSQL geospatial query performance with ElasticSearch geolocation for tracking vehicles and creating alerts when the vehicles enter certain geofences. The alerts can be used to improve logistics and security, reduce congestion, and monitor vehicle fleets. We simulate an urban scenario with 10M-100M vehicles generating geospatial data points and 2000 geofences. The system is required to ingest updated geospatial data points from vehicles and identify vehicles that show up within specified geofences.

MemSQL queries for creating schema for geofences and vehicles

CREATE REFERENCE TABLE IF NOT EXISTS locations (id integer primary key,
                                                name varchar(128),
                                                polygon geography DEFAULT NULL);

CREATE TABLE IF NOT EXISTS records (id integer primary key,
                                    location geographypoint,
                                    key(location));

MemSQL query for updating vehicles geolocation

INSERT INTO records (id, location)
VALUES      (id, location pairs)
ON          DUPLICATE KEY UPDATE location = VALUES
            ( location);

MemSQL query for checking intersection of vehicles with geofences

SELECT r.id,
       r.location,
       l.id
FROM   records r,
       locations l
WHERE  l.id = shape_id AND geography_intersects(r.location, l.polygon);

ElasticSearch geolocation mapping (schema) for geofences and vehicles

{
    "locations": {
        "properties": {
            "name" : {
                "type":"string"
            },
            "polygon": {
                "type": "geo_shape",
                "tree": "quadtree",
                "precision": "1m"
            }
        }
    }
}

{
    "driver": {
        "dynamic": "strict",
        "properties": {
            "location": {
            "type": "geo_shape",
            "tree": "quadtree",
            "points_only":"true",
            "precision": "1m"
             }
         }
     }
}

ElasticSearch query for inserting vehicle geolocation

{"location":{"type": "point","coordinates":[latitude,longitude]}}

ElasticSearch query for checking intersection of vehicles with geofences

{
    "query":
    {
        "bool":{
            "must":{
                "match_all":{}
            },
            "filter":{
                "geo_shape":{
                    "location":{
                        "indexed_shape":{
                            "id":shape_id,
                            "type":"locations",
                            "index":"locations",
                            "path": "polygon"
                        }
                    }
                }
            }
        }
    }
}

We ran MemSQL 5.1 and ElasticSearch 2.3.5 on 1, 5, and 10 node clusters of m4.10xl instances on AWS. Our benchmarking results show that on 10 nodes, MemSQL is 24x faster than ElasticSearch for queries that update vehicle geolocations in terms of rows/second.

We found MemSQL to be more than 2X faster than ElasticSearch for queries that find intersection of geolocations with geofences in terms of queries/second.

MemSQL presents a compelling option because it offers superior geolocation query performance in addition to strong transactional support and the full feature set of a leading operational data warehouse. MemSQL is the preferred option for enterprises that need to monitor geolocations for people, vehicles, and goods in real time.

For complete details of the benchmark including scripts, please visit https://github.com/memsql/geo-benchmark

Feel free to download MemSQL if you’d like to try it yourself: http://www.memsql.com/download



MemSQL Helios eclipse
Introducing
MemSQL Helios
The World’s Fastest Cloud Database