How Manage Accelerated Data Freshness by 10x

Success in the mobile advertising industry is achieved by delivering contextual ads in the moment. The faster and more personalized a display ad, the better. Any delay in ad delivery means lost bids, revenue, and ultimately, customers.

Manage, a technology company specializing in programmatic mobile marketing and advertising, helps drive mobile application adoption for companies like Uber, Wish, and Amazon. In a single day, Manage generates more than a terabyte of data and processes more than 30 billion bid requests. Manage analyzes this data to know which impressions to buy on behalf of advertisers and uses machine learning models to predict the probability of clicks, app installs, and purchases.

managing-data-at-scaleManaging Data at Scale

At the start, Manage used MySQL to power their underlying statistics pipeline, but quickly ran into scaling issues as data volume grew. Manage then turned to Hadoop coupled with Apache Hive and Kafka for data management, analysis, and real-time data feeds. However, even with this optimized data architecture, Manage found that Hive was slow and caused hours of delay in data pipelines.

To meet customer expectations, Manage needed a solution that could deliver fresh data for reporting, while concurrently allowing their analytics team to run ad hoc queries. Kai Sung, Manage CTO and co-founder began the search for a faster database platform, and found SingleStore. The Manage team quickly started prototyping on SingleStore, and was in production within a few months.

streaming-log-data-from-apache-kafkaStreaming Log Data from Apache Kafka

Manage uses SingleStore Streamliner, an Apache Spark solution, to first stream log data from Apache Kafka, then store it in the SingleStore columnstore for further processing. As new data arrives, the pipeline de-duplicates data and aggregates it into various summary tables within SingleStore. This data is then made available to an external reporting dashboard and reporting API. With this architecture, manage has a highly scalable, real-time data pipeline that ingests data and summarizes data as fast as it is produced.

10-x-faster-data10x Faster Data

After implementing SingleStore, Manage was able to reduce the delay in data freshness from two hours down to 10 to 15 minutes. With SingleStore, the Manage team now has the ability to run analytics much faster and can react to marketplace changes in the moment.

In an EnterpriseTech article, Kai Sung said, “We’ve built a highly scalable, real-time data pipeline that ingests and summarizes data as fast as we produce it. Our analytics team is able to run ad-hoc queries on log-level data within seconds.”

For more details, check out the EnterpriseTech article: Managing 30B Bid Requests, 1.5B Users per Day in (near) Real Time

If you are interested in SingleStore, you can download at www.singlestore.com/cloud-trial/




Share