Serving Investment Managers
Novus is a portfolio intelligence platform that helps the world’s top investors generate higher returns. The company works with 100+ of the world’s top investment managers and institutional investors, managing approximately $2 trillion. The platform helps the industry’s top investors collectively innovate and gain valuable insights into their investments.
Changing How the World Invests
(An edited transcript of this blog post, with updates, can be found here. Ed.)
Novus aims to change how the world invests by allowing their users to explore investments from publicly sourced data, and to understand their risks and potential returns. Users can log in to the Novus platform to analyze individual hedge fund portfolios, aggregate industry trends, or analyze their own private data.
According to Noah Zucker, Vice President of Tactical Engineering at Novus, “Our mission is to help investors discover their true investment acumen, identify their skill-sets, and understand potential risks.”
Initially Facing an ETL Barrier
As the Novus team began to scale their platform, they encountered a barrier with ETL (Extract, Transform, Load) that was causing them to monitor a delicate process requiring handholding nearly 24/7. If there was an issue, they had to immediately spring into action. And the worst cases of having to load jobs during the day led to an application slowdown for all of the users.
They had tried scaling up with larger servers but the prior database could not keep up. The team began investigating options to scale out on the cloud. In particular they wanted to find an option that would not require heavy rework of their application model. Essentially if they had had to introduce a sharding strategy on their own, it would have required a change to business logic.
24/7 ETL Handholding
Overnight Failure = Business Hours Slowdown
Scala worker pool limited by the database
Non-trivial code changes needed to shard and scale
Investing in New Technologies
The team came across MemSQL and began to rework their pipeline. Clients provide data in all sorts of formats, and that data is loaded through a cleanup process into a persistent store.
Then the Novus platform takes that data and sends it into a distributed compute layer of Scala nodes which then places the data in MemSQL so clients have all of that data available to them. Now high intensity computations are available to clients immediately instead of minutes or hours.
The Bottom Line for the ETL Team
Typical load went down from 90 minutes to 2 minutes
Client team focuses on service, not ETL
Predictable application performance
Scala workers: 12 → 126
Add servers to scale – No code changes needed
Keeping Developers Happy
Anytime there is a database change, there is a potential to create extra work for developers, but MemSQL uses the MySQL protocol for access, making the experience familiar and comfortable for developers. There is an entire tool chain available, and the learning curve is simple.
In addition, MemSQL has first class JSON support and the Novus team was able to map their JSON format from MongoDB to MemSQL. They also wrote a blog post about using the MemSQL JSON Column Type: https://tech.novus.com/learning-the-memsql-json-column-type/
For Novus, the bottom line was that the client data team could focus on delivering value to customers and helping them understand data and investments instead of tending to the ETL process during the day. In the event of any kind of ETL failure during the day, there is no impact to end users.
Further, the Novus architecture is not limited by the prior database and the team was able to scale from 12 to 126 workers Scala workers, more than 10x improvement. Now for further scale, they can just add more servers to the database.
For Novus, MemSQL fits the operational model. As an application developer there is no need to change the code to increase scale. By having a set of well written SQL code, the ops team can scale by just adding more servers.
With MemSQL, Novus can manage the installation with just one DBA and architect for few hours a week, and they do not have to devote a whole team for care and feeding of the data platform.
“It pretty much takes care of itself once we have it set up” – Noah Zucker, VP, Engineering – Novus
Hear directly from Novus at Strata + Hadoop World New York September 2015: https://youtu.be/RvyB_ogIDSE