Materialized views in the future?

Hi everyone,

we are currently working on a BI use case in which we will have near real time requirements.
Essentially we will get the main entities (e.g. employee) pushed into MemSQL real time and then need to do analytics on those entities.
This will however require us to join multiple tables.
A view would make this easier for the BI folks but we’ll still have the join costs at run time.
Are there any plans to support materialized views?

Thanks!
Christoph

Christoph, thank you for suggesting a feature. We have it on the roadmap, but can’t commit on the date yet. I’m wondering if something can be done with secondary indexes and vectorized joins that we already support. Can you share your schema, queries, sized of tables, and how selective predicates are.

Thanks for reaching out Nikita!
Right now I cannot share the exact schemas but here are some more details:

  • There is going to be an “Employee” table with roughly 350k entries
  • There is also going to be a EmployeeHistory" table with multiple million rows
  • Employee salaries are going to be increased in “seasons”
  • There will be many many seasons as they will also be used for simulations

Once we have the exact requirements we will have a tool with 350k+ employees, millions of historic entries, thousands of seasons and in each season different salary increases per employee (e.g. you month fixed salary may change, your bonus may change, etc…).

Does this help you to get a very rough understanding on what we are trying to implement?
Thanks,
Christoph

Yes, and how fast do you want the query to work? Is this a three way join?

our goal is that every request is answered in <250ms as 250ms is perceived as instant by humans. Since there is latency and the backend needs some time, we are trying to keep all queries at below 100ms.
Does this help and make sense?

And how many joins? BTW i think we can meet your SLA without MVs

I don’t know yet for sure but I assume between 3 and 6 joins.
MVs would probably have the added benefit of resulting in less CPU usage as we wouldn’t have the costs of doing the join every time the statement is executed. We are looking at a self-service use case with a bunch of Tableau users so it is difficult to forecast the concurrency.
Thank you for your fast responses Nikita!