On-Premise VM Sizing Recommendations for Free License


#1

So this question is a bit complicated since I was told that the free license is going to be changed from a 128 GB limit in the near future to more of a node-based limit so I don’t know if an answer can be provided for that upcoming change as well, but yesterday I was trying to spin up a new cluster and didn’t really know how to size things to maximize the amount of memory store capacity I could have available.

I may be incorrect, but I seem to remember that when I first experimented with MemSQL about two years ago and setup a cluster, that the aggregators didn’t really count towards the available memory store capacity and that only the leafs contributed to that. However you have to have a master aggregator at a minimum along and the child aggregator is also recommended. If each of those has 4 vCPUs and 32 GB RAM then you’re using up 64 GB for the license (I’m assuming) leaving only 64 GB RAM for the leaf nodes (I ended up with 2 leaf nodes).

However, I am still not sure how much Memory Store capacity I have total (or how much I have remaining after inserting a good 40 GB of data yesterday) since the MemSQL Studio interface doesn’t spell that out at all (I seem to recall that the older web interface did provide some indication about this to the user but I’m not seeing anything like that anymore).

I was hoping to make the following suggestions:

  • Add some indicators into MemSQL Studio to provide the available memory store total / remaining capacity figures (I guess it wouldn’t hurt to share the amount of disk capacity either for the column store option)
  • If you are going to switch from a memory-based free license to a node-based free license in the near future, it would be great to have that limit set to 6 nodes to allow for a Master Aggregator, Child Aggregator, and 4 Leaf Nodes since I think a limit of 4 nodes would potentially be too limiting.

Looking forward to playing with the new cluster I setup some more in the coming days/weeks! (I did have some issues completing the install using the detailed instructions once I got to the add-leaf steps…I’m not sure why those occurred though…the next day I discovered the memsql-deploy setup-cluster command though and that one seemed to work a lot better and actually gave me error output about the configuration I was initially trying, with 6 nodes, exceeding the memory limits of the free license, which was helpful).


#2

Omar, thank you for your detailed question and feature requests. It is much appreciated!

For cluster sizing it will hopefully become a bit easier to figure this out once we switch to unit licenses. If you like I would be happy to add a unit-based free license in Portal for you to try it out. PM me your portal account email address if you want that to happen.

First, to clarify how we compute license usage:

  • Non-unit licenses - the license is evaluated against the sum of the maximum_memory setting across all nodes in the cluster. This includes aggregators.
  • Unit-based licenses - the license is evaluated against the number of consumed units across only the leaf nodes in the cluster.

For cluster sizing, it’s recommended to max out the size of the leaf nodes and minimize the size of aggregators. So for unit based licenses the recommendation is to run 4 units worth of leaf nodes. This means you can run 4 leaf nodes with 32GB of ram and access to 8 cores each.

For non-unit based free licenses you will need to cut some of the 128GB of ram out for the aggregator layer. Depending on your availability needs you may want more than one aggregator which will impact the amount of memory left for your leaf nodes. Each agg should have a minimum of 8GB however something between 12 and 20 is more recommended. Two uses of memory that should be considered are:

  • operations which buffer on the aggregator such as certain types of group by queries
  • reference tables (are fully replicated to every node in the cluster including aggs)

Another thing to note about licensing is that we don’t license based on your usage - rather we license based on the maximum capacity of your cluster. So if your license is 128GB and you have 2 nodes each with maximum_memory set to 64GB, then you are already consuming your entire license. Such a cluster will stop accepting writes once the nodes run out of memory - but otherwise availability won’t be impacted.

In terms of your studio feature request - that will be coming along with a lot of similar metrics over the next few months. We are super excited to get these metrics into our user’s hands.

If you have a moment, could you provide more context regarding the issues you hit with add-leaf during your manual cluster deployment experience?

Thanks for using MemSQL! Hope you enjoy it!