Load balancing between Aggregators

I’m copying a large set of records from some DB to MemSQL (6 nodes: 1 MA, 1 CA, 4 leaves)
My app connects to the master aggregator only, via MariaDB driver. Using multithreading and batch-insertions, I’m getting speeds of up to 1.2 million insertions / minute.

Upon profiling, it turns out that master aggregator was waiting on network I/O 95% of the time with total CPU at > 200%. And CA wasn’t being used at all.
Does that mean that load balancing between MA / CA would give me a performance boost? Also, the leaves’ CPU isn’t fully utilised (every leaf was at 12-18% CPU in that 30 second profiling period).

If load balancing makes sense here, what’s the standard way of doing it in MemSQL?

It certainly seems like you could get better insertion rates by also using the CA. The most direct way would be to have half your application threads that are doing the inserts talk to the MA and the other half talk to the CA.

I’ve heard that:

  • haproxy works well
  • you can use a connection pool and the mysql driver supports load balancing between multiple targets as well
  • one of our larger customers uses DNS round robin

DNS based round robin seems like a good option. I will try that out and get back to you.