Ingestion from MongoDB


#1

Is there any popular MongoDB adapter available to replicate data from a MongoDB cluster to a MemSQL cluster? I have a use case were I would like to listen to MongoDB collection updates and replicate for later analytics. Is there any solution out-of-the-box or should I go for a custom implementation?


#2

Currently we don’t have native replication from MongoDB into MemSQL. That said, if I was to build such a thing I would look into the following approach:

  1. Find a way to replicate a Mongo replica set into Kafka (from a brief Google search there appears to be some options here
  2. Consume the Kafka stream of changes using MemSQL Pipelines (docs here: https://docs.memsql.com/memsql-pipelines/v6.7/kafka-extractor/)
  3. Stream the data into a Columnstore table, storing each of the raw updates using our native JSON column type: https://docs.memsql.com/concepts/v6.7/json-guide/#ddl-defining-tables-with-json-columns

Would love to hear if this solution works for you and if so it would be great if you could report back with your experience. Including which pieces of software you ended up using would be helpful as well.