Stream data out from Memsql to kafka topic

sangrambhojane · November 7, 2019, 8:17am

Hey Team,
I am new to using memSQL with kafka. I am clear about injesting data from kafka topic via memSQL pipeline, but I would like to know if we can do viceversa (i.e. stream out data from memSQL to kafka topic via pipeline or any other way)?

sasha · November 7, 2019, 6:43pm

The upcoming 7.0 release will add support for writing the result of a select query as CSV to a kafka topic. Until then, the best workaround is likely to implement that by hand on top of select ... into outfile.

sangrambhojane · November 8, 2019, 10:20am

Hey Sasha,

Thanks for the response and your suggestion, I have implemented similar workaround for now (memSQL -> outfile -> (java utility) -> kafka topic) but will be looking forward for the upcoming release to avoid the overhead.

Also I have one more question if you could please help : As MemSQL cluster’s master aggregator metadata includes information about the Kafka cluster’s brokers, topics, and partitions after pipeline is started, then :

Can we ingest data from multiple kafka topics into one table using same pipeline?
Something like :
CREATE PIPELINE test_multiple_kafka_topic AS LOAD DATA KAFKA ‘kafka-host:port/kafka_topic1, kafka_topic2, kafka_topic3’ INTO TABLE events;

Or say if the topics are generated dynamically but with some specified naming convention ‘kafka_topic*’ where * can change, then Can a pipeline be formed as :

CREATE PIPELINE test_multiple_kafka_topic AS LOAD DATA KAFKA ‘kafka-host:port/%kafka_topic%’ INTO TABLE events;

sasha · November 8, 2019, 8:19pm

No, we don’t currently support multiple topics in a single pipeline. I agree that it’s an important feature, but we don’t have an official estimate for it. One workaround is in fact a pipeline per topic. But note that, because each pipeline writes in a separate transaction, you may begin to hit row lock contention if these pipelines write to rows with the same primary key.

sangrambhojane · November 11, 2019, 7:07am

Thank you Sasha for answering my queries.

adel · May 19, 2020, 7:06pm

Dear Sash, Version 7.0 is out. Can you confirm this feature is out there now , and can you please provide its documentation url.

thanks