I am new to using memSQL with kafka. I am clear about injesting data from kafka topic via memSQL pipeline, but I would like to know if we can do viceversa (i.e. stream out data from memSQL to kafka topic via pipeline or any other way)?
The upcoming 7.0 release will add support for writing the result of a
select query as CSV to a kafka topic. Until then, the best workaround is likely to implement that by hand on top of
select ... into outfile.
Thanks for the response and your suggestion, I have implemented similar workaround for now (memSQL -> outfile -> (java utility) -> kafka topic) but will be looking forward for the upcoming release to avoid the overhead.
Also I have one more question if you could please help : As MemSQL cluster’s master aggregator metadata includes information about the Kafka cluster’s brokers, topics, and partitions after pipeline is started, then :
Can we ingest data from multiple kafka topics into one table using same pipeline?
Something like :
test_multiple_kafka_topic AS LOAD DATA KAFKA ‘kafka-host:port/kafka_topic1, kafka_topic2, kafka_topic3’ INTO TABLE
Or say if the topics are generated dynamically but with some specified naming convention ‘kafka_topic*’ where * can change, then Can a pipeline be formed as :
test_multiple_kafka_topic AS LOAD DATA KAFKA ‘kafka-host:port/%kafka_topic%’ INTO TABLE
No, we don’t currently support multiple topics in a single pipeline. I agree that it’s an important feature, but we don’t have an official estimate for it. One workaround is in fact a pipeline per topic. But note that, because each pipeline writes in a separate transaction, you may begin to hit row lock contention if these pipelines write to rows with the same primary key.
Thank you Sasha for answering my queries.