MemSQL Pipeline to Kafka on another AWS Region fails: Converts public IP to private DNS

solved

#1

Hi guys,
I am trying to create a MemSQL pipeline to Kafka broker in another AWS region. I used the public IP of my kafka broker to create the pipeline, but it fails with error indicating it can not connect to the private DNS of my EC2 machine where the broker is installed:

Create Pipeline Command:
CREATE PIPELINE sales_date_kafka AS LOAD DATA KAFKA ‘[PUBLIC_IP]:9092/sales_data’ INTO TABLE sales FIELDS TERMINATED BY “,” (Region,Country,Item_Type,Sales_Channel,Order_Priority ,Order_Date,Order_ID,Ship_Date,Units_Sold,Unit_Price,Unit_Cost,Total_Revenue,Total_Cost,Total_Profit);

Error Message:
ERROR 1970 (HY000): Subprocess timed out. Truncated stderr:

%3|1542413852.475|FAIL|rdkafka#consumer-1| ip-[PRIVATE_IP].ap-southeast-2.compute.internal:9092/0: Failed to resolve ‘ip-[PRIVATE_IP].ap-southeast-2.compute.internal:9092’: Name or service not known

%3|1542413852.475|ERROR|rdkafka#consumer-1| ip-[PRIVATE_IP].ap-southeast-2.compute.internal:9092/0: Failed to resolve ‘ip-[PRIVATE_IP].ap-southeast-2.compute.internal:9092’: Name or service not known

It seems like somehow the create pipeline translates the public IP to private DNS and fails to connect to it, since the memsql machine and kafka broker are in different regions and therefore can’t see each other’s private DNS/IP.

Has anyone else seen this behaviour? Any suggestions on how to get it work?

Thanks for your help in advance.


#2

Can you check your kafka broker configuration? Specifically, you should look at the advertised.listeners setting, and switch it from using a local address to the public IP on every broker. For more information, look here https://kafka.apache.org/documentation/#brokerconfigs .

In the kafka protocol, the bootstrap server communicates back network names of all knowns brokers. Since pipeline only knows of the public ip for the broker, we can conclude that ip-[PRIVATE_IP].ap-southeast-2.compute.internal comes back in a response.


#3

It’s worth noting that your pipeline traffic does not appear to be secure, which is usually fine for internal VPC purposes, but not recommended for traffic through the internet.

You can also explore Inter-Region VPC peering, now offered by AWS.
https://aws.amazon.com/about-aws/whats-new/2018/02/inter-region-vpc-peering-is-now-available-in-nine-additional-aws-regions/


#4

Thanks very much, setting “advertised.listeners” solved the issue.
Cheers.