Template transform for S3 and Python?

christoph · November 21, 2019, 7:27pm

Hi,

I was looking at the documentation (SingleStoreDB Cloud · SingleStore Documentation) but I couldn’t find an example for a transform written in Python that uses a .csv from S3 as an input and then just outputs the same data again.
With this kind of a template I could understand how the input and output needs to be structured so that I can add my own adaptations.

Maybe someone in the community has such a template handy and wouldn’t mind to share it?

Thanks!
Christoph

JoYo · November 26, 2019, 1:04am

for line in sys.stdin:
    sys.stdout.write(line)

christoph · November 26, 2019, 8:42am

Awesome!

So just to ensure we are on the same page and in case someone else comes across this thread, here is the entire Python script that should

return the data being ingested without touching it
once done returning the data, it waits for 10 seconds and then calls an AWS Lambda function

#!/usr/bin/python

import sys
import time
import boto3
import json

#Return input unchanged
for line in sys.stdin:
sys.stdout.write(line)

#Call lambda function after 10 secs once done writing everything back to MemSQL
time.sleep(10)
client = boto3.client(‘lambda’);
response = client.invoke(
ClientContext=‘MemSQLTransform’,
FunctionName=‘arn:aws:lambda:eu-west-1:554654745077:function:PulseETL-dev-startPackageExecution’,
InvocationType=‘Event’,
LogType=‘None’,
Payload=json.dumps({“id”: 8000000}),
Qualifier=’$LATEST’,
)