Template transform for S3 and Python?

Hi,

I was looking at the documentation (https://docs.memsql.com/v6.8/concepts/pipelines/transforms/) but I couldn’t find an example for a transform written in Python that uses a .csv from S3 as an input and then just outputs the same data again.
With this kind of a template I could understand how the input and output needs to be structured so that I can add my own adaptations.

Maybe someone in the community has such a template handy and wouldn’t mind to share it?

Thanks!
Christoph

1 Like
for line in sys.stdin:
    sys.stdout.write(line)

Awesome!

So just to ensure we are on the same page and in case someone else comes across this thread, here is the entire Python script that should

  1. return the data being ingested without touching it

  2. once done returning the data, it waits for 10 seconds and then calls an AWS Lambda function

    #!/usr/bin/python

    import sys
    import time
    import boto3
    import json

    #Return input unchanged
    for line in sys.stdin:
    sys.stdout.write(line)

    #Call lambda function after 10 secs once done writing everything back to MemSQL
    time.sleep(10)
    client = boto3.client(‘lambda’);
    response = client.invoke(
    ClientContext=‘MemSQLTransform’,
    FunctionName=‘arn:aws:lambda:eu-west-1:554654745077:function:PulseETL-dev-startPackageExecution’,
    InvocationType=‘Event’,
    LogType=‘None’,
    Payload=json.dumps({“id”: 8000000}),
    Qualifier=’$LATEST’,
    )