File system pipeline into SP for multiple tables

I currently have a File System Pipeline which uses stored procedure to do upsert in a table. After the upsert is done , i am deleting the old entries in the table which is not available in the new file based on a timestamp set at the beginning of the SP. Also i am inserting few columns to another table.

  1. What happens if the pipeline fails after the upsert is successful and before delete is executed. Would only the upsert be available in memsql.

  2. If i insert values into another table after the first table upsert, what will happen if the pipeline fails between table 1 and table 2.

I guess my basic question is whether the whole SP is considered as single unit of work ? Or do we need to handle the failure scenarios.

By default, all queries in a pipeline’s stored procedure run in the context of an implicit multi-statement transaction managed by MemSQL. We manage it for the sake of exactly once semantics - it’s the same transaction which we use to update the pipeline’s internal metadata about which files are loaded vs unloaded. Writes in the stored procedure won’t become visible outside the SP until the whole batch completes successfully and we commit the transaction. If any part of the SP fails, they’ll all be rolled back and the file will remain in an unloaded state.

So yes, a single unit of work with no need for manual cleanup.


Thanks for the response. Just a quick follow up.

Which data is returned to the queries during the pipeline stored procedure execution. Since you mentioned that the changes are committed only after the entire batch is successful, is the updated data available to query only after the pipeline finishes?

Or the new updated data is visible to the queries during the SP process but will be rolled back if the pipeline fails ?