Transactions and pipelines

buddha · April 2, 2019, 2:57pm

i have a pipeline that calls a stored proc.
the stored proc has 3 lines:
START TRANSACTION;
REPLACE

(<fields…>) SELECT ;
COMMIT;

i am getting this error:
ERROR 2307 ER_INVALID_STORED_PROCEDURE_FOR_PIPELINES: Compilation error in function FixInc.upsrt_TW_Quote near line 3: Stored procedure containing ‘start transaction’ cannot be run inside a pipeline.

i have a kafka stream that has real time prices for Bonds, but i only have a few hundred Bonds that i am monitoring and get tens-of-thousands of updates daily - so i’m doing a ‘replace’ (because i only need the latest price for each Bond) - but while the table only has a few hundred rows, it is showing that it is taking up 10s of MBs of memory - so i’m wondering how to keep the memory usage minimal…

jack · April 2, 2019, 4:42pm

If your table has a unique key and you want to replace rows with the same unique key value, which is what REPLACE does, you don’t even need a stored procedure, just use a pipeline with the REPLACE option. There’s also no need to do START TRANSACTION/COMMIT in the example above, since it’s just a single statement inside.

buddha · April 2, 2019, 6:46pm

yes, that is what i am doing (ie REPLACE INTO TABLE from the pipeline)
but what the heart of my Q is -

on a table that might have 100,000 ‘replaces’ during the day, but never more than 500 rows in it,
why does it’s In Memory Size grow to be enormous?
i was thinking that perhaps it was because a transaction was growing and never being committed/shrunk.

Can someone shed some light on why the memory usage continues to grow in a scenario where i’m constantly updating (via REPLACE/pipeline)…