We are creating a pipeline which is ingesting batch data into our tables. We are using REPLACE INTO syntax within. For every row, we want to update it’s one column which is for instance the count for the row while replacing it. Is there any way to get the current row column values before replace? As this is huge batch (30M rows) and goes into huge table(>150B rows) what would be the solution to update the count while each batch replace/insert. Thanks
Consider using pipelines to stored procedures, which will let you use arbitrary logic during the load.
we are using pipeline. In that we are using replace into query. we have following columns in it.
our pipeline ingest more than 20M data with area_code and user_id. we have to increase the count while replacing. The simple one is to fetch the existing count and plus one it. I was asking that if we can replace and count+1 in existing count value of that row?
maybe CREATE PIPELINE … ON DUPLICATE KEY UPDATE can help solve your issue?
REPLACE INTO will delete conflicting versions before installing new one; afaik there’s no way to read old values for REPLACE
@evan what if we aren’t using batching. it’s just a procedure which is running on a table after few intervals.
the on dup key update pipeline feature should be orthogonal to the batching