Delete with join performance

aaron.pfeifer · February 21, 2019, 4:34pm

Hey there!

I have a columnstore table where I’m trying to delete a set of rows based on all of the columns in the columnstore table matching the rows in my batch.

In my scenario, the columnstore table is 25B rows and the batch I want to delete consists of 1,000 rows. I thought the fastest way to do this would be to load the batch into a temporary table (with the same shard key as the columnstore table) and do a delete via a join with that table.

For example:

delete f
from facts f
inner join facts_to_delete d on
  f.partner_id = d.partner_id and
  f.app_id = d.app_id and
  ...
  f.epoch = d.epoch;

Running this query takes approximately 7s to complete. I tried to introduce some filters (e.g. on epoch) knowing what the values on that column would be and that seemed to have somewhat of an impact:

delete f
from facts f
inner join facts_to_delete d on
  f.partner_id = d.partner_id and
  f.app_id = d.app_id and
  ...
  f.epoch = d.epoch
where
  f.epoch in (1548493200);

This query takes 5s to complete instead of 7s. What’s curious, though, is that a select statement with the same join only takes 500ms:

select *
from facts f
inner join facts_to_delete d on
  f.partner_id = d.partner_id and
  f.app_id = d.app_id and
  ...
  f.epoch = d.epoch
where
  f.epoch in (1548493200);

In this case, there were no actual records matching in the facts table.

Our tables are structured like so:

CREATE TABLE `facts` (
  `partner_id` bigint(20) NOT NULL,
  `app_id` bigint(20) NOT NULL,
  ...
  `epoch` bigint(20) NOT NULL,
  `value` decimal(20,5) NOT NULL,
  KEY `app_id` (`partner_id`,`app_id`,`epoch`,...) /*!90619 USING CLUSTERED COLUMNSTORE */ /*!90621 WITH(COLUMNSTORE_SEGMENT_ROWS=102400, COLUMNSTORE_FLUSH_BYTES=2097152) */,
  /*!90618 SHARD */ KEY `foreign_key_shard_key` (`partner_id`,`app_id`,`epoch`)
) /*!90621 AUTOSTATS_ENABLED=TRUE */ |

The temporary table is structured identically in terms of the shard key.

As an alternative to the temporary table, I also tried to run an individual delete statement for each row. For 1,000 rows it took approximately the same amount of time as using a join. It appears as though running individual statements is faster for fewer rows and slower when deleting a lot of rows.

So I’d love to get thoughts on how I could optimize the delete. I’m also curious why the select performance is an order of magnitude faster than the delete performance even though it’s presumably looking at the same rows.

Thanks for any advice!

aaron.pfeifer · February 21, 2019, 8:30pm

Interestingly enough if I switch the query to:

delete f
from facts f
inner join facts_to_delete d on
  f.partner_id = d.partner_id and
  f.app_id = d.app_id and
  ...
  f.epoch = d.epoch
where
  d.epoch in (1548493200);

(notice the filter is now on the delete table instead of the fact table)

…we see an order of magnitude improvement in performance and the query now only takes 500ms.

That’s great, though I feel like I don’t really understand what’s happening in the query execution that a filter like this impacts the performance so significantly.

aaron.pfeifer · February 21, 2019, 9:05pm

The other surprising aspect with this is that the performance remains relatively constant – so performance is 500ms whether I have 1,000 records I’m deleting or just 1. I suspect this has to do with how the filter impacts the segments that get scanned.

Jacky · February 21, 2019, 9:09pm

Hey Aaron.

Thanks for posting, I am forwarding this to the relevant parties to help you out.

aaron.pfeifer · February 21, 2019, 10:12pm

I actually didn’t realize you could profile delete queries. These are the different profiles:

Without a condition:

| Delete partitions:all actual_rows: 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| HashJoin [f.partner_id = d.partner_id AND f.app_id = d.app_id AND f.device_country_id = d.device_country_id AND f.metric_id = d.metric_id AND f.store_id = d.store_id AND f.epoch = d.epoch] actual_rows: 0 exec_time: 1,388ms start_time: [00:00:00.251, 00:00:00.312]                                                                                                                                                                                                                                 |
| |---HashTableBuild alias:d actual_rows: 1,000 exec_time: 0ms start_time: [00:00:00.251, 00:00:00.311] memory_usage: 12,582.912109 KB                                                                                                                                                                                                                                                                                                                                                                    |
| |   Project [d_0.partner_id, d_0.app_id, d_0.device_country_id, d_0.metric_id, d_0.store_id, d_0.epoch] est_rows:1,248 actual_rows: 1,000 exec_time: 0ms start_time: [00:00:00.251, 00:00:00.311]                                                                                                                                                                                                                                                                                                       |
| |   TableScan report_service.facts_app_metrics_to_delete1 AS d_0 est_table_rows:1,248 est_filtered:1,248 actual_rows: 1,000 exec_time: 0ms start_time: [00:00:00.251, 00:00:00.311]                                                                                                                                                                                                                                                                                                                     |
| ColumnStoreScan report_service.facts_app_metrics2 AS f, KEY app_id (partner_id, app_id, epoch, metric_id, device_country_id, store_id) USING CLUSTERED COLUMNSTORE WITH(COLUMNSTORE_SEGMENT_ROWS=102400, COLUMNSTORE_FLUSH_BYTES=2097152) est_table_rows:14,085,295,956 est_filtered:14,085,295,956 actual_rows: 7,685,287,661 exec_time: 3,811ms start_time: [00:00:00.251, 00:00:00.311] memory_usage: 69,206.015625 KB segments_scanned: 83,174 segments_skipped: 71,113 segments_fully_contained: 0 |

With a condition:

| Delete partitions:all actual_rows: 0                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| NestedLoopJoin actual_rows: 0 exec_time: 0ms                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| |---Filter [f.device_country_id = d.device_country_id AND f.metric_id = d.metric_id AND f.store_id = d.store_id AND f.epoch = d.epoch] actual_rows: 0 exec_time: 0ms                                                                                                                                                                                                                                                                                                                        |
| |   IndexRangeScan report_service.facts_app_metrics_to_delete1 AS d, SHARD KEY foreign_key_shard_key (partner_id, app_id, epoch) scan:[partner_id = f.partner_id AND app_id = f.app_id AND epoch IN (...)] est_table_rows:1,248 est_filtered:1,248 actual_rows: 473,430 exec_time: 0ms start_time: [00:00:00.001, 00:00:00.022]                                                                                                                                                             |
| |    [actual_rows: 473,430 | max:26,667 at partition_27, average: 4,931.562500, std dev: 5,350.254406]                                                                                                                                                                                                                                                                                                                                                                                      |
| Filter [f.epoch IN (...)] actual_rows: 1,492,288 exec_time: 25ms start_time: [00:00:00.001, 00:00:00.003]                                                                                                                                                                                                                                                                                                                                                                                   |
| ColumnStoreScan report_service.facts_app_metrics2 AS f, KEY app_id (partner_id, app_id, epoch, metric_id, device_country_id, store_id) USING CLUSTERED COLUMNSTORE WITH(COLUMNSTORE_SEGMENT_ROWS=102400, COLUMNSTORE_FLUSH_BYTES=2097152) est_table_rows:14,085,296,955 est_filtered:471 actual_rows: 13,509,726,390 exec_time: 273ms start_time: [00:00:00.001, 00:00:00.002] memory_usage: 88,080.382812 KB segments_scanned: 147,769 segments_skipped: 6,518 segments_fully_contained: 0 |

I actually wonder if you’re generally better off deleting a single record at a time most of the time.

rob · February 21, 2019, 11:14pm

Do you have a profile for when the filter is on the fact table?

aaron.pfeifer · February 22, 2019, 2:06am

I was trying to pull up the profile for using f.epoch as a condition instead of d.epoch and it seems I got bit in my initial evaluation by the query being previously compiled in a different way that was causing it to perform poorly. When I cleared my plancache, filtering on f.epoch and d.epoch have the same effect.

Alright that being put to the side, what’s still interesting is that when I only have 1 record in my temporary table, MemSQL is still being forced to scan all the rows in the facts table. I can add the epoch condition and that improves performance, but it still takes 500ms to complete. I think I naively assumed that MemSQL would be able to skip more segments based on the contents in the temporary table but I think I’m make some false assumptions about how that’s working under the hood.

My current evaluation is that using the join table can be effective when there are enough records in your batch to delete. However, if you only have a few records then it’s more efficient to delete them one at a time.

I wonder if that would be any different if MemSQL supported this MySQL syntax:

DELETE FROM table WHERE (col1,col2) IN ((1,2),(3,4),(5,6))

Any other thoughts on how to optimize batched deletes or does it seem like I’m on the right track?

rob · February 22, 2019, 8:26pm

MemSQL can get segment elimination through the join but only for hash join. Do you have an example profile where the temporary table has 1 record?