Columnstore ordering by column performance

zmeidav · July 10, 2019, 12:08pm

I have a question regarding columnstore sorting peformance.

Assuming I have a table with 10 columns with columnstore definition:
KEY(A DESC, B DESC, C, D , E ) using clustered columnstore

How come the sort on A DESC with limit 100 takes 0.02ms, sorting on B DESC takes 24 seconds.
the table has 500M rows, 16 partitions, and 8 cores of CPU. when sorting by B it uses all 8 cores.

for sorting on B: segments_scanned: 1,888 segments_skipped: 0 segments_fully_contained: 0
for A: segments_scanned: 32 segments_skipped: 0 segments_fully_contained: 0

for reference the query is a basic select C from table order by B(or A) DESC limit 100;
Both columns A,B are bigint type.

Jacky · July 10, 2019, 7:52pm

I’m fetching some people to help you out on this question

jack · July 10, 2019, 8:03pm

When the order by is A, the query can take advantage of the sort key to scan the table in order, and therefore it only has to read enough rows to reach the limit 100. This is why you see a much lower runtime and segments_scanned.

When the order by is B, the sort key cannot be used, because the ordering is lexicographic - the top rows of B can be anywhere in the table, so the query must scan the entire table.

For the purposes of optimizing ORDER BY, columnstore sort key functions much the same as a typical B-tree index - the order by has to match a prefix of the key.

zmeidav · July 10, 2019, 8:16pm

I was afraid this is the reason, i was hoping that at least min/max per value of the index were held in each segment so they can be somehow leveraged.

So what would be a good way to ORDER BY on not the first column, assuming I don’t have a specific value for A to filter on…

Assuming I have a table with bank transaction history: transaction id, timestamp, amount, place, ATM id etc

Assuming I want to display a table with pagination of all of those details, and allow sorting by every column, how can such table be structured. Consider also hundreds of millions of rows.

jack · July 10, 2019, 8:35pm

There isn’t necessarily a good answer for that, but some ideas are:

You could try a rowstore table, where you can create an index on each column for efficient order bys.
You could try a query like select ... from t where B >= (select min(B) from t order by B desc limit 100,1) order by B desc limit 100 [Edited with correction discussed below] where you have a subquery find only the top-n rows, and then find the rest of the data for those rows. This may help by reducing the amount of data that has to be accessed - it has to read all the B data, but not all the data for the rest of the columns.

zmeidav · July 10, 2019, 9:05pm

Let’s assume table size is too big for RAM size.

wouldn’t this be as fast as the select min(B) from t order by B desc limit 100, which is destined to be slow to begin with?

haoran · July 11, 2019, 2:59am

This is a little bit tricky. For query

select * from table order by B desc limit 100

what happens in our internal query execution is: (1) Since B is not matching the sort key, we have to do a full scan of the table. (2) Since your projection is “select *”, all columns of your table will be extracted. So internally in this case, we will have a priority queue storing the 100 rows with the max B values, and update the priority queue as we scan the whole table. Note that the 100 rows contains all the columns of the table and we also extract all columns when we scan, since your projection is “select *”.

select ... from t where B >= (select min(B) from t order by B desc limit 100) order by B desc limit 100

In this query, we execute the subselect first. In the subselect, only column B is used. The execution of the subselect is same as the previous case, except that now we only need to extract column B, as no other columns are involved, which would be a lot faster than extracting all columns.

After we get the threshold B value, only approximately 100 rows (assuming that there are not too many duplicates on the threshold B value) will pass the filter “B >= threshold_B”. We will extract the other columns of the ~100 rows that passed the filter using subsegment access, which is a very small cost compared with extracting the whole table.

In theory, for “select * from table order by B desc limit 100”, our optimizer could have reached a better execution plan by only extract column B to get the top 100 row IDs, and then fill in the other projection columns by subsegment access. But this is not what’s happening now (the primary reason is we don’t have the concept of rowID and subsegment-access before 7.0), so you would have to use @jack 's subselect workaround for now.

zmeidav · July 11, 2019, 7:48am

Thanks for the detailed response @haoran:

So although the sort key is (A,B,C,D) - B is not matching the sort key, but A,B would be working fast right?

BTW Sorting by A is super fast, but by A reverse order, is slow, shouldn’t it have been as fast?

I’m using beta 7.0- should this feature be enabled ? if so, I still see poor performance(?)

zmeidav · July 11, 2019, 9:27am

I’ve tested this query. The inner sub-select query is actually wrong since the order by B DESC limit 100 has no impact and the min(B) is actually the global minimum of the column.

jack · July 11, 2019, 5:45pm

You’re correct, sorry, here’s a corrected version of that query:

select ... from t where B >= (
    select B from t order by B desc limit 100,1
) order by B desc limit 100

where the subquery gets the 100th largest value of B.

Or equivalently

select ... from t where B >= (
    select min(B) from (
        select B from t order by B desc limit 100
    )
) order by B desc limit 100

jack · July 11, 2019, 5:49pm

I’m using beta 7.0- should this feature be enabled ? if so, I still see poor performance(?)

No, we were not clear, the optimization for order by that we are discussing here is a hypothetical one that doesn’t exist yet. There are other reasons it is not currently possible. However, subsegement access which is in 7.0 helps make the workaround run more efficiently, but the original query still must scan the entire table.

zmeidav · July 16, 2019, 2:05pm

Currently, the best and fastest way I found to select items by some field in some ordering, is to fetch the unique IDs from the table order by something limit 100, then concurrently fetch by ID using the hash index in the columnar store:

select unique_id from my_table where (…) order by somefield(which is not the primary sorting field) limit 100;
then concurrently: select * from my_table where unique_id=some_unique_id