Show slow performance response using order by

Idan,
If you don’t have good data spread between game ID, then you’ll have data skewing in partitions. I do believe data is still distributed in different leaf nodes. If your query will also involve date, and the amount of data per day is similar, i would consider using date(event_arrival_time) as a shard key, and instead of using event_arrival_time between x and y, you can just use event_arrival_date=‘date’. You can still keep event_arrival_time and index it for faster sort and if you need a specific time slot.
I think your case is a good candidate to create a fact table about your event. I’m not sure if cont_event is your raw data, but it’s good practice for scalability and reporting speed to aggregate your data in fact and dim. You just need to define your grain properly.

1 Like

Idan,
Yes Select * is usually not a good idea. You want to specify the column to return. I’m not sure what is stored inside your table. If your table has json data, it will slowdown your queries when included in Select. (especially if you return lost of data). Which points back to what I was saying in previous post. You should use your raw data and aggregate it for fast reporting capabilities. I’ve had Memsql tables with billions of rows aggregated return results in sub-seconds. Any query over 3 seconds in MemSQL is a good candidate for optimization.

1 Like

It’s difficult for me to say why your issue no longer reproduces. One guess I have is that your original order by query was not DESC. You can try seeing if removing DESC in the order by reproduces your slowdown.

instead of doing select * I specified specific fields and it got better aswell (6 seconds)

In older releases, we sometimes unnecessarily materialize projected non-filter columns. If you upgrade to 7.0, I expect that the select * query will be about as fast as your query that projects specific fields.

Regarding shard key tuning, I don’t think this is the right path to take for improving this query’s latency. I suggest sticking with keyless sharding for now. You may see a benefit on this specific query by placing game_id first in your clustered columnstore key. Alternatively, upgrading to 7.0 will also allow you to create a hash index on game_id, which I suggest experimenting with.

1 Like

@cwatts,

You can try seeing if removing DESC in the order by reproduces your slowdown.

I truly apologize for this inaccuracy it wasnt that I missed DESC it was the dates range which I changed. On the first example had much higher range in the BETWEEN condition(29 days) compared to the last one (1 day). Now that I switched it back to 29 days I can reproduce easily the issue and it shows again ~25 seconds

so thats the query with the order by:
SELECT * FROM contribution.cont_event cont WHERE game_id = 'RUUvr1574773090344' AND event_arrival_time BETWEEN '2019-12-02 00:00:00' AND '2019-12-31 23:59:59' ORDER BY event_arrival_time

and iam also adding here the corresponding PROFILE text json:

{
    "profile":[
        {
            "executor":"Top",
            "limit":"@@SESSION.`sql_select_limit`",
            "actual_row_count":{ "value":300 },
            "actual_total_time":{ "value":0 },
            "inputs":[
                {
                    "executor":"GatherMerge",
                    "order":[
                        "remote_0.event_arrival_time"
                    ],
                    "partitions":"all",
                    "query":"SELECT `cont`.`event_id` AS `event_id`, `cont`.`action` AS `action`, `cont`.`correlation_id` AS `correlation_id`, `cont`.`status` AS `status`, `cont`.`event_arrival_time` AS `event_arrival_time`, `cont`.`create_time` AS `create_time`, `cont`.`create_ts` AS `create_ts`, `cont`.`operator_id` AS `operator_id`, `cont`.`game_id` AS `game_id`, `cont`.`player_id` AS `player_id`, `cont`.`segment_code` AS `segment_code`, `cont`.`bet_amount_original` AS `bet_amount_original`, `cont`.`bet_amount_converted` AS `bet_amount_converted`, `cont`.`cont_amount_player` AS `cont_amount_player`, `cont`.`cont_amount_operator` AS `cont_amount_operator`, `cont`.`cont_amount_total` AS `cont_amount_total`, `cont`.`operator_income` AS `operator_income`, `cont`.`cont_amount_jackpot` AS `cont_amount_jackpot`, `cont`.`original_currency` AS `original_currency`, `cont`.`base_currency` AS `base_currency`, `cont`.`currency_rate` AS `currency_rate`, `cont`.`operator_game_code` AS `operator_game_code`, `cont`.`funnel_id` AS `funnel_id`, `cont`.`segment_name` AS `segment_name`, `cont`.`operator_game_name` AS `operator_game_name`, `cont`.`description` AS `description`, `cont`.`extra_fields` AS `extra_fields`, `cont`.`jackpot_game_name` AS `jackpot_game_name`, `cont`.`game_version` AS `game_version`, `cont`.`event_type` AS `event_type`, `cont`.`event` AS `event` FROM `contribution_0`.`cont_event` as `cont`  WHERE ((`cont`.`game_id` = 'RUUvr1574773090344') AND (`cont`.`event_arrival_time` BETWEEN '2019-12-02 00:00:00' AND '2019-12-31 23:59:59')) ORDER BY 5 LIMIT NULL OPTION(NO_QUERY_REWRITE=1, INTERPRETER_MODE=INTERPRET_FIRST)",
                    "alias":"remote_0",
                    "actual_row_count":{ "value":4800, "avg":0.000000, "stddev":0.000000, "max":0, "maxPartition":0 },
                    "actual_total_time":{ "value":0 },
                    "start_time":{ "value":0 },
                    "end_time":{ "value":21008 },
                    "inputs":[
                        {
                            "executor":"Project",
                            "out":[
                                {
                                    "alias":"",
                                    "projection":"cont.event_id"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.action"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.correlation_id"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.status"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.event_arrival_time"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.create_time"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.create_ts"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.operator_id"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.game_id"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.player_id"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.segment_code"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.bet_amount_original"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.bet_amount_converted"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.cont_amount_player"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.cont_amount_operator"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.cont_amount_total"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.operator_income"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.cont_amount_jackpot"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.original_currency"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.base_currency"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.currency_rate"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.operator_game_code"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.funnel_id"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.segment_name"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.operator_game_name"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.description"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.extra_fields"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.jackpot_game_name"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.game_version"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.event_type"
                                },
                                {
                                    "alias":"",
                                    "projection":"cont.event"
                                }
                            ],
                            "subselects":[],
                            "actual_row_count":{ "value":4800, "avg":300.000000, "stddev":0.000000, "max":300, "maxPartition":0 },
                            "actual_total_time":{ "value":4, "avg":3.500000, "stddev":0.500000, "max":4, "maxPartition":8 },
                            "start_time":{ "value":19776, "avg":20554.437500, "stddev":0.000000, "max":21003, "maxPartition":4 },
                            "network_traffic":{ "value":9215912, "avg":575994.500000, "stddev":68.802253, "max":576170, "maxPartition":6 },
                            "network_time":{ "value":19, "avg":10.000000, "stddev":9.000000, "max":19, "maxPartition":8 },
                            "inputs":[
                                {
                                    "executor":"Top",
                                    "limit":"?",
                                    "actual_row_count":{ "value":4800, "avg":300.000000, "stddev":0.000000, "max":300, "maxPartition":0 },
                                    "actual_total_time":{ "value":0, "avg":0.000000, "stddev":0.000000, "max":0, "maxPartition":0 },
                                    "inputs":[
                                        {
                                            "executor":"Filter",
                                            "condition":[
                                                "cont.game_id = ? AND cont.event_arrival_time >= ? AND cont.event_arrival_time <= ?"
                                            ],
                                            "subselects":[],
                                            "actual_row_count":{ "value":65914, "avg":4119.625000, "stddev":26.506780, "max":4189, "maxPartition":11 },
                                            "actual_total_time":{ "value":194, "avg":192.000000, "stddev":2.000000, "max":194, "maxPartition":8 },
                                            "start_time":{ "value":27, "avg":37.750000, "stddev":0.000000, "max":50, "maxPartition":15 },
                                            "inputs":[
                                                {
                                                    "executor":"OrderedColumnStoreScan",
                                                    "db":"contribution",
                                                    "table":"cont_event",
                                                    "alias":"cont",
                                                    "index":"KEY operator_id (event_arrival_time, action, operator_id, game_id, correlation_id) USING CLUSTERED COLUMNSTORE",
                                                    "storage":"columnar",
                                                    "est_table_rows":"259183530",
                                                    "est_filtered":"1",
                                                    "actual_row_count":{ "value":30011346, "avg":1875709.125000, "stddev":236285.090670, "max":2787664, "maxPartition":3 },
                                                    "actual_total_time":{ "value":20777, "avg":20671.500000, "stddev":105.500000, "max":20777, "maxPartition":0 },
                                                    "start_time":{ "value":26, "avg":36.750000, "stddev":0.000000, "max":49, "maxPartition":15 },
                                                    "memory_usage":{ "value":182452224, "avg":11403264.000000, "stddev":2695753.079351, "max":15728640, "maxPartition":2 },
                                                    "segments_scanned":{ "value":55, "avg":3.437500, "stddev":0.496078, "max":4, "maxPartition":2 },
                                                    "segments_skipped":{ "value":217, "avg":13.562500, "stddev":0.496078, "max":14, "maxPartition":0 },
                                                    "segments_fully_contained":{ "value":0, "avg":0.000000, "stddev":0.000000, "max":0, "maxPartition":0 },
                                                    "segments_filter_encoded_data":{ "value":42, "avg":2.625000, "stddev":0.780625, "max":4, "maxPartition":2 },
                                                    "inputs":[]
                                                }
                                            ]
                                        }
                                    ]
                                }
                            ]
                        }
                    ]
                }
            ]
        }
    ],
    "version":"2",
    "warning":"requires compile",
    "info":{
        "memsql_version":"6.8.11",
        "memsql_version_hash":"e973c625ae6d372c2d41d39b19612202c244fd7a",
        "num_online_leaves":"2",
        "num_online_aggs":"1",
        "context_database":"(null)"
    }
}

btw: I havnt applied yet your first suggestion:
optimize table cont_event full

I suggest sticking with keyless sharding for now.

regarding the shard-key. dont you think as best practice would be better to use the correlation_id (instead of default) as it has high cardinality ?

Thank you

1 Like

Can you run the query with and without order by, one after another, and again paste the profiles? It looks like the data may have changed: in the most recent profile, 65914 rows passed the filter, but in Post #5, only 405 rows passed the filter. Sorry for the trouble.

1 Like

ofcourse. it’s not trouble at all!

without order-by

PROFILE SELECT * FROM contribution.cont_event cont WHERE game_id = 'RUUvr1574773090344' AND event_arrival_time BETWEEN '2019-12-02 00:00:00' AND '2019-12-31 23:59:59';

{
"profile":[
    {
        "executor":"Top",
        "limit":"@@SESSION.`sql_select_limit`",
        "actual_row_count":{ "value":300 },
        "actual_total_time":{ "value":0 },
        "inputs":[
            {
                "executor":"Gather",
                "partitions":"all",
                "query":"SELECT `cont`.`event_id` AS `event_id`, `cont`.`action` AS `action`, `cont`.`correlation_id` AS `correlation_id`, `cont`.`status` AS `status`, `cont`.`event_arrival_time` AS `event_arrival_time`, `cont`.`create_time` AS `create_time`, `cont`.`create_ts` AS `create_ts`, `cont`.`operator_id` AS `operator_id`, `cont`.`game_id` AS `game_id`, `cont`.`player_id` AS `player_id`, `cont`.`segment_code` AS `segment_code`, `cont`.`bet_amount_original` AS `bet_amount_original`, `cont`.`bet_amount_converted` AS `bet_amount_converted`, `cont`.`cont_amount_player` AS `cont_amount_player`, `cont`.`cont_amount_operator` AS `cont_amount_operator`, `cont`.`cont_amount_total` AS `cont_amount_total`, `cont`.`operator_income` AS `operator_income`, `cont`.`cont_amount_jackpot` AS `cont_amount_jackpot`, `cont`.`original_currency` AS `original_currency`, `cont`.`base_currency` AS `base_currency`, `cont`.`currency_rate` AS `currency_rate`, `cont`.`operator_game_code` AS `operator_game_code`, `cont`.`funnel_id` AS `funnel_id`, `cont`.`segment_name` AS `segment_name`, `cont`.`operator_game_name` AS `operator_game_name`, `cont`.`description` AS `description`, `cont`.`extra_fields` AS `extra_fields`, `cont`.`jackpot_game_name` AS `jackpot_game_name`, `cont`.`game_version` AS `game_version`, `cont`.`event_type` AS `event_type`, `cont`.`event` AS `event` FROM `contribution_0`.`cont_event` as `cont`  WHERE ((`cont`.`game_id` = 'RUUvr1574773090344') AND (`cont`.`event_arrival_time` BETWEEN '2019-12-02 00:00:00' AND '2019-12-31 23:59:59')) LIMIT NULL OPTION(NO_QUERY_REWRITE=1, INTERPRETER_MODE=INTERPRET_FIRST)",
                "alias":"remote_0",
                "actual_row_count":{ "value":484, "avg":0.000000, "stddev":0.000000, "max":0, "maxPartition":0 },
                "actual_total_time":{ "value":0 },
                "start_time":{ "value":0 },
                "end_time":{ "value":4226 },
                "inputs":[
                    {
                        "executor":"Project",
                        "out":[
                            {
                                "alias":"",
                                "projection":"cont.event_id"
                            },
                            {
                                "alias":"",
                                "projection":"cont.action"
                            },
                            {
                                "alias":"",
                                "projection":"cont.correlation_id"
                            },
                            {
                                "alias":"",
                                "projection":"cont.status"
                            },
                            {
                                "alias":"",
                                "projection":"cont.event_arrival_time"
                            },
                            {
                                "alias":"",
                                "projection":"cont.create_time"
                            },
                            {
                                "alias":"",
                                "projection":"cont.create_ts"
                            },
                            {
                                "alias":"",
                                "projection":"cont.operator_id"
                            },
                            {
                                "alias":"",
                                "projection":"cont.game_id"
                            },
                            {
                                "alias":"",
                                "projection":"cont.player_id"
                            },
                            {
                                "alias":"",
                                "projection":"cont.segment_code"
                            },
                            {
                                "alias":"",
                                "projection":"cont.bet_amount_original"
                            },
                            {
                                "alias":"",
                                "projection":"cont.bet_amount_converted"
                            },
                            {
                                "alias":"",
                                "projection":"cont.cont_amount_player"
                            },
                            {
                                "alias":"",
                                "projection":"cont.cont_amount_operator"
                            },
                            {
                                "alias":"",
                                "projection":"cont.cont_amount_total"
                            },
                            {
                                "alias":"",
                                "projection":"cont.operator_income"
                            },
                            {
                                "alias":"",
                                "projection":"cont.cont_amount_jackpot"
                            },
                            {
                                "alias":"",
                                "projection":"cont.original_currency"
                            },
                            {
                                "alias":"",
                                "projection":"cont.base_currency"
                            },
                            {
                                "alias":"",
                                "projection":"cont.currency_rate"
                            },
                            {
                                "alias":"",
                                "projection":"cont.operator_game_code"
                            },
                            {
                                "alias":"",
                                "projection":"cont.funnel_id"
                            },
                            {
                                "alias":"",
                                "projection":"cont.segment_name"
                            },
                            {
                                "alias":"",
                                "projection":"cont.operator_game_name"
                            },
                            {
                                "alias":"",
                                "projection":"cont.description"
                            },
                            {
                                "alias":"",
                                "projection":"cont.extra_fields"
                            },
                            {
                                "alias":"",
                                "projection":"cont.jackpot_game_name"
                            },
                            {
                                "alias":"",
                                "projection":"cont.game_version"
                            },
                            {
                                "alias":"",
                                "projection":"cont.event_type"
                            },
                            {
                                "alias":"",
                                "projection":"cont.event"
                            }
                        ],
                        "subselects":[],
                        "actual_row_count":{ "value":484, "avg":30.250000, "stddev":20.550243, "max":109, "maxPartition":6 },
                        "actual_total_time":{ "value":5, "avg":3.000000, "stddev":2.000000, "max":5, "maxPartition":0 },
                        "start_time":{ "value":551, "avg":1434.437500, "stddev":0.000000, "max":1755, "maxPartition":15 },
                        "network_time":{ "value":3, "avg":1.500000, "stddev":1.500000, "max":3, "maxPartition":0 },
                        "inputs":[
                            {
                                "executor":"Top",
                                "limit":"?",
                                "actual_row_count":{ "value":486, "avg":30.375000, "stddev":20.766183, "max":110, "maxPartition":6 },
                                "actual_total_time":{ "value":0, "avg":0.000000, "stddev":0.000000, "max":0, "maxPartition":0 },
                                "inputs":[
                                    {
                                        "executor":"Filter",
                                        "condition":[
                                            "cont.game_id = ? AND cont.event_arrival_time >= ? AND cont.event_arrival_time <= ?"
                                        ],
                                        "subselects":[],
                                        "actual_row_count":{ "value":498, "avg":31.125000, "stddev":23.127027, "max":120, "maxPartition":6 },
                                        "actual_total_time":{ "value":50, "avg":42.000000, "stddev":8.000000, "max":50, "maxPartition":0 },
                                        "start_time":{ "value":15, "avg":36.000000, "stddev":0.000000, "max":60, "maxPartition":15 },
                                        "inputs":[
                                            {
                                                "executor":"ColumnStoreScan",
                                                "db":"contribution",
                                                "table":"cont_event",
                                                "alias":"cont",
                                                "index":"KEY operator_id (event_arrival_time, action, operator_id, game_id, correlation_id) USING CLUSTERED COLUMNSTORE",
                                                "storage":"columnar",
                                                "est_table_rows":"259200577",
                                                "est_filtered":"1",
                                                "actual_row_count":{ "value":7007223, "avg":437951.437500, "stddev":250062.862187, "max":1402339, "maxPartition":3 },
                                                "actual_total_time":{ "value":4187, "avg":4184.500000, "stddev":2.500000, "max":4187, "maxPartition":8 },
                                                "start_time":{ "value":14, "avg":35.000000, "stddev":0.000000, "max":59, "maxPartition":15 },
                                                "memory_usage":{ "value":100663296, "avg":6291456.000000, "stddev":0.000000, "max":6291456, "maxPartition":0 },
                                                "segments_scanned":{ "value":34, "avg":2.125000, "stddev":0.330719, "max":3, "maxPartition":3 },
                                                "segments_skipped":{ "value":217, "avg":13.562500, "stddev":0.496078, "max":14, "maxPartition":0 },
                                                "segments_fully_contained":{ "value":0, "avg":0.000000, "stddev":0.000000, "max":0, "maxPartition":0 },
                                                "segments_filter_encoded_data":{ "value":18, "avg":1.125000, "stddev":0.330719, "max":2, "maxPartition":3 },
                                                "inputs":[]
                                            }
                                        ]
                                    }
                                ]
                            }
                        ]
                    }
                ]
            }
        ]
    }
],
"version":"2",
"warning":"requires compile",
"info":{
    "memsql_version":"6.8.11",
    "memsql_version_hash":"e973c625ae6d372c2d41d39b19612202c244fd7a",
    "num_online_leaves":"2",
    "num_online_aggs":"1",
    "context_database":"(null)"
}

}

with order-by

PROFILE: SELECT * FROM contribution.cont_event cont WHERE game_id = 'RUUvr1574773090344' AND event_arrival_time BETWEEN '2019-12-02 00:00:00' AND '2019-12-31 23:59:59' ORDER BY event_arrival_time DESC

{
"profile":[
    {
        "executor":"Top",
        "limit":"@@SESSION.`sql_select_limit`",
        "actual_row_count":{ "value":300 },
        "actual_total_time":{ "value":0 },
        "inputs":[
            {
                "executor":"GatherMerge",
                "order":[
                    "remote_0.event_arrival_time DESC"
                ],
                "partitions":"all",
                "query":"SELECT `cont`.`event_id` AS `event_id`, `cont`.`action` AS `action`, `cont`.`correlation_id` AS `correlation_id`, `cont`.`status` AS `status`, `cont`.`event_arrival_time` AS `event_arrival_time`, `cont`.`create_time` AS `create_time`, `cont`.`create_ts` AS `create_ts`, `cont`.`operator_id` AS `operator_id`, `cont`.`game_id` AS `game_id`, `cont`.`player_id` AS `player_id`, `cont`.`segment_code` AS `segment_code`, `cont`.`bet_amount_original` AS `bet_amount_original`, `cont`.`bet_amount_converted` AS `bet_amount_converted`, `cont`.`cont_amount_player` AS `cont_amount_player`, `cont`.`cont_amount_operator` AS `cont_amount_operator`, `cont`.`cont_amount_total` AS `cont_amount_total`, `cont`.`operator_income` AS `operator_income`, `cont`.`cont_amount_jackpot` AS `cont_amount_jackpot`, `cont`.`original_currency` AS `original_currency`, `cont`.`base_currency` AS `base_currency`, `cont`.`currency_rate` AS `currency_rate`, `cont`.`operator_game_code` AS `operator_game_code`, `cont`.`funnel_id` AS `funnel_id`, `cont`.`segment_name` AS `segment_name`, `cont`.`operator_game_name` AS `operator_game_name`, `cont`.`description` AS `description`, `cont`.`extra_fields` AS `extra_fields`, `cont`.`jackpot_game_name` AS `jackpot_game_name`, `cont`.`game_version` AS `game_version`, `cont`.`event_type` AS `event_type`, `cont`.`event` AS `event` FROM `contribution_0`.`cont_event` as `cont`  WHERE ((`cont`.`game_id` = 'RUUvr1574773090344') AND (`cont`.`event_arrival_time` BETWEEN '2019-12-02 00:00:00' AND '2019-12-03 23:59:59')) ORDER BY 5 DESC LIMIT NULL OPTION(NO_QUERY_REWRITE=1, INTERPRETER_MODE=INTERPRET_FIRST)",
                "alias":"remote_0",
                "actual_row_count":{ "value":4800, "avg":0.000000, "stddev":0.000000, "max":0, "maxPartition":0 },
                "actual_total_time":{ "value":0 },
                "start_time":{ "value":0 },
                "end_time":{ "value":34472 },
                "inputs":[
                    {
                        "executor":"Project",
                        "out":[
                            {
                                "alias":"",
                                "projection":"cont.event_id"
                            },
                            {
                                "alias":"",
                                "projection":"cont.action"
                            },
                            {
                                "alias":"",
                                "projection":"cont.correlation_id"
                            },
                            {
                                "alias":"",
                                "projection":"cont.status"
                            },
                            {
                                "alias":"",
                                "projection":"cont.event_arrival_time"
                            },
                            {
                                "alias":"",
                                "projection":"cont.create_time"
                            },
                            {
                                "alias":"",
                                "projection":"cont.create_ts"
                            },
                            {
                                "alias":"",
                                "projection":"cont.operator_id"
                            },
                            {
                                "alias":"",
                                "projection":"cont.game_id"
                            },
                            {
                                "alias":"",
                                "projection":"cont.player_id"
                            },
                            {
                                "alias":"",
                                "projection":"cont.segment_code"
                            },
                            {
                                "alias":"",
                                "projection":"cont.bet_amount_original"
                            },
                            {
                                "alias":"",
                                "projection":"cont.bet_amount_converted"
                            },
                            {
                                "alias":"",
                                "projection":"cont.cont_amount_player"
                            },
                            {
                                "alias":"",
                                "projection":"cont.cont_amount_operator"
                            },
                            {
                                "alias":"",
                                "projection":"cont.cont_amount_total"
                            },
                            {
                                "alias":"",
                                "projection":"cont.operator_income"
                            },
                            {
                                "alias":"",
                                "projection":"cont.cont_amount_jackpot"
                            },
                            {
                                "alias":"",
                                "projection":"cont.original_currency"
                            },
                            {
                                "alias":"",
                                "projection":"cont.base_currency"
                            },
                            {
                                "alias":"",
                                "projection":"cont.currency_rate"
                            },
                            {
                                "alias":"",
                                "projection":"cont.operator_game_code"
                            },
                            {
                                "alias":"",
                                "projection":"cont.funnel_id"
                            },
                            {
                                "alias":"",
                                "projection":"cont.segment_name"
                            },
                            {
                                "alias":"",
                                "projection":"cont.operator_game_name"
                            },
                            {
                                "alias":"",
                                "projection":"cont.description"
                            },
                            {
                                "alias":"",
                                "projection":"cont.extra_fields"
                            },
                            {
                                "alias":"",
                                "projection":"cont.jackpot_game_name"
                            },
                            {
                                "alias":"",
                                "projection":"cont.game_version"
                            },
                            {
                                "alias":"",
                                "projection":"cont.event_type"
                            },
                            {
                                "alias":"",
                                "projection":"cont.event"
                            }
                        ],
                        "subselects":[],
                        "actual_row_count":{ "value":4800, "avg":300.000000, "stddev":0.000000, "max":300, "maxPartition":0 },
                        "actual_total_time":{ "value":5, "avg":4.000000, "stddev":1.000000, "max":5, "maxPartition":0 },
                        "start_time":{ "value":33799, "avg":34093.812500, "stddev":0.000000, "max":34466, "maxPartition":6 },
                        "network_traffic":{ "value":9213646, "avg":575852.875000, "stddev":41.005907, "max":575938, "maxPartition":10 },
                        "network_time":{ "value":11, "avg":6.000000, "stddev":5.000000, "max":11, "maxPartition":8 },
                        "inputs":[
                            {
                                "executor":"TopSort",
                                "limit":"?",
                                "order":[
                                    "cont.event_arrival_time DESC"
                                ],
                                "actual_row_count":{ "value":4800, "avg":300.000000, "stddev":0.000000, "max":300, "maxPartition":0 },
                                "actual_total_time":{ "value":24, "avg":22.500000, "stddev":1.500000, "max":24, "maxPartition":8 },
                                "start_time":{ "value":621, "avg":2473.000000, "stddev":0.000000, "max":2828, "maxPartition":13 },
                                "memory_usage":{ "value":33554432, "avg":2097152.000000, "stddev":0.000000, "max":2097152, "maxPartition":0 },
                                "inputs":[
                                    {
                                        "executor":"Filter",
                                        "condition":[
                                            "cont.game_id = ? AND cont.event_arrival_time >= ? AND cont.event_arrival_time <= ?"
                                        ],
                                        "subselects":[],
                                        "actual_row_count":{ "value":110794, "avg":6924.625000, "stddev":79.359841, "max":7068, "maxPartition":14 },
                                        "actual_total_time":{ "value":297, "avg":285.500000, "stddev":11.500000, "max":297, "maxPartition":8 },
                                        "start_time":{ "value":2, "avg":4.937500, "stddev":0.000000, "max":16, "maxPartition":13 },
                                        "inputs":[
                                            {
                                                "executor":"ColumnStoreScan",
                                                "db":"contribution",
                                                "table":"cont_event",
                                                "alias":"cont",
                                                "index":"KEY operator_id (event_arrival_time, action, operator_id, game_id, correlation_id) USING CLUSTERED COLUMNSTORE",
                                                "storage":"columnar",
                                                "est_table_rows":"258396693",
                                                "est_filtered":"1",
                                                "actual_row_count":{ "value":46819110, "avg":2926194.375000, "stddev":239207.291250, "max":3849633, "maxPartition":3 },
                                                "actual_total_time":{ "value":34162, "avg":34029.500000, "stddev":132.500000, "max":34162, "maxPartition":0 },
                                                "start_time":{ "value":1, "avg":3.687500, "stddev":0.000000, "max":15, "maxPartition":13 },
                                                "memory_usage":{ "value":100663296, "avg":6291456.000000, "stddev":0.000000, "max":6291456, "maxPartition":0 },
                                                "segments_scanned":{ "value":55, "avg":3.437500, "stddev":0.496078, "max":4, "maxPartition":2 },
                                                "segments_skipped":{ "value":217, "avg":13.562500, "stddev":0.496078, "max":14, "maxPartition":0 },
                                                "segments_fully_contained":{ "value":0, "avg":0.000000, "stddev":0.000000, "max":0, "maxPartition":0 },
                                                "segments_filter_encoded_data":{ "value":55, "avg":3.437500, "stddev":0.496078, "max":4, "maxPartition":2 },
                                                "inputs":[]
                                            }
                                        ]
                                    }
                                ]
                            }
                        ]
                    }
                ]
            }
        ]
    }
],
"version":"2",
"info":{
    "memsql_version":"6.8.11",
    "memsql_version_hash":"e973c625ae6d372c2d41d39b19612202c244fd7a",
    "num_online_leaves":"2",
    "num_online_aggs":"1",
    "context_database":"(null)"
}

}

additionally:

Dont you think I shall add event_arrival_time to that coulmnar index key as well? all our queries are based on date. wont it be better while sorting?

also attaching here my pref question to you from my prev reply:

I suggest sticking with keyless sharding for now.

regarding the shard-key. dont you think as best practice would be better to use the correlation_id (instead of default) as it has high cardinality ?

Thanks

1 Like

@franck.leveneur I was trying to browse the net for an example of how to use Fact and Dim methodologies. couldnt find single on except this theoretical post: Third Normal Form, Star Schema, and a Performance Centric Data Strategy

Do you have some technical ref (video/blog/etc…) where I can take an example for that?

1 Like

Hi @cwatts, Did you had a chance to look on my last response? I added the traces which you requested

Thanks

1 Like

The issue here involves the implicit LIMIT in your queries. You have the session variable sql_select_limit = 300, which is like appending LIMIT 300 to your queries. When there is no order by, we can stop the scan when we get 300 rows. However, in the presence of order by, we need to look at all rows, then sort and take the top 300. In the profile, you can see that the non-order-by query scans only 7M rows, but the order-by query scans 47M. This is the reason for the slowdown.

regarding the shard-key. dont you think as best practice would be better to use the correlation_id (instead of default) as it has high cardinality ?

Using high-cardinality columns for shard keys is good because it gives you even distribution of your data, yes. Keyless sharding also gives you even distribution of your data. There are other considerations to make when choosing a shard key, like making your common joins partition-local. See here: Optimizing Table Data Structures · SingleStore Documentation
In any case, the choice of shard key is not relevant to the latency of this query, which is why I recommended leaving it unchanged for this performance investigation.

I was trying to browse the net for an example of how to use Fact and Dim methodologies.

You’ll be more successful in your searches if you search for “fact and dimension tables” - “dim” is short for “dimension” here.

1 Like

A clustered columnstore key of event_arrival_time, game_id, action, operator_id, correlation_id is pretty close to what you have now and should give you better segment elimination for this query.

1 Like

In the profile, you can see that the non-order-by query scans only 7M rows, but the order-by query scans 47M. This is the reason for the slowdown

@cwatts Iam not sure regarding the solution. What do you actuall suggesting to boost that query?

You mentioned:

A clustered columnstore key of event_arrival_time, game_id, action, operator_id, correlation_id is pretty close to what you have now and should give you better segment elimination for this query.

Today we have:

event_arrival_time, action, operator_id, game_id, correlation_id

Isnt it almost the same?

btw: action, operatorId, gameId have a very small distinct values

Thanks,
Idan

1 Like

If you want to be able to avoid scanning the whole table to find the top 300 rows, your sort (clustered columnstore) key needs to support that. This means your order by fields (including order i.e. DESC) must be a prefix of your sort key. This would be accomplished by making event_arrival_time in your current sort key event_arrival_time desc.

Today we have:
event_arrival_time, action, operator_id, game_id, correlation_id
Isnt it almost the same?

The order that your fields appear in the key matters. Rows are sorted lexicographically, so putting game_id directly after event_arrival_time will get you better segment elimination. To help understand this, consider a sorted dictionary of words. There is a contiguous range of words starting with “co” (“c” corresponding to an event_arrival_time and “o” corresponding to a game_id), so it is easier to find those words than words containing “c_o” (where “_” is a wildcard representing possible values for action).

1 Like

so we do have event_arrival_time as prefix. so we are almost there.
you saying we shall have something like that:

event_arrival_time, game_id, action, operator_id, correlation_id

thats almost the same isnt it?

quoting my last note:

1 Like

If you want to sort DESC, you need event_arrival_time desc as a prefix, which you do not have. My suggestion is event_arrival_time desc, game_id, action, operator_id, correlation_id. The desc is to support the order by limit without scanning the whole table, and the game_id earlier is for better segment elimination.

thats almost the same isnt it?

Sure, they are similar, but the differences are important.

1 Like

correlation_id has high cardinality are you sure it shall be described in cluster columnstore key ? I thought I understood that only low cardinality coulmns should be added to that key

1 Like

For the purposes of optimizing this query, nothing after game_id matters. I was just trying to make minimal changes to the key you already had.

1 Like

We are planning to recreate this table so if I can get more tips from experienced expert like you(in regarding how shall I design the columnstore cluster index key Ofcourse that is highly appreciated

Would you recommend keeping high cardinally column on that key?

Thanks again for your help

1 Like

Referring to the documentation on sort key guidelines may be useful.

There is nothing inherently wrong with having high-cardinality columns in the sort key. Date/datetimes are commonly found in sort keys, for example. The two primary things to consider in choosing a key are segment elimination and merge cost. Documentation on segment elimination can be found here and on the merger here. You need to estimate if the benefit to read performance given by segment elimination (which of course requires you to be filtering on correlation_id in the first place) is worth the write amplification given by extra merger work.

1 Like

I read that like 20 times:) I didnt know you can add to the columnstore key the variable DESC. we have veriouse queries on that table. this table planned to be huge(billions of transactions) ! perhaps I did mistake by not explaining you the motivation of having order by desc. mybe Iam doing something wrong here by desgin.

We have backoffice where we transactional reports to customers. we use that query to fetch the results to generate this report. We do this by pagination and show it on the screen (using pages). we dont want to fetch all table at once so we paginate it using ordered by desc with the query you already familiar:

SELECT * FROM contribution.cont_event cont WHERE game_id = 'RUUvr1574773090344' AND action IN ('PLACE_BET') AND event_arrival_time BETWEEN '2019-12-02 00:00:00' AND '2019-12-31 23:59:59' ORDER BY event_arrival_time DESC LIMIT 30 OFFSET 0

This way we can show pages to the end customer ordered by date desc and he can scroll to the next ones as he wishes(google-stype). perhaps we overkill memsql for this and we need to think about other way.

i found things like: Third Normal Form, Star Schema, and a Performance Centric Data Strategy

I do understand the concept still dont understand how memsql applies this. but hey iam sure thats for another question which I already posted :slight_smile: (How to leverage memsql for reports that combining multiple tables - #2 by franck.leveneur - Off-Topic - SingleStore Forums)

Ive done: SHOW COLUMNAR MERGE STATUS FOR cont_event; merger showed good results - I think:

Merger,State,Plan,Progress,Partition
(Current groups),NULL,"17,1",NULL,7
(Current groups),NULL,"17,1",NULL,6
(Current groups),NULL,"17,1",NULL,1
(Current groups),NULL,"17,1",NULL,5
(Current groups),NULL,"17,1",NULL,15
(Current groups),NULL,"17,1",NULL,4
(Current groups),NULL,"17,1",NULL,3
(Current groups),NULL,"17,1",NULL,2
(Current groups),NULL,"17,1",NULL,0
(Current groups),NULL,"17,1",NULL,8
(Current groups),NULL,"17,1",NULL,9
(Current groups),NULL,"17,1",NULL,10
(Current groups),NULL,"17,1",NULL,11
(Current groups),NULL,"17,1",NULL,12
(Current groups),NULL,"17,1",NULL,13
(Current groups),NULL,"17,1",NULL,14
1 Like

Hi @cwatts was wondering if you had another chance to look on my last response?? Thank you

1 Like