How to get the total number of successful and failed queries for a recent interval, say 5 mins?


#1

We can use mv_activtites/mv_activities_extended and join with mv_queries to get the total success_count and failure_count of queries over the recent interval configured by session variable activities_delta_sleep_s, which is 1 second by default. But if I am interested to know for the last 5 minutes, can I increase the session variable value from 1sec to 5mins? What will be the performance implication due to that? I am guessing there will be performance overhead due to that.

As an alternative to mv_activtites, can I use mv_finished_tasks? When does mv_finished_tasks table get refreshed? When it gets refreshed, does it refresh for all the activities or some of the activities depending on last_finished_timestamp or something else?

Similarly I wanted information like average query latency for all the queries fired over the last few minutes. Which table can I use for that?


#2

The impact of changing activities_delta_sleep_s is just that all queries on mv_activities etc will sleep for that long to collect statistics over that time period. It doesn’t add any performance overheads.

mv_activities is essentially just the delta of mv_activities_cumulative between two points in time. So you can also collect the _cumulative table periodically (5 minutes apart in your case) and subtract.

mv_finished_tasks is also a cumulative table. For the most common type of activity of interest, queries, their entries in these tables are cleared/reset when the query is evicted from the in-memory plancache - i.e. their statistics get reset to 0. Everything will get reset upon restart.

You can compute average query latency using mv_activities as well, use elapsed_time_ms and the same strategy as above.

EDIT: edited wording above to avoid confusion about what “refresh” we’re talking about.


#3

To avoid potential confusion:

I think Jack’s talking about a different notion of “refresh” than the one you had in mind. But that’s because the stats are up always up to data and there’s no need to periodically update them. information_schema queries read internal in-memory data structures and format the output as a SQL table. In the case of workload profiling tables, these data structures are updated on the fly by queries as they execute, even when no infoschema queries are running.

Though it’s tangential, it’s still worth keeping in mind that, as Jack described, stats for a query can get reset if it’s dropped from the in-memory plancache.

mv_activities with a delta of 5 min would indeed suffice to get finished/success counts. The delta simply controls the length of a sleep inside the query.

However, I wouldn’t recommend using mv_activities to get average latency. The issue is that mv_activities includes metrics for not-yet-finished instances of a query, so its elapsed_time_ms isn’t quite a measure of latency. In particular, if you were to divide elapsed_time_ms by the instance count (success_count + run_count + failure_count), you’d underestimate average latency.

To compute average metrics per execution, I’d recommend mv_finished_tasks instead. That table’s metrics exclude running instances. However, since we don’t currently provide a delta table for mv_finished_tasks - in the sense that mv_activities[_extended] is the delta of two reads of mv_activities[_extended]_cumulative - it can only directly tell you about average latency across all instances of a query since the time it was last added to the in-memory plancache. To compute average latency over just those instances that ran within a certain window of time, you’d have to manually take two snapshots, at the start and end of the window, and then compute a delta manually, as Jack mentioned.