Plan caching is different than query caching in a lot of other systems. Many other systems have query caching where if an identical query is run a second time, and the data hasn’t changed, the results cached from the first execution are sent back and the query doesn’t need to run again. Therefore the runtime of the second query is basically 0.
On the other hand, in MemSQL we only cache query plans, not query results, like Rob said above. Second runs of the same plan are faster because we can use the previously compiled plan, but still fully re-execute the query. This is much more general than query caching - any query with the same shape, but with different values of parameters, will be able to reuse the same plan, and of course the same plan can be used as the data changes. See https://docs.memsql.com/v6.8/introduction/faqs/memsql-faq/#why-do-memsql-queries-typically-run-faster-the-second-time-they-are-executed and https://docs.memsql.com/v6.8/concepts/code-generation/ for more information.
What makes the most sense to performance test depends on what your application workload looks like. If you mostly run a fixed set of query shapes/templates with different parameters subbed in, then those will be able to reuse cached plans. While if you mostly run unique new query shapes with e.g. different filter expressions, joins, groupings being added, then you would care about the first-run performance, in which case you would want to include the time it takes to compile the query plans in your performance testing. It is tricky to test that accurately, especially in a concurrent setting. But a reasonable place to start is to run DROP ALL FROM PLANCACHE on all nodes immediately after each query (note that plan_expiration_minutes must not be set to 0 for this to work).
Also, setting plan_expiration_minutes = 0 will only expire plans from memory, but they will remain on disk, so it won’t do what you want.