I’m well aware that the fetch of a JSON column is troublesome and could take longer than a regular column since it is using an optimized and different storage engine, however the numbers I saw do not add up.
I have a JSON column with well over 40 fields in the json data blob.
Fetching 1 field out of the json column:
select payload::columnX from … limit 1;
takes 15-20 ms.
Fetching 4 fields out of the json column:
select payload::column1, payload::column2, payload::column3, payload::column4 from … limit 1;
takes about 70-80ms.
Fetching 40 fields out of the json column - takes about 650-700ms.
So i checked the profiled the query using the memsql studio - during the 650-700ms period, the cpu was only on 6% cpu usage. which would be equivalent to roughly 60 ms of cpu time.
It makes me wonder whether the underlying implementation is using multiple threads to fetch from the JSON storage, or linearly column by column, and the delay is basically disk access that doesn’t get paralleled ?