ERROR 1956 ER_PARAMETER_CAPACITY_EXCEEDED: Query cannot be completed because the parameter array capacity of 1048576 was exceeded

Hi,

I’m working with a large sample of trading data from multiple years, with each row as a single trade. Columns included in this table are the trade_datetime (MM-DD-YYYY HH:mm:ss.SSS), trade_date (MM-DD-YYYY), and trade_hour (HH).

My below query works as intended with a smaller set of data; this retains the latest trade data over a 1hr interval, then deletes everything else.

However with anything larger, 8.55M rows of data, I receive the following error in MemSQL:

ERROR 1956 ER_PARAMETER_CAPACITY_EXCEEDED: Query cannot be completed because the parameter array capacity of 1048576 was exceeded

I have plenty of resources available on this machine, along with 128GB or RAM, yet was wondering if there was a setting within MemSQL placing constraints on this parameter?

Query:

DELETE FROM tbl_trades
WHERE row IN
    (SELECT row
    FROM tbl_trades A
    WHERE trade_datetime <
        (SELECT MAX(trade_datetime)
        FROM tbl_trades
        WHERE ticker = A.ticker
        AND trade_date = A.trade_date
        AND trade_hour = A.trade_hour)
    GROUP BY ticker, trade_price, trade_volume, trade_datetime, trade_date, trade_hour);

Here is how the table was created:

CREATE TABLE IF NOT EXISTS tbl_trades
(
    row INT AUTO_INCREMENT,
    ticker TEXT NOT NULL,
    trade_price DECIMAL(12,2) NOT NULL,
    trade_volume DECIMAL(12,2) NOT NULL,
    trade_datetime DATETIME(6) NOT NULL,
    trade_date DATE NOT NULL,
    trade_hour INT NOT NULL,
    upload_time TIMESTAMP(6) DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    KEY(`row`) USING CLUSTERED COLUMNSTORE,
    SHARD KEY (`ticker`)
);

Thanks

Yes, this is an internal limit of MemSQL (Hard-coded limit)

The limit is being hit by the result of the IN clause (The query SELECT MAX(trade_datetime) FROM tbl_trades WHERE ticker = A.ticker AND trade_date = A.trade_date AND trade_hour = A.trade_hour) GROUP BY ticker, trade_price, trade_volume, trade_datetime, trade_date, trade_hour)

One solution is

create temporary table tmp as SELECT row
    FROM tbl_trades A
    WHERE trade_datetime <
        (SELECT MAX(trade_datetime)
        FROM tbl_trades
        WHERE ticker = A.ticker
        AND trade_date = A.trade_date
        AND trade_hour = A.trade_hour)
    GROUP BY ticker, trade_price, trade_volume, trade_datetime, trade_date, trade_hour;

DELETE FROM tbl_trades WHERE row IN (select row from tmp order by row limit 1000000 offset 0);
DELETE FROM tbl_trades WHERE row IN (select row from tmp order by row limit 1000000 offset 1000000);
DELETE FROM tbl_trades WHERE row IN (select row from tmp order by row limit 1000000 offset 2000000);
.....
....
DROP TABLE tmp;

This is just one solution on top of my head. You can wrap this inside a stored procedure (SingleStoreDB Cloud · SingleStore Documentation) and do a FOR loop to automate this if you want. You can use begin and end if you want transaction semantics.

1 Like

Thanks, this works well! Makes sense to run this as a stored procedure, during low volume hours.

Much appreciated