Unable to insert longblob to a table


#1

Hi,

I created a test table:
CREATE TABLE mf_model (
timestamp int(10) unsigned NOT NULL,
model longblob NOT NULL,
/*!90618 SHARD / KEY timestamp (timestamp),
KEY timestamp_2 (timestamp) /
!90619 USING CLUSTERED COLUMNSTORE /
) /
!90621 AUTOSTATS_ENABLED=TRUE */

I’m using a simple python script to insert a longblob of 281 MB to this table. This is the python script:

import time
import pickle
import traceback
import pymysql.cursors

class DB:
def __init__(self, customer_id:str):
    self.customer_id = customer_id

def get_connection(self):
    return pymysql.connect(
        host="localhost",
        port=3306,
        user="root",
        password="<password>",
        database="model_" + self.customer_id)

def save_model(db:DB, model_bin:bytes):
try:
    connection = None
    connection = db.get_connection()
    with connection.cursor() as cursor:
        timestamp = int(time.time())
        cursor.execute("INSERT INTO mf_model (timestamp, model) VALUES (%s, %s)", (timestamp, model_bin))
        cursor.execute("DELETE FROM mf_model WHERE timestamp != %s", (timestamp,))
    connection.commit()
except Exception:
    traceback.print_exc()
finally:
    if connection != None:
        connection.close()
with open("./model.pickle", "rb") as f:
    model = pickle.load(f)

model_bin = pickle.dumps(model, pickle.HIGHEST_PROTOCOL)
save_model(DB("test"), model_bin)`

Smaller blobs around 50MB can be inserted successfully, but when it’s larger sometimes it is inserted and sometimes the python script ends up with this error:

Traceback (most recent call last):
  File "upload.py", line 24, in save_model
    cursor.execute("INSERT INTO mf_model (timestamp, model) VALUES (%s, %s)", (timestamp, model_bin))
  File "/usr/lib/python3.6/site-packages/pymysql/cursors.py", line 170, in execute
    result = self._query(query)
  File "/usr/lib/python3.6/site-packages/pymysql/cursors.py", line 328, in _query
    conn.query(q)
  File "/usr/lib/python3.6/site-packages/pymysql/connections.py", line 516, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
  File "/usr/lib/python3.6/site-packages/pymysql/connections.py", line 727, in _read_query_result
    result.read()
  File "/usr/lib/python3.6/site-packages/pymysql/connections.py", line 1066, in read
    first_packet = self.connection._read_packet()
  File "/usr/lib/python3.6/site-packages/pymysql/connections.py", line 683, in _read_packet
    packet.check_error()
  File "/usr/lib/python3.6/site-packages/pymysql/protocol.py", line 220, in check_error
    err.raise_mysql_exception(self._data)
  File "/usr/lib/python3.6/site-packages/pymysql/err.py", line 109, in raise_mysql_exception
    raise errorclass(errno, errval)
pymysql.err.InternalError: (1857, 'An error in this transaction required closing a connection to a leaf. The open transaction has rolled back. Run SHOW WARNINGS for more details.')

When I try SHOW WARNINGS there is nothing.
I can not find nothing useful about this error on the Internet.
I am using Memsql 6.7 and I can not upgrade right now.
There is nothing else in the master or leaf logs. I even tried the audit logs, but there is literally nothing about the insert.
I tried it on small test setup with 1 master and 1 leaf and I also tried it on cluster with 4 aggregators and 12 leafs and it’s exactly the same.

Please can you help me solve this? According to documentation it should insert 4GB blob just fine https://docs.memsql.com/sql-reference/v6.5/datatypes/#string-types

Thank you for any hint!


#2

The first thing is to find out if that large buffer made it through the protocol stack in python to memsql. My suspicion is that the api has a max default size.

I just looked at the docs for pymysql.cursors and found there is a “max_allowed_packet” (https://pymysql.readthedocs.io/en/latest/modules/cursors.html). That page says “Default value of max_allowed_packet is 1048576.” which is 1 meg. so 281 mb would be over it.