Upgrade failing with Timeout


#1

Hi,

I am trying to upgrade from 5.x to 6.x in the box without internet access. I have managed to move the tar.tz file to the remote box, but the upgrade is failing.

sudo memsql-ops agent-upgrade --file-path ./memsql-ops-6.7.5.tar.gz
Adding a MemSQL Ops file
Unpacking archive to determine its version
A file of type MemSQL Ops with version 6.7.5 already exists
Overwriting existing file
Copying file into MemSQL Ops data directory
Registering file with MemSQL Ops
Successfully added a MemSQL Ops file with version 6.7.5
Timed out while sending API message
You have mail in /var/spool/mail/root


#2

Are all OPS agents in your cluster in good communication with each other? Do your network and security settings permit communication between OPS agents? This message Timed out while sending API message means that the primary OPS agent tried to send instructions to a follower OPS agent to upgrade itself, but it was unable to communicate and timed out.

Check current OPS state by running the command below on the primary OPS agent, and also SSH into each of the other hosts and run it there.

memsql-ops agent-list

If one of the agents shows itself as offline or fails to respond when you run that command on its host, then start it locally on that host with the command:

memsql-ops start

Once all OPS agents are online try the upgrade again.

The agent list should be the same on all agents. However in version 5.8 if there was disconnectivity between OPS agents it could be hard to diagnose without checking each agent individually, because the primary OPS agent would assume everything was fine until it heard otherwise, which wouldn’t happen if a follower was unable to communicate. Connectivity between OPS agents was dramatically improved in the fixes in OPS version 6.5.9. Once your MemSQL cluster is any version of 6.X.X be sure to upgrade OPS to the most recent version, which is currently 6.7.5. For more information see the OPS release notes: https://docs.memsql.com/memsql-ops/v6.7/memsql-ops-releases/

Note that if your MemSQL version is less than 5.8 you will have to upgrade to 5.8 before you can upgrade it to any 6.X version. For more information on upgrading from version 5.8 to any 6.X version see our docs: https://docs.memsql.com/operational-manual/v5.8/upgrading-memsql/

As a side note, the message You have mail in /var/spool/mail/root does not have to do with MemSQL nor with OPS. It is a system message, resulting from some system level task like cron job, or system security report that had an output but nowhere to put it. For more information on reading this message, see this external article: https://askubuntu.com/questions/1058894/you-have-mail-in-var-spool-mail-root


#3

@sgl-memsql I actually doesn’t have cluster here in my test env, it is all in the single system which doesn’t have an internet so I am trying to do it by downloading the files and trying to upgrade. My current memsql agent version 5.8.2, I tried doing the upgrade to 5.8.4 but i am getting the same issue as timeout.

[15.5.0.v1:root@fte-shared-linux-export:2 ~]# memsql-ops agent-list
 ID       Host          Port  Role     State   Version 
 A8ef06d  10.237.248.7  9000  PRIMARY  ONLINE  5.8.2  

As you see the primary host is online and the master and leaf is in good state
[15.5.0.v1:root@fte-shared-linux-export:2 database]# memsql-ops memsql-list

ID Agent Id Process State Cluster State Role Host Port Version
3FF9F61 A8ef06d RUNNING CONNECTED MASTER 10.237.248.7 43306 5.8.10
E1B2ED7 A8ef06d RUNNING CONNECTED LEAF 10.237.248.7 43307 5.8.10

However the upgrade is failing with the same issue as below

[15.5.0.v1:root@fte-shared-linux-export:2 database]# sudo memsql-ops agent-upgrade --file-path ./memsql-ops-5.8.4.tar.gz
Adding a MemSQL Ops file
Unpacking archive to determine its version
Copying file into MemSQL Ops data directory
Registering file with MemSQL Ops
Successfully added a MemSQL Ops file with version 5.8.4
Timed out while sending API message

Not sure what really causing this issue, Is that something to do with the permissions on the folder?


#4

Any update on this, I am still having this issue.