Errors with memsql-admin on a fresh installation of MemSQL 6.7.11

Hi,

We have installed MemSQL 6.7.11 following the latest official guide.
We have created a user “memuser” with sudo privileges and used it to launch the “memsql-deploy setup-cluster” command.

The command completed successfully and returned the following:

Registered hosts
memsql-deploy will perform the following actions:
· Install memsql-server 6.7.11 on hosts
- x.x.x.x
- x.x.x.x
- x.x.x.x
- x.x.x.x
- x.x.x.x
- x.x.x.x
· Deploy a master aggregator on x.x.x.x:3306
- Enable high availability mode
· Deploy a child aggregator on x.x.x.x:3306
· Deploy leaf nodes on
- x.x.x.x:3306
- x.x.x.x:3306
- x.x.x.x:3306
- x.x.x.x:3306
· Set MemSQL root password on all nodes

Would you like to continue? [y/N]: y
Downloaded memsql-server 6.7.11
Installing MemSQL on all hosts…
Installed memsql-server6.7.11-5d2517b77a on host x.x.x.x (1/6)
Installed memsql-server6.7.11-5d2517b77a on host x.x.x.x (2/6)
Installed memsql-server6.7.11-5d2517b77a on host x.x.x.x (3/6)
Installed memsql-server6.7.11-5d2517b77a on host x.x.x.x (4/6)
Installed memsql-server6.7.11-5d2517b77a on host x.x.x.x (5/6)
Installed memsql-server6.7.11-5d2517b77a on host x.x.x.x (6/6)
Successfully installed on 6 hosts
Created master node
Successfully set license
Bootstrapped master aggregator
Enabled high availability mode
Created aggregator nodes
Added aggregators nodes to cluster
Created leaf nodes
Added leaf nodes to cluster

To view your cluster, run ‘memsql-admin list-nodes’

.
.
.
But, whenever we run ‘memsql-admin list-nodes’, it returns the following:

✘ Failed to list nodes on all hosts: failed to list nodes on 6 hosts: x.x.x.x, x.x.x.x, x.x.x.x, x.x.x.x, x.x.x.x, x.x.x.x
No nodes found

.
.
.
Running “sudo memsqlctl …” returns the data successfully and shows the cluster as up and running.

After troubleshooting the issue, it seems memsql-admin is trying to launch memsqlctl as “memuser” as memsql-report would return the following error:

✘ Failed to collect informationSchemaPipelines: error running memsqlctl: error running command: "/usr/bin/memsqlctl" "--json" "--yes" "list-nodes" "--"Master": exit status 1
stderr: memsqlctl is configured to require that users be in the ‘memsql’ group for read access. Please rerun the command as a user in this group or theuser: open /var/lib/memsql/nodes.hcl.lock: permission denied

.
.
.
If we configure memsql-admin to implicitly run as root, then everything will work fine. But it’s not recommended from our side and memsql doesn’t require it.

Is the above behavior a bug? Or have we missed something?

Thank you.

Just to add to the above that we can SSH to every host using the identity key of the “memuser” user.
This user has sudo privileges on all hosts. When we try to call “sudo memsqlctl”, all commands complete successfully, but when we call “memsqlctl” without sudo then we receive the same errors as above:

memsqlctl failed to load the MemSQL config file ‘/memsql/base/memsql.cnf’ referenced by the node metadata file at ‘/var/lib/memsql/nodes.hcl’.

.
.
.
And this is the rights of the file:

-rw------- 1 memsql memsql 501 Feb 20 15:50 /memsql/base/memsql.cnf

.
.
.
The “ps -ef” returns several lines similar to the below: (which i guess is the memsql-admin connecting to the different hosts)

memuser 23798 1 0 16:53 pts/0 00:00:00 /usr/bin/ssh -oControlMaster=yes -oControlPath=/run/user/1007/memsql-toolbox522622473/1.socket -N -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -oServerAliveInterval=60 -oServerAliveCountMax=5 -oBatchMode=no -oIdentityFile=/home/memuser/.ssh/id_rsa x.x.x.x

Thank you for including so much detail in your post. It really helped me narrow down what might be happening here. To fix your issue, give the memsql group read permissions on all node directories and files.

Detailed notes follow:

As expected, memsql-deploy setup-cluster created a memsql system user and memsql group on each host and added the admin user (memuser in this case) to the group. This should have allowed the memuser user to perform operations such as reading config files, since memsqlctl will create these files owned by memsql:memsql with the permission bits set to 644.

In your case, however, these files were created without read permissions for the memsql group, which would explain the errors from memsql-admin and memsqlctl:

I could reproduce this behavior by setting my default umask to 077. If you’re using something like a default umask or ACLs to manage permissions, you can avoid these errors in the future by allowing the memsql group to read files in the /memsql/base directory.

Thank you very much. This was indeed our issue.
The umask was set to 077 as part of our hardening procedure.

We tried giving the permissions to all node directories and files as you suggested and we also tried changing the umask to 027, cleaning up and then reinstalling again. Both methods were successful.

Thanks again for your support.