Malloc_active_memory exceeds 10 GB and crashes cluster

We had our first MemSQL crash this weekend and the logs indicate this was caused by too low memory. The cluster became healthy after restarting our two data nodes, but we’ll like to understand the error and prevent this from happening again.

By comparing the server status error dump from the tracelogs with actual SHOW STATUS data, we noticed that Malloc_active_memory stands out.

With its almost 10 GB usage it differs a lot from the description:

Tracks memory allocated directly from the Linux OS and managed by the C runtime allocators (not MemSQL’s built-in memory allocators that use the Buffer Manager). The memory use here should be approximately 1-2 GBs for most workloads. Column store tables, open connections, and memory for metadata about tables, columns, etc. are the biggest consumers of memory.

What could be using all this memory?

We’re running MemSQL 6.7.12

Error message. “Nonfatal buffer manager memory allocation failure. Memory use (13443.625000 MB) has reached the maximum_memory parameter (13539 MB).”

Server dump:

Threads_cached :  27
Threads_connected :  43
Threads_created :  50
Threads_running :  1
Threads_background :  1
Threads_idle :  20
Ready_queue :  0
Idle_queue :  0
Context_switches :  255657
Context_switch_misses :  5
Columnstore_ingest_management_estimated_segments_to_flush :  0
Columnstore_ingest_management_estimated_memory :  0.000 MB
Threads_waiting_for_disk_space :  0
Total_server_memory :  13443.6 (-99.9) MB
Total_io_pool_memory :  0.1 MB
Free_io_pool_memory :  0.0 MB
Alloc_thread_stacks :  51.000 MB
Malloc_active_memory :  10127.341 (-5.677) MB
Malloc_transaction_cached_memory :  323.758 MB
Buffer_manager_memory :  3036.6 (-6.0) MB
Buffer_manager_cached_memory :  63.5 (-5.6) MB
Buffer_manager_unrecycled_memory :  9.9 (+1.1) MB
Alloc_skiplist_tower :  530.750 MB
Alloc_variable :  806.250 MB
Alloc_table_primary :  1581.125 (-0.250) MB
Alloc_deleted_version :  31.625 MB
Alloc_internal_key_node :  4.000 MB
Alloc_hash_buckets :  19.274 MB
Alloc_table_metadata_cache :  0.375 MB
Alloc_code_generator :  0.125 (+0.125) MB
Alloc_unit_images :  5.288 (-51.088) MB
Alloc_unit_ifn_thunks :  0.293 (-1.933) MB
Alloc_object_code_images :  2.148 (-21.239) MB
Alloc_compiled_unit_sections :  1.352 (-13.769) MB
Alloc_databases_list_entry :  0.125 MB
Alloc_plan_cache :  0.500 MB
Alloc_warnings :  2.500 MB
Alloc_replication_large :  8.000 MB
Alloc_durability_large :  192.251 MB
Alloc_skynet_replication :  0.375 MB
Alloc_sharding_partitions :  0.125 MB
Alloc_log_replay :  0.155 MB
Alloc_mmap_file :  3072.000 MB
Alloc_protocol_packet :  5.250 MB
Alloc_profile_stats :  0.125 MB
Alloc_background_tasks :  0.000 (-1.375) MB
Alloc_table_memory :  2973.024 (-0.250) MB
Alloc_variable_bucket_16 :  allocs:659312  alloc_MB:10.1  buffer_MB:11.6  cached_buffer_MB:0.1
Alloc_variable_bucket_24 :  allocs:1565813  alloc_MB:35.8  buffer_MB:38.2  cached_buffer_MB:0.2
Alloc_variable_bucket_32 :  allocs:1142800  alloc_MB:34.9  buffer_MB:37.0  cached_buffer_MB:0.1
Alloc_variable_bucket_40 :  allocs:471952  alloc_MB:18.0  buffer_MB:19.5  cached_buffer_MB:0.2
Alloc_variable_bucket_48 :  allocs:190551  alloc_MB:8.7  buffer_MB:11.0  cached_buffer_MB:0.0
Alloc_variable_bucket_56 :  allocs:215578  alloc_MB:11.5  buffer_MB:13.9  cached_buffer_MB:0.0
Alloc_variable_bucket_64 :  allocs:199886  alloc_MB:12.2  buffer_MB:16.4  cached_buffer_MB:0.0
Alloc_variable_bucket_72 :  allocs:236094  alloc_MB:16.2  buffer_MB:21.9  cached_buffer_MB:0.0
Alloc_variable_bucket_80 :  allocs:192898  alloc_MB:14.7  buffer_MB:19.6  cached_buffer_MB:0.0
Alloc_variable_bucket_88 :  allocs:169288  alloc_MB:14.2  buffer_MB:18.9  cached_buffer_MB:0.0
Alloc_variable_bucket_104 :  allocs:303789  alloc_MB:30.1  buffer_MB:37.9  cached_buffer_MB:0.0
Alloc_variable_bucket_128 :  allocs:223656  alloc_MB:27.3  buffer_MB:35.2  cached_buffer_MB:0.0
Alloc_variable_bucket_160 :  allocs:104426  alloc_MB:15.9  buffer_MB:24.4  cached_buffer_MB:0.0
Alloc_variable_bucket_200 :  allocs:79602  alloc_MB:15.2  buffer_MB:25.9  cached_buffer_MB:0.0
Alloc_variable_bucket_248 :  allocs:102951  alloc_MB:24.3  buffer_MB:30.4  cached_buffer_MB:0.4
Alloc_variable_bucket_312 :  allocs:56504  alloc_MB:16.8  buffer_MB:18.2  cached_buffer_MB:0.9
Alloc_variable_bucket_384 :  allocs:30152  alloc_MB:11.0  buffer_MB:12.0  cached_buffer_MB:0.1
Alloc_variable_bucket_480 :  allocs:30081  alloc_MB:13.8  buffer_MB:14.6  cached_buffer_MB:0.2
Alloc_variable_bucket_600 :  allocs:15657  alloc_MB:9.0  buffer_MB:11.1  cached_buffer_MB:0.4
Alloc_variable_bucket_752 :  allocs:49068  alloc_MB:35.2  buffer_MB:42.6  cached_buffer_MB:0.5
Alloc_variable_bucket_936 :  allocs:29915  alloc_MB:26.7  buffer_MB:33.1  cached_buffer_MB:1.9
Alloc_variable_bucket_1168 :  allocs:38961  alloc_MB:43.4  buffer_MB:54.0  cached_buffer_MB:0.2
Alloc_variable_bucket_1480 :  allocs:30384  alloc_MB:42.9  buffer_MB:50.9  cached_buffer_MB:3.5
Alloc_variable_bucket_1832 :  allocs:26952  alloc_MB:47.1  buffer_MB:52.1  cached_buffer_MB:0.0
Alloc_variable_bucket_2288 :  allocs:16811  alloc_MB:36.7  buffer_MB:41.8  cached_buffer_MB:1.9
Alloc_variable_bucket_2832 :  allocs:13730  alloc_MB:37.1  buffer_MB:41.4  cached_buffer_MB:1.0
Alloc_variable_bucket_3528 :  allocs:6259  alloc_MB:21.1  buffer_MB:25.1  cached_buffer_MB:2.8
Alloc_variable_bucket_4504 :  allocs:3445  alloc_MB:14.8  buffer_MB:16.9  cached_buffer_MB:1.4
Alloc_variable_bucket_5680 :  allocs:1452  alloc_MB:7.9  buffer_MB:9.6  cached_buffer_MB:0.5
Alloc_variable_bucket_6224 :  allocs:352  alloc_MB:2.1  buffer_MB:2.8  cached_buffer_MB:0.0
Alloc_variable_bucket_7264 :  allocs:454  alloc_MB:3.1  buffer_MB:4.0  cached_buffer_MB:0.0
Alloc_variable_bucket_9344 :  allocs:577  alloc_MB:5.1  buffer_MB:6.1  cached_buffer_MB:0.6
Alloc_variable_bucket_11896 :  allocs:295  alloc_MB:3.3  buffer_MB:3.4  cached_buffer_MB:0.0
Alloc_variable_bucket_14544 :  allocs:32  alloc_MB:0.4  buffer_MB:0.5  cached_buffer_MB:0.0
Alloc_variable_bucket_18696 :  allocs:7  alloc_MB:0.1  buffer_MB:0.2  cached_buffer_MB:0.1
Alloc_variable_bucket_21816 :  allocs:4  alloc_MB:0.1  buffer_MB:0.1  cached_buffer_MB:0.0
Alloc_variable_bucket_26184 :  allocs:4  alloc_MB:0.1  buffer_MB:0.4  cached_buffer_MB:0.2
Alloc_variable_bucket_32728 :  allocs:0  alloc_MB:0.0  buffer_MB:0.1  cached_buffer_MB:0.1
Alloc_variable_bucket_43648 :  allocs:2  alloc_MB:0.1  buffer_MB:0.2  cached_buffer_MB:0.1
Alloc_variable_bucket_65472 :  allocs:2  alloc_MB:0.1  buffer_MB:0.2  cached_buffer_MB:0.1
Alloc_variable_bucket_130960 :  allocs:12  alloc_MB:1.5  buffer_MB:2.9  cached_buffer_MB:1.4
Alloc_variable_cached_buffers :  19.1 MB
Alloc_variable_allocated :  668.8 MB
GCed_versions_last_sweep :  0
Average_garbage_collection_duration :  1 ms

Status after recovering:

+-----------------------------------------------------------+---------------------------------------------------------------------+
| Variable_name                                             | Value                                                               |
+-----------------------------------------------------------+---------------------------------------------------------------------+
| Threads_cached                                            | 12                                                                  |
| Threads_connected                                         | 24                                                                  |
| Threads_created                                           | 30                                                                  |
| Threads_running                                           | 1                                                                   |
| Threads_background                                        | 1                                                                   |
| Threads_shutdown                                          | 0                                                                   |
| Threads_idle                                              | 6                                                                   |
| Ready_queue                                               | 0                                                                   |
| Idle_queue                                                | 0                                                                   |
| Context_switches                                          | 1396                                                                |
| Context_switch_misses                                     | 0                                                                   |
| Columnstore_ingest_management_estimated_segments_to_flush | 0                                                                   |
| Threads_waiting_for_disk_space                            | 0                                                                   |
| Total_io_pool_memory                                      | 0.1 MB                                                              |
| Free_io_pool_memory                                       | 0.0 MB                                                              |
| Alloc_thread_stacks                                       | 31.000 (+5.000) MB                                                  |
| Malloc_active_memory                                      | 448.609 (+25.688) MB                                                |
| Malloc_transaction_cached_memory                          | 323.758 MB                                                          |
| Buffer_manager_memory                                     | 2990.1 (+42.4) MB                                                   |
| Buffer_manager_cached_memory                              | 79.1 (+21.1) MB                                                     |
| Buffer_manager_unrecycled_memory                          | 0.0 MB                                                              |
| Alloc_skiplist_tower                                      | 530.625 MB                                                          |
| Alloc_variable                                            | 728.375 (+4.625) MB                                                 |
| Alloc_table_primary                                       | 1624.125 (+15.750) MB                                               |
| Alloc_deleted_version                                     | 17.875 MB                                                           |
| Alloc_internal_key_node                                   | 4.500 MB                                                            |
| Alloc_hash_buckets                                        | 19.274 MB                                                           |
| Alloc_table_metadata_cache                                | 0.250 MB                                                            |
| Alloc_unit_images                                         | 26.586 (+7.055) MB                                                  |
| Alloc_unit_ifn_thunks                                     | 1.110 (+0.231) MB                                                   |
| Alloc_object_code_images                                  | 11.982 (+2.894) MB                                                  |
| Alloc_compiled_unit_sections                              | 7.778 (+1.790) MB                                                   |
| Alloc_databases_list_entry                                | 0.125 MB                                                            |
| Alloc_plan_cache                                          | 0.125 MB                                                            |
| Alloc_warnings                                            | 1.000 MB                                                            |
| Alloc_replication_large                                   | 8.000 MB                                                            |
| Alloc_durability_large                                    | 192.251 MB                                                          |
| Alloc_skynet_replication                                  | 0.375 MB                                                            |
| Alloc_sharding_partitions                                 | 0.125 MB                                                            |
| Alloc_log_replay                                          | 0.155 MB                                                            |
| Alloc_mmap_file                                           | 3072.000 MB                                                         |
| Alloc_client_connection                                   | 0.625 (+0.500) MB                                                   |
| Alloc_protocol_packet                                     | 2.875 (+0.375) MB                                                   |
| Alloc_table_memory                                        | 2924.774 (+20.375) MB                                               |
| Alloc_variable_bucket_16                                  | allocs:710516  alloc_MB:10.8  buffer_MB:14.4  cached_buffer_MB:1.2  |
| Alloc_variable_bucket_24                                  | allocs:1612615  alloc_MB:36.9  buffer_MB:40.9  cached_buffer_MB:1.8 |
| Alloc_variable_bucket_32                                  | allocs:1161137  alloc_MB:35.4  buffer_MB:40.1  cached_buffer_MB:1.6 |
| Alloc_variable_bucket_40                                  | allocs:479966  alloc_MB:18.3  buffer_MB:20.6  cached_buffer_MB:0.4  |
| Alloc_variable_bucket_48                                  | allocs:196439  alloc_MB:9.0  buffer_MB:11.5  cached_buffer_MB:1.0   |
| Alloc_variable_bucket_56                                  | allocs:221881  alloc_MB:11.8  buffer_MB:14.8  cached_buffer_MB:0.6  |
| Alloc_variable_bucket_64                                  | allocs:204931  alloc_MB:12.5  buffer_MB:15.0  cached_buffer_MB:0.5  |
| Alloc_variable_bucket_72                                  | allocs:239196  alloc_MB:16.4  buffer_MB:18.9  cached_buffer_MB:0.2  |
| Alloc_variable_bucket_80                                  | allocs:194850  alloc_MB:14.9  buffer_MB:16.6  cached_buffer_MB:0.1  |
| Alloc_variable_bucket_88                                  | allocs:169778  alloc_MB:14.2  buffer_MB:16.1  cached_buffer_MB:0.2  |
| Alloc_variable_bucket_104                                 | allocs:304389  alloc_MB:30.2  buffer_MB:33.5  cached_buffer_MB:0.9  |
| Alloc_variable_bucket_128                                 | allocs:224319  alloc_MB:27.4  buffer_MB:30.9  cached_buffer_MB:0.5  |
| Alloc_variable_bucket_160                                 | allocs:104585  alloc_MB:16.0  buffer_MB:19.2  cached_buffer_MB:1.5  |
| Alloc_variable_bucket_200                                 | allocs:79605  alloc_MB:15.2  buffer_MB:17.9  cached_buffer_MB:2.0   |
| Alloc_variable_bucket_248                                 | allocs:103075  alloc_MB:24.4  buffer_MB:27.0  cached_buffer_MB:0.0  |
| Alloc_variable_bucket_312                                 | allocs:56571  alloc_MB:16.8  buffer_MB:18.8  cached_buffer_MB:0.0   |
| Alloc_variable_bucket_384                                 | allocs:30154  alloc_MB:11.0  buffer_MB:11.5  cached_buffer_MB:0.0   |
| Alloc_variable_bucket_480                                 | allocs:30094  alloc_MB:13.8  buffer_MB:14.6  cached_buffer_MB:0.4   |
| Alloc_variable_bucket_600                                 | allocs:15662  alloc_MB:9.0  buffer_MB:9.5  cached_buffer_MB:0.2     |
| Alloc_variable_bucket_752                                 | allocs:49073  alloc_MB:35.2  buffer_MB:35.8  cached_buffer_MB:0.0   |
| Alloc_variable_bucket_936                                 | allocs:29921  alloc_MB:26.7  buffer_MB:27.2  cached_buffer_MB:0.0   |
| Alloc_variable_bucket_1168                                | allocs:38966  alloc_MB:43.4  buffer_MB:44.1  cached_buffer_MB:0.0   |
| Alloc_variable_bucket_1480                                | allocs:30386  alloc_MB:42.9  buffer_MB:43.4  cached_buffer_MB:0.0   |
| Alloc_variable_bucket_1832                                | allocs:26959  alloc_MB:47.1  buffer_MB:47.8  cached_buffer_MB:0.0   |
| Alloc_variable_bucket_2288                                | allocs:16810  alloc_MB:36.7  buffer_MB:37.4  cached_buffer_MB:0.0   |
| Alloc_variable_bucket_2832                                | allocs:13729  alloc_MB:37.1  buffer_MB:38.0  cached_buffer_MB:0.0   |
| Alloc_variable_bucket_3528                                | allocs:6258  alloc_MB:21.1  buffer_MB:21.6  cached_buffer_MB:0.0    |
| Alloc_variable_bucket_4504                                | allocs:3444  alloc_MB:14.8  buffer_MB:15.1  cached_buffer_MB:0.0    |
| Alloc_variable_bucket_5680                                | allocs:1452  alloc_MB:7.9  buffer_MB:8.1  cached_buffer_MB:0.0      |
| Alloc_variable_bucket_6224                                | allocs:352  alloc_MB:2.1  buffer_MB:2.1  cached_buffer_MB:0.0       |
| Alloc_variable_bucket_7264                                | allocs:454  alloc_MB:3.1  buffer_MB:3.2  cached_buffer_MB:0.0       |
| Alloc_variable_bucket_9344                                | allocs:576  alloc_MB:5.1  buffer_MB:5.2  cached_buffer_MB:0.0       |
| Alloc_variable_bucket_11896                               | allocs:295  alloc_MB:3.3  buffer_MB:3.4  cached_buffer_MB:0.0       |
| Alloc_variable_bucket_14544                               | allocs:32  alloc_MB:0.4  buffer_MB:0.5  cached_buffer_MB:0.0        |
| Alloc_variable_bucket_18696                               | allocs:7  alloc_MB:0.1  buffer_MB:0.2  cached_buffer_MB:0.1         |
| Alloc_variable_bucket_21816                               | allocs:4  alloc_MB:0.1  buffer_MB:0.1  cached_buffer_MB:0.0         |
| Alloc_variable_bucket_26184                               | allocs:4  alloc_MB:0.1  buffer_MB:0.1  cached_buffer_MB:0.0         |
| Alloc_variable_bucket_43648                               | allocs:2  alloc_MB:0.1  buffer_MB:0.1  cached_buffer_MB:0.0         |
| Alloc_variable_bucket_65472                               | allocs:2  alloc_MB:0.1  buffer_MB:0.1  cached_buffer_MB:0.0         |
| Alloc_variable_bucket_130960                              | allocs:9  alloc_MB:1.1  buffer_MB:2.9  cached_buffer_MB:1.8         |
| Alloc_variable_cached_buffers                             | 15.1 (+2.8) MB                                                      |
| Alloc_variable_allocated                                  | 672.7 MB                                                            |
| GCed_versions_last_sweep                                  | 165946                                                              |
| Average_garbage_collection_duration                       | 13 ms                                                               |
+-----------------------------------------------------------+---------------------------------------------------------------------+

Can you send us a cluster report (SingleStoreDB Cloud · SingleStore Documentation) to bug-report@memsql.com?

Does your workload make use of fulltext indexes?

-Adam