mysql 在查询某些表时崩溃

问题描述

我们正在运行 mariadb 10.3.25:

$ MysqL --version
MysqL  Ver 15.1 distrib 10.3.25-MariaDB,for debian-linux-gnu (x86_64) using readline 5.2

我们的某些数据库表中似乎存在某种损坏。

附件 A:

MariaDB [etherpad]> select * from store;
ERROR 2013 (HY000): Lost connection to MysqL server during query

与此同时,这发生在日志中:

Jan 16 19:51:52 hostname MysqLd[31236]: 2021-01-16 19:51:52 0x7f0c884b8700  InnoDB: Assertion failure in file /build/mariadb-10.3-RRxkin/mariadb-10.3-10.3.25/storage/innobase/row/row0sel.cc line 2972
Jan 16 19:51:52 hostname MysqLd[31236]: InnoDB: Failing assertion: prebuilt->trx->isolation_level == TRX_ISO_READ_UNCOMMITTED
Jan 16 19:51:52 hostname MysqLd[31236]: InnoDB: We intentionally generate a memory trap.
Jan 16 19:51:52 hostname MysqLd[31236]: InnoDB: [...]
Jan 16 19:51:52 hostname MysqLd[31236]: 210116 19:51:52 [ERROR] MysqLd got signal 6 ;
Jan 16 19:51:52 hostname MysqLd[31236]: This Could be because you hit a bug. It is also possible that this binary
Jan 16 19:51:52 hostname MysqLd[31236]: or one of the libraries it was linked against is corrupt,improperly built,Jan 16 19:51:52 hostname MysqLd[31236]: or misconfigured. This error can also be caused by malfunctioning hardware.
Jan 16 19:51:52 hostname MysqLd[31236]: [...]
Jan 16 19:51:52 hostname MysqLd[31236]: We will try our best to scrape up some info that will hopefully help
Jan 16 19:51:52 hostname MysqLd[31236]: diagnose the problem,but since we have already crashed,Jan 16 19:51:52 hostname MysqLd[31236]: something is definitely wrong and this may fail.
Jan 16 19:51:52 hostname MysqLd[31236]: Server version: 10.3.25-MariaDB-0+deb10u1-log
Jan 16 19:51:52 hostname MysqLd[31236]: key_buffer_size=16777216
Jan 16 19:51:52 hostname MysqLd[31236]: read_buffer_size=131072
Jan 16 19:51:52 hostname MysqLd[31236]: key_buffer_size=16777216                                                                                                                                                                          [55/647]
Jan 16 19:51:52 hostname MysqLd[31236]: read_buffer_size=131072
Jan 16 19:51:52 hostname MysqLd[31236]: max_used_connections=16
Jan 16 19:51:52 hostname MysqLd[31236]: max_threads=153
Jan 16 19:51:52 hostname MysqLd[31236]: thread_count=22
Jan 16 19:51:52 hostname MysqLd[31236]: It is possible that MysqLd Could use up to
Jan 16 19:51:52 hostname MysqLd[31236]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 352736 K  bytes of memory
Jan 16 19:51:52 hostname MysqLd[31236]: Hope that's ok; if not,decrease some variables in the equation.
Jan 16 19:51:52 hostname MysqLd[31236]: Thread pointer: 0x7f0c500093b8
Jan 16 19:51:52 hostname MysqLd[31236]: Attempting backtrace. You can use the following information to find out
Jan 16 19:51:52 hostname MysqLd[31236]: where MysqLd died. If you see no messages after this,something went
Jan 16 19:51:52 hostname MysqLd[31236]: terribly wrong...
Jan 16 19:51:52 hostname MysqLd[31236]: stack_bottom = 0x7f0c884b7dd8 thread_stack 0x30000
Jan 16 19:51:52 hostname MysqLd[31236]: /usr/sbin/MysqLd(my_print_stacktrace+0x2e)[0x563337b2b05e]
Jan 16 19:51:52 hostname MysqLd[31236]: /usr/sbin/MysqLd(handle_fatal_signal+0x54d)[0x56333765e09d]
Jan 16 19:51:53 hostname MysqLd[31236]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x12730)[0x7f0c91ef1730]
Jan 16 19:51:53 hostname MysqLd[31236]: /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x10b)[0x7f0c914ae7bb]
Jan 16 19:51:53 hostname MysqLd[31236]: /lib/x86_64-linux-gnu/libc.so.6(abort+0x121)[0x7f0c91499535]
Jan 16 19:51:53 hostname MysqLd[31236]: /usr/sbin/MysqLd(+0x4e3433)[0x5633373a2433]
Jan 16 19:51:53 hostname MysqLd[31236]: /usr/sbin/MysqLd(+0x4d5d6c)[0x563337394d6c]
Jan 16 19:51:53 hostname MysqLd[31236]: /usr/sbin/MysqLd(+0x9d8814)[0x563337897814]
Jan 16 19:51:53 hostname MysqLd[31236]: /usr/sbin/MysqLd(+0x9dcdcf)[0x56333789bdcf]
Jan 16 19:51:53 hostname MysqLd[31236]: /usr/sbin/MysqLd(+0x918681)[0x5633377d7681]
Jan 16 19:51:53 hostname MysqLd[31236]: /usr/sbin/MysqLd(_ZN7handler11ha_rnd_nextEPh+0x127)[0x563337662db7]
Jan 16 19:51:53 hostname MysqLd[31236]: /usr/sbin/MysqLd(_Z13rr_sequentialP11READ_RECORD+0x1c)[0x56333776a43c]
Jan 16 19:51:53 hostname MysqLd[31236]: /usr/sbin/MysqLd(_Z10sub_selectP4JOINP13st_join_tableb+0x1e3)[0x5633374bdf03]
Jan 16 19:51:53 hostname MysqLd[31236]: /usr/sbin/MysqLd(_ZN4JOIN10exec_innerEv+0xaaa)[0x5633374e01ba]
Jan 16 19:51:53 hostname MysqLd[31236]: /usr/sbin/MysqLd(_ZN4JOIN4execEv+0x33)[0x5633374e03d3]
Jan 16 19:51:53 hostname MysqLd[31236]: /usr/sbin/MysqLd(_Z12MysqL_selectP3THDP10TABLE_LISTjR4ListI4ItemEPS4_jP8st_orderS9_S7_S9_yP13select_resultP18st_select_lex_unitP13st_select_lex+0xef)[0x5633374deaaf]
Jan 16 19:51:53 hostname MysqLd[31236]: /usr/sbin/MysqLd(_Z13handle_selectP3THDP3LEXP13select_resultm+0x14d)[0x5633374df38d]
Jan 16 19:51:53 hostname MysqLd[31236]: /usr/sbin/MysqLd(+0x5c1d8c)[0x563337480d8c]
Jan 16 19:51:53 hostname MysqLd[31236]: /usr/sbin/MysqLd(_Z21MysqL_execute_commandP3THD+0x5857)[0x56333748d087]
Jan 16 19:51:53 hostname MysqLd[31236]: /usr/sbin/MysqLd(_Z11MysqL_parseP3THDPcjP12Parser_statebb+0x1c9)[0x56333748f879]
Jan 16 19:51:54 hostname MysqLd[31236]: /usr/sbin/MysqLd(_Z16dispatch_command19enum_server_commandP3THDPcjbb+0x111d)[0x56333749172d]
Jan 16 19:51:54 hostname MysqLd[31236]: /usr/sbin/MysqLd(_Z10do_commandP3THD+0x122)[0x563337492e82]
Jan 16 19:51:54 hostname MysqLd[31236]: /usr/sbin/MysqLd(_Z24do_handle_one_connectionP7CONNECT+0x23a)[0x5633375641ba]
Jan 16 19:51:54 hostname MysqLd[31236]: /usr/sbin/MysqLd(handle_one_connection+0x3d)[0x56333756433d]
Jan 16 19:51:55 hostname MysqLd[31236]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x7fa3)[0x7f0c91ee6fa3]
Jan 16 19:51:55 hostname MysqLd[31236]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f0c915704cf]
Jan 16 19:51:55 hostname MysqLd[31236]: Trying to get some variables.
Jan 16 19:51:55 hostname MysqLd[31236]: Some pointers may be invalid and cause the dump to abort.
Jan 16 19:51:55 hostname MysqLd[31236]: Query (0x7f0c50012e20): select * from store
Jan 16 19:51:55 hostname MysqLd[31236]: Connection ID (thread ID): 733
Jan 16 19:51:55 hostname MysqLd[31236]: Status: NOT_KILLED
Jan 16 19:51:55 hostname MysqLd[31236]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,de
rived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_joi
n_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_d
erived=on,split_materialized=on
Jan 16 19:51:55 hostname MysqLd[31236]: The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-MysqLd/ contains
Jan 16 19:51:55 hostname MysqLd[31236]: information that should help you find out what is causing the crash.
Jan 16 19:51:55 hostname MysqLd[31236]: Writing a core file...
Jan 16 19:51:55 hostname MysqLd[31236]: Working directory at /var/lib/MysqL
Jan 16 19:51:55 hostname MysqLd[31236]: Resource Limits:
Jan 16 19:51:55 hostname MysqLd[31236]: Limit                     Soft Limit           Hard Limit           Units
Jan 16 19:51:55 hostname MysqLd[31236]: Max cpu time              unlimited            unlimited            seconds
Jan 16 19:51:55 hostname MysqLd[31236]: Max file size             unlimited            unlimited            bytes
Jan 16 19:51:55 hostname MysqLd[31236]: Max data size             unlimited            unlimited            bytes
Jan 16 19:51:55 hostname MysqLd[31236]: Max file size             unlimited            unlimited            bytes                                                                                                                          [0/647]
Jan 16 19:51:55 hostname MysqLd[31236]: Max data size             unlimited            unlimited            bytes
Jan 16 19:51:55 hostname MysqLd[31236]: Max stack size            8388608              unlimited            bytes
Jan 16 19:51:55 hostname MysqLd[31236]: Max core file size        0                    unlimited            bytes
Jan 16 19:51:55 hostname MysqLd[31236]: Max resident set          unlimited            unlimited            bytes
Jan 16 19:51:55 hostname MysqLd[31236]: Max processes             15390                15390                processes
Jan 16 19:51:55 hostname MysqLd[31236]: Max open files            65536                65536                files
Jan 16 19:51:55 hostname MysqLd[31236]: Max locked memory         65536                65536                bytes
Jan 16 19:51:55 hostname MysqLd[31236]: Max address space         unlimited            unlimited            bytes
Jan 16 19:51:55 hostname MysqLd[31236]: Max file locks            unlimited            unlimited            locks
Jan 16 19:51:55 hostname MysqLd[31236]: Max pending signals       15390                15390                signals
Jan 16 19:51:55 hostname MysqLd[31236]: Max msgqueue size         819200               819200               bytes
Jan 16 19:51:55 hostname MysqLd[31236]: Max nice priority         0                    0
Jan 16 19:51:55 hostname MysqLd[31236]: Max realtime priority     0                    0
Jan 16 19:51:55 hostname MysqLd[31236]: Max realtime timeout      unlimited            unlimited            us
Jan 16 19:51:55 hostname MysqLd[31236]: Core pattern: core
Jan 16 19:52:02 hostname MysqLd[6672]: [... innodb crash recovery ...]

其他一些表也发生了非常相似的事情。 我试过的:

  • 我想转储所有数据,清除整个 mariadb 安装并恢复。不出所料,MysqLdump 遇到了同样的损坏 (?),并且数据库在转储期间崩溃。
  • 我尝试遵循建议创建 MyISAM 表并使用 innodb 表中的数据填充该表的指南,但由于同样的原因而失败。

有什么办法可以解决这个问题?自然,我们需要这些表中的数据。看来,一旦查询命中某个记录/块(我对 MysqL 的内部工作一无所知),它就会使服务器崩溃。那么我们如何挽救数据?


UPDATE 2021-01-18 根据要求,以下是 variablesstatus 查询

MariaDB [(none)]> show global variables like '%thread%';
+-----------------------------------------+---------------------------+
| Variable_name                           | Value                     |
+-----------------------------------------+---------------------------+
| aria_repair_threads                     | 1                         |
| binlog_optimize_thread_scheduling       | ON                        |
| debug_no_thread_alarm                   | OFF                       |
| innodb_encryption_threads               | 0                         |
| innodb_purge_threads                    | 4                         |
| innodb_read_io_threads                  | 4                         |
| innodb_thread_concurrency               | 0                         |
| innodb_thread_sleep_delay               | 10000                     |
| innodb_write_io_threads                 | 4                         |
| max_delayed_threads                     | 20                        |
| max_insert_delayed_threads              | 20                        |
| myisam_repair_threads                   | 1                         |
| performance_schema_max_thread_classes   | 50                        |
| performance_schema_max_thread_instances | -1                        |
| slave_domain_parallel_threads           | 0                         |
| slave_parallel_threads                  | 0                         |
| thread_cache_size                       | 8                         |
| thread_concurrency                      | 10                        |
| thread_handling                         | one-thread-per-connection |
| thread_pool_idle_timeout                | 60                        |
| thread_pool_max_threads                 | 65536                     |
| thread_pool_oversubscribe               | 3                         |
| thread_pool_prio_kickup_timer           | 1000                      |
| thread_pool_priority                    | auto                      |
| thread_pool_size                        | 1                         |
| thread_pool_stall_limit                 | 500                       |
| thread_stack                            | 196608                    |
| wsrep_slave_threads                     | 1                         |
+-----------------------------------------+---------------------------+
28 rows in set (0.001 sec)

MariaDB [(none)]> show global status like '%thread%';
+------------------------------------------+-------+
| Variable_name                            | Value |
+------------------------------------------+-------+
| Delayed_insert_threads                   | 0     |
| Performance_schema_thread_classes_lost   | 0     |
| Performance_schema_thread_instances_lost | 0     |
| Slow_launch_threads                      | 0     |
| Threadpool_idle_threads                  | 0     |
| Threadpool_threads                       | 0     |
| Threads_cached                           | 7     |
| Threads_connected                        | 12    |
| Threads_created                          | 98    |
| Threads_running                          | 6     |
| wsrep_applier_thread_count               | 0     |
| wsrep_rollbacker_thread_count            | 0     |
| wsrep_thread_count                       | 0     |
+------------------------------------------+-------+
13 rows in set (0.001 sec)

MariaDB [(none)]> show global variables like '%timeout%';
+---------------------------------------+----------+
| Variable_name                         | Value    |
+---------------------------------------+----------+
| connect_timeout                       | 10       |
| deadlock_timeout_long                 | 50000000 |
| deadlock_timeout_short                | 10000    |
| delayed_insert_timeout                | 300      |
| idle_readonly_transaction_timeout     | 0        |
| idle_transaction_timeout              | 0        |
| idle_write_transaction_timeout        | 0        |
| innodb_flush_log_at_timeout           | 1        |
| innodb_lock_wait_timeout              | 50       |
| innodb_rollback_on_timeout            | OFF      |
| interactive_timeout                   | 28800    |
| lock_wait_timeout                     | 86400    |
| net_read_timeout                      | 600      |
| net_write_timeout                     | 600      |
| rpl_semi_sync_master_timeout          | 10000    |
| rpl_semi_sync_slave_kill_conn_timeout | 5        |
| slave_net_timeout                     | 60       |
| thread_pool_idle_timeout              | 60       |
| wait_timeout                          | 28800    |
+---------------------------------------+----------+
19 rows in set (0.001 sec)

MariaDB [(none)]> show global status like '%timeout%';
+-------------------------------------+-------+
| Variable_name                       | Value |
+-------------------------------------+-------+
| binlog_group_commit_trigger_timeout | 0     |
| Master_gtid_wait_timeouts           | 0     |
| Ssl_default_timeout                 | 0     |
| Ssl_session_cache_timeouts          | 0     |
+-------------------------------------+-------+
4 rows in set (0.001 sec)

MariaDB [(none)]> show global status like '%aborted%';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| Aborted_clients  | 3     |
| Aborted_connects | 0     |
+------------------+-------+
2 rows in set (0.001 sec)

服务器有 5 GB 的 RAM。


关于 store 表:

MariaDB [etherpad]> show create table store;
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| store | CREATE TABLE `store` (
  `key` varchar(100) COLLATE utf8_bin NOT NULL DEFAULT '',`value` longtext COLLATE utf8_bin NOT NULL,PRIMARY KEY (`key`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.000 sec)

MariaDB [etherpad]> select count(*) from store;
+----------+
| count(*) |
+----------+
|   779443 |
+----------+
1 row in set (1 min 19.378 sec)

这是 iostat 信息:

$ iostat -xm 5 3
Linux 4.14.0-0.bpo.3-amd64 (hostname)      01/18/2021      _x86_64_        (1 cpu)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           8.63    2.39   16.53   22.68    0.23   49.54

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
xvdap2           7.67   37.91      0.07      0.67     0.13    37.91   1.67  50.00   16.35    2.54   0.05     9.40    18.01   4.35  19.82
xvdap1           0.51    1.25      0.00      0.01     0.02     0.07   3.58   5.64    7.52   27.01   0.03     4.15     4.24   1.63   0.29

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          18.51    2.21   15.49   55.33    0.40    8.05

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
xvdap2           4.00  157.80      0.02      1.53     0.00    71.00   0.00  31.03    5.80   55.33   7.93     4.00     9.92   4.37  70.72
xvdap1           0.00    0.00      0.00      0.00     0.00     0.00   0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           8.96    2.44   15.68   15.27    0.41   57.23

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
xvdap2           0.00   22.20      0.00      0.40     0.00    35.40   0.00  61.46    0.00   22.81   0.30     0.00    18.27   4.11   9.12
xvdap1           0.00    0.00      0.00      0.00     0.00     0.00   0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00

UPDATE 2021-01-24:我试图通过使用 limit 子句准二分表来查明问题,发现在大约 800,000 条记录中,每个选择663,187 次数据库崩溃后的记录。 663,187 之前的少数记录。一个包含看似混乱的数据,见下文。

MariaDB [etherpad]> select * from store limit 663184,1\G;
*************************** 1. row ***************************
  key:
value:
                     f[Y
                                       f[팩

这不是在暗示数据损坏吗?对于这个问题,我能做些什么?摆脱这些记录?

解决方法

从目前可用的信息来看,考虑在你的 my.cnf [mysqld] 部分

innodb_buffer_pool_size=2G  # to use 40% of available RAM
REMOVE thread_cache_size to allow default sizing (or set it to 256)
REMOVE thread_stack to allow default calc of slightly larger thread_stack per ref man

当我们知道内核数/CPU 数时,我们或许能够提供其他建议。

当超过 1 个 CPU 可用时,可以提供其他建议。

您的查询:select count(*) from store; 如果您尝试过,可能会在不到一分钟的时间内完成 SELECT COUNT(key) FROM store; 只读取索引,而不是每一行。

祝您有个美好的 2021 年。

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...