linux – “服务mysqld停止”超时(然后发现“mysqld死了但是锁定了”)

我在我的64位CentOS 5服务器上通过yum安装了 mysql和server.它启动很好,但是当我试图阻止它时它停止运行然后我必须“Ctrl-C”它.然后我运行“service mysqld status”,它显示:
mysqld dead but subsys locked

我运行ps aux和mysql无处可寻.通过“service mysqld start”再次启动mysqld工作正常.试图阻止它会产生同样的问题.

然后我意识到/ var / lock / subsys / mysqld仍然存在.运行mysqld时,我检查了/var/run/mysqld/mysqld.pid,它与正在运行的服务的pid匹配.

我尝试重新安装mysql并删除所有文件和配置但无济于事.

该怎么办?

编辑:

我在/etc/init.d/mysqld文件中添加了一些echo语句,特别是在stop函数中:

stop(){
        if [ ! -f "$mypidfile" ]; then
            # not running; per LSB standards this is "ok"
            action $"Stopping $prog: " /bin/true
            return 0
        fi  
        echo "beginning stop sequence"
        MYSQLPID=`cat "$mypidfile"`
        if [ -n "$MYSQLPID" ]; then
            /bin/kill "$MYSQLPID" >/dev/null 2>&1
            echo "killing pid $MYSQLPID"
            ret=$?
            if [ $ret -eq 0 ]; then
                echo "return code $ret after kill attempt"
                TIMEOUT="$STOPTIMEOUT"
                echo "timeout is set to $STOPTIMEOUT"
                while [ $TIMEOUT -gt 0 ]; do
                    /bin/kill -0 "$MYSQLPID" >/dev/null 2>&1 || break
                    sleep 1
                    let TIMEOUT=${TIMEOUT}-1
                    echo "timeout is now $TIMEOUT"
                done
                if [ $TIMEOUT -eq 0 ]; then
                    echo "Timeout error occurred trying to stop MySQL Daemon."
                    ret=1
                    action $"Stopping $prog: " /bin/false
                else
                    echo "attempting to del lockfile: $lockfile"
                    rm -f $lockfile
                    rm -f "$socketfile"
                    action $"Stopping $prog: " /bin/true
                fi
            else
                action $"Stopping $prog: " /bin/false
            fi
        else
            # failed to read pidfile,probably insufficient permissions
            action $"Stopping $prog: " /bin/false
            ret=4
        fi
        return $ret
}

这是我尝试停止服务时得到的结果:

[root@server]# service mysqld stop
beginning stop sequence
killing pid 9145
return code 0 after kill attempt
timeout is set to 60
timeout is now 59
timeout is now 58
timeout is now 57
timeout is now 56
timeout is now 55
timeout is now 54
timeout is now 53
timeout is now 52
timeout is now 51
timeout is now 50
timeout is now 49

从查看代码看来,它永远不会突破while循环,并且无法删除锁定文件.我在解释这个错误吗?我在我的其他服务器上检查了相同的文件,它使用相同的代码.我傻眼了.

编辑:
在while循环部分

/bin/kill -0 "$MYSQLPID" >/dev/null 2>&1 || break

由于某种原因,它无法识别返回码.当调用service mysqld stop时,该进程已被杀死,但不确定为什么它不允许循环中断.

编辑:
进一步测试显示调用/ bin / kill和只调用kill之间的一些奇怪的行为,他们显然返回不同的代码,为什么??????:

[root@server]# /bin/kill 25200
kill 25200: No such process
[user@server]# echo ${?}
0
[root@server]# kill 25200
-bash: kill: (25200) - No such process
[root@server]# echo ${?}
1

编辑:我以非root用户身份登录并尝试执行“kill”和“/ bin / kill”,结果令人惊讶:

[notroot@server ~]$kill -0 23232
-bash: kill: (23232) - No such process
[notroot@server ~]$echo $?
1
[notroot@server ~]$/bin/kill -0 23232
kill 23232: No such process
(No info could be read for "-p": geteuid()=501 but you should be root.)
[notroot@server ~]$echo $?
0

执行kill和bin / kill作为非root用户时,“无信息可读”错误不会显示在我的其他服务器中.

编辑:添加了quanta描述的日志记录,并检查了mysql日志:

启动和停止后,mysql日志显示:

110918 00:11:28 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
110918  0:11:28 [Note] Plugin 'FEDERATED' is disabled.
110918  0:11:28  InnoDB: Initializing buffer pool,size = 16.0M
110918  0:11:28  InnoDB: Completed initialization of buffer pool
110918  0:11:29  InnoDB: Started; log sequence number 0 44233
110918  0:11:29 [Warning] 'user' entry 'root@server' ignored in --skip-name-resolve mode.
110918  0:11:29 [Warning] 'user' entry '@server' ignored in --skip-name-resolve mode.
110918  0:11:29 [Note] Event Scheduler: Loaded 0 events
110918  0:11:29 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.58-ius'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  Distributed by The IUS Community Project
110918  0:11:34 [Note] /usr/libexec/mysqld: Normal shutdown

110918  0:11:34 [Note] Event Scheduler: Purging the queue. 0 events
110918  0:11:34  InnoDB: Starting shutdown...
110918  0:11:39  InnoDB: Shutdown completed; log sequence number 0 44233
110918  0:11:39 [Note] /usr/libexec/mysqld: Shutdown complete

110918 00:11:39 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended

然后在tmp / mysql.log中:

kill 23080: No such process
kill 23080: No such process
kill 23080: No such process
kill 23080: No such process
kill 23080: No such process
kill 23080: No such process
kill 23080: No such process
kill 23080: No such process
kill 23080: No such process
kill 23080: No such process

我中途停止了停止过程,所以我不必等待超时.看起来这个过程被杀了.我认为,问题仍然是来自“kill”和“/ bin / kill”的不同返回码

解决方法

首先要做的事情是:一个非常完善,系统和彻底的调试,干得好.

在我的RHEL 5.6框中,如果我试图杀死不存在的pid,我总是得到1的返回码.我尝试了root用户和非特权用户,两者都是完整路径,只有命令名称.我也只得到简洁的杀死XXX:没有这样的过程,没有详细的错误消息.

运行rpm -Vv util-linux并查看是否有人没有用新的改进版本替换/ bin / kill可能是个好主意.即使rpm验证说文件是原始的,我也会尝试重命名/ bin / kill并从工作机器上复制二进制文件.如果文件替换有帮助并且您没有发现合法的更改源,那么无论rpm验证的输出如何,我都认为机器已被泄露.

相关文章

linux常用进程通信方式包括管道(pipe)、有名管道(FIFO)、...
Linux性能观测工具按类别可分为系统级别和进程级别,系统级别...
本文详细介绍了curl命令基础和高级用法,包括跳过https的证书...
本文包含作者工作中常用到的一些命令,用于诊断网络、磁盘占满...
linux的平均负载表示运行态和就绪态及不可中断状态(正在io)的...
CPU上下文频繁切换会导致系统性能下降,切换分为进程切换、线...