由于磁盘写入错误,Tarantool无法启动

问题描述

我正在尝试从头开始在Docker中启动Tarantool(无现有数据)。我使用了Tutorial中建议的Docker命令,并在MacOS 10.15.6(Catalina)的Docker Desktop 2.4.0.0下运行它:

docker run \
  --name mytarantool \
  -d -p 3301:3301 \
  -v /data/dir/on/host:/var/lib/tarantool \
  tarantool/tarantool:2.5.1

({/data/dir/on/host被替换为我在笔记本电脑上的本地目录)。我也尝试使用最新版本2.6.0。

容器在启动后不久终止。 docker logs显示如下:

2020-10-02 20:51:10.331 [1] main/103/tarantool-entrypoint.lua C> Tarantool 2.6.0-0-g47aa4e01e
2020-10-02 20:51:10.331 [1] main/103/tarantool-entrypoint.lua C> log level 5
2020-10-02 20:51:10.332 [1] main/103/tarantool-entrypoint.lua I> mapping 268435456 bytes for memtx tuple arena...
2020-10-02 20:51:10.332 [1] main/103/tarantool-entrypoint.lua I> mapping 134217728 bytes for vinyl tuple arena...
2020-10-02 20:51:10.335 [1] main/103/tarantool-entrypoint.lua I> instance uuid 1811ff01-13d1-45c8-9878-0974bf27ee40
2020-10-02 20:51:10.335 [1] iproto/101/main I> binary: bound to 0.0.0.0:3301
2020-10-02 20:51:10.335 [1] main/103/tarantool-entrypoint.lua I> initializing an empty data directory
2020-10-02 20:51:10.351 [1] main/103/tarantool-entrypoint.lua I> assigned id 1 to replica 1811ff01-13d1-45c8-9878-0974bf27ee40
2020-10-02 20:51:10.351 [1] main/103/tarantool-entrypoint.lua I> cluster uuid 12ca546b-29ea-4af3-a407-f24e91c0e636
2020-10-02 20:51:10.357 [1] snapshot/101/main I> saving snapshot `/var/lib/tarantool/00000000000000000000.snap.inprogress'
2020-10-02 20:51:10.361 [1] snapshot/101/main I> done
2020-10-02 20:51:10.364 [1] main/103/tarantool-entrypoint.lua I> ready to accept requests
2020-10-02 20:51:10.365 [1] main/103/tarantool-entrypoint.lua I> set 'log_level' configuration option to 5
2020-10-02 20:51:10.365 [1] main/105/checkpoint_daemon I> scheduled next checkpoint for Fri Oct  2 22:10:11 2020
2020-10-02 20:51:10.367 [1] main/103/tarantool-entrypoint.lua I> set 'listen' configuration option to "3301"
2020-10-02 20:51:10.367 [1] main/103/tarantool-entrypoint.lua I> set 'log_format' configuration option to "plain"
2020-10-02 20:51:10.384 [1] wal/101/main xlog.c:1026 !> SystemError /var/lib/tarantool/00000000000000000000.xlog: can't allocate disk space: Invalid argument
2020-10-02 20:51:10.384 [1] main/103/tarantool-entrypoint.lua txn.c:876 E> ER_WAL_IO: Failed to write to disk
2020-10-02 20:51:10.391 [1] main txn.c:876 E> ER_WAL_IO: Failed to write to disk
2020-10-02 20:51:10.391 [1] main F> Fatal error,exiting the event loop

与此同时,容器设法创建5.9K 00000000000000000000.snap和97B 00000000000000000000.xlog文件

$ ls -hal
total 24
drwxr-xr-x@ 4 user  staff   128B  2 Oct 13:51 .
drwxr-xr-x  3 user  staff    96B  2 Oct 12:56 ..
-rw-r--r--  1 user  staff   5.9K  2 Oct 13:51 00000000000000000000.snap
-rw-r--r--  1 user  staff    97B  2 Oct 13:51 00000000000000000000.xlog

如果我在不装入本地目录的情况下启动了容器,则它将成功。

我认为我的本地文件系统有问题(或从容器中可见的方式),或者可能是权限问题,但我无法弄清楚到底是什么。

如果我exec作为成功启动的容器中的shell,我会看到xlog文件更大并且文件的所有者是tarantool:tarantool

$ docker exec -it 016 sh
/opt/tarantool # ls -hal /var/lib/tarantool/
total 1044
drwxr-xr-x    2 tarantoo tarantoo    4.0K Oct  2 20:40 .
drwxr-xr-x    1 root     root        4.0K Aug  2 16:31 ..
-rw-r--r--    1 tarantoo tarantoo    5.9K Oct  2 20:40 00000000000000000000.snap
-rw-r--r--    1 tarantoo tarantoo     273 Oct  2 20:40 00000000000000000000.xlog

但是在目录绑定的情况下,它看起来有所不同:

$ docker run -it -p 3031:3031 -v /Users/user/project/storage:/var/lib/tarantool tarantool/tarantool:2.6.0 sh
/opt/tarantool # ls -hal /var/lib/tarantool/
total 16
drwxr-xr-x    4 tarantoo root         128 Oct  2 21:18 .
drwxr-xr-x    1 root     root        4.0K Aug  2 16:31 ..
-rw-r--r--    1 root     root        5.9K Oct  2 21:18 00000000000000000000.snap
-rw-r--r--    1 root     root          97 Oct  2 21:18 00000000000000000000.xlog

我试图更改目录和文件的所有者:

$ docker run -it -p 3031:3031 -v /Users/user/project/storage:/var/lib/tarantool tarantool/tarantool:2.6.0 sh
/opt/tarantool # chown tarantool:tarantool -R /var/lib/tarantool/

并在容器重启后检查更改是否仍然存在:

$ docker run -it -p 3031:3031 -v /Users/user/project/storage:/var/lib/tarantool tarantool/tarantool:2.6.0 sh
/opt/tarantool # ls -hal /var/lib/tarantool/
total 16
drwxr-xr-x    4 tarantoo tarantoo     128 Oct  2 21:18 .
drwxr-xr-x    1 root     root        4.0K Aug  2 16:31 ..
-rw-r--r--    1 tarantoo tarantoo    5.9K Oct  2 21:18 00000000000000000000.snap
-rw-r--r--    1 tarantoo tarantoo      97 Oct  2 21:18 00000000000000000000.xlog

现在,权限看起来与工作容器中的权限相同。但是启动容器通常最终会导致相同的问题:

$ docker run -it -p 3031:3031 -v /Users/user/project/storage:/var/lib/tarantool tarantool/tarantool:2.6.0
Creating configuration file: /etc/tarantool/config.yml
Config:
---
force_recovery: false
memtx_dir: /var/lib/tarantool
listen: 3301
pid_file: /var/run/tarantool/tarantool.pid
vinyl_dir: /var/lib/tarantool
wal_dir: /var/lib/tarantool
```
2020-10-02 21:22:29.680 [1] main/103/tarantool-entrypoint.lua C> Tarantool 2.6.0-0-g47aa4e01e
2020-10-02 21:22:29.681 [1] main/103/tarantool-entrypoint.lua C> log level 5
2020-10-02 21:22:29.685 [1] main/103/tarantool-entrypoint.lua I> mapping 268435456 bytes for memtx tuple arena...
2020-10-02 21:22:29.685 [1] main/103/tarantool-entrypoint.lua I> mapping 134217728 bytes for vinyl tuple arena...
2020-10-02 21:22:29.687 [1] main/103/tarantool-entrypoint.lua I> instance uuid 74d33452-7f39-4ebf-a2f7-c1da6cb8c54b
2020-10-02 21:22:29.691 [1] main/103/tarantool-entrypoint.lua I> instance vclock {}
2020-10-02 21:22:29.691 [1] iproto/101/main I> binary: bound to 0.0.0.0:3301
2020-10-02 21:22:29.693 [1] main/103/tarantool-entrypoint.lua I> recovery start
2020-10-02 21:22:29.694 [1] main/103/tarantool-entrypoint.lua I> recovering from `/var/lib/tarantool/00000000000000000000.snap'
2020-10-02 21:22:29.695 [1] main/103/tarantool-entrypoint.lua I> cluster uuid 258099e8-803c-4a10-a3e0-57f6cd796f18
2020-10-02 21:22:29.708 [1] main/103/tarantool-entrypoint.lua I> assigned id 1 to replica 74d33452-7f39-4ebf-a2f7-c1da6cb8c54b
2020-10-02 21:22:29.709 [1] main/103/tarantool-entrypoint.lua I> recover from `/var/lib/tarantool/00000000000000000000.xlog'
2020-10-02 21:22:29.710 [1] main/103/tarantool-entrypoint.lua recovery.cc:156 W> file `/var/lib/tarantool/00000000000000000000.xlog` wasn't correctly closed
2020-10-02 21:22:29.713 [1] main/103/tarantool-entrypoint.lua I> ready to accept requests
2020-10-02 21:22:29.713 [1] main/103/tarantool-entrypoint.lua C> leaving orphan mode
2020-10-02 21:22:29.713 [1] main/103/tarantool-entrypoint.lua I> set 'log_level' configuration option to 5
2020-10-02 21:22:29.713 [1] main/105/checkpoint_daemon I> scheduled next checkpoint for Fri Oct  2 23:01:09 2020
2020-10-02 21:22:29.720 [1] main/103/tarantool-entrypoint.lua I> set 'listen' configuration option to "3301"
2020-10-02 21:22:29.720 [1] main/103/tarantool-entrypoint.lua I> set 'log_format' configuration option to "plain"
2020-10-02 21:22:29.723 [1] wal/101/main xlog.c:1026 !> SystemError /var/lib/tarantool/00000000000000000000.xlog: can't allocate disk space: Invalid argument
2020-10-02 21:22:29.723 [1] main/103/tarantool-entrypoint.lua txn.c:876 E> ER_WAL_IO: Failed to write to disk
2020-10-02 21:22:29.726 [1] main txn.c:876 E> ER_WAL_IO: Failed to write to disk
2020-10-02 21:22:29.726 [1] main F> Fatal error,exiting the event loop

奇怪的是,如果我在笔记本电脑上本地检查权限,则根本没有更改权限。我以为这是Docker的魔力,但考虑到容器重新启动之间的权限更改仍然存在,因此我不确定它是如何工作的。

但是也许问题与权限无关...然后呢?如何解决该问题?

解决方法

回答我自己的问题。

显然,此问题与权限无关,问题出在文件系统虚拟化级别。

可以通过在Docker桌面首选项中关闭“ gRPC FUSE”功能来解决此问题。大概是(如它所说的can't allocate disk space: Invalid argument),问题在于该实现不支持某些特定参数的fallocate()(请参见https://github.com/docker/for-mac/issues/4964#issuecomment-702748937)。

更新:如果您想使用gRPC FUSE功能进行文件共享,请考虑更新到已解决该问题的2.4.2.0版。