问题描述
我发现遵循光线指南在光线集群上运行 docker 镜像以执行 python 脚本非常困难。我发现缺乏简单的工作示例。
FROM rayproject/ray
workdir /usr/src/app
copY . .
CMD ["step_1.py"]
ENTRYPOINT ["python3"]
我用它来创建罐头镜像并将其推送到 docker hub。 (“myimage”只是一个例子)
docker build -t myimage .
docker push myimage
"step_1.py" 在 200 秒内每秒打印一次:
import time
for i in range(200):
time.sleep(1)
print("hello")
这是我的 config.yaml。再次非常简单:
cluster_name: simple-1
min_workers: 0
max_workers: 2
docker:
image: "myimage"
container_name: "my_simple_docker_container"
pull_before_run: True
idle_timeout_minutes: 5
provider:
type: aws
region: eu-west-2
availability_zone: eu-west-2a
file_mounts_sync_continuously: False
auth:
ssh_user: ubuntu
ssh_private_key: /home/user/.ssh/aws_ubuntu_test.pem
head_node:
InstanceType: c5.2xlarge
ImageId: ami-xxxxx826a6b31fd2c
KeyName: aws_ubuntu_test
BlockDeviceMappings:
- DeviceName: /dev/sda1
Ebs:
VolumeSize: 200
worker_nodes:
InstanceType: c5.2xlarge
ImageId: ami-xxxxx826a6b31fd2c
KeyName: aws_ubuntu_test
InstanceMarketoptions:
MarketType: spot
head_setup_commands:
- pip install boto3==1.4.8
worker_setup_commands: []
head_start_ray_commands:
- ray stop
- ulimit -n 65536; ray start --head --port=6379 --object-manager-port=8076 --autoscaling-config=~/ray_bootstrap_config.yaml
worker_start_ray_commands:
- ray stop
- ulimit -n 65536; ray start --address=$RAY_HEAD_IP:6379 --object-manager-port=8076
我在终端打了:
ray up simple1.yaml:
每次都会出现这个错误:
shared connection to x.x.xx.119 closed.
"docker cp" requires exactly 2 arguments.
See 'docker cp --help'.
Usage: docker cp [OPTIONS] CONTAINER:SRC_PATH DEST_PATH|-
docker cp [OPTIONS] SRC_PATH|- CONTAINER:DEST_PATH
copy files/folders between a container and the local filesystem
Shared connection to x.x.xx.119 closed.
只需添加 docker 镜像即可在任何其他远程机器上运行就好了,只是不能在 ray 集群上运行。
如果有人可以帮助我,我将永远感激不尽,我什至会承诺在我挣扎后添加一个关于媒体的教程。
解决方法
我认为问题可能在于使用 ENTRYPOINT
。 Ray ClusterLauncher 使用如下命令启动 docker:
docker run --rm --name <NAME> -d -it --net=host <image_name> bash
当我运行 docker build -t myimage .
然后运行 docker run --rm -it myimage bash
时,Docker 出错:
python3: can't open file 'bash': [Errno 2] No such file or directory