问题描述
我想使用docker(完全没有像S3或Blob这样的云存储)运行MLflow。因此,我遵循this guide并尝试将工件存储设置为在另一个Docker容器中运行的atmoz sftp服务器。如MLFlow docs中的建议,我尝试使用主机密钥进行身份验证,但是,当我尝试注册我的工件时,收到以下错误pysftp.exceptions.CredentialException: No password or key specified.
我猜,我的主机密钥设置有问题。我也尝试遵循this guide(在this question中提到),但是不幸的是,对于我的有限的容器,sftp服务器和pub-priv-key设置知识,它没有足够的详细信息。我的docker-compose看起来像这样...
services:
db:
restart: always
image: mysql/mysql-server:5.7.28
container_name: mlflow_db
expose:
- "3306"
networks:
- backend
environment:
- MYSQL_DATABASE=${MYSQL_DATABASE}
- MYSQL_USER=${MYSQL_USER}
- MYSQL_PASSWORD=${MYSQL_PASSWORD}
- MYSQL_ROOT_PASSWORD=${MYSQL_ROOT_PASSWORD}
volumes:
- dbdata:/var/lib/mysql
mlflow-sftp:
image: atmoz/sftp
container_name: mlflow-sftp
ports:
- "2222:22"
volumes:
- ./storage/sftp:/home/foo/storage
- ./ssh_host_ed25519_key:/home/foo/.ssh/ssh_host_ed25519_key.pub:ro
- ./ssh_host_rsa_key:/home/foo/.ssh/ssh_host_rsa_key.pub:ro
command: foo::1001
networks:
- backend
web:
restart: always
build: ./mlflow
depends_on:
- mlflow-sftp
image: mlflow_server
container_name: mlflow_server
expose:
- "5000"
networks:
- frontend
- backend
volumes:
- ./ssh_host_ed25519_key:/root/.ssh/ssh_host_ed25519_key:ro
- ./ssh_host_rsa_key:/root/.ssh/ssh_host_rsa_key:ro
command: >
bash -c "sleep 3
&& ssh-keyscan -H mlflow-sftp >> ~/.ssh/known_hosts
&& mlflow server --backend-store-uri mysql+pymysql://${MYSQL_USER}:${MYSQL_PASSWORD}@db:3306/${MYSQL_DATABASE} --default-artifact-root sftp://foo@localhost:2222/storage --host 0.0.0.0"
nginx:
restart: always
build: ./nginx
image: mlflow_nginx
container_name: mlflow_nginx
ports:
- "80:80"
networks:
- frontend
depends_on:
- web
网络: 前端: 司机:桥 后端: 司机:桥
卷: dbdata:
...,然后在我的python脚本中创建一个新的mlflow实验,如下所示。
remote_server_uri = "http://localhost:80"
mlflow.set_tracking_uri(remote_server_uri)
EXPERIMENT_NAME = "test43"
mlflow.create_experiment(EXPERIMENT_NAME) #,artifact_location=ARTIFACT_URI)
mlflow.set_experiment(EXPERIMENT_NAME)
EXPERIMENT_NAME = "test43"
mlflow.create_experiment(EXPERIMENT_NAME) #,artifact_location=ARTIFACT_URI)
mlflow.set_experiment(EXPERIMENT_NAME)
with mlflow.start_run():
print(mlflow.get_artifact_uri())
print(mlflow.get_registry_uri())
lr = ElasticNet(alpha=alpha,l1_ratio=l1_ratio,random_state=42)
lr.fit(train_x,train_y)
predicted_qualities = lr.predict(test_x)
(rmse,mae,r2) = eval_metrics(test_y,predicted_qualities)
print("Elasticnet model (alpha=%f,l1_ratio=%f):" % (alpha,l1_ratio))
print(" RMSE: %s" % rmse)
print(" MAE: %s" % mae)
print(" R2: %s" % r2)
mlflow.log_param("alpha",alpha)
mlflow.log_param("l1_ratio",l1_ratio)
mlflow.log_metric("rmse",rmse)
mlflow.log_metric("r2",r2)
mlflow.log_metric("mae",mae)
tracking_url_type_store = urlparse(mlflow.get_tracking_uri()).scheme
if tracking_url_type_store != "file":
mlflow.sklearn.log_model(lr,"model",registered_model_name="ElasticnetWineModel")
else:
mlflow.sklearn.log_model(lr,"model")
我尚未修改提到的第一个指南的dockerfile,即您可以看到它们here。我的猜测是我弄乱了主机密钥,也许把它们放在了错误的目录中,但是经过数小时的蛮力试验之后,我希望有人可以帮助我向正确的方向发展。让我知道是否有任何东西可以重现该错误。
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)