问题描述
我正在运行运行时 8.1(包括 Apache Spark 3.1.1、Scala 2.12)试图让 hyperopt 像定义的那样工作
https://docs.databricks.com/applications/machine-learning/automl-hyperparam-tuning/hyperopt- spark-mlflow-integration.html
py4j.Py4JException: Method maxNumConcurrentTasks([]) does not exist
当我尝试
spark_trials = SparkTrials()
有什么特别的我需要做才能让它工作吗?
这是我正在使用的集群
{
"autoscale": {
"min_workers": 1,"max_workers": 2
},"cluster_name": "mlops_tiny_ml","spark_version": "8.2.x-cpu-ml-scala2.12","spark_conf": {},"aws_attributes": {
"first_on_demand": 1,"availability": "SPOT_WITH_FALLBACK","zone_id": "us-west-2b","instance_profile_arn": "arn:aws:iam::112437402463:instance-profile/databricks_instance_role_s3","spot_bid_price_percent": 100,"ebs_volume_type": "GENERAL_PURPOSE_SSD","ebs_volume_count": 3,"ebs_volume_size": 100
},"node_type_id": "m4.large","driver_node_type_id": "m4.large","ssh_public_keys": [],"custom_tags": {},"spark_env_vars": {},"autotermination_minutes": 120,"enable_elastic_disk": false,"cluster_source": "UI","init_scripts": [],"cluster_id": "0xxxxxt404"
}
解决方法
Hyperopt 仅包含在 DBR ML 运行时中,而不包含在库存运行时中。您可以通过查看每个运行时的发行说明来检查它:DBR 8.1 与 DBR 8.1 ML。
来自docs:
Databricks Runtime for Machine Learning 结合了 MLflow 和 Hyperopt,这两个开源工具可以自动执行模型选择和超参数调整过程。