如何使用bash批量检索Airflow连接？ Google Composer

问题描述

我正在尝试使用基础架构作为代码的方法来创建一个作曲者环境。为此，我需要能够以编程方式存储和检索气流变量，并将它们的版本保持在某个地方。

以下命令可让我列出指定的requires_grad

中的所有连接

False

这是输出：

model.train(False)

此输出的问题在于它返回了几乎不可用的数据表（请参见下面的图像）。理想情况下，它应该返回类似json的结构。

问题1：是否有一种快速的方法可以将连接（和变量）导出（并导入）为json？
问题2：如果问题1是“没有办法”，那么如何将这些数据转换成漂亮的字典或类似结构的键值？

此外，非常感谢您对$COMPOSER_ENV模式进行任何其他解释。

解决方法

使用cat file代替您的gcloud命令，我不必在您的问题中生成表格：

$ cat tst.awk
!/[[:space:]]/ {
    # Skip all lines that separate the data rows
    next
}
(++nr) == 1 {
    # Set FS to whatever the combination of control chars is at the start of the first data line
    match($0,/[^[:blank:]]+/)
    FS = "[[:blank:]]*" substr($0,1,RLENGTH) "[[:blank:]]*"
}
{
    # Get rid of the FSs at the start and end of the line to avoid leading/trailing null fields
    gsub("^" FS "|" FS "$","")
}
nr == 1 {
    # Store the header lines for later use
    for (i=1; i<=NF; i++) {
        gsub(/[[:blank:]]+/,"_",$i)
        hdr[i] = $i
    }
    print "{"
    next
}
{
    # Print the json-equivalent for the data on the current line
    gsub(/\047/,"\"")
    printf "%s  %s: {\n",(nr>2 ? ",\n" : ""),$1
    for (i=2; i<=NF; i++) {
        printf "    \"%s\": %s%s\n",hdr[i],$i,(i<NF ? "," : "")
    }
    printf "  }",$1
}
END {
    print "\n}"
}

$ cat file | awk -f tst.awk
{
  "airflow_db": {
    "Conn_Type": "mysql","Host": "airflow-sqlp...rvice.default","Port": None,"Is_Encrypted": True,"Is_Extra_Encrypted": False,"Extra": None
  },"beeline_default": {
    "Conn_Type": "beeline","Host": "localhost","Port": 10000,"Is_Encrypted": False,"Is_Extra_Encrypted": True,"Extra": "gAAAAABfdZs0...yjt7nj1C2Dzgm"
  },"bigquery_default": {
    "Conn_Type": "google_cloud_platform","Host": None,"Extra": "gAAAAABfdZs2...AOdwY-EnZLg=="
  },"local_mysql": {
    "Conn_Type": "mysql","presto_default": {
    "Conn_Type": "presto","Port": 3400,"google_cloud_default": {
    "Conn_Type": "google_cloud_platform","Extra": "gAAAAABfdZs2...oMm2saUwAxQ=="
  },"hive_cli_default": {
    "Conn_Type": "hive_cli","pig_cli_default": {
    "Conn_Type": "pig_cli","hiveserver2_default": {
    "Conn_Type": "hiveserver2","metastore_default": {
    "Conn_Type": "hive_metastore","Port": 9083,"Extra": "gAAAAABfdZs0...vNSgFh1mE1HY="
  },"mongo_default": {
    "Conn_Type": "mongo","Host": "mongo","Port": 27017,"mysql_default": {
    "Conn_Type": "mysql","Host": "mysql","postgres_default": {
    "Conn_Type": "postgres","Host": "postgres","sqlite_default": {
    "Conn_Type": "sqlite","Host": "/tmp/sqlite_default.db","http_default": {
    "Conn_Type": "http","Host": "https://www.httpbin.org/","mssql_default": {
    "Conn_Type": "mssql","Port": 1433,"vertica_default": {
    "Conn_Type": "vertica","Port": 5433,"wasb_default": {
    "Conn_Type": "wasb","Extra": "gAAAAABfdZs0...ST7E2347-uG4="
  },"webhdfs_default": {
    "Conn_Type": "hdfs","Port": 50070,"ssh_default": {
    "Conn_Type": "ssh","sftp_default": {
    "Conn_Type": "sftp","Port": 22,"Extra": "gAAAAABfdZs0...guLrr1ky5XpN2"
  },"fs_default": {
    "Conn_Type": "fs","Extra": "gAAAAABfdZs0...WqhP9ZLa8gQ=="
  },"aws_default": {
    "Conn_Type": "aws","spark_default": {
    "Conn_Type": "spark","Host": "yarn","Extra": "gAAAAABfdZs0...18ws2BelkcL8="
  },"druid_broker_default": {
    "Conn_Type": "druid","Host": "druid-broker","Port": 8082,"Extra": "gAAAAABfdZs0...sC6Kcd9mOKhE="
  },"druid_ingest_default": {
    "Conn_Type": "druid","Host": "druid-overlord","Port": 8081,"Extra": "gAAAAABfdZs0...CpBdCkHuk5lqw"
  },"redis_default": {
    "Conn_Type": "redis","Host": "redis","Port": 6379,"Extra": "gAAAAABfdZs0...E1qdjhMngIg=="
  },"sqoop_default": {
    "Conn_Type": "sqoop","Host": "rmdbs","Extra": ""
  },"emr_default": {
    "Conn_Type": "emr","Extra": "gAAAAABfdZs0...GsJIS8IjaBuM="
  },"databricks_default": {
    "Conn_Type": "databricks","qubole_default": {
    "Conn_Type": "qubole","segment_default": {
    "Conn_Type": "segment","Extra": "gAAAAABfdZs0...oawClUj4Qzj8i"
  },"azure_data_lake_default": {
    "Conn_Type": "azure_data_lake","Extra": "gAAAAABfdZs0...DMIAMmOeZNg=="
  },"azure_cosmos_default": {
    "Conn_Type": "azure_cosmos","Extra": "gAAAAABfdZs0...tusOfGrWviAk="
  },"azure_contai...ances_default": {
    "Conn_Type": "azure_container_instances","Extra": "gAAAAABfdZs0...q460BKvTu4Lk="
  },"cassandra_default": {
    "Conn_Type": "cassandra","Host": "cassandra","Port": 9042,"dingding_default": {
    "Conn_Type": "http","Host": "","opsgenie_default": {
    "Conn_Type": "http","google_cloud...store_default": {
    "Conn_Type": "google_cloud_platform","Extra": "gAAAAABfdZs2...ltsxQHWUgxA=="
  },"google_cloud_storage_default": {
    "Conn_Type": "google_cloud_platform","Extra": "gAAAAABfdZs2...RNLazPEE7gQ=="
  }
}

请注意，如果idk实际上不是有效的json，我只是将文本块从输入位置移动到输出位置。希望您将很容易进行所需的任何更改，以产生实际想要获得的输出（问题中遗漏了此信息）。

如我所见，您想将连接导出到 .json 文件。

当前，根据documentation，使用最新且稳定的Airflow版本，您可以将连接导出到 .json 文件。命令如下，

airflow connections export connections.json

或者，

airflow connections export /tmp/connections --format json

.json格式使用架构

{
  "airflow_db": {
    "conn_type": "mysql","host": "mysql","login": "root","password": "plainpassword","schema": "airflow","port": null,"extra": null
  },"druid_broker_default": {
    "conn_type": "druid","host": "druid-broker","login": null,"password": null,"schema": null,"port": 8082,"extra": "{\"endpoint\": \"druid/v2/sql\"}"
  }
}

此外，每个连接都存储在符合以下命名约定的环境变量中：AIRFLOW_CONN_{CONN_ID}，here。

airflow airflow awk bash google-cloud-composer google-cloud-platform

如何使用bash批量检索Airflow连接？ Google Composer

问题描述

解决方法

相关问答