使用python访问yaml中的元素

问题描述

我正在使用yaml和pyyaml配置我的应用程序。

是否可以配置类似这样的内容-

config.yml-

root:
    repo_root: /home/raghhuveer/code/data_science/papers/cv/AlexNet_lght
    data_root: $root.repo_root/data

service:
    root: $root.data_root/csv/xyz.csv

yaml加载功能-

def load_config(config_path):
    config_path = os.path.abspath(config_path)
    
    if not os.path.isfile(config_path):
        raise FileNotFoundError("{} does not exist".format(config_path))
    else:
        with open(config_path) as f:
            config = yaml.load(f,Loader=yaml.SafeLoader)
        # logging.info(config)
        logging.info("Config used for run - \n{}".format(yaml.dump(config,sort_keys=False)))
        return DotDict(config)

当前输出-

root:
  repo_root: /home/raghhuveer/code/data_science/papers/cv/AlexNet_lght
  data_root: ${root.repo_root}/data

service:
  root: ${root.data_root}/csv/xyz.csv

所需的输出-

root:
  repo_root: /home/raghhuveer/code/data_science/papers/cv/AlexNet_lght
  data_root: /home/raghhuveer/code/data_science/papers/cv/AlexNet_lght/data

service:
  root: /home/raghhuveer/code/data_science/papers/cv/AlexNet_lght/data/csv/xyz.csv

使用python甚至可能吗?如果是这样,任何帮助都将非常好。

谢谢。

解决方法

一般方法:

  • 按原样读取文件
  • 搜索包含$的字符串:
    • 确定“变量”的“路径”
    • 用实际值替换“变量”

一个示例,对字典使用递归调用并替换字符串:

import re,pprint,yaml

def convert(input,top=None):
    """Replaces $key1.key2 with actual values. Modifies input in-place"""
    if top is None:
        top = input # top should be the original input
    if isinstance(input,dict):
        ret = {k:convert(v,top) for k,v in input.items()} # recursively convert items
        if input != ret: # in case order matters,do it one or several times more until no change happens
            ret = convert(ret)
        input.update(ret) # update original input
        return input # return updated input (for the case of recursion)
    if isinstance(input,str):
        vars = re.findall(r"\$[\w_\.]+",input) # find $key_1.key_2.keyN sequences
        for var in vars:
            keys = var[1:].split(".") # remove dollar and split by dots to make "key chain"
            val = top # starting from top ...
            for k in keys: # ... for each key in the key chain ...
                val = val[k] # ... go one level down
            input = input.replace(var,val) # replace $key sequence eith actual value
        return input # return modified input
    # TODO int,float,list,...

with open("in.yml") as f: config = yaml.load(f) # load as is
convert(config) # convert it (in-place)
pprint.pprint(config)

输出:

{'root': {'data_root': '/home/raghhuveer/code/data_science/papers/cv/AlexNet_lght/data','repo_root': '/home/raghhuveer/code/data_science/papers/cv/AlexNet_lght'},'service': {'root': '/home/raghhuveer/code/data_science/papers/cv/AlexNet_lght/data/csv/xyz.csv'}}

注意:YAML在这里不是那么重要,它也可以与JSON,XML或其他格式一起使用。

注2:如果仅使用YAML和python,则this post的一些答案可能会有用(使用锚点和引用以及特定于应用程序的本地标记)