在 bash 中从“key1='val1' key2='val2'”字符串中解析变量而不使用 eval 或者,使用较旧的 Bash 版本

问题描述

我有一个特定于项目的命令,它以以下形式生成输出

Parameter1='value1' Parameter2='Value2' ... #单引号值 变量。

但我想明确分配值并需要打印必须显示相应值的参数。

这里xtc_cmd get是项目特定的cmd

root@renway:~# FOO=`xtc_cmd get lan_ifname lan_ipaddr lan_netmask`
root@renway:~#
root@renway:~# echo $FOO
SYSCFG_lan_ifname='br1' SYSCFG_lan_ipaddr='10.0.0.1' SYSCFG_lan_netmask='255.255.255.0'
root@renway:~#
root@renway:~# echo $SYSCFG_lan_ifname

root@renway:~# echo $SYSCFG_lan_ipaddr

root@renway:~# echo $SYSCFG_lan_netmask

但是,在变量打印其值之后,我尝试了 'eval $FOO'。 出于安全原因,我想跳过 'eval'。

共享脚本执行的输出

root@renway:~# /tmp/test.sh
++ xtc_cmd get lan_ifname lan_ipaddr lan_netmask
+ FOO='SYSCFG_lan_ifname='\''br1'\''
SYSCFG_lan_ipaddr='\''10.0.0.1'\''
SYSCFG_lan_netmask='\''255.255.255.0'\'''
+ echo 'SYSCFG_lan_ifname='\''br1'\''' 'SYSCFG_lan_ipaddr='\''10.0.0.1'\''' 'SYSCFG_lan_netmask='\''255.255.255.0'\'''
SYSCFG_lan_ifname='br1' SYSCFG_lan_ipaddr='10.0.0.1' SYSCFG_lan_netmask='255.255.255.0'

如何实际赋值并打印这些变量。

输入感兴趣的字符串

FOO='SYSCFG_lan_ipaddr='\''10.0.0.1'\'' SYSCFG_sysdate='\'''\''$(date>> /tmp/date.txt)0'\'''\'' SYSCFG_lan_pd_interfaces='\''brlan0 brlan19 brlan20'\'''

预期输出

foo_SYSCFG_lan_ipaddr=10.0.0.1
foo_SYSCFG_sysdate='$(date>> /tmp/date.txt)0' #single quoted value
foo_SYSCFG_lan_pd_interfaces=brlan0 brlan19 brlan20 #whitespace separated string

这里的挑战是 SYSCFG_sysdate 与其他参数相比,单独保存单引号值 '$(date>> /tmp/date.txt)0'

抱歉,我错过了最早强调或提及此参数。 这是为了测试恶意命令注入攻击。所以这里的期望值是按原样存储但没有执行命令的值。使用 'eval' 内置命令,日期命令正在执行,这是不期望的。

我在运行 Zilog80 的 POSIX V1 脚本后得到的所需输出 使用 'set' 内置。

但是 POSIX V2 脚本只有在没有 SYSCFG_sysdate 参数的情况下才能正常运行。

特别感谢@Charles Duffy 和@Zilog80 对此的大量宝贵意见和指导问题。

解决方法

借用对密切相关问题 (Reading quoted/escaped arguments correctly from a string) 的回答:

#!/usr/bin/env bash
FOO="SYSCFG_lan_ifname='br1' SYSCFG_lan_ipaddr='10.0.0.1' SYSCFG_lan_netmask='255.255.255.0'"
 
case $BASH_VERSION in
  ''|[1-3].*) echo "ERROR: Bash 4.0 required; this is ${BASH_VERSION:-not bash}" >&2; exit 1;;
esac
 
declare -A kwargs=( )
while IFS= read -r -d ''; do
  [[ $REPLY = *=* ]] || {
    printf 'ERROR: Item %q is not in assignment form\n' "$REPLY" >&2
    continue
  }
  kwargs[${REPLY%%=*}]=${REPLY#*=}
done < <(xargs printf '%s\0' <<<"$FOO")
 
# show what we parsed for demonstration purposes
declare -p kwargs >&2

您可以在 https://ideone.com/KniaC4; 的在线沙箱中看到它的输出是以下形式的关联数组:

declare -A kwargs=([SYSCFG_lan_ifname]="br1" [SYSCFG_lan_netmask]="255.255.255.0" [SYSCFG_lan_ipaddr]="10.0.0.1" )

...因此您可以参考 "${kwargs[SYSCFG_lan_ifname]}""${kwargs[SYSCFG_lan_ipaddr]}"

这比分配给常规 bash 变量更安全,因为它不会让攻击者修改 PATH、LD_PRELOAD 或其他修改 shell、链接器、加载器、标准 C 库等行为的环境变量。(注意即使您没有明确 export 由这段代码创建的赋值,赋值给一个已经导出的变量也会自动导出新值;因此仅适用于环境变量而不适用于常规 shell 变量的安全问题仍然可以解决在这里玩)。


警告:xargs 解析字符串的方式与 POSIX sh 标准完全不兼容——有关详细信息和其他选项,请参阅 link given above(Python 有一个完全兼容的解析器,f /e,链接的答案描述了如何从 bash 使用它)。


或者,使用较旧的 Bash 版本

当关联数组不可用时,可以为常规变量添加前缀:

#!/usr/bin/env bash
FOO="SYSCFG_lan_ifname='br1' SYSCFG_lan_ipaddr='10.0.0.1' SYSCFG_lan_netmask='255.255.255.0'"
 
while IFS= read -r -d ''; do
  [[ $REPLY = *=* ]] || {
    printf 'ERROR: Item %q is not in assignment form\n' "$REPLY" >&2
    continue
  }
  printf -v "foo_${REPLY%%=*}" '%s' "${REPLY#*=}"
done < <(xargs printf '%s\0' <<<"$FOO")
 
# show what we parsed for demonstration purposes

for var in ${!foo_*}; do
  echo "$var has value: ${!var}"
done

看到在 https://ideone.com/7UZJkT 运行,输出:

foo_SYSCFG_lan_ifname has value: br1
foo_SYSCFG_lan_ipaddr has value: 10.0.0.1
foo_SYSCFG_lan_netmask has value: 255.255.255.0
,

要添加到@CharlesDuffy 的答案中,对于那些仍然坚持使用旧的“不可升级”硬件/虚拟机的人,这里有一种 POSIX/旧 bash 方法可以安全地实现这一点。使用 dash、ksh93、bash 2.05b、3 和 4 进行测试。无法检索我的旧 Bourne shell 92。

编辑:感谢有用的@CharlesDuffy 评论:

  1. 更新以处理 'value' 部分中的空白/空格/换行符/wathever。以一种基本的方式(多个空白减少到一个空格,吞下新行)。正在努力寻找更好的方法来处理这个问题。

  2. 生成的变量名称现在以 _ 为前缀,以防止任何 尝试覆盖 PATHLD_PRELOAD

EDIT2: 添加了 Bash 2/3/4 和 ksh 版本,用于处理值中的制表符/空格/换行符。见下文。

EDIT3:添加符合 POSIX 的 rev 2,可以处理 TABNEWLINE 和多个 SPACE

POSIX 兼容 V1:

这个不能很好地处理变量值部分的换行符和制表符。它不会崩溃,但相关变量会被“压缩”在一行中,并带有空格而不是换行符,并且所有制表符/多个空格都减少为一个空格。

#!/bin/sh
# If you only have a bash 4.x,you can test with compat 3.1 bash
# shopt -s compat31
FOO="SYSCFG_lan_ifname='br1' SYSCFG_lan_ipaddr='10.0.0.1' 
SYSCFG_lan_netmask='255.255.255.0' SYSCFG_space='my   space' SYSCFG_newline='I have 
many multi
lines input"
# An "env variable" definer that use the read command 
# to parse and define the env  variable
define() {
  IFS=\= read -r key value <<EOF
$1
EOF

  # Unquotting the value,adapt as it fit your needs
  value="${value#\'}"
  value="${value%\'}"
  read -r "_${key}" << EOF
${value}
EOF

}
unset _SYSCFG_lan_ifname
unset _SYSCFG_lan_ipaddr
unset _SYSCFG_lan_netmask
unset _SYSCFG_space
unset _SYSCFG_newline
# Using the set command to "parse" the variables string
set ${FOO}
while [ "$1" ] ; do
  key_value="$1"
  while [ "$1" ] && [ "${key_value%\'}" = "${key_value}" ]  ; do
    shift
    key_value="${key_value} $1"
  done
  define "${key_value}"
  [ "$1" ] && shift
done
echo "${_SYSCFG_lan_ifname}"
echo "${_SYSCFG_lan_ipaddr}"
echo "${_SYSCFG_lan_netmask}"
echo "${_SYSCFG_space}"
echo "${_SYSCFG_newline}"

输出与ksh93相同,bash 2..4,破折号:

br1
10.0.0.1
255.255.255.0
my space
I have many multi lines input

符合 POSIX 的 V2:

这个版本可以处理特殊字符和,部分,换行符。它不使用 set 命令来解析字符串,从而避免了任何潜在的 glob 效应。我们依赖于基本的壳三聚体 #%。这也可以处理字符串中的不同引用和转义引号/双引号。 define 函数通过 here-doc 中的 \n 处理多行,因此它们的翻译将留给脚本用户。

#!/bin/sh
# If you only have a bash 4.x,you can test with compat 3.1 bash
# shopt -s compat31
# Test string. There is a TAB between "input" and "and".
FOO="SYSCFG_lan_ifname='br1' SYSCFG_lan_ipaddr='10.0.0.1 *' 
SYSCFG_lan_netmask=\"255.255.255.0\" SYSCFG_space='mypath\\ so\'urce\\my   space' 
SYSCFG_newline='I have 
many  multi 
 lines input    and /path/to/thi ngs'"
#
# Define here the prefix you want for the variables. A prefix is required
# to avoid LD_PRELOAD,PATH,etc. override from a variable.
_prefix="_"
# 
# The POSIX way for a new line constant.
_NL="
"
# An "env variable" definer that use the read command to parse and define 
define() {
  _key="$1"
  _value="$2"
  _quote="$3"
  _tmp=""
  # The POSIX read command can only read one line at a time.
  # For multiline,we loop to rebuild the full value.
  while read -r _line ; do 
    [ "${_tmp}" ] && _tmp="${_tmp}\n${_line}" || _tmp="${_line}";
  done  <<EOF
${_value}
EOF

  read -r "${_prefix}${_key}" << EOF
${_tmp}
EOF

}
unset _SYSCFG_lan_ifname
unset _SYSCFG_lan_ipaddr
unset _SYSCFG_lan_netmask
unset _SYSCFG_space
unset _SYSCFG_newline
# First,we trim blanks
_FOO="${FOO# * }"
_FOO="${_FOO% * }"
# We use shell basic trimer to "parse" the string
while [ "${_FOO}" ] ; do
  # Get the first assignation from the beginning
  _FOO_next="${_FOO#*=}"
  if [ "${_FOO_next}" != "${_FOO}" ] ; then
    # If there is backslash in the string we need to double escape them for 
    # using it as a pattern. We do that in a safe manner regarding FOO content.
    _FOO_next_pattern="$( sed 's/\\/\\\\/g' <<EOF
${_FOO_next}
EOF
    )"
    # We have an assignation to parse
    _key="${_FOO%=${_FOO_next_pattern}}"
    # We must have a key,assignation without key part are ignored.
    # If need,you can output error message in the else branch.
    if [ "${_key}" ] ; then
      # Triming space and newlines
      _key="${_key## }"
      _key="${_key##${_NL}}"
      _key="${_key## }"
      _quote="\'"
      # Test if it  is quote,if not quote then try double quote
      [ "${_FOO_next}" = "${_FOO_next#${_quote}}" ] && _quote="\""  
      # If not double quote,consider unquoted...
      [ "${_FOO_next}" = "${_FOO_next#${_quote}}" ] && _quote=""  
      # Extracting value part and trim quotes if any
      if [ "${_quote}" ] ; then 
        _FOO_next="${_FOO_next#${_quote}}"
        _FOO_next_pattern="${_FOO_next_pattern#${_quote}}"
      fi
      _value="${_FOO_next}"
      if [ "${_quote}" ] ; then 
        _FOO_next="${_FOO_next#*[^\\]${_quote}}"
        _FOO_next_pattern="${_FOO_next_pattern#*[^\\]${_quote}}"
      else
        # If the value part is not quoted,we look for the next unescaped space
        # as the delimiter for the next key/value pair.
        _FOO_next="${_FOO_next#*[^\\] }"
        _FOO_next_pattern="${_FOO_next_pattern#*[^\\] }"
      fi
      _value="${_value%${_quote}${_FOO_next_pattern}}"
      # We have parse everything need to set the variable
      define "${_key}" "${_value}" "${_quote}"
      _FOO="${_FOO_next}"
    else
      _FOO="${_FOO_next#*[^\\] }"
    fi
  else
    # Nothing more to parse
    _FOO=""
  fi
done
printf "%s\n" "${_SYSCFG_lan_ifname}"
printf "%s\n" "${_SYSCFG_lan_ipaddr}"
printf "%s\n" "${_SYSCFG_lan_netmask}"
printf "%s\n" "${_SYSCFG_space}"
printf "%s\n" "${_SYSCFG_newline}"

输出与ksh93相同,bash 2..4,破折号:

br1
10.0.0.1 *
255.255.255.0
mypath\ so\'urce\my   space
I have\nmany  multi\nlines input    and /path/to/thi ngs

BASH V2+ 和 KSH93 兼容:

它不符合 POSIX,因为模式 (/) 的变量替换不是 POSIX。字面 ASCII 推断 $'\x<hex ASCII code>' 确实不是 POSIX,并且以下脚本只能与基于 ASCII 的 UNIX shell 一起使用(忘记 EBCDIC...)。无论如何,这个可以处理变量值部分的换行符/制表符/多个空格。

#!/bin/sh
# If you only have a bash 4.x,you can test with compat 3.1 bash
# shopt -s compat31
# Test string. There is a TAB between "input" and "and".
FOO="SYSCFG_lan_ifname='br1' SYSCFG_lan_ipaddr='10.0.0.1 *' 
SYSCFG_lan_netmask='255.255.255.0' SYSCFG_space='mypath\\ source\\my   space' 
SYSCFG_newline='I have 
many  multi 
 lines input    and /path/to/thi ngs"
# 
# For bash 2.0,we can't make inline subsitution of ESC nor NL nor TAB  because
# of the following bug :
#  FOO="`echo -e \"multi\nline\"`";echo "${FOO//$'\x0a'/$'\x1b'}" ==> multi'line
# Bash 2.0 wrongly include one quote to the output in this case.
# To avoid that,we store ESC and NL in local variable,and it is better 
# for readability.
_ESC=$'\x1b'
_NL=$'\x0a'
_TAB=$'\x09'
# Same kind of trouble with the backslash in bash 2.0,the substiution need 
# 'double' escape for them in bash 2.0,so we store BKS,test it and double it 
# if required.
# However,if used as a variable in pattern or subsitution part,we have then to
# deal with two forms of escaped bakcslash since shells don't "dedouble"/escape
# them  for the substitute value,only for the pattern.
_BKS_PATTERN="\\\\"
_BKS="\\"
if [ "${_BKS_PATTERN//\\/X}" != "XX" ] ; then
  # Hello bash 2.0
  _BKS_PATTERN="\\\\\\\\"
  _BKS="\\\\"
fi
# An "env variable" definer that use the read command to parse and define 
define() {
  IFS=\= read -r _key _value <<EOF
$1
EOF

  # Unquotting the _value,adapt as it fit your needs
  _value="${_value#\'}"
  _value="${_value%\'}"
  _value="${_value%\'${_BKS_PATTERN}}"
  # Unescape the _key string to trim escaped nl
  _key="${_key#${_ESC}}"
  _key="${_key%${_ESC}}"
  # Unescape the _value string
  _value="${_value//${_BKS_PATTERN} / }"
  _value="${_value//${_ESC}${_ESC}/${_TAB}}"
  _value="${_value//${_ESC}/${_NL}}"
  read -d\' -r "_${_key}" <<EOF
${_value}'
EOF

}
unset _SYSCFG_lan_ifname
unset _SYSCFG_lan_ipaddr
unset _SYSCFG_lan_netmask
unset _SYSCFG_space
unset _SYSCFG_newline
# First,we escape the new line with 0x1B
_FOO="${FOO//${_NL}/${_ESC}}"
# Second,escape each tab with double ESC. All tabs.
_FOO="${_FOO//${_TAB}/${_ESC}${_ESC}}"
# Third,escape each space. All space.
_FOO="${_FOO// /${_BKS} }"
# Using the set command to "parse" the variables string
set ${_FOO}
while [ "$1" ] ; do
  _key_value="$1"
  while [ "$1" ] && [ "${_key_value%\'${_BKS_PATTERN}}" = "${_key_value}" ] ; do
    shift
    _key_value="${_key_value} $1"
  done
  define "${_key_value}"
  [ "$1" ] && shift
done
printf "%s\n" "${_SYSCFG_lan_ifname}"
printf "%s\n" "${_SYSCFG_lan_ipaddr}"
printf "%s\n" "${_SYSCFG_lan_netmask}"
printf "%s\n" "${_SYSCFG_space}"
printf "%s\n" "${_SYSCFG_newline}"

输出与 ksh93、bash 2 和 + 相同:

(请注意,我们使用 printf 来呈现“输入”和“与”之间的制表符字符。)

br1
10.0.0.1 *
255.255.255.0
mypath\ source\my   space
I have
many  multi
 lines input    and /path/to/thi ngs
,

如何实际赋值并打印这些变量。

编写一个解析器并解析输入并分配变量。读取一行,在第一个 = 处拆分行,左侧部分是变量,右侧部分是变量值,删除 ' 引号或以命令输出的任何样式解析引号样式并将结果分配给从行中提取的变量名称。沿着伪代码

while <split line on first =>; do
    vareiable_value=$(parse second part after =)
    print -v "$variable_name" %s "$variable_value"
done <<<"$FOO"
,

假设您已经知道变量的名称(或有找到它们的方法),并且您不必担心安全问题(请参阅 Charles Duffy 的回答),您可以source from a string:>

$ FOO='SYSCFG_lan_ifname='\''br1'\'' SYSCFG_lan_ipaddr='\''10.0.0.1'\'' SYSCFG_lan_netmask='\''255.255.255.0'\'''

$ set | grep SYSCFG
FOO='SYSCFG_lan_ifname='\''br1'\'' SYSCFG_lan_ipaddr='\''10.0.0.1'\'' SYSCFG_lan_netmask='\''255.255.255.0'\'''

$ . <(echo "${FOO}")

$ set | grep SYSCFG
FOO='SYSCFG_lan_ifname='\''br1'\'' SYSCFG_lan_ipaddr='\''10.0.0.1'\'' SYSCFG_lan_netmask='\''255.255.255.0'\'''
SYSCFG_lan_ifname=br1
SYSCFG_lan_ipaddr=10.0.0.1
SYSCFG_lan_netmask=255.255.255.0

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...