问题描述
索引 | 字符串列 |
---|---|
1 | A:blahblahblah。乙:随便。 C: idkidk。 |
2 | 答:废话。 C: idkidk |
3 | B:随便。 C: idkidk |
4 | B:随便。 D:随机的东西 |
我需要为每个特定键生成新列,显示其对应的值。问题是仅仅 split(.) 不起作用,因为并非所有条目都具有相同的键。
这基本上就是我想要实现的:
A | B | C | D |
---|---|---|---|
blahblahblah | 随便 | idkidk | |
废话 | idkidk | ||
随便 | idkidk | ||
随便 | 随机材料 |
我已经挣扎了一段时间,但似乎什么都不对。有什么建议吗?
解决方法
这里有一个解决方案,只要没有值或键包含 :
并且没有键包含任何空格。我更改了示例数据中的一个键来测试多字母键。
* Example generated by -dataex-. For more info,type help dataex
clear
input byte Index str68 String_column
1 "A:blahblahblah. non_single_letter_key: whatever whatever. C: idkidk."
2 "A:blahblah. C: idkidk"
3 "B:whatever whatever. C: idkidk"
4 "B:whatever whatever. D: randomstuff"
end
* Get the number of rows and loop over them
count
forvalues row = 1/`r(N)' {
*Get the raw string for this
local raw_string = String_column[`row']
*Get the first key in the raw string (anything before the first :)
gettoken nextkey raw_string : raw_string,parse(":")
local raw_string = subinstr("`raw_string'",":","",1) //Remove the parse character ":"
*Loop over the raw_string until it is empty
while "`raw_string'" != "" {
*Get the key from above or last loop
local key "`nextkey'"
*For the last pair in the string when raw_string only contains the last value
if strpos("`raw_string'",":") == 0 {
local value "`raw_string'"
local raw_string ""
}
*Not yet last pair,parse out this value and next key
else {
*Get all content until next parse character
gettoken value_and_nextkey raw_string : raw_string,parse(":")
local raw_string = subinstr("`raw_string'",1) //Remove the parse character ":"
*Reverse that content and get the first word in the reversed result
local v_and_nk_reversed = strreverse("`value_and_nextkey'")
gettoken next_key_reversed value_reversed : v_and_nk_reversed,parse(" ")
*Reverse the value for this pair and next key
local value = strreverse("`value_reversed'")
local nextkey = strreverse("`next_key_reversed'")
}
*Test if a variable exist for this key,if not create it
cap confirm variable `key'
if _rc != 0 {
gen `key' = ""
}
*Add the value for this row in the variable for this key
replace `key' = "`value'" if _n == `row'
}
}