问题描述
我目前有这个正则表达式:
?P<key>\w+)=(?P<value>[a-zA-Z0-9-_:/@. ]+
输入第 1 行:event=1921;json={"source":"A","location":B":"folder":"c:\\windows\\system32"},"id":2,"address":null,"name":"gone";
输入第 2 行:dev=b;json={"dest":"123","home":AZ":"loc":"sys"},"ab":9,"home":null,"someKey":"someValue";
它正确地提取了“event=1921;”但确实提取了另外两种类型。
解决方法
您应该能够使用 parse
运算符:https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/parseoperator
例如:
print input = 'event=1921;json={"source":"A","location":B":"folder":"c:\\windows\\system32"},"id":2,"address":null,"name":"gone";'
| parse input with * "json=" json:dynamic ',"id"' * '"name":"' name '"' *
如果您的负载/属性名称完全是动态的,则:
一个。我建议您评估以标准格式构建源数据的选项(目前,即使是其中的“json
”部分也不是有效的 JSON)
B.您可以尝试以下方法 - 功能强大,但效率很低(不推荐用于大规模数据处理)
datatable(input:string)
[
'event=1921;json={"source":"A","name":"gone";','dev=b;json={"dest":"123","home":AZ":"loc":"sys"},"ab":9,"home":null,"someKey":"someValue";'
]
| parse input with prefix ";json={" json:dynamic '},' suffix
| mv-apply x = extract_all(@'(\w+)=(\w+)',prefix) on (
project p = pack(tostring(x[0]),x[1])
| summarize b1 = make_bag(p)
)
| mv-apply y = extract_all(@'"(\w+)":"?(\w+)"?',suffix) on (
project p = pack(tostring(y[0]),y[1])
| summarize b2 = make_bag(p)
)
| project json = strcat("{",json,"}"),b = bag_merge(b1,b2)
| evaluate bag_unpack(b)