如何在SQL Server中将对象的JSON数组列更改为对象

问题描述

我正在处理带有JSON值列的sql表。该列的每一行都是JSON结构中的字符串值。此JSON结构始终是一个数组,其中包含一项的一个或多个对象。对象的数量和关键字可以不同。例如,第一行可能看起来像这样:

from pyspark.sql.functions import col
output_df = df.withColumn("PID",col("property")[0][1]).withColumn("EngID",col("property")[1][1]).withColumn("TownIstat",col("property")[2][1]).withColumn("ActiveEng",col("property")[3][1]).drop("property")

第二行值可能看起来像这样:

[{"Page View":"Page"},{"Search Data":"9"},{"Search distance":"undefined"},{"Search Location":"undefined"},{"Search Filters":"{}"},{"Search No Restrictions":"undefined"},{"Search Term":"Services"},{"Search Type":"Id"}]

我正在尝试将这些值转换为一个包含多个元素的对象

所以第一行看起来像这样:

[{"Page Type":"Service"},{"Organization ID":"111555666"},{"Service ID":"333444"},{"refUrl":"https://randomURL"}]

第二行如下所示:

{"Page View":"Page","Search Data":"9","Search distance":"undefined","Search Location":"undefined","Search Filters":"{}","Search No Restrictions":"undefined","Search Term":"Services","Search Type":"Id"}

我尝试了这种方法

{"Page Type":"Service","Organization ID":"111555666","Service ID":"333444","refUrl":"https://randomURL"}

这有效,但是它可能会更改SELECT FRUA.Id,REPLACE(REPLACE(REPLACE(REPLACE(JSON_column,'{',''),'}','[','{'),']','}') FROM test.table 之类的意外{[值,或者破坏嵌套元素。是否有更好的方法sql Server Azure 12.0.2000.8上实现此目标?

解决方法

这是JSON对象的未命名JSON数组。要访问数组的元素,答案使用JSON_QUERY和列偏移量。将JSON对象从数组中提取到列中后,该解决方案将使用JSON_VALUE提取字段值。将字段值提取到列中后,将使用FOR JSON PATH对结果表进行序列化,并指定WITHOUT_ARRAY_WRAPPER。

JSON数据

declare @json           nvarchar(max)=
N'[{"Page View":"Page"},{"Search Data":"9"},{"Search Distance":"undefined"},{"Search Location":"undefined"},{"Search Filters":"{}"},{"Search No Restrictions":"undefined"},{"Search Term":"Services"},{"Search Type":"Id"}]';

查询

with j_cte as (
    select
       json_query(@json,'$[0]') AS a,json_query(@json,'$[1]') AS b,'$[2]') AS c,'$[3]') AS d,'$[4]') AS e,'$[5]') AS f,'$[6]') AS g,'$[7]') AS h )
select json_value(jc.a,N'$."Page View"') AS [Page View],json_value(jc.b,N'$."Search Data"') AS [Search Data],json_value(jc.c,N'$."Search Distance"') AS [Search Distance],json_value(jc.d,N'$."Search Location"') AS [Search Location],json_value(jc.e,N'$."Search Filters"') AS [Search Filters],json_value(jc.f,N'$."Search No Restrictions"') AS [Search No Restrictions],json_value(jc.g,N'$."Search Term"') AS [Search Term],json_value(jc.h,N'$."Search Type"') AS [Search Type]
from j_cte jc for json path,without_array_wrapper;

输出

{
  "Page View": "Page","Search Data": "9","Search Distance": "undefined","Search Location": "undefined","Search Filters": "{}","Search No Restrictions": "undefined","Search Term": "Services","Search Type": "Id"
}
,

一种可能的解决方案是使用OPENJSON()从存储的JSON数组中提取每个JSON对象,并使用SUSBTRING()STRING_AGG()构建最终输出:

表格:

CREATE TABLE Data (JsonData varchar(1000))
INSERT INTO Data (JsonData)
VALUES
   ('[{"Page View":"Page"},{"Search Type":"Id"}]'),('[{"Page Type":"Service"},{"Organization ID":"111555666"},{"Service ID":"333444"},{"refUrl":"https://randomURL"}]')

表格:

UPDATE Data
SET JsonData = (
   SELECT CONCAT('{',STRING_AGG(SUBSTRING([value],2,LEN([value]) - 2),','),'}')
   FROM OPENJSON(JsonData)
)

结果:

JsonData
{"Page View":"Page","Search Data":"9","Search Distance":"undefined","Search Location":"undefined","Search Filters":"{}","Search No Restrictions":"undefined","Search Term":"Services","Search Type":"Id"}
{"Page Type":"Service","Organization ID":"111555666","Service ID":"333444","refUrl":"https://randomURL"}