问题描述
我有一张表,其中的一列用于保存大型JSON数组。
查询原始值似乎比返回之前用OPEnjsON处理列值要慢很多。
问题:
与返回较大的nvarchar值相比,OPEnjsON实际上是否更强?
为什么在这种情况下会更快?
示例架构:
CREATE TABLE [ExampleTable] (
[Id] [UNIQUEIDENTIFIER] NOT NULL,[Timestamp] [DATETIME2](7) NOT NULL,[PrevIoUsObjects] [NVARCHAR](MAX) NOT NULL
)
每行的PrevIoUsObjects
值是一个JSON数组,通常包含大约10,000个元素。
Id
是表的主键
Id
具有唯一的聚集索引
Timestamp
具有非唯一,非聚集索引
示例查询1:
SELECT TOP 1 [PrevIoUsObjects]
FROM [ExampleTable]
ORDER BY [Timestamp] DESC
如您所料,上面的查询是我第一次尝试将JSON导入我的应用程序。
对于包含10k元素的JSON数组,在我的Azure sql环境中,响应时间通常为10-15秒。
在本地环境中,使用mcr.microsoft.com/mssql/server:2017-latest
在docker托管的实例,此查询可能需要长达50秒的时间。
Table 'ExampleTable'. Scan count 1,logical reads 4,physical reads 0,read-ahead reads 0,lob logical reads 6972,lob physical reads 0,lob read-ahead reads 10908.
统计资料:
Rows,Executes,StmtText,StmtId,NodeId,Parent,PhysicalOp,LogicalOp,Argument,DefinedValues,EstimateRows,EstimateIO,Estimatecpu,AvgRowSize,TotalSubtreeCost,OutputList,Warnings,Type,Parallel,EstimateExecutions
1,1,"SELECT TOP 1 [PrevIoUsObjects]
FROM [ExampleTable]
ORDER BY [Timestamp] DESC",NULL,0.00671277,SELECT,NULL
1," |--Top(TOP EXPRESSION:((1)))",2,Top,TOP EXPRESSION:((1)),1E-07,4035,[LocalDatabase].[dbo].[ExampleTable].[PrevIoUsObjects],ExampleTable_ROW,1
1," |--nested Loops(Inner Join,OUTER REFERENCES:([LocalDatabase].[dbo].[ExampleTable].[Id]))",3,nested Loops,Inner Join,OUTER REFERENCES:([LocalDatabase].[dbo].[ExampleTable].[Id]),4.18E-05,4043,0.00671267,"[LocalDatabase].[dbo].[ExampleTable].[Timestamp],[LocalDatabase].[dbo].[ExampleTable].[PrevIoUsObjects]"," |--Index Scan(OBJECT:([LocalDatabase].[dbo].[ExampleTable].[IX_ExampleTable_Timestamp]),ORDERED BACKWARD)",4,Index Scan,"OBJECT:([LocalDatabase].[dbo].[ExampleTable].[IX_ExampleTable_Timestamp]),ORDERED BACKWARD","[LocalDatabase].[dbo].[ExampleTable].[Id],[LocalDatabase].[dbo].[ExampleTable].[Timestamp]",0.003125,0.000168,31,0.0032831," |--Clustered Index Seek(OBJECT:([LocalDatabase].[dbo].[ExampleTable].[PK_ExampleTable]),SEEK:([LocalDatabase].[dbo].[ExampleTable].[Id]=[LocalDatabase].[dbo].[ExampleTable].[Id]) LOOKUP ORDERED FORWARD)",6,Clustered Index Seek,"OBJECT:([LocalDatabase].[dbo].[ExampleTable].[PK_ExampleTable]),SEEK:([LocalDatabase].[dbo].[ExampleTable].[Id]=[LocalDatabase].[dbo].[ExampleTable].[Id]) LOOKUP ORDERED FORWARD",0.0001581,0.0034412,2
示例查询2:
DECLARE @json NVARCHAR(MAX) = (
SELECT TOP 1 [PrevIoUsObjects]
FROM [ExampleTable]
ORDER BY [Timestamp] DESC
)
SELECT *
FROM OPEnjsON(@json)
WITH (
[Id] NVARCHAR(100),[ElementTimestamp] DATETIME2,[Hash] NVARCHAR(500)
)
这是我尝试的第二个查询,与直觉相反,我发现此查询的返回速度比示例查询1快得多。
对于同一数据集,在我的Azure sql环境中,响应时间通常为1-4秒。
在本地环境中,使用docker托管的mcr.microsoft.com/mssql/server:2017-latest
实例,此查询往往会在不到2秒的时间内一致返回。
“示例查询2”的两个结果均具有惊人的性能,尽管它们包含相同的查询,但仍在内存中。
IO个人资料:Table 'ExampleTable'. Scan count 1,lob logical reads 0,lob read-ahead reads 0.
统计资料:
Rows,"DECLARE @json NVARCHAR(MAX) = (
SELECT TOP 1 [PrevIoUsObjects]
FROM [ExampleTable]
ORDER BY [Timestamp] DESC
)",0.006718207,NULL
0," |--Compute Scalar(DEFINE:([Expr1003]=[LocalDatabase].[dbo].[ExampleTable].[PrevIoUsObjects]))",Compute Scalar,DEFINE:([Expr1003]=[LocalDatabase].[dbo].[ExampleTable].[PrevIoUsObjects]),[Expr1003]=[LocalDatabase].[dbo].[ExampleTable].[PrevIoUsObjects],[Expr1003]," |--nested Loops(Left Outer Join)",Left Outer Join,4.18E-06,0.006718107," |--Constant Scan",Constant Scan,1.157E-06,9," |--Top(TOP EXPRESSION:((1)))",5," |--nested Loops(Inner Join," |--Index Scan(OBJECT:([LocalDatabase].[dbo].[ExampleTable].[IX_ExampleTable_Timestamp]),7," |--Clustered Index Seek(OBJECT:([LocalDatabase].[dbo].[ExampleTable].[PK_ExampleTable]),2
---
Rows,EstimateExecutions
14000,"SELECT *
FROM OPEnjsON(@json)
WITH (
[Id] NVARCHAR(100),[Hash] NVARCHAR(500)
)",50,5.0157E-05,NULL
14000," |--Table-valued function",Table-valued function,621,"OPEnjsON_EXPLICIT.[Id],OPEnjsON_EXPLICIT.[ElementTimestamp],OPEnjsON_EXPLICIT.[Hash]",PLAN_ROW,1
注意:在我的本地设置中,我实际上有14k行,而不是如上所述的10k行。
两组统计信息/ IO配置文件都记录在本地docker设置中,但是在Azure sql环境中观察到了相似的结果
这是怎么回事?
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)