gpt4 book ai didi

sql-server - XQuery 计划复杂性

转载 作者:行者123 更新时间:2023-12-02 22:26:06 26 4
gpt4 key购买 nike

我有一个 SQLCLR 标量函数,它将生成一个 XmlReader,我需要根据需要将其分解为内联结果集。这些 XML 对象是按需生成的,因此我无法使用 XML 索引。生成的数据集中通常有超过 100 列。考虑这个示例代码:

CREATE XML SCHEMA COLLECTION RAB AS '
<xsd:schema xmlns:schema="urn:schemas-microsoft-com:sql:SqlRowSet1" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:sqltypes="http://schemas.microsoft.com/sqlserver/2004/sqltypes" elementFormDefault="qualified">
<xsd:import namespace="http://schemas.microsoft.com/sqlserver/2004/sqltypes" schemaLocation="http://schemas.microsoft.com/sqlserver/2004/sqltypes/sqltypes.xsd" />

<xsd:element name="r" type="r"/>

<xsd:complexType name="r">
<xsd:attribute name="a" type="sqltypes:int" use="required"/>
<xsd:attribute name="b" type="sqltypes:int" use="required"/>
<xsd:attribute name="c" type="sqltypes:int" use="required"/>
</xsd:complexType>
</xsd:schema>';
GO

DECLARE @D TABLE(x XML(DOCUMENT RAB) NOT NULL);

INSERT INTO @D
VALUES
('<r a="3" b="4" c="34"/>'),
('<r a="5" b="6" c="56"/>'),
('<r a="7" b="8" c="78"/>')

SELECT x.value('/r/@a', 'int') a, x.value('/r/@b', 'int') b, x.value('/r/@c', 'int') c
FROM @d a

这会用一些 XML 值填充表变量中的类型化 XML 列,并将属性分成单独的列。这个执行计划看起来过于困惑:

 |--Compute Scalar(DEFINE:([Expr1009]=[Expr1008], [Expr1016]=[Expr1015], [Expr1023]=[Expr1022]))
|--Nested Loops(Inner Join, OUTER REFERENCES:([a].[x]))
|--Nested Loops(Inner Join, OUTER REFERENCES:([a].[x]))
| |--Nested Loops(Inner Join, OUTER REFERENCES:([a].[x]))
| | |--Table Scan(OBJECT:(@d AS [a]))
| | |--Stream Aggregate(DEFINE:([Expr1008]=MIN([Expr1024])))
| | |--Compute Scalar(DEFINE:([Expr1024]=CASE WHEN datalength(CONVERT_IMPLICIT(sql_variant,CONVERT_IMPLICIT(nvarchar(64),xsd_cast_to_maybe_large(XML Reader with XPath filter.[value],XML Reader with XPath filter.[lvalue],XML Reader wi
| | |--Table-valued function
| |--Stream Aggregate(DEFINE:([Expr1015]=MIN([Expr1025])))
| |--Compute Scalar(DEFINE:([Expr1025]=CASE WHEN datalength(CONVERT_IMPLICIT(sql_variant,CONVERT_IMPLICIT(nvarchar(64),xsd_cast_to_maybe_large(XML Reader with XPath filter.[value],XML Reader with XPath filter.[lvalue],XML Reader with XP
| |--Table-valued function
|--Stream Aggregate(DEFINE:([Expr1022]=MIN([Expr1026])))
|--Compute Scalar(DEFINE:([Expr1026]=CASE WHEN datalength(CONVERT_IMPLICIT(sql_variant,CONVERT_IMPLICIT(nvarchar(64),xsd_cast_to_maybe_large(XML Reader with XPath filter.[value],XML Reader with XPath filter.[lvalue],XML Reader with XPath f
|--Table-valued function

每列都有一个嵌套循环!如果我连接多个这些表,每个表有 100 列,那么查询计划将会变得太复杂。另外,我不明白这些 StreamAggregate 运算符的用途。内容如下:

MIN(
CASE WHEN @d.[x] as [a].[x] IS NULL
THEN NULL ELSE
CASE WHEN datalength(CONVERT_IMPLICIT(sql_variant,
CONVERT_IMPLICIT(nvarchar(64),xsd_cast_to_maybe_large(xrpf.[value],xrpf.[lvalue],xrpf.[lvaluebin],xrpf.[tid],(15),(7))
,0),0))>=(128)
THEN CONVERT_IMPLICIT(int,
CASE WHEN datalength(xsd_cast_to_maybe_large(xrpf.[value],xrpf.[lvalue],xrpf.[lvaluebin],xrpf.[tid],(15),(7)))<(128)
THEN NULL
ELSE xsd_cast_to_maybe_large(xrpf.[value],xrpf.[lvalue],xrpf.[lvaluebin],xrpf.[tid],(15),(7))
END,0)
ELSE CONVERT_IMPLICIT(int,
CONVERT_IMPLICIT(sql_variant,CONVERT_IMPLICIT(nvarchar(64),xsd_cast_to_maybe_large(xrpf.[value],xrpf.[lvalue],xrpf.[lvaluebin],xrpf.[tid],(15),(7)),0),0),0)
END
END)

哎呀!我认为使用带有 sqltype 类型的类型化 XML 组应该可以避免转换?

要么我高估了它的效率,要么我做错了什么。我的问题是如何解决这个问题,这样我就不会为每列添加额外的查询计划运算符,并且最好避免转换,或者我应该放弃并找到一种非 xpath 方式来执行此操作?

引用文献:

sqlTypes http://msdn.microsoft.com/en-us/library/ee320775%28v=sql.105%29.aspx

XML 数据类型方法 http://technet.microsoft.com/en-us/library/ms190798%28v=sql.105%29.aspx

enter image description here

最佳答案

查询计划中有一些谜团需要首先理清。计算标量的作用是什么以及为什么存在流聚合。

表值函数返回分解的 XML 的节点表,每个分解的行对应一行。当您使用类型化 XML 时,这些列是 value、lvalue、lvaluebin 和 tid。这些列在计算标量中用于计算实际值。那里的代码看起来有点奇怪,我不能说我理解为什么它是这样的,但它的要点是函数 xsd_cast_to_maybe_large 返回值,并且有处理该值的代码当值等于或大于 128 字节时的情况。

CASE WHEN datalength(
CONVERT_IMPLICIT(sql_variant,
CONVERT_IMPLICIT(nvarchar(64),
xsd_cast_to_maybe_large(XML Reader with XPath filter.[value],
XML Reader with XPath filter.[lvalue],
XML Reader with XPath filter.[lvaluebin],
XML Reader with XPath filter.[tid],(15),(5),(0)),0),0))>=(128)
THEN CONVERT_IMPLICIT(int,CASE WHEN datalength(xsd_cast_to_maybe_large(XML Reader with XPath filter.[value],
XML Reader with XPath filter.[lvalue],
XML Reader with XPath filter.[lvaluebin],
XML Reader with XPath filter.[tid],(15),(5),(0)))<(128)
THEN NULL
ELSE xsd_cast_to_maybe_large(XML Reader with XPath filter.[value],
XML Reader with XPath filter.[lvalue],
XML Reader with XPath filter.[lvaluebin],
XML Reader with XPath filter.[tid],(15),(5),(0))
END,0)
ELSE CONVERT_IMPLICIT(int,CONVERT_IMPLICIT(sql_variant,
CONVERT_IMPLICIT(nvarchar(64),
xsd_cast_to_maybe_large(XML Reader with XPath filter.[value],
XML Reader with XPath filter.[lvalue],
XML Reader with XPath filter.[lvaluebin],
XML Reader with XPath filter.[tid],(15),(5),(0)),0),0),0)
END

非类型化 XML 的相同计算标量要简单得多,而且实际上更容易理解。

CASE WHEN datalength(XML Reader with XPath filter.[value])>=(128) 
THEN CONVERT_IMPLICIT(int,XML Reader with XPath filter.[lvalue],0)
ELSE CONVERT_IMPLICIT(int,XML Reader with XPath filter.[value],0)
END

如果value中的字节数超过128字节,则从lvalue获取,否则从value获取。在使用非类型化 XML 的情况下,返回的节点表仅输出列 id、value 和 lvalue。

当您使用类型化 XML 时,节点值的存储将根据架构中指定的数据类型进行优化。看起来它可能最终出现在节点表中的 value、lvalue 或 lvaluebin 中,具体取决于它是什么类型的值,并且 xsd_cast_to_maybe_large 可以帮助解决问题。

流聚合对计算标量的返回值执行 min()。我们知道并且 SQL Server 确实(至少有时)知道当您在 value() 函数中指定 XPath 时,表值函数只会返回一行。解析器确保我们正确构建 XPath,但当查询优化器查看估计行时,它会看到 200 行。解析 XML 的表值函数的基本估计是 10000 行,然后使用所使用的 XPath 进行一些调整。在本例中,最终只有 1 行,但有 200 行。我的纯粹猜测是流聚合是为了解决这种差异。它永远不会聚合任何内容,仅发送返回的一行,但它确实会影响整个分支的基数估计,并确保优化器使用 1 行作为该分支的估计。当优化器选择连接策略等时,这当然非常重要。

那么 100 个属性怎么样?是的,如果你使用值(value)函数100次,就会有100个分支。但这里还需要进行一些优化。我创建了一个测试装置,以查看使用 10 行中的 100 个属性时哪种形状和形式的查询最快。

获胜者是使用非类型化 XML,而不是使用 nodes() 函数来粉碎 r

select X.value('(/r/@a1)[1]', 'int') as a1,
X.value('(/r/@a2)[1]', 'int') as a2,
X.value('(/r/@a3)[1]', 'int') as a3
from @T

还有一种方法可以使用数据透视来避免 100 个分支,但根据实际查询的情况,这可能是不可能的。从数据透视表出来的数据类型必须相同。您当然可以将它们提取为字符串并转换为列列表中的适当类型。它还要求您的表有一个主键/唯一键。

select a1, a2, a3
from (
select T.ID, -- primary key of @T
A.X.value('local-name(.)', 'nvarchar(50)') as Name,
A.X.value('.', 'int') as Value
from @T as T
cross apply T.X.nodes('/r/@*') as A(X)
) as T
pivot(min(T.Value) for Name in (a1, a2, a3)) as P

数据透视查询的查询计划,10行100个属性:

enter image description here

下面是结果和我使用的测试设备。我测试了 100 个属性、10 行以及所有 int 属性。

结果:

Test                                                Duration (ms)
-------------------------------------------------- -------------
untyped XML value('/r[1]/@a') 195
untyped XML value('(/r/@a)[1]') 108
untyped XML value('@a') cross apply nodes('/r') 131
untyped XML value('@a') cross apply nodes('/r[1]') 127
typed XML value('/r/@a') 185
typed XML value('(/r/@a)[1]') 148
typed XML value('@a') cross apply nodes('/r') 176
untyped XML pivot 34
typed XML pivot 52

代码:

drop type dbo.TRABType
drop type dbo.TType;
drop xml schema collection dbo.RAB;

go

declare @NumAtt int = 100;
declare @Attribs nvarchar(max);

with xmlnamespaces('http://www.w3.org/2001/XMLSchema' as xsd)
select @Attribs = (
select top(@NumAtt) 'a'+cast(row_number() over(order by 1/0) as varchar(11)) as '@name',

'sqltypes:int' as '@type',
'required' as '@use'
from sys.columns
for xml path('xsd:attribute')
)
--CREATE XML SCHEMA COLLECTION RAB AS

declare @Schema nvarchar(max) =
'
<xsd:schema xmlns:schema="urn:schemas-microsoft-com:sql:SqlRowSet1" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:sqltypes="http://schemas.microsoft.com/sqlserver/2004/sqltypes" elementFormDefault="qualified">
<xsd:import namespace="http://schemas.microsoft.com/sqlserver/2004/sqltypes" schemaLocation="http://schemas.microsoft.com/sqlserver/2004/sqltypes/sqltypes.xsd" />
<xsd:element name="r" type="r"/>
<xsd:complexType name="r">[ATTRIBS]</xsd:complexType>
</xsd:schema>';

set @Schema = replace(@Schema, '[ATTRIBS]', @Attribs)

create xml schema collection RAB as @Schema

go

create type dbo.TType as table
(
ID int identity primary key,
X xml not null
);

go

create type dbo.TRABType as table
(
ID int identity primary key,
X xml(document rab) not null
);


go

declare @NumAtt int = 100;
declare @NumRows int = 10;

declare @X nvarchar(max);
declare @C nvarchar(max);
declare @M nvarchar(max);

declare @S1 nvarchar(max);
declare @S2 nvarchar(max);
declare @S3 nvarchar(max);
declare @S4 nvarchar(max);
declare @S5 nvarchar(max);
declare @S6 nvarchar(max);
declare @S7 nvarchar(max);
declare @S8 nvarchar(max);
declare @S9 nvarchar(max);

set @X = N'<r '+
(
select top(@NumAtt) 'a'+cast(row_number() over(order by 1/0) as varchar(11))+'="'+cast(row_number() over(order by 1/0) as varchar(11))+'" '
from sys.columns
for xml path('')
)+
'/>';

set @C =
stuff((
select top(@NumAtt) ',a'+cast(row_number() over(order by 1/0) as varchar(11))
from sys.columns
for xml path('')
), 1, 1, '')

set @M =
stuff((
select top(@NumAtt) ',MAX(CASE WHEN name = ''a'+cast(row_number() over(order by 1/0) as varchar(11))+''' THEN val END)'
from sys.columns
for xml path('')
), 1, 1, '')


declare @T dbo.TType;
insert into @T(X)
select top(@NumRows) @X
from sys.columns;

declare @TRAB dbo.TRABType;
insert into @TRAB(X)
select top(@NumRows) @X
from sys.columns;


-- value('/r[1]/@a')
set @S1 = N'
select T.ID'+
(
select top(@NumAtt) ', T.X.value(''/r[1]/@a'+cast(row_number() over(order by 1/0) as varchar(11))+''', ''int'')'
from sys.columns
for xml path('')
)+
' from @T as T
option (maxdop 1)';

-- value('(/r/@a)[1]')
set @S2 = N'
select T.ID'+
(
select top(@NumAtt) ', T.X.value(''(/r/@a'+cast(row_number() over(order by 1/0) as varchar(11))+')[1]'', ''int'')'
from sys.columns
for xml path('')
)+
' from @T as T
option (maxdop 1)';

-- value('@a') cross apply nodes('/r')
set @S3 = N'
select T.ID'+
(
select top(@NumAtt) ', T2.X.value(''@a'+cast(row_number() over(order by 1/0) as varchar(11))+''', ''int'')'
from sys.columns
for xml path('')
)+
' from @T as T
cross apply T.X.nodes(''/r'') as T2(X)
option (maxdop 1)';


-- value('@a') cross apply nodes('/r[1]')
set @S4 = N'
select T.ID'+
(
select top(@NumAtt) ', T2.X.value(''@a'+cast(row_number() over(order by 1/0) as varchar(11))+''', ''int'')'
from sys.columns
for xml path('')
)+
' from @T as T
cross apply T.X.nodes(''/r[1]'') as T2(X)
option (maxdop 1)';

-- value('/r/@a') typed XML
set @S5 = N'
select T.ID'+
(
select top(@NumAtt) ', T.X.value(''/r/@a'+cast(row_number() over(order by 1/0) as varchar(11))+''', ''int'')'
from sys.columns
for xml path('')
)+
' from @TRAB as T
option (maxdop 1)';

-- value('(/r/@a)[1]')
set @S6 = N'
select T.ID'+
(
select top(@NumAtt) ', T.X.value(''(/r/@a'+cast(row_number() over(order by 1/0) as varchar(11))+')[1]'', ''int'')'
from sys.columns
for xml path('')
)+
' from @TRAB as T
option (maxdop 1)';

-- value('@a') cross apply nodes('/r') typed XML
set @S7 = N'
select T.ID'+
(
select top(@NumAtt) ', T2.X.value(''@a'+cast(row_number() over(order by 1/0) as varchar(11))+''', ''int'')'
from sys.columns
for xml path('')
)+
' from @TRAB as T
cross apply T.X.nodes(''/r'') as T2(X)
option (maxdop 1)';

-- pivot
set @S8 = N'
select ID, '+@C+'
from (
select T.ID,
A.X.value(''local-name(.)'', ''nvarchar(50)'') as Name,
A.X.value(''.'', ''int'') as Value
from @T as T
cross apply T.X.nodes(''/r/@*'') as A(X)
) as T
pivot(min(T.Value) for Name in ('+@C+')) as P
option (maxdop 1)';

-- typed pivot
set @S9 = N'
select ID, '+@C+'
from (
select T.ID,
A.X.value(''local-name(.)'', ''nvarchar(50)'') as Name,
cast(cast(A.X.query(''string(.)'') as varchar(11)) as int) as Value
from @TRAB as T
cross apply T.X.nodes(''/r/@*'') as A(X)
) as T
pivot(min(T.Value) for Name in ('+@C+')) as P
option (maxdop 1)';


exec sp_executesql @S1, N'@T dbo.TType readonly', @T;
exec sp_executesql @S2, N'@T dbo.TType readonly', @T;
exec sp_executesql @S3, N'@T dbo.TType readonly', @T;
exec sp_executesql @S4, N'@T dbo.TType readonly', @T;
exec sp_executesql @S5, N'@TRAB dbo.TRABType readonly', @TRAB;
exec sp_executesql @S6, N'@TRAB dbo.TRABType readonly', @TRAB;
exec sp_executesql @S7, N'@TRAB dbo.TRABType readonly', @TRAB;
exec sp_executesql @S8, N'@T dbo.TType readonly', @T;
exec sp_executesql @S9, N'@TRAB dbo.TRABType readonly', @TRAB;

关于sql-server - XQuery 计划复杂性,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26684398/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com