gpt4 book ai didi

sql-server - SQL Server 2016 中 STRING_SPLIT 位置的返回值

转载 作者:行者123 更新时间:2023-12-02 20:23:59 24 4
gpt4 key购买 nike

我可以使用 SQL Server 2016 或更高版本中的 STRING_SPLIT 函数返回特定位置的值吗?

我知道不能保证 select 的顺序,但是它是使用 STRING_SPLIT 吗?

DROP TABLE IF EXISTS #split

SELECT 'z_y_x' AS splitIt
INTO #split UNION
SELECT 'a_b_c'

SELECT * FROM #split;

WITH cte
AS (
SELECT ROW_NUMBER() OVER ( PARTITION BY s.splitIt ORDER BY s.splitIt ) AS position,
s.splitIt,
value
FROM #split s
CROSS APPLY STRING_SPLIT(s.splitIt, '_')
)
SELECT * FROM cte WHERE position = 2

这总是返回第二个元素的值吗? b 代表 a_b_c,y 代表 z_y_x?

我不明白为什么 Microsoft 不随该函数的值返回位置指示符列。

最佳答案

有 - 从 v2016 开始 - 一个解决方案 via FROM OPENJSON() :

DECLARE @str VARCHAR(100) = 'val1,val2,val3';

SELECT *
FROM OPENJSON('["' + REPLACE(@str,',','","') + '"]');

结果

key value   type
0 val1 1
1 val2 1
2 val3 1

文档清楚地说明了:

When OPENJSON parses a JSON array, the function returns the indexes of the elements in the JSON text as keys.

对于您的情况,这是:

SELECT 'z_y_x' AS splitIt
INTO #split UNION
SELECT 'a_b_c'

DECLARE @delimiter CHAR(1)='_';

SELECT *
FROM #split
CROSS APPLY OPENJSON('["' + REPLACE(splitIt,@delimiter,'","') + '"]') s
WHERE s.[key]=1; --zero based

希望 STRING_SPLIT() 的 future 版本将包含此信息

更新性能测试,与流行的Jeff-Moden-splitter

进行比较

试试这个:

USE master;
GO

CREATE DATABASE dbTest;
GO

USE dbTest;
GO
--Jeff Moden's splitter
CREATE FUNCTION [dbo].[DelimitedSplit8K](@pString VARCHAR(8000), @pDelimiter CHAR(1))
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (
SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (
SELECT 1 UNION ALL
SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
),
cteLen(N1,L1) AS(
SELECT s.N1,
ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)
FROM cteStart s
)
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = SUBSTRING(@pString, l.N1, l.L1)
FROM cteLen l
;
GO
--Avoid first call bias
SELECT * FROM dbo.DelimitedSplit8K('a,b,c',',');
GO

--Table to keep the results
CREATE TABLE Results(ID INT IDENTITY,ResultSource VARCHAR(100),durationMS INT, RowsCount INT);
GO
--Table with strings to split
CREATE TABLE dbo.DelimitedItems(ID INT IDENTITY,DelimitedNString nvarchar(4000),DelimitedString varchar(8000));
GO

--获取包含 100 个项目的随机混合字符串的行
--尝试使用行数(GO 后面的计数)和 TOP 的计数

INSERT INTO DelimitedItems(DelimitedNString)
SELECT STUFF((
SELECT TOP 100 ','+REPLACE(v.[name],',',';')
FROM master..spt_values v
WHERE LEN(v.[name])>0
ORDER BY NewID()
FOR XML PATH('')),1,1,'')
--Keep it twice in varchar and nvarchar
UPDATE DelimitedItems SET DelimitedString=DelimitedNString;
GO 500 --create 500 differently mixed rows

--测试

DECLARE @d DATETIME2;

SET @d = SYSUTCDATETIME();
SELECT DI.ID, DS.Item, DS.ItemNumber
INTO #TEMP
FROM dbo.DelimitedItems DI
CROSS APPLY dbo.DelimitedSplit8K(DI.DelimitedNString,',') DS;
INSERT INTO Results(ResultSource,RowsCount,durationMS)
SELECT 'delimited8K with NVARCHAR(4000)'
,(SELECT COUNT(*) FROM #TEMP) AS RowCountInTemp
,DATEDIFF(MILLISECOND,@d,SYSUTCDATETIME()) AS Duration_NV_ms_delimitedSplit8K

SET @d = SYSUTCDATETIME();
SELECT DI.ID, DS.Item, DS.ItemNumber
INTO #TEMP2
FROM dbo.DelimitedItems DI
CROSS APPLY dbo.DelimitedSplit8K(DI.DelimitedString,',') DS;
INSERT INTO Results(ResultSource,RowsCount,durationMS)
SELECT 'delimited8K with VARCHAR(8000)'
,(SELECT COUNT(*) FROM #TEMP2) AS RowCountInTemp
,DATEDIFF(MILLISECOND,@d,SYSUTCDATETIME()) AS Duration_V_ms_delimitedSplit8K

SET @d = SYSUTCDATETIME();
SELECT DI.ID, OJ.[Value] AS Item, OJ.[Key] AS ItemNumber
INTO #TEMP3
FROM dbo.DelimitedItems DI
CROSS APPLY OPENJSON('["' + REPLACE(DI.DelimitedNString,',','","') + '"]') OJ;
INSERT INTO Results(ResultSource,RowsCount,durationMS)
SELECT 'OPENJSON with NVARCHAR(4000)'
,(SELECT COUNT(*) FROM #TEMP3) AS RowCountInTemp
,DATEDIFF(MILLISECOND,@d,SYSUTCDATETIME()) AS Duration_NV_ms_OPENJSON

SET @d = SYSUTCDATETIME();
SELECT DI.ID, OJ.[Value] AS Item, OJ.[Key] AS ItemNumber
INTO #TEMP4
FROM dbo.DelimitedItems DI
CROSS APPLY OPENJSON('["' + REPLACE(DI.DelimitedString,',','","') + '"]') OJ;
INSERT INTO Results(ResultSource,RowsCount,durationMS)
SELECT 'OPENJSON with VARCHAR(8000)'
,(SELECT COUNT(*) FROM #TEMP4) AS RowCountInTemp
,DATEDIFF(MILLISECOND,@d,SYSUTCDATETIME()) AS Duration_V_ms_OPENJSON
GO
SELECT * FROM Results;
GO

--清理

DROP TABLE #TEMP;
DROP TABLE #TEMP2;
DROP TABLE #TEMP3;
DROP TABLE #TEMP4;

USE master;
GO
DROP DATABASE dbTest;

结果:

500 行 200 个项目

1220    delimited8K with NVARCHAR(4000)
274 delimited8K with VARCHAR(8000)
417 OPENJSON with NVARCHAR(4000)
443 OPENJSON with VARCHAR(8000)

500 行 100 个项目

421 delimited8K with NVARCHAR(4000)
140 delimited8K with VARCHAR(8000)
213 OPENJSON with NVARCHAR(4000)
212 OPENJSON with VARCHAR(8000)

5 行 100 个项目

10  delimited8K with NVARCHAR(4000)
5 delimited8K with VARCHAR(8000)
3 OPENJSON with NVARCHAR(4000)
4 OPENJSON with VARCHAR(8000)

500 行中的 5 个项目

32  delimited8K with NVARCHAR(4000)
30 delimited8K with VARCHAR(8000)
28 OPENJSON with NVARCHAR(4000)
24 OPENJSON with VARCHAR(8000)

--无限长度(仅适用于OPENJSON)--填充时没有TOP子句
--结果约为 500 行中的 500 个项目

1329    OPENJSON with NVARCHAR(4000)
1117 OPENJSON with VARCHAR(8000)

事实:

  • 流行的分割函数不喜欢NVARCHAR
  • 该函数仅限于 8k 字节容量内的字符串
  • 只有在 VARCHAR 中包含很多项和很多行的情况下,拆分器函数才会位于前面。
  • 在所有其他情况下,OPENJSON 似乎或多或少更快......
  • OPENJSON 可以处理(几乎)无限数量
  • OPENJSON v2016 要求
  • 每个人都在等待 STRING_SPLIT 的位置

更新在测试中添加了 STRING_SPLIT

同时,我使用 STRING_SPLIT() 重新运行测试,其中包含另外两个测试部分。作为位置,我必须返回一个硬编码值,因为该函数不返回部件的索引。

在所有测试的情况下,OPENJSONSTRING_SPLIT 接近,而且通常更快:

1000 行中有 5 个项目

250 delimited8K with NVARCHAR(4000)
124 delimited8K with VARCHAR(8000) --this function is best with many rows in VARCHAR
203 OPENJSON with NVARCHAR(4000)
204 OPENJSON with VARCHAR(8000)
235 STRING_SPLIT with NVARCHAR(4000)
234 STRING_SPLIT with VARCHAR(8000)

30 行 200 个项目

140 delimited8K with NVARCHAR(4000)
31 delimited8K with VARCHAR(8000)
47 OPENJSON with NVARCHAR(4000)
31 OPENJSON with VARCHAR(8000)
47 STRING_SPLIT with NVARCHAR(4000)
31 STRING_SPLIT with VARCHAR(8000)

10000 行中的 100 个项目

8145    delimited8K with NVARCHAR(4000)
2806 delimited8K with VARCHAR(8000) --fast with many rows!
5112 OPENJSON with NVARCHAR(4000)
4501 OPENJSON with VARCHAR(8000)
5028 STRING_SPLIT with NVARCHAR(4000)
5126 STRING_SPLIT with VARCHAR(8000)

关于sql-server - SQL Server 2016 中 STRING_SPLIT 位置的返回值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50061036/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com