gpt4 book ai didi

sql-server - 在 SQL Server 中选择 "\"和 "."之间的字符串

转载 作者:行者123 更新时间:2023-12-05 00:31:17 24 4
gpt4 key购买 nike

我有下表列名为文件,它包含目录。样本数据是:

C:\filedata\6860_f11.xlxb_3.30 test - 0.3 ML 

C:\cloud\files\1191_f12.xlxb_12.16 test - 0.3 ML

请注意,我只想获取 #1 的 6860_f11.xlxb 和 #2 的 1191_f12.xlxb

对于 #1,该目录仅包含 1 个文件夹 filedata 但对于 #2,它包含 2 个文件夹 cloud\files

下面是我的代码:

select  
(SUBSTRING((file), 0, CHARINDEX ('.xlxb', (file)) + 4)) as xlsb_file
from
[Projects].[dbo].[ProjFiles]

有什么方法可以获取文件夹之后的字符串,直到 .xlxb 之后的下划线?

最佳答案

不需要 CLR。不需要正则表达式。使用 NGrams8K 解决此问题的最简单且性能最好的方法.我住的地方现在是凌晨 2 点,所以我会尽快完成。

注意这个查询:

DECLARE @string VARCHAR(150) = 'C:\cloud\files\1191_f12.xlxb_12.16 test - 0.3 ML';

SELECT RetPos = f.p, RetVal = e.s
FROM (SELECT MAX(position)+1 FROM samd.NGrams8k(@string,1) WHERE token = '\') AS f(p)
CROSS APPLY (VALUES(SUBSTRING(@string,f.p,CHARINDEX('.',@string,f.p)-f.p+5))) AS e(s);

结果:

RetPos RetVal
------ ---------------
16 1191_f12.xlxb

现在对着 table :

CREATE TABLE #yourtable ([file] VARCHAR(150));
INSERT INTO #yourtable
VALUES ('C:\filedata\6860_f11.xlxb_3.30 test - 0.3 ML'),
('C:\cloud\files\1191_f12.xlxb_12.16 test - 0.3 ML');

SELECT *
FROM #yourtable AS t
CROSS APPLY
(
SELECT newstring = e.s
FROM (SELECT MAX(position) FROM samd.NGrams8k(t.[file],1) WHERE token = '\') AS f(p)
CROSS
APPLY (VALUES(SUBSTRING(t.[file],f.p+1,CHARINDEX('.',t.[file],f.p+1)-f.p+4))) AS e(s)
) AS itvf_str_extract;

这真的很简单。性能也将击败任何基于 CLR/Regex 的解决方案 - 就是这样。

旁注:John Cappelletti 的解决方案非常出色(一如既往)。在底层,它与我的 NGrams 解决方案非常相似,但不完全相同;比较这两个查询:

DECLARE @string VARCHAR(150) = 'C:\cloud\files\1191_f12.xlxb_12.16 test - 0.3 ML';
DECLARE @Delimiter1 varchar(100) = '\', @Delimiter2 varchar(100) = '.';

-- Alan B
SELECT
RetSeq = 1,
RetPos = f.p,
RetVal = e.s
FROM (SELECT MAX(position)+1 FROM samd.NGrams8k(@string,1) WHERE token = '\') AS f(p)
CROSS
APPLY (VALUES(SUBSTRING(@string,f.p,CHARINDEX('.',@string,f.p)-f.p+5))) AS e(s);

-- John C
with cte1(N) As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
cte2(N) As (Select Top (IsNull(DataLength(@String),0)) Row_Number() over (Order By (Select NULL)) From (Select N=1 From cte1 N1,cte1 N2,cte1 N3,cte1 N4,cte1 N5,cte1 N6) A ),
cte3(N) As (Select 1 Union All Select t.N+DataLength(@Delimiter1) From cte2 t Where Substring(@String,t.N,DataLength(@Delimiter1)) = @Delimiter1),
cte4(N,L) As (Select S.N,IsNull(NullIf(CharIndex(@Delimiter1,@String,s.N),0)-S.N,8000) From cte3 S)

Select RetSeq = Row_Number() over (Order By N)
,RetPos = N
,RetVal = left(RetVal,charindex(@Delimiter2,RetVal)-1)
From (
Select *,RetVal = Substring(@String, N, L)
From cte4
) A
Where charindex(@Delimiter2,RetVal)>1;

现在执行计划:

enter image description here

关于sql-server - 在 SQL Server 中选择 "\"和 "."之间的字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53718380/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com