gpt4 book ai didi

analytics - 在 SQL Server 2014 Developer 中生成 20 亿行的有效方法

转载 作者:行者123 更新时间:2023-12-02 15:02:04 26 4
gpt4 key购买 nike

长话短说;我正在测试一个通过网络连接从表中清除条目的系统,预计该功能最多可处理超过 20 亿个条目。

我需要对此进行压力测试才能确定。

这是我的测试脚本(十分钟最多能产生980万)

DECLARE @I INT=0

WHILE @I <2000000001
BEGIN
INSERT INTO "Table here"
VALUES(@I)

SET @I=@I+1
END

任何人都可以提出任何建议,或者让我知道在这种情况下我的测试环境的上限是多少?

最佳答案

下面是10M批量使用CROSS JOIN的方法。这在我的台式机上大约 6 分钟内加载了 20 亿行。

--create and load 2B row table using cross join CTE source
CREATE TABLE dbo.TableHere(
ID int NOT NULL
);

DECLARE
@TargetRowCount int = 2000000000
, @RowsInserted int = 0;

WHILE @RowsInserted < @TargetRowCount
BEGIN

WITH
t10 AS (SELECT n FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) t(n))
,t1k AS (SELECT 0 AS n FROM t10 AS a CROSS JOIN t10 AS b CROSS JOIN t10 AS c)
,t10m AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS num FROM t1k AS a
CROSS JOIN t1k AS b CROSS JOIN t10 AS c)
INSERT INTO dbo.TableHere WITH(TABLOCKX)
SELECT num + @RowsInserted
FROM t10m;

SET @RowsInserted += @@ROWCOUNT;

DECLARE @CurrentTimestampString varchar(1000) = FORMAT(SYSDATETIME(),'yyyy-MM-dd HH:mm:ss');
RAISERROR('%s: %d of %d rows inserted', 0, 0, @CurrentTimestampString, @RowsInserted, @TargetRowCount) WITH NOWAIT;

END;
GO

考虑创建一个具有增量值的永久计数表。这不仅会提高测试数据生成的性能,还可以重复使用物化计数表来促进各种其他常见任务,例如生成增量值范围、日期时间序列等。

通过避免在每次迭代中重复交叉连接,利用计数表将我机器上的性能提高了 30%(4.5 分钟)。 编辑:处理器升级后缩短至 2.5 分钟(Intel i9-12900K 处理器)。

SET NOCOUNT ON;
--create and load 10M row tally table
DROP TABLE IF EXISTS dbo.Tally;
CREATE TABLE dbo.Tally(
Num int NOT NULL CONSTRAINT PK_Tally PRIMARY KEY
);
WITH
t10 AS (SELECT n FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) t(n))
,t1k AS (SELECT 0 AS n FROM t10 AS a CROSS JOIN t10 AS b CROSS JOIN t10 AS c)
,t10m AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS num FROM t1k AS a
CROSS JOIN t1k AS b CROSS JOIN t10 AS c)
INSERT INTO dbo.Tally WITH(TABLOCKX) (Num)
SELECT num
FROM t10m;
ALTER INDEX PK_Tally ON dbo.Tally REBUILD WITH (FILLFACTOR=100);
GO

--create and load 2B row table using tally table source
CREATE TABLE dbo.TableHere(
ID int NOT NULL
);

DECLARE
@TargetRowCount int = 2000000000
, @RowsInserted int = 0;

WHILE @RowsInserted < @TargetRowCount
BEGIN

INSERT INTO dbo.TableHere WITH(TABLOCKX) (ID)
SELECT Num + @RowsInserted
FROM dbo.Tally;

SET @RowsInserted += @@ROWCOUNT;

DECLARE @CurrentTimestampString varchar(1000) = FORMAT(SYSDATETIME(),'yyyy-MM-dd HH:mm:ss');
RAISERROR('%s: %d of %d rows inserted', 0, 0, @CurrentTimestampString, @RowsInserted, @TargetRowCount) WITH NOWAIT;

END;
GO

关于analytics - 在 SQL Server 2014 Developer 中生成 20 亿行的有效方法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48776162/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com