gpt4 book ai didi

sql-server - 批量插入带有外键的嵌套 xml 作为第一个表的标识列

转载 作者:数据小太阳 更新时间:2023-10-29 02:08:19 26 4
gpt4 key购买 nike

我有一个 xml 如下:

<Records>
<Record>
<Name>Best of Pop</Name>
<Studio>ABC studio</Studio>
<Artists>
<Artist>
<ArtistName>John</ArtistName>
<Age>36</Age>
</Artist>
<Artist>
<ArtistName>Jessica</ArtistName>
<Age>20</Age>
</Artist>
</Artists>
</Record>
<Record>
<Name>Nursery rhymes</Name>
<Studio>XYZ studio</Studio>
<Artists>
<Artist>
<ArtistName>Judy</ArtistName>
<Age>10</Age>
</Artist>
<Artist>
<ArtistName>Rachel</ArtistName>
<Age>15</Age>
</Artist>
</Artists>
</Record>
</Records>

此文件可能包含数百万条记录。我的 MS SQL 数据库,运行于 Azure SQL Database , 有以下 2 个表来存储这些记录:

  1. Record (RecordId [PK, identity, auto-increment], Name, Studio)

  2. 艺术家(RecordId [外键引用 Record.RecordId],艺术家姓名,年龄)

是否可以在Record表中批量插入记录,获取RecordIds,然后在一次遍历xml中批量插入艺术家信息到Artist表中使用 xml 节点方法?

长期以来,我一直在寻找一种有效的方法来做到这一点,但没有成功。

我尝试过类似于描述的方法 herehere ,但我无法找到解决方案。

解决方案方向的任何指示都会有很大帮助。

更新:@srutzky:感谢您的解决方案。这完全符合我的要求。但有一个陷阱。我必须使用节点方法来解决问题。我已经更改了查询的第一部分。但是我被困在了下半场。这就是我的工作。

DECLARE @Record TABLE (RecordId INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
Name NVARCHAR(400) UNIQUE,
Studio NVARCHAR(400));
DECLARE @Artist TABLE (ArtistId INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
RecordId INT NOT NULL,
ArtistName NVARCHAR(400), Age INT);

INSERT INTO @Record (Name, Studio)
SELECT T.c.value(N'(Name/text())[1]', 'NVARCHAR(400)'),
T.c.value(N'(Studio/text())[1]', 'NVARCHAR(400)')
FROM @ImportData.nodes('/Records/Record') T(c);

SELECT * FROM @Record

你能帮我完成第二部分吗?我是这种 xml 处理方法的新手。

UPDATE2:我明白了......我绞尽脑汁思考了几个小时,尝试了一些东西,最终找到了解决方案。

DECLARE @Record TABLE (RecordId INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
Name NVARCHAR(400) UNIQUE,
Studio NVARCHAR(400));
DECLARE @Artist TABLE (ArtistId INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
RecordId INT NOT NULL,
ArtistName NVARCHAR(400),
Age INT);

INSERT INTO @Record (Name, Studio)
SELECT T.c.value(N'(Name/text())[1]', 'NVARCHAR(400)'),
T.c.value(N'(Studio/text())[1]', 'NVARCHAR(400)')
FROM @ImportData.nodes('/Records/Record') T(c);

INSERT INTO @Artist (RecordId, ArtistName, Age)
SELECT (SELECT RecordId FROM @Record WHERE Name=T.c.value(N'(../../Name/text())[1]', 'NVARCHAR(400)')),
T.c.value(N'(ArtistName/text())[1]', 'NVARCHAR(400)'),
T.c.value(N'(Age/text())[1]', 'INT')
FROM @ImportData.nodes('/Records/Record/Artists/Artist') T(c);

SELECT * FROM @Record
SELECT * FROM @Artist

@srutzky:非常感谢您为我指明了正确的方向。欢迎提出任何改进此解决方案的建议。

最佳答案

这无论如何都不能一次性完成,因为您不能在同一 DML 语句中插入两个表(好吧,在触发器和 OUTPUT 子句之外,这两者在这里都无济于事)。但它可以通过两次有效地完成。 <Name> 的事实<Record> 中的元素是唯一的是关键,因为它允许我们使用 Record表作为第二遍的查找表(即当我们获取 Artist 行时)。

首先,您需要(好吧,应该)创建一个 UNIQUE INDEXRecord (Name ASC) .在下面的示例中,我使用的是 UNIQUE CONSTRAINT ,但这只是因为我使用表变量而不是临时表来使示例代码更容易重新运行(不需要在顶部显式 IF EXISTS DROP)。该指标将有助于第二遍的性能。

该示例使用 OPENXML,因为这很可能比使用 .nodes() 更有效函数,因为同一个文档需要遍历两次。 OPENXML 的最后一个参数功能,2 , 指定文档是“基于元素的”,因为默认解析正在寻找“基于属性的”。

DECLARE @DocumentID INT, @ImportData XML;

SET @ImportData = N'
<Records>
<Record>
<Name>Best of Pop</Name>
<Studio>ABC studio</Studio>
<Artists>
<Artist>
<ArtistName>John</ArtistName>
<Age>36</Age>
</Artist>
<Artist>
<ArtistName>Jessica</ArtistName>
<Age>20</Age>
</Artist>
</Artists>
</Record>
<Record>
<Name>Nursery rhymes</Name>
<Studio>XYZ studio</Studio>
<Artists>
<Artist>
<ArtistName>Judy</ArtistName>
<Age>10</Age>
</Artist>
<Artist>
<ArtistName>Rachel</ArtistName>
<Age>15</Age>
</Artist>
</Artists>
</Record>
</Records>';


DECLARE @Record TABLE (RecordId INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
Name NVARCHAR(400) UNIQUE,
Studio NVARCHAR(400));
DECLARE @Artist TABLE (ArtistId INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
RecordId INT NOT NULL,
ArtistName NVARCHAR(400), Age INT);

EXEC sp_xml_preparedocument @DocumentID OUTPUT, @ImportData;

-- First pass: extract "Record" rows
INSERT INTO @Record (Name, Studio)
SELECT Name, Studio
FROM OPENXML (@DocumentID, N'/Records/Record', 2)
WITH (Name NVARCHAR(400) './Name/text()',
Studio NVARCHAR(400) './Studio/text()');


-- Second pass: extract "Artist" rows
INSERT INTO @Artist (RecordId, ArtistName, Age)
SELECT rec.RecordId, art.ArtistName, art.Age
FROM OPENXML (@DocumentID, N'/Records/Record/Artists/Artist', 2)
WITH (Name NVARCHAR(400) '../../Name/text()',
ArtistName NVARCHAR(400) './ArtistName/text()',
Age INT './Age/text()') art
INNER JOIN @Record rec
ON rec.[Name] = art.[Name];


EXEC sp_xml_removedocument @DocumentID;
-------------------

SELECT * FROM @Record ORDER BY [RecordID];
SELECT * FROM @Artist ORDER BY [RecordID];

引用资料:

编辑:
根据使用 .nodes() 的新要求函数而不是 OPENXML ,以下将起作用:

DECLARE @ImportData XML;

SET @ImportData = N'
<Records>
<Record>
<Name>Best of Pop</Name>
<Studio>ABC studio</Studio>
<Artists>
<Artist>
<ArtistName>John</ArtistName>
<Age>36</Age>
</Artist>
<Artist>
<ArtistName>Jessica</ArtistName>
<Age>20</Age>
</Artist>
</Artists>
</Record>
<Record>
<Name>Nursery rhymes</Name>
<Studio>XYZ studio</Studio>
<Artists>
<Artist>
<ArtistName>Judy</ArtistName>
<Age>10</Age>
</Artist>
<Artist>
<ArtistName>Rachel</ArtistName>
<Age>15</Age>
</Artist>
</Artists>
</Record>
</Records>';

IF (OBJECT_ID('tempdb..#Record') IS NOT NULL)
BEGIN
DROP TABLE #Record;
END;
IF (OBJECT_ID('tempdb..#Artist') IS NOT NULL)
BEGIN
DROP TABLE #Artist;
END;

CREATE TABLE #Record (RecordId INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
Name NVARCHAR(400) UNIQUE,
Studio NVARCHAR(400));
CREATE TABLE #Artist (ArtistId INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
RecordId INT NOT NULL,
ArtistName NVARCHAR(400),
Age INT);


-- First pass: extract "Record" rows
INSERT INTO #Record (Name, Studio)
SELECT col.value(N'(./Name/text())[1]', N'NVARCHAR(400)') AS [Name],
col.value(N'(./Studio/text())[1]', N'NVARCHAR(400)') AS [Studio]
FROM @ImportData.nodes(N'/Records/Record') tab(col);


-- Second pass: extract "Artist" rows
;WITH artists AS
(
SELECT col.value(N'(../../Name/text())[1]', N'NVARCHAR(400)') AS [RecordName],
col.value(N'(./ArtistName/text())[1]', N'NVARCHAR(400)') AS [ArtistName],
col.value(N'(./Age/text())[1]', N'INT') AS [Age]
FROM @ImportData.nodes(N'/Records/Record/Artists/Artist') tab(col)
)
INSERT INTO #Artist (RecordId, ArtistName, Age)
SELECT rec.RecordId, art.ArtistName, art.Age
FROM artists art
INNER JOIN #Record rec
ON rec.[Name] = art.RecordName;

-- OR --
-- INSERT INTO #Artist (RecordId, ArtistName, Age)
SELECT rec.RecordId,
col.value(N'(./ArtistName/text())[1]', N'NVARCHAR(400)') AS [ArtistName],
col.value(N'(./Age/text())[1]', N'INT') AS [Age]
FROM @ImportData.nodes(N'/Records/Record/Artists/Artist') tab(col)
INNER JOIN #Record rec
ON rec.Name = col.value(N'(../../Name/text())[1]', N'NVARCHAR(400)');

-------------------

SELECT * FROM #Record ORDER BY [RecordID];
SELECT * FROM #Artist ORDER BY [RecordID];

插入#Artist有两种选择如上所示。第一个使用 CTE 将 XML 提取从 INSERT/SELECT 查询中抽象出来。另一个是简化版本,类似于您在问题的 UPDATE 2 中的查询。

关于sql-server - 批量插入带有外键的嵌套 xml 作为第一个表的标识列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26124743/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com