gpt4 book ai didi

sql - 将文件中的矩阵加载到 PostgreSQL 表中

转载 作者:行者123 更新时间:2023-11-29 13:06:23 24 4
gpt4 key购买 nike

我有一个 universities.txt 文件,如下所示:

AlabamaAir UniversityAlabama A&M UniversityAlabama State UniversityConcordia College-SelmaFaulkner UniversityHuntingdon CollegeJacksonville State UniversityJudson CollegeMiles CollegeOakwood CollegeSamford UniversitySoutheastern Bible CollegeSouthern Christian UniversitySpring Hill CollegeStillman CollegeTalladega CollegeUniversity of North AlabamaUniversity of South AlabamaUniversity of West AlabamaAlaskaAlaska Bible CollegeAlaska Pacific UniversitySheldon Jackson CollegeUniversity of Alaska - AnchorageUniversity of Alaska - FairbanksUniversity of Alaska - SoutheastArizonaAmerican Indian College of the Assemblies of GodArizona State UniversityArizona State University EastArizona State University WestDeVry University-PhoenixEmbry-Riddle Aeronautical UniversityGrand Canyon UniversityNorthcentral UniversityNorthern Arizona University

.. and so on, where in this case Alabama, Alaska and Arizona are locations and everything else are universities. What I want to do is load the location into a table called Location and the Universities into a table called University, where the Id of the Location table is a FK to the University table, like this:

CREATE TABLE Location (
Id SERIAL PRIMARY KEY,
Name TEXT
);

CREATE TABLE University (
Id SERIAL PRIMARY KEY,
Location INTEGER REFERENCES Location (Id) NOT NULL,
Name TEXT
);

所以我想在 Postgres 中做的是这样的:

for (int i=0 until i = universities.size()  i++){
//each entry in the universities vector is a tuple with the first entry being the country/state
//and the second entry being a vector of the universities as String's
Vector tuple = (Vector)universities.get(i);
//insert into location table
String state = (String)tuple.get(0);
Vector u = (Vector)tuple.get(1);
for(int j=0; until j =u.size(); j++){
//insert into university table with i as FK to location table

有人知道怎么做吗?

最佳答案

这是一个纯 SQL 解决方案

使用COPY使用 data modifying CTEs 将您的文件导入临时表和一个 DML 语句(需要 PostgreSQL 9.1 或更高版本)完成剩下的工作。这两个步骤都应该很快:

具有单个文本列的临时表(在 session 结束时自动删除):

CREATE TEMP TABLE tmp (txt text);

从文件导入数据:

COPY tmp FROM '/path/to/file.txt'

如果您是从远程客户端执行此操作,请使用 meta command \copy of psql相反。

我的解决方案取决于问题中显示的数据格式。即:城市前后有一个空行。我假设导入文件中有实际的空字符串。确保在第一个城市之前有一个带空字符串的前导行,以避免出现特殊情况。

行将按顺序插入。我将其用于以下窗口函数而无需排序。

WITH x AS (
SELECT txt
,row_number() OVER () AS rn
,lead(txt) OVER () = '' AND
lag(txt) OVER () = '' AS city
FROM tmp -- don't remove empty rows just yet
), y AS (
SELECT txt, city
,sum(city::int) OVER w AS id
FROM x
WHERE txt <> '' -- remove empty rows now
WINDOW w AS (ORDER BY rn)
), l AS (
INSERT INTO location (id, name)
SELECT id, txt
FROM y
WHERE city
), u AS (
INSERT INTO university u (location, name)
SELECT id, txt
FROM y
WHERE NOT city
)
SELECT setval('location_id_seq', max(id))
FROM y;

瞧。

  • CTE x 根据城市前后行中的空字符串值标记城市。

  • CTE y 添加城市的运行总和 (id),从而为每个城市及其所属的城市形成一个完全有效的 id统一。

  • CTE lu 执行插入,现在这很容易。

  • 最后的 SELECT 设置附加到 location.id 的序列的下一个值。我们还没有使用它,所以我们必须将它设置为当前的最大值,否则我们会在未来的 INSERT 到位置时遇到重复键错误。

关于sql - 将文件中的矩阵加载到 PostgreSQL 表中,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10407545/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com