sql - 为什么此代码在 PostgreSQL 中失败以及如何修复它(解决方法)？是 Postgres SQL 引擎缺陷吗？-6ren

sql - 为什么此代码在 PostgreSQL 中失败以及如何修复它(解决方法)？是 Postgres SQL 引擎缺陷吗？

转载作者：搜寻专家更新时间：2023-10-30 22:22:12

当我发现奇怪的 Postgres 行为时，我一直在处理文本解析任务。我暴露奇怪错误的原始代码是用 Java 编写的，具有 PostgreSQL 的 JDBC 连接性(已测试 v8.3.3 和 v8.4.2)，这是我的原始帖子:Is it an error of PostgreSQL SQL engine and how to avoid (workaround) it? .我刚刚将那里提供的 Java 代码移植到纯 plpgsql 中，它给出了相同的错误(与原始帖子中描述的行为相同)。

简化代码现在与解析无关 - 它只是生成伪随机(但可重复)单词并在规范化后插入它们(表 spb_word 包含唯一的单词和 id，它们在最终表 spb_obj_word 和表中由 id 引用spb_word4obj 用作输入缓冲区)。

这是我的表格(来自 OP 的 c&p):

create sequence spb_word_seq;

create table spb_word (
  id bigint not null primary key default nextval('spb_word_seq'),
  word varchar(410) not null unique
);

create sequence spb_obj_word_seq;

create table spb_obj_word (
  id int not null primary key default nextval('spb_obj_word_seq'),
  doc_id int not null,
  idx int not null,
  word_id bigint not null references spb_word (id),
  constraint spb_ak_obj_word unique (doc_id, word_id, idx)
);

create sequence spb_word4obj_seq;

create table spb_word4obj (
  id int not null primary key default nextval('spb_word4obj_seq'),
  doc_id int not null,
  idx int not null,
  word varchar(410) not null,
  word_id bigint null references spb_word (id),
  constraint spb_ak_word4obj unique (doc_id, word_id, idx),
  constraint spb_ak_word4obj2 unique (doc_id, word, idx)
);

以及从原始 Java 代码移植到 plpgsql 的代码:

create sequence spb_wordnum_seq;

create or replace function spb_getWord() returns text as $$
declare
  rn int;
  letters varchar(255) :=   'ąćęłńóśźżjklmnopqrstuvwxyz';
                          --'abcdefghijklmnopqrstuvwxyz';
  llen int := length(letters);
  res text := '';
  wordnum int;
begin
  select nextval('spb_wordnum_seq') into wordnum;

  rn := 3 * (wordnum + llen * llen * llen);
  rn := (rn + llen) / (rn % llen + 1);
  rn := rn % (rn / 2 + 10);

  loop
    res := res || substring(letters, rn % llen, 1);
    rn := floor(rn / llen);
    exit when rn = 0;
  end loop;

  --raise notice 'word for wordnum=% is %', wordnum, res;

  return res;
end;
$$ language plpgsql;



create or replace function spb_runme() returns void as $$
begin
  perform setval('spb_wordnum_seq', 1, false);
  truncate table spb_word4obj, spb_word, spb_obj_word;

  for j in 0 .. 50000-1 loop

    if j % 100 = 0 then raise notice 'j = %', j; end if;

    delete from spb_word4obj where doc_id = j;

    for i in 0 .. 20 - 1 loop
      insert into spb_word4obj (word, idx, doc_id) values (spb_getWord(), i, j);         
    end loop;

    update spb_word4obj set word_id = w.id from spb_word w 
    where w.word = spb_word4obj.word and doc_id = j;

    insert into spb_word (word) 
    select distinct word from spb_word4obj 
    where word_id is null and doc_id = j;

    update spb_word4obj set word_id = w.id 
    from spb_word w 
    where w.word = spb_word4obj.word and 
    word_id is null and doc_id = j;

    insert into spb_obj_word (word_id, idx, doc_id) 
    select word_id, idx, doc_id from spb_word4obj where doc_id = j;
  end loop;
end;
$$ language plpgsql;

要运行它，只需执行 select spb_runme()作为 SQL 语句。

这是第一个错误示例:

NOTICE:  j = 8200
ERROR:  duplicate key value violates unique constraint "spb_word_word_key"
CONTEXT:  SQL statement "insert into spb_word (word) select distinct word from spb_word4obj where word_id is null and doc_id =  $1 "
PL/pgSQL function "spb_runme" line 18 at SQL statement

第二个:

NOTICE:  j = 500
ERROR:  null value in column "word_id" violates not-null constraint
CONTEXT:  SQL statement "insert into spb_obj_word (word_id, idx, doc_id) select word_id, idx, doc_id from spb_word4obj where doc_id =  $1 "
PL/pgSQL function "spb_runme" line 27 at SQL statement

这些错误以不可预知的方式发生 - 每次在不同的迭代中( j )并且不同的单词会导致错误。

当波兰民族字符( ąćęłńóśźż )从生成的单词中删除(行 letters varchar(255) := 'ąćęłńóśźżjklmnopqrstuvwxyz'; 变为 letters varchar(255) := 'abcdefghijklmnopqrstuvwxyz'; )时，没有错误!我的数据库是用 UTF-8 编码创建的，所以非 ascii 字符应该没有问题，但显然它非常重要!

现在我的问题是:我的代码有什么问题？或者 PostgreSQL 有什么严重的问题？如何解决此错误？

BTW:如果是 PostgreSQL 引擎中的错误，那么这个数据库如何值得信赖？我应该转向免费替代品之一(例如 MySQL)吗？

更新:额外说明(主要针对 OMG 小马)

如果我删除了不必要的 delete - 我仍然有同样的错误。

功能 spb_getWord()必须生成具有重复项的单词-它模拟文本解析并将其划分为单词-并且某些单词会重复-这是正常的，我的其余代码正在处理重复项。因为 spb_getWord() 可能产生重复我在缓冲表中插入单词 spb_word4obj然后我更新 word_id在此表中为来自 spb_word 的已处理单词.所以现在 - 如果行在 spb_word4obj有 word_id not null - 那么它是重复的，所以我不会在 spb_word 中插入这个词.但是 - 正如 OMG Ponies 提到的，我收到错误 duplicate key value violates unique constraint这意味着我正确处理重复的代码失败了。 IE。我的代码由于内部 Postgres 错误而失败 - Postgres 以某种方式错误地执行了正确的代码并且失败了。

在将新词(已识别并标记为不插入的重复词)插入 spb_word 后我的代码最终将规范化的单词插入 spb_obj_word - 引用 spb_word 中不重复的条目替换词体，但这有时会再次失败，因为 Postgres 内部错误。我再次认为我的代码是正确的，但它失败了，因为 Postgres SQL 引擎本身存在问题。
spb_getWord 在生成的单词中添加或删除波兰国家字母只向我保证这是奇怪的 Postgres 错误 - 所有唯一/重复的考虑因素保持不变，但允许/禁止单词中的某些字母会导致错误或消除它们。所以这不是我的代码中出现错误的情况 - 重复处理不当。

确保我的代码中没有错误的第二件事是检测到的不可预测的错误时刻。我的代码的每次运行都执行相同的单词序列，因此它应该始终在相同的位置以相同的值中断，从而导致错误。但事实并非如此 - 它失败是非常随机的时刻。

最佳答案

NOTICE: j = 8200
ERROR: duplicate key value violates unique constraint "spb_word_word_key"
CONTEXT: SQL statement "insert into spb_word (word) select distinct word from spb_word4obj where word_id is null and doc_id = $1 "
PL/pgSQL function "spb_runme" line 18 at SQL statement

...告诉你你的 spb_getWord()正在生成 SPB_WORD 中已经存在的值 table 。您需要更新函数以在退出函数之前检查单词是否已经存在 - 如果存在，请重新生成，直到遇到不存在的单词。

我认为你的 spb_runme()需要类似于:

create or replace function spb_runme() returns void as $$
DECLARE
  v_word VARCHAR(410);

begin
  perform setval('spb_wordnum_seq', 1, false);
  truncate table spb_word4obj, spb_word, spb_obj_word;

  for j in 0 .. 50000-1 loop

    if j % 100 = 0 then raise notice 'j = %', j; end if;

    for i in 0 .. 20 - 1 loop
      v_word := spb_getWord();
      INSERT INTO spb_word (word) VALUES (v_word);

      INSERT INTO spb_word4obj 
        (word, idx, doc_id, word_id)
        SELECT w.word, i, j, w.id
          FROM SPB_WORD w 
         WHERE w.word = v_word;

    end loop;

    INSERT INTO spb_obj_word (word_id, idx, doc_id) 
    SELECT w4o.word_id, w4o.idx, w4o.doc_id 
      FROM SPB_WORD4OBJ w4o 
     WHERE w40.doc_id = j;

  end loop;
end;

使用它可以让您更改 word_id不支持 NULL。处理外键时，在表中填充外键引用 第一个 - 从 parent 开始，然后处理它的 child 。

我所做的另一项更改是存储 spb_getWord()在变量( v_word )中，因为多次调用该函数意味着您每次都会得到不同的值。

最后一件事 - 我删除了删除语句。您已经截断了表格，那里没有什么可以删除的。当然，与 j 的值无关。 .

关于sql - 为什么此代码在 PostgreSQL 中失败以及如何修复它(解决方法)？是 Postgres SQL 引擎缺陷吗？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/2089772/

文章推荐： c# - 将 scope_identity() 与 ASP.NET 控件一起使用

文章推荐： ios - 如何从 MKmapview 的可见区域获取半径？

文章推荐： ios - 快速旋转屏幕后如何获得 View 的高度？

文章推荐： eloquent - Vue 2 Laravel 5.3 Eloquent 无法从对象检索数据

postgresql - 将旧表从已删除的 postgresql 导入到新安装的 postgresql
我的 postgresql 有问题，我复制了所有文件，然后将其删除。然后，我安装了新的，问题就解决了。现在可以将旧文件和文件导入新文件吗？最佳答案如果它们是相同的主要版本(即 9.0 到 9.0.
postgresql - postgresql 服务器启动时，postgresql 中是否有任何系统表自动更新？
我想使用 Postgresql 9.2.2 来存储我的应用程序的数据。我不得不构建一个应该基于数据库级别的触发器(当数据库启动时，这个触发器将被触发并执行。)，当 postgresql 服务器启动时是
postgresql - 如何从 Postgresql 目录表中检索 Postgresql 序列缓存值？
我已经使用下面的查询从 Postgresql 目录表中获取 Sequence 对象的完整信息 select s.sequence_name, s.start_value, s.minimum_valu
postgresql - 执行函数从本地 PostgreSQL 数据库返回远程 PostgreSQL 数据库中的数据
Postgres 版本:9.3.4 我需要执行驻留在远程数据库中的函数。该函数根据给定的参数返回一个统计数据表。我实际上只是在我的本地数据库中镜像该函数，以使用我的数据库角色和授权来锁定对该函数的访
postgresql - 在没有 postgresql-server 的情况下重启 PostgreSQL
我在 CentOS 7 上，我正在尝试解决“PG::ConnectionBad: FATAL: Peer authentication failed for user”错误。所以我已经想出我应该更改
postgresql - Postgresql - 使用动态列名触发
我写了一个触发器函数，在触发器表列名上循环，我从具有不同列的不同表调用该函数。该函数将列名插入到数组中并在它们上循环，以便将值插入到另一个模式和表中。函数和触发器创建脚本: DROP TRIGGER
postgresql - PostgreSQL 的默认空闲连接超时值是多少
PostgreSQL 的默认空闲连接超时是多少，我运行了 show idle_in_transaction_session_timeout 查询并返回了 0，但是值 0 表示此选项被禁用，但我想知道默
postgresql - Postgresql 中十六进制值的适当数据类型？
我需要将十六进制值存储到数据库表中，谁能推荐我需要用于属性的数据类型？提前致谢最佳答案您可以使用bytea 来存储十六进制格式。更多信息 can be found in the postgres
postgresql - Postgresql 中是否需要对复合主键的一部分进行索引？
我有一个具有复合主键的(大)表，由 5 列(a、b、c、d、e)组成。我想高效地选择具有其中两列 (a + e) 的所有行到给定值。在 PostgreSQL 中，我需要索引吗？或者数据库会使用主键
postgresql - PostgreSQL 如何在内部存储日期时间类型
在阅读 PostreSQL (13) 文档时，我遇到了 this页面，其中列出了不同日期时间类型的存储大小。除其他外，它指出: Name Storag
postgresql - PostgreSQL 中批量插入的最佳大小
我有两个大整数的巨大表(500 000 000 行)。两列都被单独索引。我正在使用语法批量插入此表: INSERT into table (col1, col2) VALUES(x0, y0), (x
postgresql - 无法启动 PostgreSQL
有一台 CentOS7 Linux 机器正在运行(不是由我管理；拥有有限的权限)。请求在其中设置 PostgreSQL。刚刚从 CentOS 存储库安装了 PostgreSQL: sudo yum
postgresql - 是否可以在不破坏现有数据库的情况下安装 Postgresql？
我在 Ubuntu 18.04 上安装了 Postgresql 10，但不知何故坏了，不会重新启动。我可以重新安装它而不破坏它的数据库，以便我可以再次访问数据库吗？ pg_dump 不起作用。最佳答
postgresql - postgresql 中的自动备份创建空备份
我想在 UNIX 中使用 crontab 自动备份 PostgreSQL 数据库。我已经尝试过，但它会创建 0 字节备份。我的 crontab 条目是: 24 * * * * /home/desk
postgresql - 允许远程连接 postgresql
我已经完成了PG服务器的安装。我希望能够使用 pgAdmin 远程连接到它，但不断收到服务器不听错误。 could not connect to server: Connection refused
PostgreSQL:PostgreSQL 支持波斯历吗？
Oracle 支持波斯历但需要知道 PostgreSQL 是否支持波斯历？如果是，那么我们如何在 PostgreSQL 中将默认日历类型设置为 Persian 而不是 Gregorian(在 Ora
postgresql - PostgreSQL 模式的命名空间版本以实现向后兼容性？
假设我们有一个带有表的 SQL 数据库 Person以及访问它的几个应用程序。出于某种原因，我们想修改 Person表以向后不兼容的方式。保持兼容性的一种潜在解决方案是将表重命名为 User并创建一
postgresql - PostgreSQL 中的模式是物理对象吗？
我使用 PostgreSQL 中的模式来组织我庞大的会计数据库。每年年底，我都会通过为下一年创建一个新模式来进行协调过程。新模式的文件是否与旧模式物理分离？或者所有模式一起存储在硬盘上？这对我来说
postgresql - PostgreSQL autovacuum中的autovacuum_vacuum_cost_delay是什么？
我正在尝试使用配置文件中的以下配置参数调整 PostgreSQL 服务器: autovacuum_freeze_max_age = 500000000 autovacuum_max_workers =
postgresql - 如何仅查询具有表情符号的数据(postgresql)
我的数据包含数据库列中的表情符号，即 message_text ------- 🙂 😀 Hi 😀 我只想查询包含表情符号的数据的行。在 postgres 中是否有一种简单的方法可以做到这一点？

搜寻专家

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

sql - 为什么此代码在 PostgreSQL 中失败以及如何修复它(解决方法)？是 Postgres SQL 引擎缺陷吗？