gpt4 book ai didi

java - 如何使用可变长度字符串标记列

转载 作者:行者123 更新时间:2023-12-01 13:52:14 26 4
gpt4 key购买 nike

我有一个关于在列中标记字符串的问题我有像这样的表

id  list1   i love dogs2   i like cats and dogs  

and so on

it should be converted to

id  list1   i1   love1   dogs2   i2   like2   cates2   and2   dogs

How do I tokenize this? I tried using this code inside cursor and procedure

SELECT id, regexp_substr(str, '[^ ]+', 1, level) TOKEN
from test
CONNECT by level <= length(regexp_replace (str, '[^ ]+')) + 1;

但是从 java 调用时速度非常慢。还有其他选择吗?

谢谢安尼尔

最佳答案

正如我在评论中提到的,您可以尝试使用 substrinstr 函数而不是 regexp_substr,也许性能会提高您可以接受:

CREATE TABLE test_list_tab (
id NUMBER,
str VARCHAR2(100)
);

CREATE TABLE test_tokens_tab (
id NUMBER,
token VARCHAR2(100)
);

INSERT INTO test_list_tab VALUES (1, 'i love dogs');
INSERT INTO test_list_tab VALUES (2, 'i like cats and dogs');
INSERT INTO test_list_tab VALUES (3, 'i');
INSERT INTO test_list_tab VALUES (4, 'abc');
INSERT INTO test_list_tab VALUES (5, 'abc i');
INSERT INTO test_list_tab VALUES (6, NULL);

DECLARE
v_token test_list_tab.str%TYPE;
v_space_position NUMBER := 1;
v_prev_space_position NUMBER := 1;
BEGIN
FOR v_rec IN (SELECT id, str FROM test_list_tab WHERE str IS NOT NULL)
LOOP
v_prev_space_position := 0;

LOOP
v_space_position := instr(v_rec.str, ' ', v_prev_space_position + 1);
IF v_space_position > 0 THEN
v_token := substr(v_rec.str, v_prev_space_position + 1, v_space_position - v_prev_space_position - 1);
ELSE
v_token := substr(v_rec.str, v_prev_space_position + 1);
END IF;

INSERT INTO test_tokens_tab VALUES (v_rec.id, v_token);

v_prev_space_position := v_space_position;
EXIT WHEN v_space_position = 0;
END LOOP;
END LOOP;

COMMIT;
END;

SELECT id, token FROM test_tokens_tab;

输出:

        ID TOKEN ---------- -------         1 i                1 love             1 dogs             2 i                2 like             2 cats             2 and              2 dogs             3 i                4 abc              5 abc              5 i 

关于java - 如何使用可变长度字符串标记列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19881543/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com