gpt4 book ai didi

PHP - 在字符串中搜索关键字并提高提取关键字的质量和准确性

转载 作者:塔克拉玛干 更新时间:2023-11-03 05:50:06 26 4
gpt4 key购买 nike

我有一段PHP代码如下:

$Keywords = array(
', JOE.' => '1',
', JOE' => '2',
'JOE' => '3',
'JOE.' => '4',
'/JOE' => '5',
'/JOE/' => '6',
'JOE/.' => '7',
',JOE.' => '8'
);
$Text = "JOE is JOE is JOE is JOE is JOE is JOE is JOE. Hello , JOE. Hey ,JOE. Come on , JOE. Dude,JOE/. Shut up ,JOE. What is the meaning of /JOE/? Of course, JOE";

extract_keyword ($Keywords, $Text);

function extract_keyword ($Keywords, $Text){
mb_internal_encoding('UTF-8');

uksort($Keywords, function ($a, $b) {
$as = mb_strlen($a);
$bs = mb_strlen($b);

if ($as > $bs) {
return -1;
}
else if ($bs > $as) {
return 1;
}
return 0;

});

$Keywords_ci = array();

foreach ($Keywords as $k => $v) {
$Keywords_ci[$k] = $v;
}

$re = '/\b(?:' . join('|', array_map(function($keyword) {
return preg_quote($keyword, '/');
}, array_keys($Keywords))) . ')\b/i';

$KeywordArrayKey = array();
$KeywordArrayValue = array();
$NewArray = array();
preg_match_all($re, $Text, $matches);
foreach ($matches[0] as $keyword) {
$KeywordArrayKey[] = $keyword;
$KeywordArrayValue[] = $Keywords_ci[$keyword];
if(!empty($keyword) && !empty($Keywords_ci[$keyword])) {
$NewArray[] = array($keyword => $Keywords_ci[$keyword]);
}
}
print_r($NewArray) ."<br><br>";
}

代码回应如下:

Array ( 
[0] => Array ( [JOE] => 3 )
[1] => Array ( [JOE] => 3 )
[2] => Array ( [JOE] => 3 )
[3] => Array ( [JOE] => 3 )
[4] => Array ( [JOE] => 3 )
[5] => Array ( [JOE] => 3 )
[6] => Array ( [JOE] => 3 )
[7] => Array ( [JOE] => 3 )
[8] => Array ( [JOE] => 3 )
[9] => Array ( [JOE] => 3 )
[10] => Array ( [JOE] => 3 )
[11] => Array ( [JOE] => 3 )
[12] => Array ( [JOE] => 3 )
[13] => Array ( [, JOE] => 2 ) )

可以看到,问题是代码不够准确,提取不到$keywords有关键字的地方,比如', JOE .' => '1' 或 '乔/.' => '7'。事实上,我的目标是将 '/JOE' => '5''/JOE/' => '6''JOE .' => '4' 等等。能否请您看一下代码,让我知道如何提高提取关键字的质量/准确性?谢谢你的帮助。

注意 1:print_r($Keywords_ci); 打印 Array ( [, JOE.] => 1 [JOE/.] => 7 [,JOE. ] => 8 [, JOE] => 2 [/JOE/] => 6 [JOE.] => 4 [/JOE] => 5 [JOE] => 3 ),但是我是寻找是回显可用关键字的所有实例,例如 '/JOE/' => '6'',JOE.' => $Text 中的“8”

注意 2:下面是 print_r($NewArray) 的预期打印:

Array ( 
[0] => Array ( [JOE] => 3 )
[1] => Array ( [JOE] => 3 )
[2] => Array ( [JOE] => 3 )
[3] => Array ( [JOE] => 3 )
[4] => Array ( [JOE] => 3 )
[5] => Array ( [JOE] => 3 )
[6] => Array ( [JOE.] => 4 )
[7] => Array ( [, JOE.] => 1 )
[8] => Array ( [,JOE.] => 8 )
[9] => Array ( [, JOE.] => 1 )
[10] => Array ( [JOE/.] => 7 )
[11] => Array ( [,JOE.] => 8 )
[12] => Array ( [/JOE/] => 6 )
[13] => Array ( [, JOE] => 2 ) )

最佳答案

将关键字从最长到最短排序后,您就会知道将在该字符串的任何可能子集之前检查字符串(/JOE/在/JOE 之前)。因此,您可以使用 str_replace 删除实际匹配项,从而在搜索/JOE 时不匹配/JOE/(假设您之前搜索过/JOE/)。使用 str_replacecount 参数获取匹配项目的数量

<?php
$Keywords = array(
', JOE.' => '1',
', JOE' => '2',
'JOE' => '3',
'JOE.' => '4',
'/JOE' => '5',
'/JOE/' => '6',
'JOE/.' => '7',
',JOE.' => '8'
);
$Text = "JOE is JOE. Hello , JOE. Hey ,JOE. Come on , JOE. Dude,JOE/. Shut up ,JOE. What is the meaning of /JOE/? Of course, JOE";

uksort($Keywords, function ($a, $b) {
$as = mb_strlen($a);
$bs = mb_strlen($b);

if ($as > $bs) {
return -1;
}
else if ($bs > $as) {
return 1;
}
return 0;

});

$copy = $Text;
foreach ($Keywords as $keyword => $value) {
$copy = str_replace($keyword, '', $copy, $count);
if ($count > 0) {
$result[$keyword] = $value;
}
}

print_r($result);

您可以使用 $count 变量实际计算字符串出现的次数。

关于PHP - 在字符串中搜索关键字并提高提取关键字的质量和准确性,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24715628/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com