"-6ren">
gpt4 book ai didi

用于 PHP 应用程序的基于 Java 的库 Aho-Corasick 字符串匹配算法

转载 作者:塔克拉玛干 更新时间:2023-11-03 04:33:20 26 4
gpt4 key购买 nike

我有一段 PHP 代码可以成功地在 $post 数据中搜索 $list 关键字,并在相似度约为 80-90% 的地方回显结果。下面是代码:

$list = array(
"Data" => "9",
"Data Structure" => "10",
"Database" => "11",
"Creativity" => "12",
"Forest" => "13",
"Al Pacino" => "14",
"Humans" => "15",
"Technology" => "16"
);

$post = array ('Database', 'Law', 'Tech', 'Creative');

$all_key_values = $all_keys = array();

foreach ($post as $keyword) {
foreach ($list as $word=>$num) {
$sim_chars = similar_text($keyword, $word);
if ($sim_chars/strlen($keyword) > .8 || $sim_chars/strlen($word) > .8) {
$all_key_values[] = $num;
$all_keys[] = $word;
}
elseif (stripos($keyword, $word) !== false || strpos($word, $keyword) !== false) {
$sll_key_values[] = $num;
$all_keys[] = $word;
}
}
}

print_r(implode(',', $all_key_values));
print_r(implode(',', $all_keys));

现在,问题是我想使用用 Java 编写的 Aho-Corasick 库在 $fulltext 中搜索 $list 关键字。您可以在 here 中找到代码.

require_once("http://localhost:8080/JavaBridge/java/Java.inc");

$list = array(
"Data" => "9",
"Data Structure" => "10",
"Database" => "11",
"Creativity" => "12",
"Forest" => "13",
"Al Pacino" => "14",
"Humans" => "15",
"Technology" => "16"
);

$fulltext = "A forest, also referred to as a wood or the woods, is an area with a high density of trees. As with cities, depending on various cultural definitions, what is considered a forest may vary significantly in size and have different classifications according to how and of what the forest is composed.[1] A forest is usually an area filled with trees but any tall densely packed area of vegetation may be considered a forest, even underwater vegetation such as kelp forests, or non-vegetation such as fungi,[2] and bacteria. Tree forests cover approximately 9.4 percent of the Earth's surface (or 30 percent of total land area), though they once covered much more (about 50 percent of total land area). They function as habitats for organisms, hydrologic flow modulators, and soil conservers, constituting one of the most important aspects of the biosphere. A typical tree forest is composed of the overstory (canopy or upper tree layer) and the understory. The understory is further subdivided into the shrub layer, herb layer, and also the moss layer and soil microbes. In some complex forests, there is also a well-defined lower tree layer. Forests are central to all human life because they provide a diverse range of resources: they store carbon, aid in regulating the planetary climate, purify water and mitigate natural hazards such as floods. Forests also contain roughly 90 percent of the worlds terrestrial biodiversity.";

所以,我的问题是如何调用 Aho-Corasick 库以在 $fulltext 中搜索 $list 并找到具有 100% 相似度的关键字。非常感谢您的帮助和时间。

最佳答案

您不能在 PHP 代码中包含 java 库。但是,您可以编写一个 java 服务器应用程序(在 java 中),它可以从您的 php 代码中接受数据。可以想到任何数量的方法——从套接字通信、Web 服务到简单的命令行工具。作为替代方案,您当然可以始终在 PHP 中重新实现 java 库 - 这可能会让您学到很多关于 php 和 java 以及算法的知识。

关于用于 PHP 应用程序的基于 Java 的库 Aho-Corasick 字符串匹配算法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24460572/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com