gpt4 book ai didi

algorithm - aho corasick 算法的状态转移表

转载 作者:塔克拉玛干 更新时间:2023-11-03 03:27:46 28 4
gpt4 key购买 nike

请帮助我理解 Aho-Corasick 算法中多于一种模式的状态转换表的构造。

请简单详细的解释一下,让我明白。

我正在关注 this纸和here是那个的动画。

谢谢。

最佳答案

第一阶段

创建关键字树:

Starting at the root, follow the path labeled by chars of Pi        If the path ends before Pi, continue it by adding new edges and ...                            nodes for the remaining characters of P        Store identifier i of Pi at the terminal node of the path

I demonstrate it by an example :

Suppose we have a finite set of patterns P = {he, she, his, hers}.

enter image description here

Phase2

Next we extend the keyword tree into an automaton,to support linear-time matching as follows.

Obviously the automaton consists of two parts :

  • states
  • transitions

States : Exactly the nodes achieved by the keyword tree; and initial state = root of the tree.

enter image description here

Transitions : using goto(q,a) function as follows.

/*This function gives the state entered from current state q      by matching target char a*/- If edge (q,v) in the tree is labeled by a, then g(q,a)=v; - g(0,a)=0 for each a that does not label an edge out of the root//the automton stays at the initial state while scanning non-matching characters       - O.W. g(q,a)={}

enter image description here

Using the automaton :

SearchofTarget T[1..m] //considering the target string T as an 
array of chars indexed from 1 to m
q:= 0; // initial state(root)
for i:= 1 to m do
while g(q,T[i]) == {} do
q:= fail(q); // follow a fail
q:= goto(q,T[i]); // follow a goto
if output(q) != {} then print i, out(q);
endfor;

正如您在上面的伪代码中看到的,它调用了两个函数:fail(q)output(q)

fail(q) ://对于 q !=0 给出在不匹配时进入的状态

failure(q) is the node labeled by the longest proper suffix  w of L(q) ...
s.t. w is a prefix of some pattern
//L(q) is the label of node q as the concatenation of edge labels ...
on the path from the root to q

输出(q):

gives the set of patterns recognized when entering state q

在计算完这两个函数之后,自动机看起来像这样:

enter image description here

现在你可能想知道什么时候计算这些方法,所以请查看这些以获得更正式的形式:

enter image description here

希望这对您有所帮助,如果还有什么不明确的地方,请不要犹豫,尽管问。

关于algorithm - aho corasick 算法的状态转移表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/22398190/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com