gpt4 book ai didi

java - 如何找到 TRIE 中最长的字符串

转载 作者:塔克拉玛干 更新时间:2023-11-02 19:59:19 24 4
gpt4 key购买 nike

我已经通过 2 个类实现了 DS Trie:Trie 和 TrieNode。我需要编写一个函数来返回 O(h) 中 Trie 中最长的字符串。我的 TrieNode 有一个字段 LinkedList 存储每个节点的 child 。我们仍然没有了解 BFS 或 DFS,所以我正在尝试思考一些创造性的方法来解决它。

我已经有一个函数(一个单独的函数)可以通过给定的字符插入/创建一个新节点:在构建 Trie 时:创建一个带有字段“maxDepth=0”的节点,指示我当前的深度。对于我创建的每个新节点,我将一直迭代到他的父节点(每个节点已经有一个指向他父节点的指针)等等,直到我到达根节点,并将他父节点的深度增加 1。现在,我将通过这种方式创建返回最长字符串的函数: 对于每个节点:遍历我的子节点,查找最大整数“maxDepth”,然后向下查找。这样做直到达到“maxDepth==0”。例如,我的算法适用于这个字符串:“aacgace”

       root      
/ \
(2)a g(0)
/
(1)c
/
(0)e

=> 'ace' 实际上是最长的。但不适用于此字符串:“aacgae”

      root      
/ \
(2)a g(0)
/ \
(0)c (0)e

=> 看起来节点 'a' 有一个 child ,而他的 child 也有一个 child ,但事实并非如此。

一般来说,我试图利用创建 Trie 的第一个函数(运行时间:O(h*c)),因此第二个函数(返回最长字符串)的运行时间将更少我可以。 O(h)

最佳答案

不确定你真正想做什么,但你可以找到一个 trie 的例子 here .

基本上我通过构建器来创建 trie;让我们快速了解一下单词是如何添加到 trie 中的:

// In TrieBuilder
final TrieNodeBuilder nodeBuilder = new TrieNodeBuilder();

// ...

/**
* Add one word to the trie
*
* @param word the word to add
* @return this
* @throws IllegalArgumentException word is empty
*/
public TrieBuilder addWord(@Nonnull final String word)
{
Objects.requireNonNull(word);

final int length = word.length();

if (length == 0)
throw new IllegalArgumentException("a trie cannot have empty "
+ "strings (use EMPTY instead)");
nrWords++;
maxLength = Math.max(maxLength, length);
nodeBuilder.addWord(word);
return this;
}

这会延迟将单词添加到 TrieNodeBuilder,它会执行以下操作:

private boolean fullWord = false;

private final Map<Character, TrieNodeBuilder> subnodes
= new TreeMap<>();

TrieNodeBuilder addWord(final String word)
{
doAddWord(CharBuffer.wrap(word));
return this;
}

/**
* Add a word
*
* <p>Here also, a {@link CharBuffer} is used, which changes position as we
* progress into building the tree, character by character, node by node.
* </p>
*
* <p>If the buffer is "empty" when entering this method, it means a match
* must be recorded (see {@link #fullWord}).</p>
*
* @param buffer the buffer (never null)
*/
private void doAddWord(final CharBuffer buffer)
{
if (!buffer.hasRemaining()) {
fullWord = true;
return;
}

final char c = buffer.get();
TrieNodeBuilder builder = subnodes.get(c);
if (builder == null) {
builder = new TrieNodeBuilder();
subnodes.put(c, builder);
}
builder.doAddWord(buffer);
}

假设我们将“trouble”和“troubling”都添加到 trie 中;这是怎么回事:

  • 第一次为“麻烦”的每个角色创建节点;
  • 第二次,直到“l”的所有节点都存在;然后为“ing”创建所有节点。

现在,如果我们添加“troubles”,将为“e”之后的“s”创建另一个节点。

fullWord 变量告诉我们这里是否有一个潜在的完全匹配;这是搜索功能:

public final class Trie
{
private final int nrWords;
private final int maxLength;
private final TrieNode node;

// ...

/**
* Search for a string into this trie
*
* @param needle the string to search
* @return the length of the match (ie, the string) or -1 if not found
*/
public int search(final String needle)
{
return node.search(needle);
}
// ...
}

TrieNode 中我们有:

public final class TrieNode
{
private final boolean fullWord;

private final char[] nextChars;
private final TrieNode[] nextNodes;

// ...

public int search(final String needle)
{
return doSearch(CharBuffer.wrap(needle), fullWord ? 0 : -1, 0);
}

/**
* Core search method
*
* <p>This method uses a {@link CharBuffer} to perform searches, and changes
* this buffer's position as the match progresses. The two other arguments
* are the depth of the current search (ie the number of nodes visited
* since root) and the index of the last node where a match was found (ie
* the last node where {@link #fullWord} was true.</p>
*
* @param buffer the charbuffer
* @param matchedLength the last matched length (-1 if no match yet)
* @param currentLength the current length walked by the trie
* @return the length of the match found, -1 otherwise
*/
private int doSearch(final CharBuffer buffer, final int matchedLength,
final int currentLength)
{
/*
* Try and see if there is a possible match here; there is if "fullword"
* is true, in this case the next "matchedLength" argument to a possible
* child call will be the current length.
*/
final int nextLength = fullWord ? currentLength : matchedLength;


/*
* If there is nothing left in the buffer, we have a match.
*/
if (!buffer.hasRemaining())
return nextLength;

/*
* OK, there is at least one character remaining, so pick it up and see
* whether it is in the list of our children...
*/
final int index = Arrays.binarySearch(nextChars, buffer.get());

/*
* If not, we return the last good match; if yes, we call this same
* method on the matching child node with the (possibly new) matched
* length as an argument and a depth increased by 1.
*/
return index < 0
? nextLength
: nextNodes[index].doSearch(buffer, nextLength, currentLength + 1);
}
}

注意 -1 是如何在 doSearch() 的第一次调用中作为“nextLength”参数传递的。

假设我们有一个包含上面三个词的 trie,下面是搜索“tr”的调用序列,但失败了:

  • doSearch("tr", -1, 0)(节点为根);
  • doSearch("tr", -1, 1)(节点为't');
  • doSearch("tr", -1, 2)(节点为'r');
  • 没有下一个字符:返回 nextLength; nextLength 为-1,不匹配。

现在,如果我们有“麻烦”:

  • doSearch("troubles", -1, 0)(节点为根节点);
  • doSearch("troubles", -1, 1)(节点为't');
  • doSearch("troubles", -1, 2)(节点为'r');
  • doSearch("troubles", -1, 3)(节点为'o');
  • doSearch("troubles", -1, 4)(节点是'u');
  • doSearch("troubles", -1, 5)(节点为'b');
  • doSearch("troubles", -1, 6)(节点为'l');
  • doSearch("troubles", -1, 7)(节点是'e');
  • doSearch("troubles", 7, 8)(全词为真!节点为“s”);
  • 没有下一个字符:返回nextLength,即8;我们有一场比赛。

关于java - 如何找到 TRIE 中最长的字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30544454/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com