gpt4 book ai didi

java - 在 Java 中使用 Trie 确定电话号码前缀

转载 作者:行者123 更新时间:2023-11-30 11:02:36 26 4
gpt4 key购买 nike

我正在尝试创建一个快速搜索功能来确定电话号码的前缀。

我正在将前缀数据从数据库加载到内存中作为 TreeMap,其中键是前缀,值是包含有关此前缀(国家等)信息的对象。

这是 TreeMap 的填充方式:

private static TreeMap<String, PrefixData> prefixMap = new TreeMap<>();
//EXAMPLE of DB query
try {
Class.forName("org.postgresql.Driver");
Connection connection = DriverManager.getConnection(dbURL, dbUser, dbPass);
connection.setAutoCommit(false);
Statement stmt = connection.createStatement();

ResultSet rs = stmt.executeQuery("SELECT * FROM countries_prefixes");

//Looping resultset
while (rs.next()) {
//TODO Review fields that must be stored in memory
String country = rs.getString("name");

//Populating map with data object (keeping nr prefix as key and object as value)
prefixMap.put(rs.getString("country_prefix"), new PrefixData(country));
}

rs.close();
stmt.close();
connection.close();

} catch (ClassNotFoundException | SQLException e) {
System.out.println(e.toString());
}

假设我有要查看的电话号码:37251845632;35844021546;34651478966等等……

有些前缀是 1 位长,有些是 2 位长,有些是 3 位长等等......

所以我创建了一个有效的循环:

//TODO Try some other search methods (Tries)
//array for prefix length priority
int[] sequence = {3, 4, 2, 1, 5, 6};

//Performing search from the map
for (int i = 0; i < sequence.length; i++) {

//Extracting prefix from phone nr
String prefix = phoneNr.substring(1, sequence[i] + 1);

if (prefixMap.containsKey(prefix)) {
PrefixData pdata = prefixMap.get(prefix);
System.out.println(String.format("Found matching key [%s] in TreeMap", prefix));
System.out.println(String.format("[NAME: %s] [REGEX: %s] ", pdata.getCountryName(), pdata.getNrRegex()));

//Validate number format with regex
if (pdata.getNrRegex().trim() != null && !pdata.getNrRegex().trim().isEmpty()) {
System.out.println("Regex for number validation is present!");
if (phoneNr.matches(pdata.getNrRegex().replaceAll("^/|/$", ""))) {
System.out.println("NUMBER IS VALID!");
} else {
System.out.println("INVALID NUMBER!");
}
}

return pdata;

}
}
return null;
}

现在循环运行良好,但速度很慢。我听说过一些关于 Tries 的消息,它更快,但我不明白如何在我的场景中实现它。

感谢任何帮助!

最佳答案

正如我所说,循环有效,但这不是实现我的目标的好方法。

所以我做了一些研究并提出了使用前缀树 (Trie) 实现的解决方案。

稍微看一下Trie是什么可以发现here .

现在是 Trie 实现部分。我知道找到一个已经编写和测试过的代码会更快,所以我找到了谷歌实现 here .以及 Vladimir Kroz 的实现 here .

做了一些小修改,这就是解决方案。我将提供两种解决方案:

前缀映射接口(interface)

package tiesImpl;

/**
* Maps string prefixes to values. For example, if you {@code put("foo", 1)},
* {@code get("foobar")} returns {@code 1}. Prohibits null values.
*
* <p>Use instead of iterating over a series of string prefixes calling
* {@code String.startsWith(prefix)}.
*
* @author crazybob@google.com (Bob Lee)
* @param <T>
*/

public interface PrefixMap<T> {
/**
* Maps prefix to value.
*
* @param prefix
* @param value
* @return The previous value stored for this prefix, or null if none.
* @throws IllegalArgumentException if prefix is an empty string.
*/
T put(CharSequence prefix, T value);

/**
* Finds a prefix that matches {@code s} and returns the mapped value.
*
* If multiple prefixes in the map match {@code s}, the longest match wins.
*
* @param s
* @return value for prefix matching {@code s} or {@code null} if none match.
*/
T get(CharSequence s);
}

PrefixTrie 类

package tiesImpl;

/*
* Copyright (C) 2007 Google Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import java.util.LinkedHashMap;
import java.util.Map;

/**
* Trie implementation supporting CharSequences as prefixes.
*
* Prefixes are sequences of characters, and the set of allowed characters is
* specified as a range of sequential characters. By default, any seven-bit
* character may appear in a prefix, and so the trie is a 128-ary tree.
*
* @author crazybob@google.com (Bob Lee)
* @author mharris@google.com (Matthew Harris)
* @param <T>
*/
public class PrefixTrie<T> implements PrefixMap<T> {
// The set of allowed characters in prefixes is given by a range of
// consecutive characters. rangeOffset denotes the beginning of the range,
// and rangeSize gives the number of characters in the range, which is used as
// the number of children of each node.

private final char rangeOffset;
private final int rangeSize;

private final Node<T> root;

/**
* Constructs a trie for holding strings of seven-bit characters.
*/
public PrefixTrie() {
rangeOffset = '\0';
rangeSize = 128;
root = new Node<>(rangeSize);
}

/**
* Constructs a trie for holding strings of characters.
*
* The set of characters allowed in prefixes is given by the range
* [rangeOffset, lastCharInRange], inclusive.
*
* @param firstCharInRange
* @param lastCharInRange
*/
public PrefixTrie(char firstCharInRange, char lastCharInRange) {
this.rangeOffset = firstCharInRange;
this.rangeSize = lastCharInRange - firstCharInRange + 1;

if (rangeSize <= 0) {
throw new IllegalArgumentException("Char range must include some chars");
}

root = new Node<>(rangeSize);
}

/**
* {@inheritDoc}
*
* @param prefix
* @param value
* @throws IllegalArgumentException if prefix contains a character outside
* the range of legal prefix characters.
*/
@Override
public T put(CharSequence prefix, T value) {
if (value == null) {
throw new NullPointerException();
}

Node<T> current = root;
for (int i = 0; i < prefix.length(); i++) {
int nodeIndex = prefix.charAt(i) - rangeOffset;
try {
Node<T> next = current.next[nodeIndex];
if (next == null) {
next = current.next[nodeIndex] = new Node<>(rangeSize);
}
current = next;
} catch (ArrayIndexOutOfBoundsException e) {
throw new IllegalArgumentException(
"'" + prefix.charAt(i) + "' is not a legal prefix character.");
}
}
T oldValue = current.value;
current.value = value;
return oldValue;
}

/**
* {@inheritDoc}
* @param s
*/
@Override
public T get(CharSequence s) {
Node<T> deepestWithValue = root;
Node<T> current = root;
for (int i = 0; i < s.length(); i++) {
int nodeIndex = s.charAt(i) - rangeOffset;
if (nodeIndex < 0 || rangeSize <= nodeIndex) {
return null;
}
current = current.next[nodeIndex];
if (current == null) {
break;
}
if (current.value != null) {
deepestWithValue = current;
}
}
return deepestWithValue.value;
}

/**
* Returns a Map containing the same data as this structure.
*
* This implementation constructs and populates an entirely new map rather
* than providing a map view on the trie, so this is mostly useful for
* debugging.
*
* @return A Map mapping each prefix to its corresponding value.
*/
public Map<String, T> toMap() {
Map<String, T> map = newLinkedHashMap();
addEntries(root, new StringBuilder(), map);
return map;
}

/**
* Adds to the given map all entries at or below the given node.
*
* @param node
* @param builder A StringBuilder containing the prefix for the given node.
* @param map
*/
private void addEntries(Node<T> node,
StringBuilder builder,
Map<String, T> map) {
if (node.value != null) {
map.put(builder.toString(), node.value);
}

for (int i = 0; i < node.next.length; i++) {
Node<T> next = node.next[i];
if (next != null) {
builder.append((char) (i + rangeOffset));
addEntries(next, builder, map);
builder.deleteCharAt(builder.length() - 1);
}
}
}

private static class Node<T> {

T value;
final Node<T>[] next;

@SuppressWarnings("unchecked")
Node(int numChildren) {
next = new Node[numChildren];
}
}

/**
* Creates a {@code LinkedHashMap} instance.
*
* @param <K>
* @param <V>
* @return a newly-created, initially-empty {@code LinkedHashMap}
*/
public static <K, V> LinkedHashMap<K, V> newLinkedHashMap() {
return new LinkedHashMap<>();
}

}

Vladimir Kroz 实现:Trie 类

package tiesImpl;

import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;

/**
* Prefix table based on Trie structure. Allows to perform incremental lookup
* and match based on search key prefixes (classic example - determine phone
* area code for given phone number)
*
* @param <V> a type of value object to be stored along with prefix (e.g when
* key is a country name, the value could be a name of the country)
*
* @author Vladimir Kroz
* https://vkroz.wordpress.com/2012/03/23/prefix-table-trie-implementation-in-java/
*/

public class Trie<V> {

Entry<V> entry;
char key;
Map<Character, Trie<V>> children;

public Trie() {
this.children = new HashMap<>(10);
entry = new Entry<>();
}

/**
* non-public, used by _put()
*/
Trie(char key) {
this.children = new HashMap<>(10);
this.key = key;
entry = new Entry<>();
}

public void put(String key, V value) {
_put(new StringBuffer(key), new StringBuffer(""), value);
}

void _put(StringBuffer remainder, StringBuffer prefix, V value) {
if (remainder.length() > 0) {
char keyElement = remainder.charAt(0);
Trie<V> t = null;
try {
t = children.get(keyElement);
} catch (IndexOutOfBoundsException e) {
}
if (t == null) {
t = new Trie<>(keyElement);
children.put(keyElement, t);
}
prefix.append(remainder.charAt(0));
t._put(remainder.deleteCharAt(0), prefix, value);
} else {
this.entry.value = value;
this.entry.prefix = prefix.toString();
}

}

/**
* Retrieves element from prefix table matching as a prefix to provided
* key. E.g. if key is "37251656565" and prefix table has node "372" then
* this call will return the value of "372"
*
* @param key a string which starts with prefix to be searched in the table
* (e.g. phone number)
* @return an Object associated with matching prefix (i.e if key is a phone
* number it may return a corresponding country name)
*/
public V get(String key) {
return _get(new StringBuffer(key), 0);
}

/**
* Returns true if key has matching prefix in the table
*
* @param key
* @return
*/
public boolean hasPrefix(String key) {
return this.get(key) != null;
}

V _get(StringBuffer key, int level) {
if (key.length() > 0) {
Trie<V> t = children.get(key.charAt(0));
if (t != null) {
//FYI: modified code to return deepest with value instead of returning null if prefix doesn't have corresponding value.
V result = t._get(key.deleteCharAt(0), ++level);
return result == null ? entry.value : result;

} else {
return (level > 0) ? entry.value : null;
}

} else {
return entry.value;
}
}

@Override
//For debugging
public String toString() {

Iterator<Character> it = children.keySet().iterator();
StringBuffer childs = new StringBuffer();

while (it.hasNext()) {
Character _key = it.next();
childs.append(String.format("\n%s\n",
//Adding a tab to the beginning of every line to create a visual tree
String.format("%s: %s", _key, children.get(_key)).replaceAll("(?m)(^)", "\t")));
}

return String.format("Trie [entry=%s, children=%s]", entry, childs);
}

static public class Entry<V> {

String prefix;
V value;

public Entry() {
}

public Entry(String p, V v) {
prefix = p;
value = v;
}

public String prefix() {
return prefix;
}

public V value() {
return value;
}

@Override
public String toString() {
return "Entry [prefix=" + prefix + ", value=" + value + "]";
}

}
}

最后是测试部分

package tiesImpl;

/**
* Class for testing different implementations of prefix tree (Trie).
*
* @author lkallas
*/

public class TriesTest {

private static final PrefixTrie<String> googleTrie = new PrefixTrie<>();
private static final Trie<String> krozTrie = new Trie<>();

public static void main(String[] args) {

//Inserting prefixes to Google implementation of Trie
googleTrie.put("7", "Russia");
googleTrie.put("77", "Abkhazia");
googleTrie.put("746", "Some unknown country");

//Inserting prefixes to Vladimir Kroz implementation of Trie
krozTrie.put("7", "Russia");
krozTrie.put("77", "Abkhazia");
krozTrie.put("746", "Some unknown country");

System.out.println("Google version of get: " + googleTrie.get("745878787"));
System.out.println("Vladimir Kroz version of get: " + krozTrie.get("745878787"));

}

}

希望这个回答对其他人也有所帮助!干杯!

关于java - 在 Java 中使用 Trie 确定电话号码前缀,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30707260/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com