- 浏览: 739221 次
- 性别:
- 来自: 杭州
文章分类
最新评论
-
lgh1992314:
a offset: 26b offset: 24c offse ...
java jvm字节占用空间分析 -
ls0609:
语音实现在线听书http://blog.csdn.net/ls ...
Android 语音输入API使用 -
wangli61289:
http://viralpatel-net-tutorials ...
Android 语音输入API使用 -
zxjlwt:
学习了素人派http://surenpi.com
velocity宏加载顺序 -
tt5753:
谢啦........
Lucene的IndexWriter初始化时的LockObtainFailedException的解决方法
字典树查找,Trie,又称字典树、单词查找树,是一种树形结构,用于保存大量的字符串。它的优点是:利用字符串的公共前缀来节约存储空间。
package com.jwetherell.algorithms.data_structures; import java.util.Arrays; /** * A trie, or prefix tree, is an ordered tree data structure that is used to * store an associative array where the keys are usually strings. * * == This is NOT a compact Trie. == * * http://en.wikipedia.org/wiki/Trie * * @author Justin Wetherell <phishman3579@gmail.com> */ public class Trie<C extends CharSequence> { private int size = 0; protected INodeCreator creator = null; protected Node root = null; /** * Default constructor. */ public Trie() { } /** * Constructor with external Node creator. */ public Trie(INodeCreator creator) { this.creator = creator; } /** * Create a new node for sequence. * * @param parent node of the new node. * @param character which represents this node. * @param isWord signifies if the node represents a word. * @return Node which was created. */ protected Node createNewNode(Node parent, Character character, boolean isWord) { return (new Node(parent, character, isWord)); } /** * Add sequence to trie. * * @param seq to add to the trie. * @return True if sequence is added to trie or false if it already exists. */ public boolean add(C seq) { return (this.addSequence(seq)!=null); } /** * Add sequence to trie. * * @param seq to add to the trie. * @return Node which was added to trie or null if it already exists. */ protected Node addSequence(C seq) { if (root==null) { if (this.creator==null) root = createNewNode(null, null, false); else root = this.creator.createNewNode(null, null, false); } int length = (seq.length() - 1); Node prev = root; //For each Character in the input, we'll either go to an already define // child or create a child if one does not exist for (int i = 0; i < length; i++) { Node n = null; Character c = seq.charAt(i); int index = prev.childIndex(c); //If 'prev' has a child which starts with Character c if (index >= 0) { //Go to the child n = prev.getChild(index); } else { //Create a new child for the character if (this.creator==null) n = createNewNode(prev, c, false); else n = this.creator.createNewNode(prev, c, false); prev.addChild(n); } prev = n; } //Deal with the first character of the input string not found in the trie Node n = null; Character c = seq.charAt(length); int index = prev.childIndex(c); //If 'prev' already contains a child with the last Character if (index >= 0) { n = prev.getChild(index); //If the node doesn't represent a string already if (n.isWord == false) { //Set the string to equal the full input string n.character = c; n.isWord = true; size++; return n; } else { //String already exists in Trie return null; } } else { //Create a new node for the input string if (this.creator==null) n = createNewNode(prev, c, true); else n = this.creator.createNewNode(prev, c, true); prev.addChild(n); size++; return n; } } /** * Remove sequence from the trie. * * @param sequence to remove from the trie. * @return True if sequence was remove or false if sequence is not found. */ public boolean remove(C sequence) { if (root == null) return false; //Find the key in the Trie Node previous = null; Node node = root; int length = (sequence.length() - 1); for (int i = 0; i <= length; i++) { char c = sequence.charAt(i); int index = node.childIndex(c); if (index >= 0) { previous = node; node = node.getChild(index); } else { return false; } } if (node.childrenSize > 0) { //The node which contains the input string and has children, just NULL out the string node.isWord = false; } else { //The node which contains the input string does NOT have children int index = previous.childIndex(node.character); //Remove node from previous node previous.removeChild(index); //Go back up the trie removing nodes until you find a node which represents a string while (previous != null && previous.isWord==false && previous.childrenSize == 0) { if (previous.parent != null) { int idx = previous.parent.childIndex(previous.character); if (idx >= 0) previous.parent.removeChild(idx); } previous = previous.parent; } } size--; return true; } /** * Get node which represents the sequence in the trie. * * @param seq to find a node for. * @return Node which represents the sequence or NULL if not found. */ protected Node getNode(C seq) { if (root == null) return null; //Find the string in the trie Node n = root; int length = (seq.length() - 1); for (int i = 0; i <= length; i++) { char c = seq.charAt(i); int index = n.childIndex(c); if (index >= 0) { n = n.getChild(index); } else { //string does not exist in trie return null; } } return n; } /** * Does the trie contain the sequence. * * @param seq to locate in the trie. * @return True if sequence is in the trie. */ public boolean contains(C seq) { Node n = this.getNode(seq); if (n==null || !n.isWord) return false; //If the node found in the trie does not have it's string // field defined then input string was not found return n.isWord; } /** * Number of sequences in the trie. * * @return number of sequences in the trie. */ public int size() { return size; } /** * {@inheritDoc} */ @Override public String toString() { return TriePrinter.getString(this); } protected static class Node { private static final int MINIMUM_SIZE = 2; protected Node[] children = new Node[MINIMUM_SIZE]; protected int childrenSize = 0; protected Node parent = null; protected boolean isWord = false; //Signifies this node represents a word protected Character character = null; //First character that is different the parent's string protected Node(Node parent, Character character, boolean isWord) { this.parent = parent; this.character = character; this.isWord = isWord; } protected void addChild(Node node) { if (childrenSize>=children.length) { children = Arrays.copyOf(children, ((children.length*3)/2)+1); } children[childrenSize++] = node; } protected boolean removeChild(int index) { if (index>=childrenSize) return false; children[index] = null; childrenSize--; System.arraycopy(children, index+1, children, index, childrenSize-index); if (childrenSize>=MINIMUM_SIZE && childrenSize<children.length/2) { children = Arrays.copyOf(children, childrenSize); } return true; } protected int childIndex(Character character) { for (int i = 0; i < childrenSize; i++) { Node c = children[i]; if (c.character.equals(character)) return i; } return Integer.MIN_VALUE; } protected Node getChild(int index) { if (index>=childrenSize) return null; return children[index]; } protected int getChildrenSize() { return childrenSize; } /** * {@inheritDoc} */ @Override public String toString() { StringBuilder builder = new StringBuilder(); if (isWord == true) builder.append("Node=").append(isWord).append("\n"); for (int i=0; i<childrenSize; i++) { Node c = children[i]; builder.append(c.toString()); } return builder.toString(); } } protected static interface INodeCreator { /** * Create a new node for sequence. * * @param parent node of the new node. * @param character which represents this node. * @param isWord signifies if the node represents a word. * @return Node which was created. */ public Node createNewNode(Node parent, Character character, boolean type); } protected static class TriePrinter { public static <C extends CharSequence> void print(Trie<C> trie) { System.out.println(getString(trie)); } public static <C extends CharSequence> String getString(Trie<C> tree) { return getString(tree.root, "", null, true); } protected static <C extends CharSequence> String getString(Node node, String prefix, String previousString, boolean isTail) { StringBuilder builder = new StringBuilder(); String string = null; if (node.character!=null) { String temp = String.valueOf(node.character); if (previousString!=null) string = previousString + temp; else string = temp; } builder.append(prefix + (isTail ? "└── " : "├── ") + ((node.isWord == true) ? ("(" + node.character + ") " + string) : node.character) + "\n"); if (node.children != null) { for (int i = 0; i < node.childrenSize - 1; i++) { builder.append(getString(node.children[i], prefix + (isTail ? " " : "│ "), string, false)); } if (node.childrenSize >= 1) { builder.append(getString(node.children[node.childrenSize - 1], prefix + (isTail ? " " : "│ "), string, true)); } } return builder.toString(); } } }
测试代码
private static boolean testTrie() { { long count = 0; long addTime = 0L; long removeTime = 0L; long beforeAddTime = 0L; long afterAddTime = 0L; long beforeRemoveTime = 0L; long afterRemoveTime = 0L; long memory = 0L; long beforeMemory = 0L; long afterMemory = 0L; //Trie. if (debug>1) System.out.println("Trie."); testNames[testIndex] = "Trie"; count++; if (debugMemory) beforeMemory = DataStructures.getMemoryUse(); if (debugTime) beforeAddTime = System.currentTimeMillis(); Trie<String> trie = new Trie<String>(); for (int i=0; i<unsorted.length; i++) { int item = unsorted[i]; String string = String.valueOf(item); trie.add(string); if (validateStructure && !(trie.size()==i+1)) { System.err.println("YIKES!! "+item+" caused a size mismatch."); handleError(trie); return false; } if (validateContents && !trie.contains(string)) { System.err.println("YIKES!! "+string+" doesn't exist."); handleError(trie); return false; } } if (debugTime) { afterAddTime = System.currentTimeMillis(); addTime += afterAddTime-beforeAddTime; if (debug>0) System.out.println("Trie add time = "+addTime/count+" ms"); } if (debugMemory) { afterMemory = DataStructures.getMemoryUse(); memory += afterMemory-beforeMemory; if (debug>0) System.out.println("Trie memory use = "+(memory/count)+" bytes"); } String invalid = INVALID.toString(); boolean contains = trie.contains(invalid); boolean removed = trie.remove(invalid); if (contains || removed) { System.err.println("Trie invalidity check. contains="+contains+" removed="+removed); return false; } else System.out.println("Trie invalidity check. contains="+contains+" removed="+removed); if (debug>1) System.out.println(trie.toString()); long lookupTime = 0L; long beforeLookupTime = 0L; long afterLookupTime = 0L; if (debugTime) beforeLookupTime = System.currentTimeMillis(); for (int item : unsorted) { String string = String.valueOf(item); trie.contains(string); } if (debugTime) { afterLookupTime = System.currentTimeMillis(); lookupTime += afterLookupTime-beforeLookupTime; if (debug>0) System.out.println("Trie lookup time = "+lookupTime/count+" ms"); } if (debugTime) beforeRemoveTime = System.currentTimeMillis(); for (int i=0; i<unsorted.length; i++) { int item = unsorted[i]; String string = String.valueOf(item); trie.remove(string); if (validateStructure && !(trie.size()==unsorted.length-(i+1))) { System.err.println("YIKES!! "+item+" caused a size mismatch."); handleError(trie); return false; } if (validateContents && trie.contains(string)) { System.err.println("YIKES!! "+string+" still exists."); handleError(trie); return false; } } if (debugTime) { afterRemoveTime = System.currentTimeMillis(); removeTime += afterRemoveTime-beforeRemoveTime; if (debug>0) System.out.println("Trie remove time = "+removeTime/count+" ms"); } contains = trie.contains(invalid); removed = trie.remove(invalid); if (contains || removed) { System.err.println("Trie invalidity check. contains="+contains+" removed="+removed); return false; } else System.out.println("Trie invalidity check. contains="+contains+" removed="+removed); count++; if (debugMemory) beforeMemory = DataStructures.getMemoryUse(); if (debugTime) beforeAddTime = System.currentTimeMillis(); for (int i=unsorted.length-1; i>=0; i--) { int item = unsorted[i]; String string = String.valueOf(item); trie.add(string); if (validateStructure && !(trie.size()==unsorted.length-i)) { System.err.println("YIKES!! "+item+" caused a size mismatch."); handleError(trie); return false; } if (validateContents && !trie.contains(string)) { System.err.println("YIKES!! "+string+" doesn't exists."); handleError(trie); return false; } } if (debugTime) { afterAddTime = System.currentTimeMillis(); addTime += afterAddTime-beforeAddTime; if (debug>0) System.out.println("Trie add time = "+addTime/count+" ms"); } if (debugMemory) { afterMemory = DataStructures.getMemoryUse(); memory += afterMemory-beforeMemory; if (debug>0) System.out.println("Trie memory use = "+(memory/count)+" bytes"); } contains = trie.contains(invalid); removed = trie.remove(invalid); if (contains || removed) { System.err.println("Trie invalidity check. contains="+contains+" removed="+removed); return false; } else System.out.println("Trie invalidity check. contains="+contains+" removed="+removed); if (debug>1) System.out.println(trie.toString()); lookupTime = 0L; beforeLookupTime = 0L; afterLookupTime = 0L; if (debugTime) beforeLookupTime = System.currentTimeMillis(); for (int item : unsorted) { String string = String.valueOf(item); trie.contains(string); } if (debugTime) { afterLookupTime = System.currentTimeMillis(); lookupTime += afterLookupTime-beforeLookupTime; if (debug>0) System.out.println("Trie lookup time = "+lookupTime/count+" ms"); } if (debugTime) beforeRemoveTime = System.currentTimeMillis(); for (int i=0; i<unsorted.length; i++) { int item = unsorted[i]; String string = String.valueOf(item); trie.remove(string); if (validateStructure && !(trie.size()==unsorted.length-(i+1))) { System.err.println("YIKES!! "+item+" caused a size mismatch."); handleError(trie); return false; } if (validateContents && trie.contains(string)) { System.err.println("YIKES!! "+string+" still exists."); handleError(trie); return false; } } if (debugTime) { afterRemoveTime = System.currentTimeMillis(); removeTime += afterRemoveTime-beforeRemoveTime; if (debug>0) System.out.println("Trie remove time = "+removeTime/count+" ms"); } contains = trie.contains(invalid); removed = trie.remove(invalid); if (contains || removed) { System.err.println("Trie invalidity check. contains="+contains+" removed="+removed); return false; } else System.out.println("Trie invalidity check. contains="+contains+" removed="+removed); //sorted long addSortedTime = 0L; long removeSortedTime = 0L; long beforeAddSortedTime = 0L; long afterAddSortedTime = 0L; long beforeRemoveSortedTime = 0L; long afterRemoveSortedTime = 0L; if (debugMemory) beforeMemory = DataStructures.getMemoryUse(); if (debugTime) beforeAddSortedTime = System.currentTimeMillis(); for (int i=0; i<sorted.length; i++) { int item = sorted[i]; String string = String.valueOf(item); trie.add(string); if (validateStructure && !(trie.size()==(i+1))) { System.err.println("YIKES!! "+item+" caused a size mismatch."); handleError(trie); return false; } if (validateContents && !trie.contains(string)) { System.err.println("YIKES!! "+item+" doesn't exist."); handleError(trie); return false; } } if (debugTime) { afterAddSortedTime = System.currentTimeMillis(); addSortedTime += afterAddSortedTime-beforeAddSortedTime; if (debug>0) System.out.println("Trie add time = "+addSortedTime+" ms"); } if (debugMemory) { afterMemory = DataStructures.getMemoryUse(); memory += afterMemory-beforeMemory; if (debug>0) System.out.println("Trie memory use = "+(memory/(count+1))+" bytes"); } contains = trie.contains(invalid); removed = trie.remove(invalid); if (contains || removed) { System.err.println("Trie invalidity check. contains="+contains+" removed="+removed); return false; } else System.out.println("Trie invalidity check. contains="+contains+" removed="+removed); if (debug>1) System.out.println(trie.toString()); lookupTime = 0L; beforeLookupTime = 0L; afterLookupTime = 0L; if (debugTime) beforeLookupTime = System.currentTimeMillis(); for (int item : sorted) { String string = String.valueOf(item); trie.contains(string); } if (debugTime) { afterLookupTime = System.currentTimeMillis(); lookupTime += afterLookupTime-beforeLookupTime; if (debug>0) System.out.println("Trie lookup time = "+lookupTime/(count+1)+" ms"); } if (debugTime) beforeRemoveSortedTime = System.currentTimeMillis(); for (int i=sorted.length-1; i>=0; i--) { int item = sorted[i]; String string = String.valueOf(item); trie.remove(string); if (validateStructure && !(trie.size()==i)) { System.err.println("YIKES!! "+item+" caused a size mismatch."); handleError(trie); return false; } if (validateContents && trie.contains(string)) { System.err.println("YIKES!! "+item+" still exists."); handleError(trie); return false; } } if (debugTime) { afterRemoveSortedTime = System.currentTimeMillis(); removeSortedTime += afterRemoveSortedTime-beforeRemoveSortedTime; if (debug>0) System.out.println("Trie remove time = "+removeSortedTime+" ms"); } contains = trie.contains(invalid); removed = trie.remove(invalid); if (contains || removed) { System.err.println("Trie invalidity check. contains="+contains+" removed="+removed); return false; } else System.out.println("Trie invalidity check. contains="+contains+" removed="+removed); if (testResults[testIndex]==null) testResults[testIndex] = new long[6]; testResults[testIndex][0]+=addTime/count; testResults[testIndex][1]+=removeTime/count; testResults[testIndex][2]+=addSortedTime; testResults[testIndex][3]+=removeSortedTime; testResults[testIndex][4]+=lookupTime/(count+1); testResults[testIndex++][5]+=memory/(count+1); if (debug>1) System.out.println(); } return true; }
发表评论
-
对字符串进行验证之前先进行规范化
2013-09-17 23:18 13879对字符串进行验证之前先进行规范化 应用系统中经常对字 ... -
使用telnet连接到基于spring的应用上执行容器中的bean的任意方法
2013-08-08 09:17 1425使用telnet连接到基于spring的应用上执行容器中 ... -
jdk7和8的一些新特性介绍
2013-07-06 16:07 10064更多ppt内容请查看:htt ... -
java对于接口和抽象类的代理实现,不需要有具体实现类
2013-06-12 09:50 2905原文链接:http://www.javaarch.net/j ... -
Java EE 7中对WebSocket 1.0的支持
2013-06-05 09:27 3795原文链接:http://www.javaarch.n ... -
Java Web使用swfobject调用flex图表
2013-05-28 19:05 1076Java Web使用swfobject调用 ... -
spring使用PropertyPlaceholderConfigurer扩展来满足不同环境的参数配置
2013-05-21 15:57 3285spring使用PropertyPlaceholderCon ... -
java国际化
2013-05-20 20:57 4435java国际化 本文来自:http://www.j ... -
RSS feeds with Java
2013-05-20 20:52 1182RSS feeds with Java 原文来自:htt ... -
使用ibatis将数据库从oracle迁移到mysql的几个修改点
2013-04-29 10:40 1629我们项目在公司的大战略下需要从oracle ... -
线上机器jvm dump分析脚本
2013-04-19 10:48 2856#!/bin/sh DUMP_PIDS=`p ... -
eclipse远程部署,静态文件实时同步插件
2013-04-06 20:18 5414eclipse 远程文件实时同步,eclipse远程 ... -
java价格处理的一个问题
2013-03-26 21:21 1786我们经常会处理一些价格,比如从运营上传的文件中将某 ... -
java 服务降级开关设计思路
2013-03-23 16:35 3715java 服务屏蔽开关系统,可以手工降级服务,关闭服 ... -
poi解析excel内存溢出
2013-03-20 22:21 6344真是悲剧啊,一个破内部使用系统20多个人使用的后 ... -
简单web安全框架
2013-03-16 11:56 1497web安全框架,主要用servlet filter方 ... -
基于servlet的简单的页面缓存框架
2013-03-11 19:27 1176基于servlet的页面级缓存框架的基本用法: 代码参考: ... -
Eclipse使用过程中出现java.lang.NoClassDefFoundError的解决方案
2013-02-01 17:22 1469如果jdk,classpath设置正确,突然在eclipse ... -
jetty对于包的加载顺序的处理
2013-01-28 22:58 40631.问题 今天在本地和测试环境用jet ... -
hsqldb源码分析系列6之事务处理
2013-01-20 15:20 1673在session的 public Result ...
相关推荐
基于Trie树模仿谷歌百度搜索框提示。写的比较简单、代码漏洞之处欢迎指正。
Trie树,又称字典树或前缀树,关于它的结构就不详细介绍了。Trie树在单词统计、前缀匹配等很多方面有很大用处...下面这篇文章主要介绍了Trie树,以及Java实现如何Trie树,有需要的朋友可以参考借鉴,下面来一起看看吧。
java数组 java数组_基于java实现的双数组Trie树
一棵用List来存储子结点的字典树——当然,也可以用哈希表等形式存储。 这篇笔记记录了构建思路,文末是源码 一、构建思路 Step1 设计结点——数据结构 Step2 实现相应的操作方法——增删改查 Step1 设计结点 我们...
Java实现字典树TrieTree,可用于计算出四六级试题的高频词.
基于Java链表实现的字典树(trie),实现了增删改查等功能,它说摘要必须大于50字我还能说啥啥啥啥
主要介绍了Java中实现双数组Trie树实例,双数组Trie就是一种优化了空间的Trie树,本文给出了实现代码、测试代码和测试结果,需要的朋友可以参考下
字典树,java语言 字典树,trie 每个节点26个子节点
DoubleArrayTrie Java编写的DoubleArrayTrie介绍用法// construct and buildDoubleArrayTrie dat = new DoubleArrayTrie(); for(String word: words) { dat.Insert(word); } System.out.println(dat.Base.length); ...
Java字典树实现。 精简优化版处理。适合Java屏蔽字并进行。进行检测并处理的情况
在 Java 中实现的并发非阻塞 patricia 树。 此树支持整数键并使用基于边缘的锁定来改进内存和性能。 这是作为德切夫博士在 UCF 的并行算法和编程 (COP4520) 课程的最终项目完成的。 该存储库还包括一篇用 LaTeX 编写...
Prefix Trie数据结构的Java实现。 介绍 尝试是类似于基于有序树的数据结构的地图,可快速搜索O(k)的顺序,其中k是键的长度。 阅读有关trie的更多信息。 动机 它最初是为在我的Android应用程序T9 App Launcher中使用...
基数树(也称为 patricia trie、radix trie或紧凑前缀树)是一种空间优化的树数据结构,它允许仅使用键的前缀插入键(以及与这些键相关联的可选值)以供后续查找而不是整个密钥。 基数树在字符串或文档索引和扫描...
Trie树(来自单词retrieval),又称前缀字,单词查找树,字典树,是一种树形结构,是一种哈希树的变种,是一种用于快速检索的多叉树结构。 它的优点是:最大限度地减少无谓的字符串比较,查询效率比哈希表高。 Trie...
第1章 基本概念 1.1 概观:系统生命周期 1.2 指针和动态存储分配 1.3 算法形式规范 ...12. 2 二路Trie树和Patricia树 12.3 多路Trie树 12.4 后缀树 12.5 Trie树和互联网的包转发 12.6 参考文献和选读材料
给定算术表达式的DFA图,利用Java语言构建Trie树,实现对输入文法的判断
字典树(Trie)可以保存一些字符串->值的对应关系。基本上,它跟 Java 的 HashMap 功能相同,都...至于Trie树的实现,可以用数组,也可以用指针动态分配,我做题时为了方便就用了数组,静态分配空间。 Trie树,又称单词
Java中的Trie和Levenshtein距离混合实现,可实现极快的前缀字符串搜索和字符串相似性。 作者:Umberto Griffo 推特:@UmbertoGriffo内容特里定义Trie [1]是使用字符串作为键的有序树数据结构。 这是一种有效的信息...
实施为Java NavigableMap的自适应基数树 该库基于ICDE 2013“自适应基数树:主存数据库的ARTful索引”,以的形式提供了自适应基数树(ART)的实现。 在有序数据结构的空间中,特别有趣,因为它们的高度和时间...
对每个小文件,统计每个文件中出现的词以及相应的频率(可以采用trie 树/hash_map等),并取出出现频率最大的100个词(可以用含100个结点的最 小堆),并把100词及相应的频率存入文件,这样又得到了5000个文件。...