Closed. This question is
off-topic。它当前不接受答案。
想改善这个问题吗?
Update the question,所以它是
on-topic,用于堆栈溢出。
4年前关闭。
我不确定我将如何攻击霍夫曼树的遍历。这棵树是正确的,我很难弄清楚如何以一种好的方式遍历它。由于某种原因,我的遍历方法没有任何结果...
更新:清理代码,使其更加面向对象
节点类:
public class Node
{
public int frekvens; //Frequency
public char tegn; //Symbol
public Node venstre; //Left child
public Node høyre; //Right child
public string s; //result string
public string resultat;
public Node (char c) // Node constructor containing symbol.
{
frekvens = 1;
tegn = c;
}
public Node (int f, Node venstre, Node høyre) // Node Constructor containing frequency and children
{
frekvens = f;
this.venstre = venstre;
this.høyre = høyre;
}
public Node (Node node) // Node constructor containing a node
{
frekvens = node.frekvens;
tegn = node.tegn;
this.venstre = venstre;
this.høyre = høyre;
}
public void ØkMed1() // Inkrement frequency by one
{
frekvens = frekvens + 1;
}
public char getVenstreTegn ()
{
return venstre.tegn;
}
public char getHøyreTegn ()
{
return venstre.tegn;
}
public int getVenstreFrekvens ()
{
return venstre.frekvens;
}
public int getHøyreFrekvens ()
{
return høyre.frekvens;
}
public int getFrekvens()
{
return frekvens;
}
public bool ErTegn(char c)
{
if ( c == tegn)
{
return false;
}
else
{
return true;
}
}
//Pretty sure this does not work as intended
public string traverser (Node n) //Traverse the tree
{
if (n.tegn != '\0') //If the node containes a symbol --> a leaf
{
resultat += s;
}
else
{
if (n.getVenstreTegn() == '\0') //If left child does not have a symbol
{
s += "0";
traverser(n.venstre);
}
if (n.getHøyreTegn() == '\0') //If right child does not have a symbol
{
s += "1";
traverser(n.høyre);
}
}
return resultat;
}
public string Resultat() //Used priviously to check if i got the correct huffman tree
{
string resultat;
resultat = "Tegn: " + Convert.ToString(tegn) +" frekvens: " + Convert.ToString(frekvens) + "\n";
return resultat;
}
}
Huffman_Tree类:
public class Huffman_Tre
{
string treString;
List<Node> noder = new List<Node>();
public Node rot;
public void bygg (string input)
{
bool funnet; //Found
char karakter; //character
for (int i = 0; i < input.Length;i++) //Loops through string and sets character
//with coresponding freqeuncy in the node list
{
karakter = input[i];
funnet = false; //default
for (int j = 0; j< noder.Count; j++)
{
if (noder[j].ErTegn(karakter) == false) //if the character already exists
{
noder[j].ØkMed1(); //inkrement frequency by one
funnet = true;
break;
}
}
if (!funnet) //if the character does not exist
{
noder.Add(new Node(karakter)); //add the character to list
}
}
//Sorting node list acending by frequency
var sortertListe = noder.OrderBy(c => c.frekvens).ToList();
noder = sortertListe;
do
{
noder.Add(new Node((noder[0].frekvens + noder[1].frekvens), noder[0],noder[1]));
//Remove the leaf nodes
noder.RemoveAt(0);
noder.RemoveAt(0);
} while(noder.Count >= 2);
}
public Node getRot()
{
return rot;
}
public string visTre()
{
foreach (Node node in noder)
{
treString += node.Resultat();
}
return treString;
}
public bool erNull()
{
if (noder[0].tegn == '\0')
{
return true;
}
else
return false;
}
}
主程序:
private void btnKomprimer_Click(object sender, System.Windows.RoutedEventArgs e)
{
string input; //The string input I want to compress
input = txtInput.Text; //initialize input to text input
input = input.ToLower();
txtOutput.Text = "";
Huffman_Tre tre = new Huffman_Tre();
tre.bygg(input);
Node rot = new Node(tre.getRot());
txtOutput.Text += rot.traverser(rot);
}
}
还有一点时间,在玩C#6.0的同时,我给出了一个霍夫曼树的例子。它尚未优化(甚至没有优化!),但它可以很好地作为示例。并且它将帮助您了解“挑战”可能发生的地方。由于我的英语水平远胜于北欧语言,因此我使用英语命名,希望您不要介意。
首先,让我们从保留频率的类开始。
public sealed class HuffmanFrequencyTable
{
#region Properties
/// <summary>
/// Holds the characters and their corresponding frequencies
/// </summary>
public Dictionary<char, int> FrequencyTable { get; set; } = new Dictionary<char, int>();
#endregion
#region Methods
/// <summary>
/// Clears the internal frequency table
/// </summary>
public void Clear()
{
FrequencyTable?.Clear();
}
/// <summary>
/// Accepts and parses a new line (string) which is then
/// merged with the existing dictionary or frequency table
/// </summary>
/// <param name="line">The line to parse</param>
public void Accept(string line)
{
if (!string.IsNullOrEmpty(line))
{
line.GroupBy(ch => ch).
ToDictionary(g => g.Key, g => g.Count()).
ToList().
ForEach(x => FrequencyTable[x.Key] = x.Value);
}
}
/// <summary>
/// Performs a dump of the frequency table, ordering all characters, lowest frequency first.
/// </summary>
/// <returns>The frequency table in the format 'character [frequency]'</returns>
public override string ToString()
{
return FrequencyTable?.PrintFrequencies();
}
#endregion
}
请注意,ToString()方法使用了一种扩展方法,该方法能够“转储”所用字典的内容。这些扩展位于名为Helpers的静态类中,如下所示:
/// <summary>
/// Extension method that helps to write the contents of a generic Dictionary to a string, ordered by it's values and
/// printing the key and it's value between brackets.
/// </summary>
/// <typeparam name="TKey">Generic key</typeparam>
/// <typeparam name="TValue">Generic value type</typeparam>
/// <param name="dictionary">The dictionary</param>
/// <exception cref="ArgumentNullException">Throws an argument null exception if the provided dictionary is null</exception>
/// <returns></returns>
public static string PrintFrequencies<TKey, TValue>(this IDictionary<TKey, TValue> dictionary)
{
if (dictionary == null)
throw new ArgumentNullException("dictionary");
var items = from kvp in dictionary
orderby kvp.Value
select kvp.Key + " [" + kvp.Value + "]";
return string.Join(Environment.NewLine, items);
}
现在,有了这个FrequencyTable,我们可以开始研究如何构建节点。 Huffman使用二叉树,因此最好生成一个具有左右子节点的Node类。我也自由地在这里执行遍历算法。此类的构建如下:
public sealed class HuffmanNode
{
#region Properties
/// <summary>
/// Holds the left node, if applicable, otherwise null
/// </summary>
public HuffmanNode Left { get; set; } = null;
/// <summary>
/// Holds the right node, if applicable, otherwise null
/// </summary>
public HuffmanNode Right { get; set; } = null;
/// <summary>
/// Holds the Character (or null) for this particular node
/// </summary>
public char? Character { get; set; } = null;
/// <summary>
/// Holds the frequency for this particular node, defaulted to 0
/// </summary>
public int Frequency { get; set; } = default(int);
#endregion
#region Methods
/// <summary>
/// Traverses all nodes recursively returning the binary
/// path for the corresponding character that has been found.
/// </summary>
/// <param name="character">The character to find</param>
/// <param name="data">The datapath (containing '1's and '0's)</param>
/// <returns>The complete binary path for a character within a node</returns>
public List<bool> Traverse(char? character, List<bool> data)
{
//Check the leafs for existing characters
if (null == Left && null == Right)
{
//We're at an endpoint of our 'tree', so return it's data or nothing when the symbol
//characters do not match
return (bool)character?.Equals(Character) ? data : null;
}
else
{
List<bool> left = null;
List<bool> right = null;
//TODO: If possible refactor with proper C# 6.0 features
if (null != Left)
{
List<bool> leftPath = new List<bool>(data);
leftPath.Add(false); //Add a '0'
left = Left.Traverse(character, leftPath); //Recursive traversal for child nodes within this left node.
}
if (null != Right)
{
List<bool> rightPath = new List<bool>(data);
rightPath.Add(true); //Add a '1'
right = Right.Traverse(character, rightPath); //Recursive traversal for childnodes within this right node
}
return (null != left) ? left : right;
}
}
#endregion
}
我在HuffmanTree类中使用Node类。从逻辑上讲,树是由节点组成的。相应的HuffmanTree是这样写的:
public sealed class HuffmanTree
{
#region Fields
/// <summary>
/// Field for keeping the Huffman nodes in. Internally used.
/// </summary>
private List<HuffmanNode> nodes = new List<HuffmanNode>();
#endregion
#region Properties
/// <summary>
/// Holds the Huffman tree
/// </summary>
public HuffmanNode Root { get; set; } = null;
/// <summary>
/// Holds the frequency table for all parsed characters
/// </summary>
public HuffmanFrequencyTable Frequencies { get; private set; } = new HuffmanFrequencyTable()
/// <summary>
/// Holds the amount of bits after encoding the tree.
/// Primary usable for decoding.
/// </summary>
public int BitCountForTree { get; private set; } = default(int);
#endregion
#region Methods
/// <summary>
/// Builds the Huffman tree
/// </summary>
/// <param name="source">The source to build the Hufftree from</param>
/// <exception cref="ArgumentNullException">Thrown when source is null or empty</exception>
public void BuildTree(string source)
{
nodes.Clear(); //As we build a new tree, first make sure it's clean :)
if (string.IsNullOrEmpty(source))
throw new ArgumentNullException("source");
else
{
Frequencies.Accept(source);
foreach (KeyValuePair<char, int> symbol in Frequencies.FrequencyTable)
{
nodes.Add(new HuffmanNode() { Character = symbol.Key, Frequency = symbol.Value });
}
while (nodes.Count > 1)
{
List<HuffmanNode> orderedNodes = nodes.OrderBy(node => node.Frequency).ToList();
if (orderedNodes.Count >= 2)
{
List<HuffmanNode> takenNodes = orderedNodes.Take(2).ToList();
HuffmanNode parent = new HuffmanNode()
{
Character = null,
Frequency = takenNodes[0].Frequency + takenNodes[1].Frequency,
Left = takenNodes[0],
Right = takenNodes[1]
};
//Remove the childnodes from the original node list and add the new parent node
nodes.Remove(takenNodes[0]);
nodes.Remove(takenNodes[1]);
nodes.Add(parent);
}
}
Root = nodes.FirstOrDefault();
}
}
/// <summary>
/// Encodes a given string to the corresponding huffman encoding path
/// </summary>
/// <param name="source">The source to encode</param>
/// <returns>The binary huffman representation of the source</returns>
public BitArray Encode(string source)
{
if (!string.IsNullOrEmpty(source))
{
List<bool> encodedSource = new List<bool>();
//Traverse the tree for each character in the passed source (string) and add the binary path to the encoded source
encodedSource.AddRange(source.SelectMany(character =>
Root.Traverse(character, new List<bool>())
).ToList()
);
//For decoding, we might need the amount of bits to skip trailing bits.
BitCountForTree = encodedSource.Count;
return new BitArray(encodedSource.ToArray());
}
else return null;
}
/// <summary>
/// Decodes a given binary path to represent it's string value
/// </summary>
/// <param name="bits">BitArray for traversing the tree</param>
/// <returns></returns>
public string Decode(BitArray bits)
{
HuffmanNode current = Root;
string decodedString = string.Empty;
foreach (bool bit in bits)
{
//Find the correct current node depending on the bit set or not set.
current = (bit ? current.Right ?? current : current.Left ?? current);
if (current.IsLeaf())
{
decodedString += current.Character;
current = Root;
}
}
return decodedString;
}
#endregion
}
这段代码中有趣的是,我决定使用
BitArrays
,该
public BitArray Encode(string source)
在树建立时将保留树的二进制路径。这里的
BitCountForTree
方法包含一个肮脏的hack。我跟踪用于编码的总位数并将其存储在
Encoding.Default.GetBytes(tbOutput.Text)
属性中。执行解码时,我将使用此属性删除可能出现的任何尾随位。有一种更好的方法可以执行此操作,但我将保留此方法供您查找。
同样,此类使用为HuffmanNode编写的扩展方法。不过这很简单:
/// <summary>
/// Determines whether a given Huffman node is a leaf or not.
/// A node is considered to be a leaf when it has no childnodes
/// </summary>
/// <param name="node">A huffman node</param>
/// <returns>True if no children are left, false otherwise</returns>
public static bool IsLeaf(this HuffmanNode node)
{
return (null == node.Left && null == node.Right);
}
此扩展方法可方便地确定给定节点是否实际上是叶节点。叶子是没有子节点的节点,因此是二叉树的末端(或者最好是该树的分支)。
现在有趣的部分是,我如何在这里工作。我已经构建了一个具有3个文本框的Windows窗体应用程序。一个用于实际输入,一个用于二进制(编码)输出,最后一个用于显示压缩结果。
我还放置了两个简单的按钮,一个用于执行霍夫曼编码,一个用于霍夫曼解码。
霍夫曼编码方法编写如下(仅在编码按钮的事件处理程序中):
string input = tbInput.Text;
Tree.BuildTree(input); //Build the huffman tree
BitArray encoded = Tree.Encode(input); //Encode the tree
//First show the generated binary output
tbBinaryOutput.Text = string.Join(string.Empty, encoded.Cast<bool>().Select(bit => bit ? "1" : "0"));
//Next, convert the binary output to the new characterized output string.
byte[] bytes = new byte[(encoded.Length / 8) + 1];
encoded.CopyTo(bytes, 0);
tbOutput.Text = Encoding.Default.GetString(bytes); //Write the compressed output to the textbox.
请注意,编码的二进制字符串没有任何尾随位。我将保留C#的编码机制。不利的一面是,在解码时我必须跟踪它。
现在解码也不太困难。尽管在本例中,我使用的是上面放置的编码代码生成的压缩输出。另外,我假设霍夫曼树(及其频率表!!!)已经建立。通常,频率表存储在压缩文件中,因此可以对其进行重建。
//First convert the compressed output to a bit array again again and skip trailing bits.
bool[] boolAr = new BitArray(Encoding.Default.GetBytes(tbOutput.Text)).Cast<bool>().Take(Tree.BitCountForTree).ToArray();
BitArray encoded = new BitArray( boolAr );
string decoded = Tree.Decode(encoded);
MessageBox.Show(decoded, "Decoded result: ", MessageBoxButtons.OK, MessageBoxIcon.Information);
请注意我创建的肮脏技巧,因为
肯定会生成一个字节数组,它可能包含不需要解码的尾随位。因此,根据重建树,我只需要实际需要的位数。
因此,在运行时,当使用“世界知名的句子”“快速的狐狸跳过懒惰的程序员”时,我的示例提供了以下输出:
按下“霍夫编码”按钮后:
并按下“ Huff解码”按钮后:
现在,此代码实际上可以使用某些优化,因为您可能会考虑使用数组而不是字典。还有更多,但要由您考虑。
我是一名优秀的程序员,十分优秀!