当前位置: 代码迷 >> 综合 >> *****Huffman Codes(※构造哈夫曼树,※构造前缀树)
  详细解决方案

*****Huffman Codes(※构造哈夫曼树,※构造前缀树)

热度:60   发布时间:2024-02-11 11:34:45.0

【学到的东西】

1)如何构造哈夫曼树(①优先队列、push*n、pop*2、push,同时累加,这样可以得到整棵树的带权路径长度(从一位大佬的代码里学到的)②直接构造二叉树(parent数组),可使用优先队列pop最小的两个出来(也可以用算法找出最小的两个数),合并后加入队列(数组),看了另外一个大佬的代码,写的很好),以及如何由哈夫曼树得到哈夫曼编码(从叶子开始找parent,如果是parent的左孩子,则+0,右孩子+1,最后颠倒,也是在“另外一个大佬的代码”里看到的)

2)如何构造字典树(next[],用code[i]-'a'表示next[]的下标(这里是code[i]-'0')),如何判断前缀码【我的做法是:Insert(code)时①若没有增加新结点,则code为别人的前缀②若在已成为编码的叶子上增加新结点,则该叶子是code的前缀,我用了每次插入code时最后一个结点的isCode来表示“已成为编码”,这样虽然很方便,但是会增加每个结点的空间)】

7-9 Huffman Codes (30分)

In 1953, David A. Huffman published his paper "A Method for the Construction of Minimum-Redundancy Codes", and hence printed his name in the history of computer science. As a professor who gives the final exam problem on Huffman codes, I am encountering a big problem: the Huffman codes are NOT unique. For example, given a string "aaaxuaxz", we can observe that the frequencies of the characters 'a', 'x', 'u' and 'z' are 4, 2, 1 and 1, respectively. We may either encode the symbols as {'a'=0, 'x'=10, 'u'=110, 'z'=111}, or in another way as {'a'=1, 'x'=01, 'u'=001, 'z'=000}, both compress the string into 14 bits. Another set of code can be given as {'a'=0, 'x'=11, 'u'=100, 'z'=101}, but {'a'=0, 'x'=01, 'u'=011, 'z'=001} is NOT correct since "aaaxuaxz" and "aazuaxax" can both be decoded from the code 00001011001001. The students are submitting all kinds of codes, and I need a computer program to help me determine which ones are correct and which ones are not.

Input Specification:

Each input file contains one test case. For each case, the first line gives an integer N (2≤N≤63), then followed by a line that contains all the Ndistinct characters and their frequencies in the following format:

 

c[1] f[1] c[2] f[2] ... c[N] f[N]

where c[i] is a character chosen from {'0' - '9', 'a' - 'z', 'A' - 'Z', '_'}, and f[i] is the frequency of c[i] and is an integer no more than 1000. The next line gives a positive integer M (≤1000), then followed by M student submissions. Each student submission consists of N lines, each in the format:

c[i] code[i]

where c[i] is the i-th character and code[i] is an non-empty string of no more than 63 '0's and '1's.

Output Specification:

For each test case, print in each line either "Yes" if the student's submission is correct, or "No" if not.

Note: The optimal solution is not necessarily generated by Huffman algorithm. Any prefix code with code length being optimal is considered correct.

Sample Input:

7
A 1 B 1 C 1 D 3 E 3 F 6 G 6
4
A 00000
B 00001
C 0001
D 001
E 01
F 10
G 11
A 01010
B 01011
C 0100
D 011
E 10
F 11
G 00
A 000
B 001
C 010
D 011
E 100
F 101
G 110
A 00000
B 00001
C 0001
D 001
E 00
F 10
G 11

Sample Output:

Yes
Yes
No
No
#include<cstdio>
#include<iostream>
#include<cstring>
#include<queue>
using namespace std;
#define MaxSize 63char c[MaxSize];  //c[]数组
int fre_of_ch[128];  //记录每个ASCII码对应的频次
int N;  //字符个数
typedef struct trienode {  //用字典树(前缀树)判断前缀码int val;  //头结点不存储0、1struct trienode* next[2];int isCode;  //该结点是否为一个编码
}*TrieNode;TrieNode createNode() {TrieNode p = new struct trienode;p->val = -1;p->next[0] = p->next[1] = NULL;p->isCode = 0;return p;
}void Create_c(char c[]);
void Insert(string code, int& isPre, TrieNode t);
void freeTree(TrieNode t);int main() {Create_c(c);cin >> N;priority_queue<int, vector<int>, greater<int>> q;  //优先队列,push同时将小的排在最前面char ch;int fre, i, Huff_sum = 0;for (i = 0; i < N; ++i) {  //不改变Ncin >> ch >> fre;fre_of_ch[(int)ch] = fre;q.push(fre);}while (q.size() > 1) {fre = q.top();q.pop();fre += q.top();q.pop();q.push(fre);Huff_sum += fre;  //Huff_sum累加,最终得到哈夫曼编码的总长度。因为每个叶子在第几层就被累加了几次,所以能够得到整棵树的带权路径长度。}//printf("the shortest sum: %d\n", Huff_sum);int M, j, sum;string code;cin >> M;while (M--) {int isPre = 1;sum = 0;TrieNode t = createNode();for (i = 0; i < N; ++i) {  //不改变Ncin >> ch >> code;sum += code.length() * fre_of_ch[(int)ch];if (!isPre) continue;if (code.length() > N) {  //有符号数和无符号数运算,会归为无符号数isPre = 0;continue;  //不能break,因为本组输入还在继续}Insert(code, isPre, t);  //用字典树判断是否存在前缀重合的问题}//printf("the sum: %d\n", sum);if (isPre && sum == Huff_sum) cout << "Yes" << endl;else cout << "No" << endl;freeTree(t);}
}void Create_c(char c[]) {int i;for (i = 0; i < MaxSize; ++i) {if (i >= 0 && i <= 9) c[i] = '0' + i;else if (i >= 10 && i <= 35) c[i] = 'a' + i - 10;else if (i >= 36 && i <= 61) c[i] = 'A' + i - 36;else c[i] = '_';  //95}//print_c();
}void freeTree(TrieNode t) {TrieNode p = t;if (p->next[0]) freeTree(p->next[0]);if (p->next[1]) freeTree(p->next[1]);delete p;t = NULL;
}void Insert(string code, int &isPre, TrieNode t) {  int i, newnode = 0;  //newnode表示初始时没有产生新结点TrieNode p = t;for (i = 0; i < code.length(); ++i) {if (!isPre) break;if (p->next[code[i] - '0'] == NULL) {if (p->isCode) {  //p表示一个编码,若在上面插入新结点,则p表示的编码是code的前缀isPre = 0;break;}TrieNode q = createNode();q->val = code[i] - '0';if (i == code.length() - 1) q->isCode = 1;  //该code的最后一位p->next[code[i] - '0'] = q;newnode = 1;  //产生了新结点p = p->next[code[i] - '0'];}else {p = p->next[code[i] - '0'];}}if(newnode==0)isPre = 0;  //整个编码读完,没有产生新结点,说明该编码是其他编码的前缀
}

 

  相关解决方案