c - Lempel-Ziv-Welch 减压不存在指数-6ren

c - Lempel-Ziv-Welch 减压不存在指数

转载作者：太空狗更新时间：2023-10-29 17:02:00

27

4

我有一个基于 LZW 压缩/解压缩算法的实现，并且大部分都将其平方。但是，我在测试的其中一个文件中遇到了问题。以下是所述文件的文本

#include "bits.h"

int check_endianness(){
    int i = 1;

我的实现遇到的问题是 int i = 1; 之前的空格组下面我分别包含了我的压缩和解压缩循环以及它们的相关调试输出。

压缩循环

i=0;
while(i < input_len && ret == LZW_ERR_OK){
    //get next byte
    char curChar = input_char(f_input, &io_flags);
    i++;

    //not necessary to check for stream end here since the while condition does that
    if(io_flags == STREAM_ERR_READ){
        ret = LZW_ERR_READ;
        break;
    }

    seqset(&temp, &curChar, 1);

    //add bytes to temp until a sequence is found that is not in lookup
    while(i < input_len && dictionary_has_entry(lookup, temp)){
        char curChar = input_char(f_input, &io_flags);
        i++;

        //check for errors / end of file
        if(io_flags != STREAM_ERR_OK){
            if(io_flags == STREAM_ERR_READ)
                ret = LZW_ERR_READ;
            break;
        }

        seqadd(&temp, &curChar, 1);
    }

    if(temp.length > 1){
        #ifdef DEBUG
        printf("Adding entry %d: ", lookup->next_value);
        for(int j = 0; j < temp.length; j++)
            printf("%.4x ", temp.data[j]);
        printf("\n");
        #endif // DEBUG
        dictionary_set_entry(lookup, temp, DICT_SET_FOREVER);
        temp.length--; //if temp.length == 1, then the EOF was probably reached
        i--; //This is important so that the entire file is read
    }

    fseek(f_input, -1, SEEK_CUR);
    output_short(f_output, dictionary_get_entry_byKey(lookup, temp)->value, STREAM_USE_ENDIAN);
    #ifdef DEBUG
    printf("Writing code: %d\n", dictionary_get_entry_byKey(lookup, temp)->value);
    #endif // DEBUG
}

压缩输出

Adding entry 297: 007b 000d
Writing code: 123
Adding entry 298: 000d 000a 0020
Writing code: 273
Adding entry 299: 0020 0020
Writing code: 32
Adding entry 300: 0020 0020 0020
Writing code: 299
Adding entry 301: 0020 0069
Writing code: 32
Adding entry 302: 0069 006e 0074 0020

减压循环

i=0;
while(i < input_len && ret == LZW_ERR_OK){
    short code;
    entry *e;
    code = input_short(f_input, &io_flags, STREAM_USE_ENDIAN);
    if(io_flags == STREAM_ERR_READ){
        ret = LZW_ERR_READ;
        break;
    }

    i += 2;

    //e should never be NULL
    printf("Reading code: %d\n", code);
    e = dictionary_get_entry_byValue(lookup, code);
    assert(e != NULL);

    seqset(&temp, e->key.data, e->key.length);

    //requires a slightly different approach to the lookup loop in lzw_encode
    while(i < input_len && e != NULL){
        code = input_short(f_input, &io_flags, STREAM_USE_ENDIAN);
        //check for errors / end of file
        if(io_flags != STREAM_ERR_OK){
            if(io_flags == STREAM_ERR_READ)
                ret = LZW_ERR_READ;
            break;
        }
        i += 2;
        printf("Reading code: %d\n", code);

        //e should never be NULL
        e = dictionary_get_entry_byValue(lookup, code);
        assert(e != NULL); <------------ This is where it is failing

        //start adding bytes to temp
        for(size_t j = 0; j < e->key.length; j++){
            seqadd(&temp, &e->key.data[j], 1);
            if(dictionary_get_entry_byKey(lookup, temp) == NULL){

                //sequence not found, add it to dictionary
                printf("Adding entry %d: ", lookup->next_value);
                dictionary_set_entry(lookup, temp, DICT_SET_FOREVER);
                for(int k = 0; k < temp.length; k++)
                    printf("%.4x ", temp.data[k]);
                printf("\n");
                e = NULL; //to escape from while
                break;
            }
        }
    }

    //if more than one code was read go back by two bytes
    if(e == NULL){
        i -= 2;
        fseek(f_input, -2, SEEK_CUR);
    }

    //only write up to the characters that made a known sequence
    temp.length--;
    for(size_t j = 0; j < temp.length; j++){
        output_char(f_output, temp.data[j]);
        #ifdef DEBUG
        //printf("%c", temp.data[j]);
        #endif // DEBUG
    }
}

解压输出

Reading code: 123
Reading code: 273
Adding entry 297: 007b 000d
Reading code: 273
Reading code: 32
Adding entry 298: 000d 000a 0020
Reading code: 32
Reading code: 299
299, 299 <----error output from dictionary (code asked > next value)
Assertion failed: e != NULL, file lzw.c, line 212

如有任何帮助，我们将不胜感激。

最佳答案

您遇到了 Lempel Ziv Welch 减压算法中臭名昭著的 KwKwK 问题。

来自原始文章，A Technique for High-Performance Data Compression ，Terry A. Welch，IEEE 计算机，1984 年 6 月，第 8-19 页:

The abnormal case occurs whenever an input character string contains the sequence KwKwK, where Kw already appears in the compressor string table. The compressor will parse out Kw, send CODE(Kw), and add KwK to its string table. Next it will parse out KwK and send the just-generated CODE(KwK). The decompressor, on receiving CODE(KwK), will not yet have added that code to its string table because it does not yet know an extension character for the prior string.

文章解释了如何解决这个问题。

关于c - Lempel-Ziv-Welch 减压不存在指数，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/42130786/

27

4

0

文章推荐： Angular 2's component' s样式在封装的子组件中没有继承

文章推荐： const 导致不兼容的指针类型。为什么只针对双指针？

c - Baum Welch 算法的初始隐马尔可夫模型
在尝试为隐马尔可夫模型编写程序时，我对 Baum-Welch 算法的初始 HMM 做了最简单的假设:将所有内容都作为均匀分布。也就是说， A[i][j] = 1/statenumber; B[i][j
java - Baum-Welch 实现示例
我正在尝试了解 Baum-Welch 算法(与隐马尔可夫模型一起使用)。我了解前向后向模型的基本理论，但如果有人能用一些代码来帮助解释它会很好(我发现阅读代码更容易，因为我可以玩弄它来理解它)。我检查
nlp - 用于词性标注器的 Baum-Welch 算法
大家。我正在使用 Baum-Welch 算法来训练词性标注器，它完全是无监督的方式。问题来了:当我得到标签结果时，我只得到一个数字序列。我不知道哪个标签代表 VV、NN、DT。我该如何解决这个问题？
algorithm - 为什么在 Baum Welch 算法中使用隐马尔可夫模型与马尔可夫模型
所以我正在尝试构建 Baum Welch 算法来进行词性标注以供练习。但是，我对使用隐马尔可夫模型与马尔可夫模型感到困惑。因为看起来你正在失去从一个状态到另一个状态的上下文。由于在移动到下一个状态时不
c++ - Lempel-Ziv-Welch 算法的解码问题
我必须实现 LZW 算法，但我发现解码部分存在一些问题。我认为代码是正确的，因为它适用于我在网上某处找到的示例:如果我按如下方式初始化我的字典 m_dictionary.push_back("a");
c - Lempel-Ziv-Welch 减压不存在指数
我有一个基于 LZW 压缩/解压缩算法的实现，并且大部分都将其平方。但是，我在测试的其中一个文件中遇到了问题。以下是所述文件的文本 #include "bits.h" int check_endian
r - 两个样本(不等方差)welch t 检验的统计功效和样本量确定？
两个样本 t 检验(等方差合并标准偏差)的命令是 power.t.test(n=, delta=, sd=, type="two.sample") 给定不等方差和样本数的两个样本，我如何计算统计功效？
dictionary - LZW (Limpel-Ziv-Welch) 字典编码定界符问题
本题可能并不严格限于LZW算法，可能涵盖LZ77和LZ78的其他实现: 我一直在尝试编写一个涉及 LZW 字典编码方案的压缩/解压缩实用程序。问题是我发现有必要在编码阶段写出每个代码字(或“代码”)后
python - 似乎无法在 scipy 中正确使用 welch 函数
我正在尝试通过使用Python的scipy包来学习DSP。我已经测量了来自机器的一些 200Hz 信号。现在我想检查信号的频谱密度。这是绘制出来的信号: 正如您所看到的，该信号显然不是低频信号。但是，
python - 为什么 scipy.signal.welch 会抑制零频？
我尝试用whelch方法，发现零频不正常 import numpy as np import scipy.signal as signal import matplotlib.pyplot as pl
algorithm - Viterbi 训练或 Baum-Welch 算法来估计转换和发射概率？
我正在尝试使用 Viterbi 算法在 HMM 上找到最可能的路径(即状态序列)。但是，我不知道转换矩阵和发射矩阵，我需要根据观察(数据)来估计它们。要估计这些矩阵，我应该使用哪种算法:Baum-W
python - 使用 scipy.signal.welch 找不到合适的能量
对于给定的离散时间信号 x(t)带间距dt (等于 1/fs，fs 是采样率)，能量为: E[x(t)] = sum(abs(x)**2.0)/fs 然后我做 x(t) 的 DFT : x_tf =
c - Baum Welch 算法中的 Gamma 和浮点精度
我目前正在尝试用 C 语言实现 Baum Welch 算法，但我遇到了以下问题:gamma 函数: gamma(i,t) = alpha(i,t) * beta(i,t) / sum over `i`
algorithm - 是否可以使用 Baum-Welch 算法估计(隐藏)状态的初始分布？
我已经实现了 Baum-Welch 算法，并且正在玩一些玩具数据，这些数据是根据已知分布生成的。数据呈正态分布，根据隐藏状态具有不同的均值和标准差。有2个州。除了隐藏状态的初始分布之外，该算法似乎收敛
python - Scipy welch 和 MATLAB pwelch 没有提供相同的答案
我在 python 中遇到了名为 welch ( https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.signal.we
hidden-markov-models - 我必须如何使用 Baum-Welch 和多次观察来训练 HMM？
我在理解 Baum-Welch 算法的工作原理时遇到了一些问题。我读到它调整了 HMM 的参数(转换和发射概率)，以最大化给定模型可以看到我的观察序列的概率。但是，如果我有多个观察序列会发生什么？我
r - 在 R 中使用 Welch 校正的单向 ANOVA 的事后检验
我在 R 中使用 oneway.test() 运行了带有韦尔奇校正的单向方差分析测试，因为我的数据违反了等方差假设(转换没有解决问题) . 一个简单的数据示例: > dput(df) structur
python - scipy.signal welch 与 matlab pwelch 等价
我有以下 MATLAB 代码来计算信号的 PSD: x = linspace(0, 10, 100001); dt = x(2) - x(1); Fs = 1 / dt; a1 = 1; f1 = 5
r - 如何在 R 中进行 lm() 输出 welch t 测试
有人告诉我，也见过一些例子，其中线性模型和 t 检验基本上是相同的检验，只是 t 检验是带有虚拟编码预测变量的专门线性模型。有没有办法让 lm 的输出输出与 r 中正常 t.test 函数相同的 t
r - 如何在 R 中进行 lm() 输出 welch t 测试
有人告诉我，也见过一些例子，其中线性模型和 t 检验基本上是相同的检验，只是 t 检验是带有虚拟编码预测变量的专门线性模型。有没有办法让 lm 的输出输出与 r 中正常 t.test 函数相同的 t

首页

博学

6Ren·AI

商城

c - Lempel-Ziv-Welch 减压不存在指数