c++ - 隐藏 PCM 流中的丢包-6ren

c++ - 隐藏 PCM 流中的丢包

转载作者：太空宇宙更新时间：2023-11-04 14:27:07

我希望使用“数据包丢失隐藏”来隐藏音频流中丢失的 PCM 帧。不幸的是，我无法找到一个没有所有许可限制和代码膨胀的库(......虽然有一些建议)。

我找到了一些由 Steve Underwood 为实现 PLC 的 Asterisk 项目编写的 GPL 代码。有几个限制；不过，正如史蒂夫在他的代码中建议的那样，他的算法可以通过一些工作应用于不同的流。目前，该代码适用于 8kHz 16 位有符号单声道流。

可以通过简单搜索 Google Code Search 找到代码的变体。 .

我希望我可以调整代码以与其他流一起工作。最初，目标是针对 8+ kHz、16 位有符号、多声道音频(全部在 C++ 环境中)调整算法。最终，我希望在 GPL 许可下提供代码，希望它能对其他人有所帮助...

附上我努力的代码如下。该代码包括一个主要功能，该功能将以给定的概率“丢弃”许多帧。不幸的是，代码并没有像预期的那样工作。我在 gdb 中运行时收到 EXC_BAD_ACCESS，但在使用“bt”命令时我没有从 gdb 获得任何跟踪。显然，我正在内存中的某个地方，但不确定确切的位置。当我注释掉 amdf_pitch 函数时，代码运行时不会崩溃...

int main (int argc, char *argv[])
{
 std::ifstream fin("C:\\cc32kHz.pcm");

 if(!fin.is_open())
 {
  std::cout << "Failed to open input file" << std::endl;
  return 1;
 }

 std::ofstream fout_repaired("C:\\cc32kHz_repaired.pcm");

 if(!fout_repaired.is_open())
 {
  std::cout << "Failed to open output repaired file" << std::endl;
  return 1;
 }

 std::ofstream fout_lossy("C:\\cc32kHz_lossy.pcm");

 if(!fout_lossy.is_open())
 {
  std::cout << "Failed to open output repaired file" << std::endl;
  return 1;
 }

 audio::PcmConcealer Concealer;
 Concealer.Init(1, 16, 32000);

 //Generate random numbers;
 srand( time(NULL) );

 int value = 0;
 int probability = 5;

 while(!fin.eof())
 {
  char arr[2];
  fin.read(arr, 2);

  //Generate's random number;
  value = rand() % 100 + 1;

  if(value <= probability)
  {
   char blank[2] = {0x00, 0x00};

   fout_lossy.write(blank, 2);

   //Fill in data;
   Concealer.Fill((int16_t *)blank, 1);
   fout_repaired.write(blank, 2);
  }
  else
  {
   //Write data to file;
   fout_repaired.write(arr, 2);
   fout_lossy.write(arr, 2);

   Concealer.Receive((int16_t *)arr, 1);
  }
 }

 fin.close();
 fout_repaired.close();
 fout_lossy.close();

 return 0;
}

PcmConcealer.hpp

/*
 * Code adapted from Steve Underwood of the Asterisk Project. This code inherits
 * the same licensing restrictions as the Asterisk Project.
 */


#ifndef __PCMCONCEALER_HPP__
#define __PCMCONCEALER_HPP__

/**

1. What does it do?
The packet loss concealment module provides a suitable synthetic fill-in signal,
to minimise the audible effect of lost packets in VoIP applications. It is not
tied to any particular codec, and could be used with almost any codec which does not
specify its own procedure for packet loss concealment.

Where a codec specific concealment procedure exists, the algorithm is usually built
around knowledge of the characteristics of the particular codec. It will, therefore,
generally give better results for that particular codec than this generic concealer will.

2. How does it work?
While good packets are being received, the plc_rx() routine keeps a record of the trailing
section of the known speech signal. If a packet is missed, plc_fillin() is called to produce
a synthetic replacement for the real speech signal. The average mean difference function
(AMDF) is applied to the last known good signal, to determine its effective pitch.
Based on this, the last pitch period of signal is saved. Essentially, this cycle of speech
will be repeated over and over until the real speech resumes. However, several refinements
are needed to obtain smooth pleasant sounding results.

- The two ends of the stored cycle of speech will not always fit together smoothly. This can
  cause roughness, or even clicks, at the joins between cycles. To soften this, the
  1/4 pitch period of real speech preceeding the cycle to be repeated is blended with the last
  1/4 pitch period of the cycle to be repeated, using an overlap-add (OLA) technique (i.e.
  in total, the last 5/4 pitch periods of real speech are used).

- The start of the synthetic speech will not always fit together smoothly with the tail of
  real speech passed on before the erasure was identified. Ideally, we would like to modify
  the last 1/4 pitch period of the real speech, to blend it into the synthetic speech. However,
  it is too late for that. We could have delayed the real speech a little, but that would
  require more buffer manipulation, and hurt the efficiency of the no-lost-packets case
  (which we hope is the dominant case). Instead we use a degenerate form of OLA to modify
  the start of the synthetic data. The last 1/4 pitch period of real speech is time reversed,
  and OLA is used to blend it with the first 1/4 pitch period of synthetic speech. The result
  seems quite acceptable.

- As we progress into the erasure, the chances of the synthetic signal being anything like
  correct steadily fall. Therefore, the volume of the synthesized signal is made to decay
  linearly, such that after 50ms of missing audio it is reduced to silence.

- When real speech resumes, an extra 1/4 pitch period of sythetic speech is blended with the
  start of the real speech. If the erasure is small, this smoothes the transition. If the erasure
  is long, and the synthetic signal has faded to zero, the blending softens the start up of the
  real signal, avoiding a kind of "click" or "pop" effect that might occur with a sudden onset.

3. How do I use it?
Before audio is processed, call plc_init() to create an instance of the packet loss
concealer. For each received audio packet that is acceptable (i.e. not including those being
dropped for being too late) call plc_rx() to record the content of the packet. Note this may
modify the packet a little after a period of packet loss, to blend real synthetic data smoothly.
When a real packet is not available in time, call plc_fillin() to create a sythetic substitute.
That's it!

*/


/*! Minimum allowed pitch (66 Hz) */
#define PLC_PITCH_MIN(SAMPLE_RATE) ((double)(SAMPLE_RATE) / 66.6)

/*! Maximum allowed pitch (200 Hz) */
#define PLC_PITCH_MAX(SAMPLE_RATE) ((SAMPLE_RATE) / 200)

/*! Maximum pitch OLA window */
//#define PLC_PITCH_OVERLAP_MAX(SAMPLE_RATE) ((PLC_PITCH_MIN(SAMPLE_RATE)) >> 2)

/*! The length over which the AMDF function looks for similarity (20 ms) */
#define CORRELATION_SPAN(SAMPLE_RATE) ((20 * (SAMPLE_RATE)) / 1000)

/*! History buffer length. The buffer must also be at leat 1.25 times
    PLC_PITCH_MIN, but that is much smaller than the buffer needs to be for
    the pitch assessment. */
//#define PLC_HISTORY_LEN(SAMPLE_RATE) ((CORRELATION_SPAN(SAMPLE_RATE)) + (PLC_PITCH_MIN(SAMPLE_RATE)))


namespace audio
{


typedef struct
{
    /*! Consecutive erased samples */
    int missing_samples;

    /*! Current offset into pitch period */
    int pitch_offset;

 /*! Pitch estimate */
    int pitch;

 /*! Buffer for a cycle of speech */
    float *pitchbuf;//[PLC_PITCH_MIN];

 /*! History buffer */
    short *history;//[PLC_HISTORY_LEN];

 /*! Current pointer into the history buffer */
    int buf_ptr;
} plc_state_t;


class PcmConcealer
{
public:
 PcmConcealer();

 ~PcmConcealer();

 void Init(int channels, int bit_depth, int sample_rate);

 //Process a block of received audio samples.
 int Receive(short amp[], int frames);

 //Fill-in a block of missing audio samples.
 int Fill(short amp[], int frames);

 void Destroy();

private:

 int amdf_pitch(int min_pitch, int max_pitch, short amp[], int channel_index, int frames);
 void save_history(plc_state_t *s, short *buf, int channel_index, int frames);
 void normalise_history(plc_state_t *s);

 /** Holds the states of each of the channels **/
 std::vector< plc_state_t * > ChannelStates;

 int plc_pitch_min;
 int plc_pitch_max;
 int plc_pitch_overlap_max;
 int correlation_span;
 int plc_history_len;

 int channel_count;
 int sample_rate;

 bool Initialized;
};


}

#endif

PcmConcealer.cpp

/*
 * Code adapted from Steve Underwood of the Asterisk Project. This code inherits
 * the same licensing restrictions as the Asterisk Project.
 */

#include "audio/PcmConcealer.hpp"

/* We do a straight line fade to zero volume in 50ms when we are filling in for missing data. */
#define ATTENUATION_INCREMENT       0.0025                              /* Attenuation per sample */


#if !defined(INT16_MAX)
#define INT16_MAX       (32767)
#define INT16_MIN       (-32767-1)
#endif


#ifdef WIN32
inline double rint(double x)
{
     return floor(x + 0.5);
}
#endif

inline short fsaturate(double damp)
{
    if (damp > 32767.0)
        return  INT16_MAX;

    if (damp < -32768.0)
        return  INT16_MIN;

 return (short)rint(damp);
}

namespace audio
{

PcmConcealer::PcmConcealer() : Initialized(false)
{


}

PcmConcealer::~PcmConcealer()
{
 Destroy();
}

void PcmConcealer::Init(int channels, int bit_depth, int sample_rate)
{
 if(Initialized)
  return;

 if(channels <= 0 || bit_depth != 16)
  return;

 Initialized = true;

 channel_count = channels;
 this->sample_rate = sample_rate;

 //////////////

 double min = PLC_PITCH_MIN(sample_rate);
 int imin = (int)min;

 double max = PLC_PITCH_MAX(sample_rate);
 int imax = (int)max;

 plc_pitch_min = imin;
 plc_pitch_max = imax;
 plc_pitch_overlap_max = (plc_pitch_min >> 2);
 correlation_span = CORRELATION_SPAN(sample_rate);
 plc_history_len = correlation_span + plc_pitch_min;

 //////////////

 for(int i = 0; i < channel_count; i ++)
 {
  plc_state_t *t = new plc_state_t;
  memset(t, 0, sizeof(plc_state_t));

  t->pitchbuf = new float[plc_pitch_min];
  t->history = new short[plc_history_len];

  ChannelStates.push_back(t);
 }
}

void PcmConcealer::Destroy()
{
 if(!Initialized)
  return;

 while(ChannelStates.size())
 {
  plc_state_t *s = ChannelStates.at(0);

  if(s)
  {
   if(s->history) delete s->history;
   if(s->pitchbuf) delete s->pitchbuf;

   memset(s, 0, sizeof(plc_state_t));
   delete s;
  }

  ChannelStates.erase(ChannelStates.begin());
 }

 ChannelStates.clear();

 Initialized = false;
}

//Process a block of received audio samples.
int PcmConcealer::Receive(short amp[], int frames)
{
 if(!Initialized)
  return 0;

 int j = 0;

 for(int k = 0; k < ChannelStates.size(); k++)
 {
  int i;
  int overlap_len;
  int pitch_overlap;

  float old_step;
  float new_step;
  float old_weight;
  float new_weight;
  float gain;

  plc_state_t *s = ChannelStates.at(k);

  if (s->missing_samples)
  {
   /* Although we have a real signal, we need to smooth it to fit well
    with the synthetic signal we used for the previous block */

   /* The start of the real data is overlapped with the next 1/4 cycle
    of the synthetic data. */
   pitch_overlap = s->pitch >> 2;

   if (pitch_overlap > frames)
    pitch_overlap = frames;

   gain = 1.0 - s->missing_samples * ATTENUATION_INCREMENT;

   if (gain < 0.0)
    gain = 0.0;

   new_step = 1.0/pitch_overlap;
   old_step = new_step*gain;
   new_weight = new_step;
   old_weight = (1.0 - new_step)*gain;

   for (i = 0;  i < pitch_overlap;  i++)
   {
    int index = (i * channel_count) + j;

    amp[index] = fsaturate(old_weight * s->pitchbuf[s->pitch_offset] + new_weight * amp[index]);

    if (++s->pitch_offset >= s->pitch)
     s->pitch_offset = 0;

    new_weight += new_step;
    old_weight -= old_step;

    if (old_weight < 0.0)
     old_weight = 0.0;
   }

   s->missing_samples = 0;
  }

  save_history(s, amp, j, frames);

  j++;
 }

    return frames;
}

//Fill-in a block of missing audio samples.
int PcmConcealer::Fill(short amp[], int frames)
{
 if(!Initialized)
  return 0;

 int j =0;

 for(int k = 0; k < ChannelStates.size(); k++)
 {
  short *tmp = new short[plc_pitch_overlap_max];

  int i;
  int pitch_overlap;

  float old_step;
  float new_step;
  float old_weight;
  float new_weight;
  float gain;

  short *orig_amp;
  int orig_len;

  orig_amp = amp;
  orig_len = frames;

  plc_state_t *s = ChannelStates.at(k);

  if (s->missing_samples == 0)
  {
   // As the gap in real speech starts we need to assess the last known pitch,
   //and prepare the synthetic data we will use for fill-in
   normalise_history(s);
   s->pitch = amdf_pitch(plc_pitch_min, plc_pitch_max, s->history + plc_history_len - correlation_span - plc_pitch_min, j, correlation_span);

   // We overlap a 1/4 wavelength
   pitch_overlap = s->pitch >> 2;

   // Cook up a single cycle of pitch, using a single of the real signal with 1/4
   //cycle OLA'ed to make the ends join up nicely
   // The first 3/4 of the cycle is a simple copy
   for (i = 0;  i < s->pitch - pitch_overlap;  i++)
    s->pitchbuf[i] = s->history[plc_history_len - s->pitch + i];

   // The last 1/4 of the cycle is overlapped with the end of the previous cycle
   new_step = 1.0/pitch_overlap;
   new_weight = new_step;

   for (  ;  i < s->pitch;  i++)
   {
    s->pitchbuf[i] = s->history[plc_history_len - s->pitch + i]*(1.0 - new_weight) + s->history[plc_history_len - 2*s->pitch + i]*new_weight;
    new_weight += new_step;
   }

   // We should now be ready to fill in the gap with repeated, decaying cycles
   // of what is in pitchbuf

   // We need to OLA the first 1/4 wavelength of the synthetic data, to smooth
   // it into the previous real data. To avoid the need to introduce a delay
   // in the stream, reverse the last 1/4 wavelength, and OLA with that.

   gain = 1.0;
   new_step = 1.0/pitch_overlap;
   old_step = new_step;
   new_weight = new_step;
   old_weight = 1.0 - new_step;

   for (i = 0;  i < pitch_overlap;  i++)
   {
    int index = (i * channel_count) + j;

    amp[index] = fsaturate(old_weight * s->history[plc_history_len - 1 - i] + new_weight * s->pitchbuf[i]);
    new_weight += new_step;
    old_weight -= old_step;

    if (old_weight < 0.0)
     old_weight = 0.0;
   }

   s->pitch_offset = i;
  }
  else
  {
   gain = 1.0 - s->missing_samples*ATTENUATION_INCREMENT;
   i = 0;
  }

  for (  ;  gain > 0.0  &&  i < frames;  i++)
  {
   int index = (i * channel_count) + j;

   amp[index] = s->pitchbuf[s->pitch_offset]*gain;
   gain -= ATTENUATION_INCREMENT;

   if (++s->pitch_offset >= s->pitch)
    s->pitch_offset = 0;
  }

  for (  ;  i < frames;  i++)
  {
   int index = (i * channel_count) + j;
   amp[i] = 0;
  }

  s->missing_samples += orig_len;
  save_history(s, amp, j, frames);

  delete [] tmp;

  j++;
    }

 return frames;
}

void PcmConcealer::save_history(plc_state_t *s, short *buf, int channel_index, int frames)
{
    if (frames >= plc_history_len)
    {
        /* Just keep the last part of the new data, starting at the beginning of the buffer */
        //memcpy(s->history, buf + len - plc_history_len, sizeof(short)*plc_history_len);

  int frames_to_copy = plc_history_len;

  for(int i = 0; i < frames_to_copy; i ++)
  {
   int index = (channel_count * (i + frames - plc_history_len)) + channel_index;
   s->history[i] = buf[index];
  }

        s->buf_ptr = 0;
        return;
    }

 if (s->buf_ptr + frames > plc_history_len)
    {
        /* Wraps around - must break into two sections */
        //memcpy(s->history + s->buf_ptr, buf, sizeof(short)*(plc_history_len - s->buf_ptr));

  short *hist_ptr = s->history + s->buf_ptr;
  int frames_to_copy = plc_history_len - s->buf_ptr;

  for(int i = 0; i < frames_to_copy; i ++)
  {
   int index = (channel_count * i) + channel_index;
   hist_ptr[i] = buf[index];
  }

        frames -= (plc_history_len - s->buf_ptr);


        //memcpy(s->history, buf + (plc_history_len - s->buf_ptr), sizeof(short)*len);

  frames_to_copy = frames;

  for(int i = 0; i < frames_to_copy; i ++)
  {
   int index = (channel_count * (i + (plc_history_len - s->buf_ptr))) + channel_index;
   s->history[i] = buf[index];
  }

        s->buf_ptr = frames;
        return;
    }

    /* Can use just one section */
    //memcpy(s->history + s->buf_ptr, buf, sizeof(short)*len);

 short *hist_ptr = s->history + s->buf_ptr;
 int frames_to_copy = frames;

 for(int i = 0; i < frames_to_copy; i ++)
 {
  int index = (channel_count * i) + channel_index;
  hist_ptr[i] = buf[index];
 }

 s->buf_ptr += frames;
}

void PcmConcealer::normalise_history(plc_state_t *s)
{
    short *tmp = new short[plc_history_len];

    if (s->buf_ptr == 0)
        return;

    memcpy(tmp, s->history, sizeof(short)*s->buf_ptr);
    memcpy(s->history, s->history + s->buf_ptr, sizeof(short)*(plc_history_len - s->buf_ptr));
    memcpy(s->history + plc_history_len - s->buf_ptr, tmp, sizeof(short)*s->buf_ptr);

    s->buf_ptr = 0;

 delete [] tmp;
}

int PcmConcealer::amdf_pitch(int min_pitch, int max_pitch, short amp[], int channel_index, int frames)
{
    int i;
    int j;
    int acc;
    int min_acc;
    int pitch;

    pitch = min_pitch;
    min_acc = INT_MAX;

    for (i = max_pitch;  i <= min_pitch;  i++)
    {
        acc = 0;

  for (j = 0;  j < frames;  j++)
  {
   int index1 = (channel_count * (i+j)) + channel_index;
   int index2 = (channel_count * j) + channel_index;

   //std::cout << "Index 1: " << index1 << ", Index 2: " << index2 << std::endl;

            acc += abs(amp[index1] - amp[index2]);
        }

  if (acc < min_acc)
        {
            min_acc = acc;
            pitch = i;
        }
    }

 std::cout << "Pitch: " << pitch << std::endl;

    return pitch;
}



}

附言- 我必须承认数字音频不是我的强项...

最佳答案

修复了问题。问题出在 amdf_pitch 函数中。其他地方也有一些小错误(已修复)。因此，代码现在将以给定的概率运行测试台插入空白。

我使用 Audacity 研究了通过测试台创建的原始 PCM 流。当遇到一组空白帧时，会按预期从接收到空白进行平滑处理；然而，当我们从空白变为有效/接收到的数据时，我们得到了点击，因为平滑在这个阶段似乎不起作用。有什么建议吗？

我附上了更新后的代码:

int main (int argc, char *argv[])
{
    std::ifstream fin("C:\\cc32kHz.pcm", std::ios::binary);

    if(!fin.is_open())
    {
        std::cout << "Failed to open input file" << std::endl;
        return 1;
    }

    std::ofstream fout_repaired("C:\\cc32kHz_repaired.pcm", std::ios::binary);

    if(!fout_repaired.is_open())
    {
        std::cout << "Failed to open output repaired file" << std::endl;
        return 1;
    }

    std::ofstream fout_lossy("C:\\cc32kHz_lossy.pcm", std::ios::binary);

    if(!fout_lossy.is_open())
    {
        std::cout << "Failed to open output repaired file" << std::endl;
        return 1;
    }

    audio::PcmConcealer Concealer;
    Concealer.Init(1, 16, 32000);  //1-channel, 16-bit, 32kHz

    //Generate random numbers;
    srand( time(NULL) );

    int value = 0;
    int probability = 3;

    int old_bytes_read = 0;

    while(!fin.eof())
    {
        char arr[1024];
        fin.read(arr, 1024);

        int total_bytes_read = fin.tellg();
        int bytes_read = total_bytes_read - old_bytes_read;
        old_bytes_read = total_bytes_read;

        if(!bytes_read)
            continue;  //Probably reached EOF;

        //Generate's random number;
        value = rand() % 100 + 1;

        if(value <= probability)
        {
            char blank[1024] = {0x00, 0x00};

            fout_lossy.write(blank, 1024);

            //Fill in data;
            Concealer.Fill((int16_t *)blank, 512);
            fout_repaired.write(blank, 1024);
        }
        else
        {
            //Write data to file;
            fout_repaired.write(arr, 1024);
            fout_lossy.write(arr, 1024);

            Concealer.Receive((int16_t *)arr, 512);
        }
    }

    fin.close();
    fout_repaired.close();
    fout_lossy.close();

    return 0;
}

PcmConcealer.hpp

/*
 * PcmConcealer.hpp
 * Code adapted from Steve Underwood of the Asterisk Project. This code inherits
 * the same licensing restrictions as the Asterisk Project.
 */



#ifndef __PCMCONCEALER_HPP__
#define __PCMCONCEALER_HPP__

/**

1. What does it do?
The packet loss concealment module provides a suitable synthetic fill-in signal,
to minimise the audible effect of lost packets in VoIP applications. It is not
tied to any particular codec, and could be used with almost any codec which does not
specify its own procedure for packet loss concealment.

Where a codec specific concealment procedure exists, the algorithm is usually built
around knowledge of the characteristics of the particular codec. It will, therefore,
generally give better results for that particular codec than this generic concealer will.

2. How does it work?
While good packets are being received, the plc_rx() routine keeps a record of the trailing
section of the known speech signal. If a packet is missed, plc_fillin() is called to produce
a synthetic replacement for the real speech signal. The average mean difference function
(AMDF) is applied to the last known good signal, to determine its effective pitch.
Based on this, the last pitch period of signal is saved. Essentially, this cycle of speech
will be repeated over and over until the real speech resumes. However, several refinements
are needed to obtain smooth pleasant sounding results.

- The two ends of the stored cycle of speech will not always fit together smoothly. This can
  cause roughness, or even clicks, at the joins between cycles. To soften this, the
  1/4 pitch period of real speech preceeding the cycle to be repeated is blended with the last
  1/4 pitch period of the cycle to be repeated, using an overlap-add (OLA) technique (i.e.
  in total, the last 5/4 pitch periods of real speech are used).

- The start of the synthetic speech will not always fit together smoothly with the tail of
  real speech passed on before the erasure was identified. Ideally, we would like to modify
  the last 1/4 pitch period of the real speech, to blend it into the synthetic speech. However,
  it is too late for that. We could have delayed the real speech a little, but that would
  require more buffer manipulation, and hurt the efficiency of the no-lost-packets case
  (which we hope is the dominant case). Instead we use a degenerate form of OLA to modify
  the start of the synthetic data. The last 1/4 pitch period of real speech is time reversed,
  and OLA is used to blend it with the first 1/4 pitch period of synthetic speech. The result
  seems quite acceptable.

- As we progress into the erasure, the chances of the synthetic signal being anything like
  correct steadily fall. Therefore, the volume of the synthesized signal is made to decay
  linearly, such that after 50ms of missing audio it is reduced to silence.

- When real speech resumes, an extra 1/4 pitch period of sythetic speech is blended with the
  start of the real speech. If the erasure is small, this smoothes the transition. If the erasure
  is long, and the synthetic signal has faded to zero, the blending softens the start up of the
  real signal, avoiding a kind of "click" or "pop" effect that might occur with a sudden onset.

3. How do I use it?
Before audio is processed, call plc_init() to create an instance of the packet loss
concealer. For each received audio packet that is acceptable (i.e. not including those being
dropped for being too late) call plc_rx() to record the content of the packet. Note this may
modify the packet a little after a period of packet loss, to blend real synthetic data smoothly.
When a real packet is not available in time, call plc_fillin() to create a sythetic substitute.
That's it!

*/


/*! Minimum allowed pitch (66 Hz) */
#define PLC_PITCH_MIN(SAMPLE_RATE) ((double)(SAMPLE_RATE) / 66.6)

/*! Maximum allowed pitch (200 Hz) */
#define PLC_PITCH_MAX(SAMPLE_RATE) ((SAMPLE_RATE) / 200)

/*! Maximum pitch OLA window */
//#define PLC_PITCH_OVERLAP_MAX(SAMPLE_RATE) ((PLC_PITCH_MIN(SAMPLE_RATE)) >> 2)

/*! The length over which the AMDF function looks for similarity (20 ms) */
#define CORRELATION_SPAN(SAMPLE_RATE) ((20 * (SAMPLE_RATE)) / 1000)

/*! History buffer length. The buffer must also be at leat 1.25 times
    PLC_PITCH_MIN, but that is much smaller than the buffer needs to be for
    the pitch assessment. */
//#define PLC_HISTORY_LEN(SAMPLE_RATE) ((CORRELATION_SPAN(SAMPLE_RATE)) + (PLC_PITCH_MIN(SAMPLE_RATE)))


namespace audio
{


typedef struct
{
    /*! Consecutive erased samples */
    int missing_samples;

    /*! Current offset into pitch period */
    int pitch_offset;

    /*! Pitch estimate */
    int pitch;

    /*! Buffer for a cycle of speech */
    float *pitchbuf;//[PLC_PITCH_MIN];

    /*! History buffer */
    short *history;//[PLC_HISTORY_LEN];

    /*! Current pointer into the history buffer */
    int buf_ptr;
} plc_state_t;


class PcmConcealer
{
public:
    PcmConcealer();

    ~PcmConcealer();

    void Init(int channels, int bit_depth, int sample_rate);

    //Process a block of received audio samples.
    int Receive(short amp[], int frames);

    //Fill-in a block of missing audio samples.
    int Fill(short amp[], int frames);

    void Destroy();

private:

    inline int amdf_pitch(int min_pitch, int max_pitch, short amp[], int frames);
    void save_history(plc_state_t *s, short *buf, int channel_index, int frames);
    void normalise_history(plc_state_t *s);

    /** Holds the states of each of the channels **/
    std::vector< plc_state_t * > ChannelStates;

    int plc_pitch_min;
    int plc_pitch_max;
    int plc_pitch_overlap_max;
    int correlation_span;
    int plc_history_len;

    int channel_count;
    int sample_rate;

    bool Initialized;
};


}

#endif

PcmConcealer.cpp

/*
 * PcmConcealer.cpp
 *
 * Code adapted from Steve Underwood of the Asterisk Project. This code inherits
 * the same licensing restrictions as the Asterisk Project.
 */

#include "audio/PcmConcealer.hpp"

/* We do a straight line fade to zero volume in 50ms when we are filling in for missing data. */
#define ATTENUATION_INCREMENT       0.0025                              /* Attenuation per sample */


#ifndef INT16_MAX
#define INT16_MAX       (32767)
#endif

#ifndef INT16_MIN
#define INT16_MIN       (-32767-1)
#endif


#ifdef WIN32
inline double rint(double x)
{
     return floor(x + 0.5);
}
#endif

inline short fsaturate(double damp)
{
    if (damp > 32767.0)
        return  INT16_MAX;

    if (damp < -32768.0)
        return  INT16_MIN;

    return (short)rint(damp);
}

namespace audio
{

PcmConcealer::PcmConcealer() : Initialized(false)
{


}

PcmConcealer::~PcmConcealer()
{
    Destroy();
}

void PcmConcealer::Init(int channels, int bit_depth, int sample_rate)
{
    if(Initialized)
        return;

    if(channels <= 0 || bit_depth != 16)
        return;

    Initialized = true;

    channel_count = channels;
    this->sample_rate = sample_rate;

    //////////////

    double min = PLC_PITCH_MIN(sample_rate);
    int imin = (int)min;

    double max = PLC_PITCH_MAX(sample_rate);
    int imax = (int)max;

    plc_pitch_min = imin;
    plc_pitch_max = imax;
    plc_pitch_overlap_max = (plc_pitch_min >> 2);
    correlation_span = CORRELATION_SPAN(sample_rate);
    plc_history_len = correlation_span + plc_pitch_min;

    //////////////

    for(int i = 0; i < channel_count; i ++)
    {
        plc_state_t *t = new plc_state_t;
        memset(t, 0, sizeof(plc_state_t));

        t->pitchbuf = new float[plc_pitch_min];
        t->history = new short[plc_history_len];

        ChannelStates.push_back(t);
    }
}

void PcmConcealer::Destroy()
{
    if(!Initialized)
        return;

    while(ChannelStates.size())
    {
        plc_state_t *s = ChannelStates.at(0);

        if(s)
        {
            if(s->history) delete s->history;
            if(s->pitchbuf) delete s->pitchbuf;

            memset(s, 0, sizeof(plc_state_t));
            delete s;
        }

        ChannelStates.erase(ChannelStates.begin());
    }

    ChannelStates.clear();

    Initialized = false;
}

//Process a block of received audio samples.
int PcmConcealer::Receive(short amp[], int frames)
{
    if(!Initialized)
        return 0;

    int j = 0;

    for(int k = 0; k < ChannelStates.size(); k++)
    {
        int i;
        int overlap_len;
        int pitch_overlap;

        float old_step;
        float new_step;
        float old_weight;
        float new_weight;
        float gain;

        plc_state_t *s = ChannelStates.at(k);

        if (s->missing_samples)
        {
            /* Although we have a real signal, we need to smooth it to fit well
                with the synthetic signal we used for the previous block */

            /* The start of the real data is overlapped with the next 1/4 cycle
                of the synthetic data. */
            pitch_overlap = s->pitch >> 2;


            if (pitch_overlap > frames)
                pitch_overlap = frames;

            gain = 1.0 - s->missing_samples * ATTENUATION_INCREMENT;

            if (gain < 0.0)
                gain = 0.0;

            new_step = 1.0/pitch_overlap;
            old_step = new_step*gain;
            new_weight = new_step;
            old_weight = (1.0 - new_step)*gain;

            for (i = 0;  i < pitch_overlap;  i++)
            {
                int index = (i * channel_count) + j;

                amp[index] = fsaturate(old_weight * s->pitchbuf[s->pitch_offset] + new_weight * amp[index]);

                if (++s->pitch_offset >= s->pitch)
                    s->pitch_offset = 0;

                new_weight += new_step;
                old_weight -= old_step;

                if (old_weight < 0.0)
                    old_weight = 0.0;
            }

            s->missing_samples = 0;
        }

        save_history(s, amp, j, frames);

        j++;
    }

    return frames;
}

//Fill-in a block of missing audio samples.
int PcmConcealer::Fill(short amp[], int frames)
{
    if(!Initialized)
        return 0;

    int j =0;

    for(int k = 0; k < ChannelStates.size(); k++)
    {
        short *tmp = new short[plc_pitch_overlap_max];

        int i;
        int pitch_overlap;

        float old_step;
        float new_step;
        float old_weight;
        float new_weight;
        float gain;

        short *orig_amp;
        int orig_len;

        orig_amp = amp;
        orig_len = frames;

        plc_state_t *s = ChannelStates.at(k);

        if (s->missing_samples == 0)
        {
            // As the gap in real speech starts we need to assess the last known pitch,
            //and prepare the synthetic data we will use for fill-in
            normalise_history(s);
            s->pitch = amdf_pitch(plc_pitch_min, plc_pitch_max, s->history + (plc_history_len - correlation_span - plc_pitch_min), correlation_span);

            // We overlap a 1/4 wavelength
            pitch_overlap = s->pitch >> 2;

            // Cook up a single cycle of pitch, using a single of the real signal with 1/4
            //cycle OLA'ed to make the ends join up nicely
            // The first 3/4 of the cycle is a simple copy
            for (i = 0;  i < s->pitch - pitch_overlap;  i++)
                s->pitchbuf[i] = s->history[plc_history_len - s->pitch + i];

            // The last 1/4 of the cycle is overlapped with the end of the previous cycle
            new_step = 1.0/pitch_overlap;
            new_weight = new_step;

            for (  ;  i < s->pitch;  i++)
            {
                s->pitchbuf[i] = s->history[plc_history_len - s->pitch + i]*(1.0 - new_weight) + s->history[plc_history_len - 2*s->pitch + i]*new_weight;
                new_weight += new_step;
            }

            // We should now be ready to fill in the gap with repeated, decaying cycles
            //  of what is in pitchbuf

            // We need to OLA the first 1/4 wavelength of the synthetic data, to smooth
            //  it into the previous real data. To avoid the need to introduce a delay
            //  in the stream, reverse the last 1/4 wavelength, and OLA with that.

            gain = 1.0;
            new_step = 1.0/pitch_overlap;
            old_step = new_step;
            new_weight = new_step;
            old_weight = 1.0 - new_step;

            for (i = 0;  (i < pitch_overlap) && (i < frames);  i++)
            {
                int index = (i * channel_count) + j;

                amp[index] = fsaturate(old_weight * s->history[plc_history_len - 1 - i] + new_weight * s->pitchbuf[i]);
                new_weight += new_step;
                old_weight -= old_step;

                if (old_weight < 0.0)
                    old_weight = 0.0;
            }

            s->pitch_offset = i;
        }
        else
        {
            gain = 1.0 - s->missing_samples*ATTENUATION_INCREMENT;
            i = 0;
        }

        for (  ;  gain > 0.0  &&  i < frames;  i++)
        {
            int index = (i * channel_count) + j;

            amp[index] = s->pitchbuf[s->pitch_offset]*gain;
            gain -= ATTENUATION_INCREMENT;

            if (++s->pitch_offset >= s->pitch)
                s->pitch_offset = 0;
        }

        for (  ;  i < frames;  i++)
        {
            int index = (i * channel_count) + j;
            amp[i] = 0;
        }

        s->missing_samples += orig_len;
        save_history(s, amp, j, frames);

        delete [] tmp;

        j++;
    }

    return frames;
}

void PcmConcealer::save_history(plc_state_t *s, short *buf, int channel_index, int frames)
{
    if (frames >= plc_history_len)
    {
        /* Just keep the last part of the new data, starting at the beginning of the buffer */
        //memcpy(s->history, buf + len - plc_history_len, sizeof(short)*plc_history_len);

        int frames_to_copy = plc_history_len;

        for(int i = 0; i < frames_to_copy; i ++)
        {
            int index = (channel_count * (i + frames - plc_history_len)) + channel_index;
            s->history[i] = buf[index];
        }

        s->buf_ptr = 0;
        return;
    }

    if (s->buf_ptr + frames > plc_history_len)
    {
        /* Wraps around - must break into two sections */
        //memcpy(s->history + s->buf_ptr, buf, sizeof(short)*(plc_history_len - s->buf_ptr));

        short *hist_ptr = s->history + s->buf_ptr;
        int frames_to_copy = plc_history_len - s->buf_ptr;

        for(int i = 0; i < frames_to_copy; i ++)
        {
            int index = (channel_count * i) + channel_index;
            hist_ptr[i] = buf[index];
        }

        frames -= (plc_history_len - s->buf_ptr);


        //memcpy(s->history, buf + (plc_history_len - s->buf_ptr), sizeof(short)*len);

        frames_to_copy = frames;

        for(int i = 0; i < frames_to_copy; i ++)
        {
            int index = (channel_count * (i + (plc_history_len - s->buf_ptr))) + channel_index;
            s->history[i] = buf[index];
        }

        s->buf_ptr = frames;
        return;
    }

    /* Can use just one section */
    //memcpy(s->history + s->buf_ptr, buf, sizeof(short)*len);

    short *hist_ptr = s->history + s->buf_ptr;
    int frames_to_copy = frames;

    for(int i = 0; i < frames_to_copy; i ++)
    {
        int index = (channel_count * i) + channel_index;
        hist_ptr[i] = buf[index];
    }

    s->buf_ptr += frames;
}

void PcmConcealer::normalise_history(plc_state_t *s)
{
    short *tmp = new short[plc_history_len];

    if (s->buf_ptr == 0)
        return;

    memcpy(tmp, s->history, sizeof(short)*s->buf_ptr);
    memcpy(s->history, s->history + s->buf_ptr, sizeof(short)*(plc_history_len - s->buf_ptr));
    memcpy(s->history + plc_history_len - s->buf_ptr, tmp, sizeof(short)*s->buf_ptr);

    s->buf_ptr = 0;

    delete [] tmp;
}

int PcmConcealer::amdf_pitch(int min_pitch, int max_pitch, short amp[], int frames)
{
    int i;
    int j;
    int acc;
    int min_acc;
    int pitch;

    pitch = min_pitch;
    min_acc = INT_MAX;

    for (i = max_pitch;  i <= min_pitch;  i++)
    {
        acc = 0;

        /*for (j = 0;  j < frames;  j++)
        {
            int index1 = (channel_count * (i+j)) + channel_index;
            int index2 = (channel_count * j) + channel_index;

            //std::cout << "Index 1: " << index1 << ", Index 2: " << index2 << std::endl;

            acc += abs(amp[index1] - amp[index2]);
        }*/

        for (j = 0;  j < frames;  j++)
            acc += abs(amp[i + j] - amp[j]);

        if (acc < min_acc)
        {
            min_acc = acc;
            pitch = i;
        }
    }

    //std::cout << "Pitch: " << pitch << std::endl;

    return pitch;
}



}

关于c++ - 隐藏 PCM 流中的丢包，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/2965061/

文章推荐： java - 动态调用接口(interface)方法

文章推荐： html - 如何访问 HTML5 视频解码功能？

jquery - 隐藏、显示、隐藏/显示按钮
代码如下: http://jsfiddle.net/t2nite/KCY8g/ 我正在使用 jquery 创建这些隐藏框。每个框都有一些文本和一个“显示”和“隐藏”按钮。我正在尝试创建一个“显示/隐
jquery - onclick div 隐藏，setTimeout div 隐藏
我正在尝试做某事。如果单击一个添加 #add-conferance 然后菜单将显示.add-contact。当点击隐藏然后它显示隐藏。我也将 setTimeout 设置为 7sec，但我希望当我的鼠标
javascript - 如何仅在第一页(div)隐藏 "previous"按钮并在最后一页(div)隐藏 "next button"
我有一个多步骤(多页？)表单，只要用户按下“下一步”或“上一步”按钮，表单字段就会通过 div 显示和隐藏。我只想禁用第一个 div (div id="page1"class="pageform")
asp.net - 现有 ASP.NET 4 解决方案中的高效 URL 屏蔽/隐藏/隐藏
我有一个使用 IIS 6 和 7 的当前系统，用 ASP.NET 和 .NET 4 中的 C# 编写。 My purpose is to hide the url completely (as per
jquery - 如果 'X' div 可见，则 'Y' div 隐藏。如果 'X' div 隐藏，则 'Y' div 可见
我正在建立一个网站，并有一个幻灯片。幻灯片有标题和索引，覆盖整个页面。当覆盖被激活时，标题需要消失。当覆盖层被停用时，通过单击退出按钮、缩略图链接或菜单链接，字幕必须返回。这就是我目前所拥有的
Jquery显示/隐藏
我正在尝试为显示/隐藏功能制作简单的 jquery 代码。但我仍然做错了什么。 $(document).ready(function(){ $('.arrow').click(function
Android自定义对话框在菜单按下时显示/隐藏
我有一个自定义对话框并使用它来代替 optionMenu。所以我希望 myDialog 表现得像菜单，即在按下菜单时显示/隐藏。我尝试了很多变体，但结果相同: 因为我为 myDialog 设置了一个
Android动态移除tabBar(隐藏)
在我的项目中，我通过 ViewPager 创建我的 tabBar，如下所示: MainActivity.java mViewPager = (ViewPager) findViewById(R.id.
excel - 隐藏/取消隐藏excel中的特定行时要更改的单元格值？
我目前正在使用一个 Excel 表，我将第 1-17 行分组并在单元格 B18 中写入了一个单元格值。我想知道当我在展开/折叠行时单击 +/- 符号时是否有办法更改 B18 中的值。例如:我希望 B
excel - 隐藏/取消隐藏特定组
我想创建一个按钮来使用 VBA 隐藏和取消隐藏特定组。我拥有的代码将隐藏或取消隐藏指定级别中的所有组: Sub Macro1() ActiveSheet.Outline.ShowLevels RowL
excel - 隐藏/取消隐藏最后写入的行
我是 VBA 新手。我想隐藏从任何行到工作表末尾的所有行。我遇到的问题是我不知道如何编程以隐藏最后写入的行。我使用下一个函数知道最后写入的单元格，但我不知道在哪里放置隐藏函数。 last = Ra
acumatica - 隐藏/禁用基于输入字段的其他字段
我想根据另一个字段的条件在 UI 上隐藏或更新一个字段。例如，如果我有一个名为 Color 的字段: [PXUIField(DisplayName="Color")] [PXStringList("
GCC 隐藏/鲜为人知的功能
这是我尝试开始收集通常不会遇到的 GCC 特殊功能。这是@jlebedev 在另一个问题中提到g++的“有效C++”选项之后， -Weffc++ This option warns about C++
Flutter ProgressDialog 隐藏
我开发了一个 Flutter 应用程序，我使用了 ProgressDialog小部件 ( progress_dialog: ^1.2.0 )。首先，我展示了 ProgressDialog小部件和一些代
android - 隐藏/显示没有动画的状态栏
我需要在 API 17+ 的同一个 Activity(Fragment) 中显示/隐藏状态栏。假设一个按钮将隐藏它，另一个按钮将显示它: 节目: getActivity().getWindow().s
angular - 是否可以通过编程方式控制清晰度下拉列表的显示/隐藏？
是否可以通过组件的 ts 代码以编程方式控制下拉列表的显示/隐藏(使用 Angular2 清楚)- https://vmware.github.io/clarity/documentation/dro
jquery - NiceScroll显示/隐藏
我想根据 if 函数的结果隐藏/显示 NiceScroll。在我的html中有三个部分，从左到右逐一滚动。我的脚本如下: var section2 = $('#section2').offset(
jquery - 单击外部 > 隐藏()
我有这个 jquery 代码: $(document).ready(function(){ //global vars var searchBoxes = $(".box"); var searchB
Jquery基于变量显示/隐藏(不是切换)
这个问题已经有答案了: Does something like jQuery.toggle(boolean) exist? (5 个回答) 已关闭 6 年前。在 jQuery 中(我当前使用的是 1
jQuery 隐藏 selectMenu？
我在这样的选择标签上使用 jQuery 的 selectMenu。 $('#ddlReport').selectmenu() 在某些情况下我想隐藏它，但我不知道如何隐藏。这不起作用: $('#ddl

太空宇宙

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

c++ - 隐藏 PCM 流中的丢包