gpt4 book ai didi

python - 我从 wav 文件中获取峰值频率。但是对于录制的 2 个 channel wav,它不起作用

转载 作者:行者123 更新时间:2023-11-28 18:30:38 25 4
gpt4 key购买 nike

我正在从 wav 文件中获取峰值频率

我从 wav 文件获取峰值频率的代码是:

import wave
import struct
import numpy as np
import wave
import contextlib

if __name__ == '__main__':
fname = "test.wav"
frate = 0
data_size = 0
with contextlib.closing(wave.open(fname,'r')) as f:
frate = f.getframerate()
data_size = f.getnframes()
wav_file = wave.open(fname, 'r')
data = wav_file.readframes(data_size)
data_size = data_size * wav_file.getnchannels()
print wav_file.getparams()
wav_file.close()
data = struct.unpack('{n}h'.format(n=data_size), data)
data = np.array(data)

w = np.fft.fft(data)
freqs = np.fft.fftfreq(len(w))
print(freqs.min(), freqs.max())

# Find the peak in the coefficients
idx = np.argmax(np.abs(w))
freq = freqs[idx]
freq_in_hertz = abs(freq * frate)
print(freq_in_hertz)

我录制了一个 48000 采样率、16 位宽、2 channel 的 wav 文件。在那个文件中,我有一个 1000Hz 的正弦音调。但脚本只输出 500Hz。我不知道我哪里出错了。但对于单 channel 和生成的 48000 采样率、16 位宽、2 channel 的 wav 文件,它工作正常。

我使用以下脚本生成了 wav 文件

import math
import wave
import struct

if __name__ == '__main__':
# http://stackoverflow.com/questions/3637350/how-to-write-stereo-wav-files-in-python
# http://www.sonicspot.com/guide/wavefiles.html
freq = 1000
data_size = 454656 * 2
fname = "test.wav"
frate = 48000.0
amp = 64000.0
nchannels = 2
sampwidth = 2
framerate = int(frate)
nframes = data_size
comptype = "NONE"
compname = "not compressed"
data = [math.sin(2 * math.pi * freq * (x / frate))
for x in range(data_size)]
wav_file = wave.open(fname, 'w')
wav_file.setparams(
(nchannels, sampwidth, framerate, nframes, comptype, compname))
for v in data:
wav_file.writeframes(struct.pack('h', int(v * amp / 2)))
wav_file.close()

我不知道我哪里做错了。我在脚本生成的 wav script_gen.wav 上传了我的 wav 文件具有 48000 采样率,2 channel ,16 位。录制的波形:2 channel wav with 48000 sample rate, 2 channels, 16 bit 1 channel wav(不允许在此处发布链接,因此将在评论中发布) with 48000 sample rate, 1 channel, 16 bit.

我在 audacity 中检查了所有这些峰值频率,它只显示 1000Khz。

但是当我尝试使用我的 scirpt 时,我得到了 1 channel wav 的正确输出而 2 channel wav 失败了。

更新:我得到峰值频率的一半作为 2 个 channel 的输出。

我觉得我错过了什么。谁能帮我解决这个问题?

最佳答案

为什么这么复杂?考虑以下内容

#!/usr/bin/env python3
import numpy as np
from numpy import fft
import scipy.io.wavfile as wf
import matplotlib.pyplot as plt

sr = 44100 # sample rate
len_sig = 2 # length of resulting signal in seconds

f = 1000 # frequency in Hz

# set you time axis
t = np.linspace(0, len_sig, sr*len_sig)

# set your signal
mono_data = np.sin(2*np.pi*t*f)

# write single channel .wav file
wf.write('mono.wav', sr, mono_data)

# write two-channel .wav file
stereo_data = np.vstack((mono_data, mono_data)).T
wf.write('stereo.wav', sr, stereo_data)

现在通过加载和分析数据来测试它

# Load data
mono_sr, mono_data = wf.read('mono.wav')
stereo_sr, stereo_data = wf.read('stereo.wav')

# analyze the data
X_mono = fft.fft(mono_data) / len(mono_data) # remember to normalize your amplitudes

# Remember that half of energy of the signal is distributed over the
# positive frequencies and the other half over the negative frequencies.
#
# Commonly you want see a magnitude spectrum. That means, we ignore the phases. Hence, we
# simply multiply the spectrum by 2 and consider ONLY the first half of it.
freq_nq = len(X_mono) // 2
X_mono = abs(X_mono[:freq_nq]) * 2
freqs_mono = fft.fftfreq(len(mono_data), 1/mono_sr)[:freq_nq]

# in order the analyze a stereo signal you first have to add both channels
sum_stereo = stereo_data.sum(axis=1) / 2

# and now the same way as above
freq_nq = len(sum_stereo) // 2
X_stereo= abs(fft.fft(sum_stereo))[:freq_nq] / len(stereo_data) * 2
freqs_stereo = fft.fftfreq(len(stereo_data), 1/stereo_sr)[:freq_nq]

峰值选择:

freqs_mono[np.argmax(X_mono)]        # == 1000.0
freqs_stereo[np.argmax(X_stereo)] # == 1000.0

绘制结果:

fig, (ax1, ax2) = plt.subplots(2, figsize=(10,5), sharex=True, sharey=True)
ax1.set_title('mono signal')
ax1.set_xlim([0, 2000])
ax1.plot(freqs_mono, X_mono, 'b', lw=2)

ax2.set_title('stereo signal')
ax2.plot(freqs_stereo, X_stereo, 'g', lw=2)
ax2.set_xlim([0, 2000])
plt.tight_layout()
plt.show()

Mono and stereo peaks

关于python - 我从 wav 文件中获取峰值频率。但是对于录制的 2 个 channel wav,它不起作用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37813059/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com