audio - FFMPEG 视频到音频的转换结果在不同的持续时间-6ren

audio - FFMPEG 视频到音频的转换结果在不同的持续时间

转载作者：行者123 更新时间：2023-12-02 23:01:44

我正在尝试将 MP4 文件转换为以 16,000 Hz 采样的单声道 WAV 文件。

当我运行以下代码时，持续时间从 开始00:09:59.99 (MP4) 至 00:09:57.64 (WAV)。它的原始较长版本从 00:48:37.46 (MP4) 到 00:48:23.38 (WAV)。

ffmpeg -i <FILE_NAME>.mp4 -ac 1 -ar 16000 <FILE_NAME>.wav

我也试过下面的代码。结果更糟，从 00:09:59.99 (MP4) 变为 00:12:56.29 (AAC)。

ffmpeg -I <FILE_NAME>.mp4 -vn -acodec copy <FILE_NAME>.aac

附上日志:

Report written to "ffmpeg-20200610-093115.log"
Command line:
ffmpeg -i short.mp4 -ac 1 -ar 16000 short.wav -report
ffmpeg version 4.1.1 Copyright (c) 2000-2019 the FFmpeg developers
  built with Apple LLVM version 10.0.0 (clang-1000.11.45.5)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.1.1 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags='-I/Library/Java/JavaVirtualMachines/openjdk-11.0.2.jdk/Contents/Home/include -I/Library/Java/JavaVirtualMachines/openjdk-11.0.2.jdk/Contents/Home/include/darwin' --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libmp3lame --enable-libopus --enable-librubberband --enable-libsnappy --enable-libtesseract --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-videotoolbox --disable-libjack --disable-indev=jack --enable-libaom --enable-libsoxr
  libavutil      56. 22.100 / 56. 22.100
  libavcodec     58. 35.100 / 58. 35.100
  libavformat    58. 20.100 / 58. 20.100
  libavdevice    58.  5.100 / 58.  5.100
  libavfilter     7. 40.101 /  7. 40.101
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  3.100 /  5.  3.100
  libswresample   3.  3.100 /  3.  3.100
  libpostproc    55.  3.100 / 55.  3.100
Splitting the commandline.
Reading option '-i' ... matched as input url with argument 'short.mp4'.
Reading option '-ac' ... matched as option 'ac' (set number of audio channels) with argument '1'.
Reading option '-ar' ... matched as option 'ar' (set audio sampling rate (in Hz)) with argument '16000'.
Reading option 'short.wav' ... matched as output url.
Reading option '-report' ... matched as option 'report' (generate a report) with argument '1'.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option report (generate a report) with argument 1.
Successfully parsed a group of options.
Parsing a group of options: input url short.mp4.
Successfully parsed a group of options.
Opening an input file: short.mp4.
[NULL @ 0x7f98a3008200] Opening 'short.mp4' for reading
[file @ 0x7f98a2904440] Setting default whitelist 'file,crypto'
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f98a3008200] Format mov,mp4,m4a,3gp,3g2,mj2 probed with size=2048 and score=100
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f98a3008200] ISO: File Type Major Brand: mp42
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f98a3008200] Unknown dref type 0x206c7275 size 12
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f98a3008200] Processing st: 0, edit list 0 - media time: 0, duration: 7679872
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f98a3008200] Unknown dref type 0x206c7275 size 12
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f98a3008200] Processing st: 1, edit list 0 - media time: 1024, duration: 26459559
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f98a3008200] drop a frame at curr_cts: 0 @ 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f98a3008200] Before avformat_find_stream_info() pos: 11213917 bytes read:318782 seeks:1 nb_streams:2
[h264 @ 0x7f98a3808800] nal_unit_type: 7(SPS), nal_ref_idc: 3
[h264 @ 0x7f98a3808800] nal_unit_type: 8(PPS), nal_ref_idc: 3
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f98a3008200] demuxer injecting skip 1024 / discard 0
[aac @ 0x7f98a1008c00] skip 1024 / discard 0 samples due to side data
[h264 @ 0x7f98a3808800] nal_unit_type: 6(SEI), nal_ref_idc: 0
[h264 @ 0x7f98a3808800] nal_unit_type: 5(IDR), nal_ref_idc: 3
[h264 @ 0x7f98a3808800] Format yuv420p chosen by get_format().
[h264 @ 0x7f98a3808800] Reinit context to 640x368, pix_fmt: yuv420p
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f98a3008200] All info found
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f98a3008200] After avformat_find_stream_info() pos: 21961 bytes read:351550 seeks:2 frames:46
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'short.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 1
    compatible_brands: isommp41mp42
    creation_time   : 2020-06-10T16:12:17.000000Z
  Duration: 00:09:59.99, start: 0.000000, bitrate: 149 kb/s
    Stream #0:0(eng), 1, 1/12800: Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 640x360 [SAR 1:1 DAR 16:9], 47 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
    Metadata:
      creation_time   : 2020-06-10T16:12:17.000000Z
      handler_name    : Core Media Video
    Stream #0:1(eng), 45, 1/44100: Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 98 kb/s (default)
    Metadata:
      creation_time   : 2020-06-10T16:12:17.000000Z
      handler_name    : Core Media Audio
Successfully opened the file.
Parsing a group of options: output url short.wav.
Applying option ac (set number of audio channels) with argument 1.
Applying option ar (set audio sampling rate (in Hz)) with argument 16000.
Successfully parsed a group of options.
Opening an output file: short.wav.
[file @ 0x7f98a0c1db40] Setting default whitelist 'file,crypto'
Successfully opened the file.
Stream mapping:
  Stream #0:1 -> #0:0 (aac (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
cur_dts is invalid (this is harmless if it occurs once at the start per stream)
[aac @ 0x7f98a100de00] skip 1024 / discard 0 samples due to side data
cur_dts is invalid (this is harmless if it occurs once at the start per stream)
detected 12 logical cores
[graph_0_in_0_1 @ 0x7f98a0e2c4c0] Setting 'time_base' to value '1/44100'
[graph_0_in_0_1 @ 0x7f98a0e2c4c0] Setting 'sample_rate' to value '44100'
[graph_0_in_0_1 @ 0x7f98a0e2c4c0] Setting 'sample_fmt' to value 'fltp'
[graph_0_in_0_1 @ 0x7f98a0e2c4c0] Setting 'channel_layout' to value '0x4'
[graph_0_in_0_1 @ 0x7f98a0e2c4c0] tb:1/44100 samplefmt:fltp samplerate:44100 chlayout:0x4
[format_out_0_0 @ 0x7f98a0e2cb80] Setting 'sample_fmts' to value 's16'
[format_out_0_0 @ 0x7f98a0e2cb80] Setting 'sample_rates' to value '16000'
[format_out_0_0 @ 0x7f98a0e2cb80] Setting 'channel_layouts' to value '0x4'
[format_out_0_0 @ 0x7f98a0e2cb80] auto-inserting filter 'auto_resampler_0' between the filter 'Parsed_anull_0' and the filter 'format_out_0_0'
[AVFilterGraph @ 0x7f98a0c16ac0] query_formats: 4 queried, 6 merged, 3 already done, 0 delayed
[auto_resampler_0 @ 0x7f98a0e2d540] [SWR @ 0x7f98a28e1000] Using fltp internally between filters
[auto_resampler_0 @ 0x7f98a0e2d540] ch:1 chl:mono fmt:fltp r:44100Hz -> ch:1 chl:mono fmt:s16 r:16000Hz
Output #0, wav, to 'short.wav':
  Metadata:
    major_brand     : mp42
    minor_version   : 1
    compatible_brands: isommp41mp42
    ISFT            : Lavf58.20.100
    Stream #0:0(eng), 0, 1/16000: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s (default)
    Metadata:
      creation_time   : 2020-06-10T16:12:17.000000Z
      handler_name    : Core Media Audio
      encoder         : Lavc58.35.100 pcm_s16le
size=   17152kB time=00:09:16.63 bitrate= 252.4kbits/s speed=1.11e+03x    
[out_0_0 @ 0x7f98a0e2c700] EOF on sink link out_0_0:default.
No more output streams to write to, finishing.
size=   18676kB time=00:09:59.99 bitrate= 255.0kbits/s speed=1.11e+03x    
video:0kB audio:18676kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000408%
Input file #0 (short.mp4):
  Input stream #0:0 (video): 1 packets read (3689 bytes); 
  Input stream #0:1 (audio): 25739 packets read (7375414 bytes); 25738 frames decoded (26355712 samples); 
  Total: 25740 packets (7379103 bytes) demuxed
Output file #0 (short.wav):
  Output stream #0:0 (audio): 25739 frames encoded (9562163 samples); 25739 packets muxed (19124326 bytes); 
  Total: 25739 packets (19124326 bytes) muxed
25738 frames successfully decoded, 0 decoding errors
[AVIOContext @ 0x7f98a0c1dc40] Statistics: 4 seeks, 76 writeouts
[AVIOContext @ 0x7f98a29045c0] Statistics: 10902846 bytes read, 29 seeks

最佳答案

MP4、MKV 等容器存储带有时间戳的数据包。其副产品之一是它允许通过简单地调整数据包的时间戳来表示音轨中的静音，这些数据包旨在在它们之间保持静音。像 WAV 或原始 AAC 比特流这样的容器没有时间戳，因此以这种方式编码的任何“静默”都会丢失。

您的输入音频为 44100 Hz。在日志末尾附近的这一行中，

Input stream #0:1 (audio): 25739 packets read (7375414 bytes); 25738 frames decoded (26355712 samples);

你看到输入流有 26355712 samples .在 44100 Hz 时，即 ~597.6351 seconds .这就是你在 WAV 输出中得到的。

要插入静音，为了保留源持续时间，请使用

ffmpeg -i <FILE_NAME>.mp4 -af aresample=async=1 -ac 1 -ar 16000 <FILE_NAME>.wav

关于audio - FFMPEG 视频到音频的转换结果在不同的持续时间，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/62308695/

文章推荐： elasticsearch - Elasticsearch日期查询。在某个月出生的人

文章推荐： Javascript 在具有类的元素上切换类

文章推荐： javascript - 使用闭包编译器 linter 抑制单个警告

文章推荐： elasticsearch - 如何从Elasticsearch结果中排除元数据？

javascript - Web 音频/ radio 流客户端 : use Howler. js、 native 音频、其他库？
我一直在为实时流和静态文件(HTTP 上的 MP3)构建网络广播播放器。我选了Howler.js作为规范化 quirks 的后端的 HTML5 Audio (思考:自动播放、淡入/淡出、进度事件)。
vue实现移动端input上传视频、音频
vue移动端input上传视频、音频，供大家参考，具体内容如下 html部分 ?
PHP转换图像+音频=视频
关闭。这个问题需要更多 focused .它目前不接受答案。想改进这个问题？更新问题，使其仅关注一个问题 editing this post . 7年前关闭。 Improve this questi
iphone - 音频/视频编程
我想在我的程序中访问音频和视频。 MAC里面可以吗？我们的程序在 Windows 上运行，我使用 directshow 进行音频/视频编程。但我想在 MAC 中开发相同的东西。有没有像direct
iOS 音频/声音不会在后台模式处于事件状态时在后台播放
我的应用程序(使用 Flutter 制作，但这应该无关紧要)具有类似于计时器的功能，可以定期(10 秒到 3 分钟)发出滴答声。我在我的 Info.plist 中激活了背景模式 Audio、AirPl
javascript - 音频 JavaScript
我是 ionic 2 的初学者我使用了音频文件。 import { Component } from '@angular/core'; import {NavController, Alert
java - 插入声音/音频
我有一个包含ListView和图片的数据库，我想在每个语音数据中包含它们。我已经尝试过，但是有很多错误。以下是我的java和xml。数据库.java package com.example.data
php - 音频/音乐社交网站托管服务
我在zend framework 2上建立了一个音乐社交网络。您可以想象它与SoundCloud相同，用户上传歌曲，其他用户播放它们，这些是网站上的基本操作。我知道将要托管该页面的服务器将需要大量带
android - 音频-Android
我正在尝试在android应用中播放音频，但是在代码中AssetFileDescriptor asset1及其下一行存在错误。这是代码: MediaPlayer mp; @Override prote
wordpress - [音频] WordPress短代码中的网址错误
我对 WordPress Audio Shortcode有问题。我这样使用它: 但是在前面，在HTML代码中我得到了: document.createElement('audio');
matlab - 音频.wav文件的SNR和评估过滤技术的客观措施
我正在做一项关于降低噪音的滤波技术的实验。我在数据集中的样本是音频文件(.wav)，因此，我有:原始录制的音频文件，我将它们与噪声混合，因此变得混合(噪声信号)，我将这些噪声信号通过滤波算法传递，输出
audio - 音频/声音增强的神经网络
一个人会使用哪种类型的神经网络架构将声音映射到其他声音？神经网络擅长学习从序列到其他序列，因此声音增强/生成似乎是它们的一种非常流行的应用(但不幸的是，事实并非如此-我只能找到一个(相当古老的)洋红色
windows - 音频:如何设置默认麦克风的电平？
这个让我抓狂: 在专用于此声音播放/录制应用程序的 Vista+ 计算机上，我需要我的应用程序确保(默认)麦克风电平被推到最大。我该怎么做？我找到了 Core Audio lib ，找到了如何将 I
html - Chrome扩展程序和流式传输<音频>
{ "manifest_version": 2, "name": "Kitten Radio Extension", "description": "Listen while browsi
c# - 音频，FFT不起作用
class Main { WaveFileReader reader; short[] sample; Complex[] tmpComplexArray; publi
android - 音频，平衡2种来源的声音
我正在使用电话录音软件(android)，该软件可以记录2个人在电话中的通话。每个电话的输出是一个音频文件，其中包含来自 call 者和被 call 者的声音。但是，大多数情况下，运行此软件的电话发
javascript - 音频/语音比较和getUserMedia
我正在构建一个需要语音激活命令的Web应用程序。我正在使用getUserMedia作为音频输入。对于语音激活命令，该过程是用户将需要通过记录其语音来“校准”命令。例如，对于“停止”命令，用户将说出“
cordova - 在PouchDB中存储视频/音频
我正在开发一个Cordova应用程序，并将PouchDB用作数据库，当连接可用时，它将所有信息复制到CouchDB。我成功存储了简单的文本和图像。我一直在尝试存储视频和音频，但是没有运气。我存储
audio - 音频.MP3在Safari浏览器中不起作用
我正在开发web application，我必须在其中使用.MP3的地方使用播放声音，但是会发生问题。声音为play good in chrome, Firefox，但为safari its not
audio - 音频:软件中的位深度减少
如何减少音频文件的位深？是否忽略了MSB或LSB？两者混合吗？ (旁问:这叫什么？) 最佳答案 TL / DR:将音频曲线高度变量右移至较低位深度可以将音频视为幅度(Y轴)随时间(X轴)的模拟曲线。

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

audio - FFMPEG 视频到音频的转换结果在不同的持续时间