gpt4 book ai didi

php - 解析 .srt 文件

转载 作者:塔克拉玛干 更新时间:2023-11-03 05:55:55 26 4
gpt4 key购买 nike

1
00:00:00,074 --> 00:00:02,564
Previously on Breaking Bad...

2
00:00:02,663 --> 00:00:04,393
Words...

我需要用 php 解析 srt 文件,并用变量打印文件中的所有 subs。

我找不到合适的正则表达式。执行此操作时,我需要获取 id、时间和字幕变量。并且打印时不能没有 array() 等。必须与原始文件中的打印相同。

我的意思是我必须像这样打印;

$number <br> (e.g. 1)
$time <br> (e.g. 00:00:00,074 --> 00:00:02,564)
$subtitle <br> (e.g. Previously on Breaking Bad...)

顺便说一句,我有这段代码。但它看不到线条。它必须被编辑,但如何编辑?

$srt_file = file('test.srt',FILE_IGNORE_NEW_LINES);
$regex = "/^(\d)+ ([\d]+:[\d]+:[\d]+,[\d]+) --> ([\d]+:[\d]+:[\d]+,[\d]+) (\w.+)/";

foreach($srt_file as $srt){

preg_match($regex,$srt,$srt_lines);

print_r($srt_lines);
echo '<br />';

}

最佳答案

这是一个简短的状态机,用于逐行解析 SRT 文件:

define('SRT_STATE_SUBNUMBER', 0);
define('SRT_STATE_TIME', 1);
define('SRT_STATE_TEXT', 2);
define('SRT_STATE_BLANK', 3);

$lines = file('test.srt');

$subs = array();
$state = SRT_STATE_SUBNUMBER;
$subNum = 0;
$subText = '';
$subTime = '';

foreach($lines as $line) {
switch($state) {
case SRT_STATE_SUBNUMBER:
$subNum = trim($line);
$state = SRT_STATE_TIME;
break;

case SRT_STATE_TIME:
$subTime = trim($line);
$state = SRT_STATE_TEXT;
break;

case SRT_STATE_TEXT:
if (trim($line) == '') {
$sub = new stdClass;
$sub->number = $subNum;
list($sub->startTime, $sub->stopTime) = explode(' --> ', $subTime);
$sub->text = $subText;
$subText = '';
$state = SRT_STATE_SUBNUMBER;

$subs[] = $sub;
} else {
$subText .= $line;
}
break;
}
}

if ($state == SRT_STATE_TEXT) {
// if file was missing the trailing newlines, we'll be in this
// state here. Append the last read text and add the last sub.
$sub->text = $subText;
$subs[] = $sub;
}

print_r($subs);

结果:

Array
(
[0] => stdClass Object
(
[number] => 1
[stopTime] => 00:00:24,400
[startTime] => 00:00:20,000
[text] => Altocumulus clouds occur between six thousand
)

[1] => stdClass Object
(
[number] => 2
[stopTime] => 00:00:27,800
[startTime] => 00:00:24,600
[text] => and twenty thousand feet above ground level.
)

)

然后您可以遍历 sub 数组或通过数组偏移量访问它们:

echo $subs[0]->number . ' says ' . $subs[0]->text . "\n";

通过遍历每个子项并显示它来显示所有子项:

foreach($subs as $sub) {
echo $sub->number . ' begins at ' . $sub->startTime .
' and ends at ' . $sub->stopTime . '. The text is: <br /><pre>' .
$sub->text . "</pre><br />\n";
}

进一步阅读:SubRip Text File Format

关于php - 解析 .srt 文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11659118/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com