gpt4 book ai didi

perl - 如何加快perl中的模式识别

转载 作者:行者123 更新时间:2023-12-01 11:36:56 25 4
gpt4 key购买 nike

这是目前的程序,它接收一个 .fasta 文件(一个包含遗传密码的文件),创建一个包含数据的哈希表并打印它,但是,它非常慢。它拆分一个字符串并将其与文件中的所有其他字母进行比较。

use strict;
use warnings;
use Data::Dumper;

my $total = $#ARGV + 1;
my $row;
my $compare;
my %hash;
my $unique = 0;
open( my $f1, '<:encoding(UTF-8)', $ARGV[0] ) or die "Could not open file '$ARGV[0]' $!\n";

my $discard = <$f1>;
while ( $row = <$f1> ) {
chomp $row;
$compare .= $row;
}
my $size = length($compare);
close $f1;
for ( my $i = 0; $i < $size - 6; $i++ ) {
my $vs = ( substr( $compare, $i, 5 ) );
for ( my $j = 0; $j < $size - 6; $j++ ) {
foreach my $value ( substr( $compare, $j, 5 ) ) {
if ( $value eq $vs ) {
if ( exists $hash{$value} ) {
$hash{$value} += 1;
} else {
$hash{$value} = 1;
}
}
}
}
}
foreach my $val ( values %hash ) {
if ( $val == 1 ) {
$unique++;
}
}

my $OUTFILE;
open $OUTFILE, ">output.txt" or die "Error opening output.txt: $!\n";
print {$OUTFILE} "Number of unique keys: " . $unique . "\n";
print {$OUTFILE} Dumper( \%hash );
close $OUTFILE;

在此先感谢您的帮助!

最佳答案

从描述中不清楚这个脚本需要什么,但是如果您正在寻找 5 个字符的匹配集,您实际上不需要进行任何字符串匹配:您可以运行整个序列并记录每个 5 字母序列出现的次数。

use strict;
use warnings;
use Data::Dumper;

my $str; # store the sequence here
my %hash;
# slurp in the whole file
open(IN, '<:encoding(UTF-8)', $ARGV[0]) or die "Could not open file '$ARGV[0]' $!\n";
while (<IN>) {
chomp;
$str .= $_;
}
close(IN);

# not sure if you were deliberately omitting the last two letters of sequence
# this looks at all the sequence
my $l_size = length($str) - 4;
for (my $i = 0; $i < $l_size; $i++) {
$hash{ substr($str, $i, 5) }++;
}

# grep in a scalar context will count the values.
my $unique = grep { $_ == 1 } values %hash;

open OUT, ">output.txt" or die "Error opening output.txt: $!\n";
print OUT "Number of unique keys: ". $unique."\n";
print OUT Dumper(\%hash);
close OUT;

关于perl - 如何加快perl中的模式识别,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25999603/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com