multithreading - perl 系统调用在使用线程时导致挂起-6ren

multithreading - perl 系统调用在使用线程时导致挂起

转载作者：行者123 更新时间：2023-12-02 00:06:13

我是perl的新手，所以请原谅我的无知。 (我正在使用 Windows 7)

我借用了 echicken 的线程示例脚本，并想将其用作脚本的基础来进行一些系统调用，但我遇到了一个超出我理解范围的问题。为了说明我遇到的问题，我在下面的示例代码中执行了一个简单的 ping 命令。

$nb_process 是允许同时运行的线程数。
$nb_compute 作为我们要运行子例程的次数(即我们将发出 ping 命令的总次数)。

当我将 $nb_compute 和 $nb_process 设置为彼此相同的值时，它工作得很好。

但是，当我减少 $nb_process(以限制在任何时候运行的线程数)时，它似乎会锁定一次 $nb_process 中定义的线程数已经开始。

如果我删除系统调用(ping 命令)，它会正常工作。

我看到其他系统调用有相同的行为(不仅仅是 ping)。

有人可以帮忙吗？我在下面提供了脚本。

#!/opt/local/bin/perl -w  
 use threads;  
 use strict;  
 use warnings;  

 my @a = ();  
 my @b = ();  


 sub sleeping_sub ( $ $ $ ); 

 print "Starting main program\n";  

 my $nb_process = 3;  
 my $nb_compute = 6;  
 my $i=0;  
 my @running = ();  
 my @Threads;  
 while (scalar @Threads < $nb_compute) {  

     @running = threads->list(threads::running);  
     print "LOOP $i\n";  
     print "  - BEGIN LOOP >> NB running threads = ".(scalar @running)."\n";  

     if (scalar @running < $nb_process) {  
         my $thread = threads->new( sub { sleeping_sub($i, \@a, \@b) });  
         push (@Threads, $thread);  
         my $tid = $thread->tid;  
         print "  - starting thread $tid\n";  
     }  
     @running = threads->list(threads::running);  
     print "  - AFTER STARTING >> NB running Threads = ".(scalar @running)."\n";  
     foreach my $thr (@Threads) {  
         if ($thr->is_running()) {  
             my $tid = $thr->tid;  
             print "  - Thread $tid running\n";  
         }  
         elsif ($thr->is_joinable()) {  
             my $tid = $thr->tid;  
             $thr->join;  
             print "  - Results for thread $tid:\n";  
             print "  - Thread $tid has been joined\n";  
         }  
     }  

     @running = threads->list(threads::running);  
     print "  - END LOOP >> NB Threads = ".(scalar @running)."\n";  
     $i++;  
 }  

 print "\nJOINING pending threads\n";  
 while (scalar @running != 0) {  
    foreach my $thr (@Threads) {  
         $thr->join if ($thr->is_joinable());  
     }  
     @running = threads->list(threads::running);  
}  
 print "NB started threads = ".(scalar @Threads)."\n";  
 print "End of main program\n";  


 sub sleeping_sub ( $ $ $ ) { 
    my @res2 = `ping 136.13.221.34`; 
    print "\n@res2";
    sleep(3);  
 }

最佳答案

您的程序的主要问题是您有一个繁忙的循环来测试是否可以加入线程。这是浪费。此外，您可以减少全局变量的数量以更好地理解您的代码。

其他挑眉:

永远不要使用原型(prototype)，除非您完全知道它们的含义。
sleeping_sub 不使用任何参数。
您经常使用 threads::running 列表，而没有考虑这是否真的正确。

您似乎只想同时运行 N 个 worker，但总共要启动 M 个 worker。这是一种相当优雅的实现方式。主要思想是我们在线程之间有一个队列，刚刚完成的线程可以将它们的线程 ID 排入队列。然后将加入该线程。为了限制线程数，我们使用信号量:

use threads; use strict; use warnings;
use feature 'say';  # "say" works like "print", but appends newline.
use Thread::Queue;
use Thread::Semaphore;

my @pieces_of_work = 1..6;
my $num_threads = 3;
my $finished_threads = Thread::Queue->new;
my $semaphore = Thread::Semaphore->new($num_threads);

for my $task (@pieces_of_work) {
  $semaphore->down;  # wait for permission to launch a thread

  say "Starting a new thread...";

  # create a new thread in scalar context
  threads->new({ scalar => 1 }, sub {
    my $result = worker($task);                # run actual task
    $finished_threads->enqueue(threads->tid);  # report as joinable "in a second"
    $semaphore->up;                            # allow another thread to be launched
    return $result;
  });

  # maybe join some threads
  while (defined( my $thr_id = $finished_threads->dequeue_nb )) {
    join_thread($thr_id);
  }
}

# wait for all threads to be finished, by "down"ing the semaphore:
$semaphore->down for 1..$num_threads;
# end the finished thread ID queue:
$finished_threads->enqueue(undef);

# join any threads that are left:
while (defined( my $thr_id = $finished_threads->dequeue )) {
  join_thread($thr_id);
}

join_thread 和 worker 定义为

sub worker {
  my ($task) = @_;
  sleep rand 2; # sleep random amount of time
  return $task + rand; # return some number
}

sub join_thread {
  my ($tid) = @_;
  my $thr = threads->object($tid);
  my $result = $thr->join;
  say "Thread #$tid returned $result";
}

我们可以得到输出:

Starting a new thread...
Starting a new thread...
Starting a new thread...
Starting a new thread...
Thread #3 returned 3.05652608754778
Starting a new thread...
Thread #1 returned 1.64777186731541
Thread #2 returned 2.18426146087901
Starting a new thread...
Thread #4 returned 4.59414651998983
Thread #6 returned 6.99852684265667
Thread #5 returned 5.2316971836585

(顺序和返回值不确定)。

使用队列可以很容易地告诉哪个线程已经完成。信号量使保护资源或限制并行事物的数量变得更加容易。

与繁忙的循环相比，此模式的主要好处是使用的 CPU 少得多。这也缩短了一般执行时间。

虽然这是一个非常大的改进，但我们可以做得更好!产生线程是昂贵的:这基本上是一个 fork()，没有 Unix 系统上的所有写时复制优化。整个解释器被复制，包括您已经创建的所有变量、所有状态等。

因此，应谨慎使用线程，并尽可能早地生成线程。我已经向您介绍了可以在线程之间传递值的队列。我们可以扩展它，以便一些工作线程不断地从输入队列中提取工作，并通过输出队列返回。现在的困难是让最后一个退出的线程完成输出队列。

use threads; use strict; use warnings;
use feature 'say';
use Thread::Queue;
use Thread::Semaphore;

# define I/O queues
my $input_q  = Thread::Queue->new;
my $output_q = Thread::Queue->new;

# spawn the workers
my $num_threads = 3;
my $all_finished_s = Thread::Semaphore->new(1 - $num_threads); # a negative start value!
my @workers;
for (1 .. $num_threads) {
  push @workers, threads->new( { scalar => 1 }, sub {
    while (defined( my $task = $input_q->dequeue )) {
      my $result = worker($task);
      $output_q->enqueue([$task, $result]);
    }
    # we get here when the input queue is exhausted.
    $all_finished_s->up;
    # end the output queue if we are the last thread (the semaphore is > 0).
    if ($all_finished_s->down_nb) {
      $output_q->enqueue(undef);
    }
  });
}

# fill the input queue with tasks
my @pieces_of_work = 1 .. 6;
$input_q->enqueue($_) for @pieces_of_work;

# finish the input queue
$input_q->enqueue(undef) for 1 .. $num_threads;

# do something with the data
while (defined( my $result = $output_q->dequeue )) {
  my ($task, $answer) = @$result;
  say "Task $task produced $answer";
}

# join the workers:
$_->join for @workers;

如前所述定义worker，我们得到:

Task 1 produced 1.15207098293783
Task 4 produced 4.31247785766295
Task 5 produced 5.96967474718984
Task 6 produced 6.2695013168678
Task 2 produced 2.02545636412421
Task 3 produced 3.22281619053999

(三个线程在打印完所有输出后才加入，这样输出会很无聊)。

当我们detach 线程时，第二个解决方案会变得更简单——主线程不会在所有线程退出之前退出，因为它正在监听最后一个完成的输入队列线程。

关于multithreading - perl 系统调用在使用线程时导致挂起，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/18268054/