linux - Linux内核add_timer的可靠性只有一个小问题吗？-6ren

linux - Linux内核add_timer的可靠性只有一个小问题吗？

转载作者：行者123 更新时间：2023-12-02 07:14:27

在下面给出的代码中，有一个简单的Linux内核模块(驱动程序)，使用add_timer以1 jiffy的分辨率重复调用了10次函数(即，定时器计划以jiffies + 1触发)。使用bash脚本rerun.sh，然后从syslog的打印输出中获取时间戳，并使用gnuplot对其进行可视化。

在大多数情况下，我会得到如下syslog输出:

[ 7103.055787] Init testjiffy: 0 ; HZ: 250 ; 1/HZ (ms): 4
[ 7103.056044]  testjiffy_timer_function: runcount 1 
[ 7103.060045]  testjiffy_timer_function: runcount 2 
[ 7103.064052]  testjiffy_timer_function: runcount 3 
[ 7103.068050]  testjiffy_timer_function: runcount 4 
[ 7103.072053]  testjiffy_timer_function: runcount 5 
[ 7103.076036]  testjiffy_timer_function: runcount 6 
[ 7103.080044]  testjiffy_timer_function: runcount 7 
[ 7103.084044]  testjiffy_timer_function: runcount 8 
[ 7103.088060]  testjiffy_timer_function: runcount 9 
[ 7103.092059]  testjiffy_timer_function: runcount 10 
[ 7104.095429] Exit testjiffy

...这将产生如下时间序列和三角洲直方图:

从本质上讲，这就是我期望从代码中获得计时的质量。

但是-偶尔，我会得到类似的捕获:

[ 7121.377507] Init testjiffy: 0 ; HZ: 250 ; 1/HZ (ms): 4
[ 7121.380049]  testjiffy_timer_function: runcount 1 
[ 7121.384062]  testjiffy_timer_function: runcount 2 
[ 7121.392053]  testjiffy_timer_function: runcount 3 
[ 7121.396055]  testjiffy_timer_function: runcount 4 
[ 7121.400068]  testjiffy_timer_function: runcount 5 
[ 7121.404085]  testjiffy_timer_function: runcount 6 
[ 7121.408084]  testjiffy_timer_function: runcount 7 
[ 7121.412072]  testjiffy_timer_function: runcount 8 
[ 7121.416083]  testjiffy_timer_function: runcount 9 
[ 7121.420066]  testjiffy_timer_function: runcount 10 
[ 7122.417325] Exit testjiffy

...这样的渲染结果如下:

...而我就像:“WHOOOOOAAAAAA ...请稍等...”-序列中没有脉冲掉落吗？意思是 add_timer错过了一个插槽，然后在接下来的4 ms插槽中启动了该功能？

有趣的是，在运行这些测试时，除了启动终端，Web浏览器和文本编辑器之外，我什么都没有，所以我看不到正在运行的任何东西，可能会占用OS/内核；因此，我真的看不到内核为何会(整个动荡时期)大失误的原因。当我阅读有关Linux内核计时的信息时，例如“ The simplest and least accurate of all timers ... is the timer API”，我读“最不准确”为:“不要精确地期望4毫秒的周期”(按照此示例)-我不接受(第一个)直方图中显示的方差;但是我不希望错过整个时期!

所以我的问题是:

在这种分辨率下，是否是add_timer的预期行为(有时会漏掉一个句点)？

如果是这样，是否有一种方法可以“强制” add_timer在每个4ms插槽上触发该功能，如该平台上的jiffy所指定的那样？

我是否有可能获得“错误的”时间戳-例如时间戳反射(reflect)何时实际“打印”到syslog，而不是何时实际触发功能？

注意，我不是在寻找低于对应于抖动的周期分辨率(在这种情况下为4ms)。当代码正常工作时，我也不希望减少增量方差。因此，正如我所看到的，我没有“高分辨率计时器”需求，也没有“硬实时”需求-我只希望add_timer可靠地触发。在不依靠内核的特殊“实时”配置的情况下，在该平台上是否有可能？

额外的问题:在下面的 rerun.sh中，您会注意到两个 sleep标记为 MUSTHAVE；如果忽略或注释了它们中的任何一个，则OS/内核将冻结，并需要进行强制重新引导。而且我不明白为什么-从bash的 rmmod之后运行 insmod真的有可能是如此之快，以至于与模块加载/卸载的正常过程相冲突吗？

平台信息:

$ cat /proc/cpuinfo | grep "processor\|model name\|MHz\|cores"
processor   : 0       # (same for 1)
model name  : Intel(R) Atom(TM) CPU N450   @ 1.66GHz
cpu MHz             : 1000.000
cpu cores   : 1
$ echo $(cat /etc/issue ; uname -a)
Ubuntu 11.04 \n \l Linux mypc 2.6.38-16-generic #67-Ubuntu SMP Thu Sep 6 18:00:43 UTC 2012 i686 i686 i386 GNU/Linux
$ echo $(lsb_release -a 2>/dev/null | tr '\n' ' ')
Distributor ID: Ubuntu Description: Ubuntu 11.04 Release: 11.04 Codename: natty

码:

$ cd /tmp/testjiffy
$ ls
Makefile  rerun.sh  testjiffy.c

Makefile :

obj-m += testjiffy.o

all:
    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

clean:
    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

testjiffy.c :

/*
 *  [http://www.tldp.org/LDP/lkmpg/2.6/html/lkmpg.html#AEN189 The Linux Kernel Module Programming Guide]
 */


#include <linux/module.h>   /* Needed by all modules */
#include <linux/kernel.h>   /* Needed for KERN_INFO */
#include <linux/init.h>     /* Needed for the macros */
#include <linux/jiffies.h>
#include <linux/time.h>
#define MAXRUNS 10

static volatile int runcount = 0;
static struct timer_list my_timer;

static void testjiffy_timer_function(unsigned long data)
{
  int tdelay = 100;

  runcount++;
  if (runcount == 5) {
    while (tdelay > 0) { tdelay--; } // small delay
  }

  printk(KERN_INFO
    " %s: runcount %d \n",
    __func__, runcount);

  if (runcount < MAXRUNS) {
    my_timer.expires = jiffies + 1;
    add_timer(&my_timer);
  }
}


static int __init testjiffy_init(void)
{
    printk(KERN_INFO
    "Init testjiffy: %d ; HZ: %d ; 1/HZ (ms): %d\n",
               runcount,      HZ,        1000/HZ);

  init_timer(&my_timer);

    my_timer.function = testjiffy_timer_function;
    //my_timer.data = (unsigned long) runcount;

  my_timer.expires = jiffies + 1;
    add_timer(&my_timer);
    return 0;
}

static void __exit testjiffy_exit(void)
{
    printk(KERN_INFO "Exit testjiffy\n");
}

module_init(testjiffy_init);
module_exit(testjiffy_exit);

MODULE_LICENSE("GPL");

rerun.sh :

#!/usr/bin/env bash

set -x
make clean
make
# blank syslog first
sudo bash -c 'echo "0" > /var/log/syslog'
sleep 1   # MUSTHAVE 01!
# reload kernel module/driver
sudo insmod ./testjiffy.ko
sleep 1   # MUSTHAVE 02!
sudo rmmod testjiffy
set +x

# copy & process syslog

max=0;
for ix in _testjiffy_*.syslog; do
  aa=${ix#_testjiffy_};
  ab=${aa%.syslog} ;
  case $ab in
    *[!0-9]*) ab=0;;          # reset if non-digit obtained; else
    *) ab=$(echo $ab | bc);;  # remove leading zeroes (else octal)
  esac
  if (( $ab > $max )) ; then
    max=$((ab));
  fi;
done;
newm=$( printf "%05d" $(($max+1)) );
PLPROC='chomp $_;
if (!$p) {$p=0;}; if (!$f) {$f=$_;} else {
  $a=$_-$f; $d=$a-$p;
  print "$a $d\n" ; $p=$a;
};'

set -x
grep "testjiffy" /var/log/syslog | cut -d' ' -f7- > _testjiffy_${newm}.syslog
grep "testjiffy_timer_function" _testjiffy_${newm}.syslog \
  | sed 's/\[\(.*\)\].*/\1/' \
  | perl -ne "$PLPROC" \
  > _testjiffy_${newm}.dat
set +x

cat > _testjiffy_${newm}.gp <<EOF
set terminal pngcairo font 'Arial,10' size 900,500
set output '_testjiffy_${newm}.png'
set style line 1 linetype 1 linewidth 3 pointtype 3 linecolor rgb "red"
set multiplot layout 1,2 title "_testjiffy_${newm}.syslog"
set xtics rotate by -45
set title "Time positions"
set yrange [0:1.5]
set offsets graph 50e-3, 1e-3, 0, 0
plot '_testjiffy_${newm}.dat' using 1:(1.0):xtic(gprintf("%.3se%S",\$1)) notitle with points ls 1, '_testjiffy_${newm}.dat' using 1:(1.0) with impulses ls 1
binwidth=0.05e-3
set boxwidth binwidth
bin(x,width)=width*floor(x/width) + width/2.0
set title "Delta diff histogram"
set style fill solid 0.5
set autoscale xy
set offsets graph 0.1e-3, 0.1e-3, 0.1, 0.1
plot '_testjiffy_${newm}.dat' using (bin(\$2,binwidth)):(1.0) smooth freq with boxes ls 1
unset multiplot
EOF
set -x; gnuplot _testjiffy_${newm}.gp ; set +x

编辑:出于 this comment by @granquet的动机，我尝试通过 /proc/schedstat使用 /proc/sched_debug从 dd和 call_usermodehelper获取调度程序统计信息；请注意，大多数情况下都会“跳过”(也就是说，由于函数的第7、6、10次运行而丢失的文件)；但是我设法获得了两个完整的运行，并把它们发布在 https://gist.github.com/anonymous/5709699中(因为我发现要在SO上粘贴gistbin可能比gist更为可取)，因为输出结果相当庞大； *_11*文件记录正确的运行， *_17*文件记录“滴”的运行。

注意我也在模块中切换到 mod_timer_pinned，它并没有多大帮助(要点日志是使用此功能通过模块获得的)。这些是 testjiffy.c中的更改:

#include <linux/kmod.h> // usermode-helper API
...
char fcmd[] = "of=/tmp/testjiffy_sched00";
char *dd1argv[] = { "/bin/dd", "if=/proc/schedstat", "oflag=append", "conv=notrunc", &fcmd[0], NULL };
char *dd2argv[] = { "/bin/dd", "if=/proc/sched_debug", "oflag=append", "conv=notrunc", &fcmd[0], NULL };
static char *envp[] = {
      "HOME=/",
      "TERM=linux",
      "PATH=/sbin:/bin:/usr/sbin:/usr/bin", NULL };

static void testjiffy_timer_function(unsigned long data)
{
  int tdelay = 100;
  unsigned long tjnow;

  runcount++;
  if (runcount == 5) {
    while (tdelay > 0) { tdelay--; } // small delay
  }

  printk(KERN_INFO
    " %s: runcount %d \n",
    __func__, runcount);

  if (runcount < MAXRUNS) {
    mod_timer_pinned(&my_timer, jiffies + 1);
    tjnow = jiffies;
    printk(KERN_INFO
      " testjiffy expires: %lu - jiffies %lu => %lu / %lu\n",
      my_timer.expires, tjnow, my_timer.expires-tjnow, jiffies);
    sprintf(fcmd, "of=/tmp/testjiffy_sched%02d", runcount);
    call_usermodehelper( dd1argv[0], dd1argv, envp, UMH_NO_WAIT );
    call_usermodehelper( dd2argv[0], dd2argv, envp, UMH_NO_WAIT );
  }
}

...并在 rerun.sh中:

...
set +x

for ix in /tmp/testjiffy_sched*; do
  echo $ix | tee -a _testjiffy_${newm}.sched
  cat $ix >> _testjiffy_${newm}.sched
done
set -x ; sudo rm /tmp/testjiffy_sched* ; set +x

cat > _testjiffy_${newm}.gp <<EOF
...

我将使用此帖子进行详细答复。

@CL.:非常感谢您的回答。很高兴确认它是“有可能在以后的麻烦中调用您的计时器函数”；通过记录这些变化，我也意识到在稍后的时间调用了timer函数-除此之外，它本身并没有什么“错误”。

很高兴了解时间戳；我想知道是否有可能:计时器函数在正确的时间命中，但是内核抢占了内核日志记录服务(我相信它是 klogd)，所以我得到了延迟的时间戳？但是，我试图创建一个“循环”(或者说是周期性的)计时器函数来写入硬件，并且我首先注意到这种“下降”，因为它意识到PC不会在USB总线上以特定的间隔写入数据。并且考虑到时间戳可以确认行为，这可能不是问题所在(我想)。

我已经修改了计时器功能，使其相对于上一个计时器的预定时间( my_timer.expires)触发-再次通过 mod_timer_pinned而不是 add_timer:

static void testjiffy_timer_function(unsigned long data)
{
  int tdelay = 100;
  unsigned long tjlast;
  unsigned long tjnow;

  runcount++;
  if (runcount == 5) {
    while (tdelay > 0) { tdelay--; } // small delay
  }

  printk(KERN_INFO
    " %s: runcount %d \n",
    __func__, runcount);

  if (runcount < MAXRUNS) {
    tjlast = my_timer.expires;
    mod_timer_pinned(&my_timer, tjlast + 1);
    tjnow = jiffies;
    printk(KERN_INFO
      " testjiffy expires: %lu - jiffies %lu => %lu / %lu last: %lu\n",
      my_timer.expires, tjnow, my_timer.expires-tjnow, jiffies, tjlast);
  }
}

...以及最初的几次尝试，它都完美无缺-但是，最终，我得到了这一点:

[13389.775508] Init testjiffy: 0 ; HZ: 250 ; 1/HZ (ms): 4
[13389.776051]  testjiffy_timer_function: runcount 1 
[13389.776063]  testjiffy expires: 3272445 - jiffies 3272444 => 1 / 3272444 last: 3272444
[13389.780053]  testjiffy_timer_function: runcount 2 
[13389.780068]  testjiffy expires: 3272446 - jiffies 3272445 => 1 / 3272445 last: 3272445
[13389.788054]  testjiffy_timer_function: runcount 3 
[13389.788073]  testjiffy expires: 3272447 - jiffies 3272447 => 0 / 3272447 last: 3272446
[13389.788090]  testjiffy_timer_function: runcount 4 
[13389.788096]  testjiffy expires: 3272448 - jiffies 3272447 => 1 / 3272447 last: 3272447
[13389.792070]  testjiffy_timer_function: runcount 5 
[13389.792091]  testjiffy expires: 3272449 - jiffies 3272448 => 1 / 3272448 last: 3272448
[13389.796044]  testjiffy_timer_function: runcount 6 
[13389.796062]  testjiffy expires: 3272450 - jiffies 3272449 => 1 / 3272449 last: 3272449
[13389.800053]  testjiffy_timer_function: runcount 7 
[13389.800063]  testjiffy expires: 3272451 - jiffies 3272450 => 1 / 3272450 last: 3272450
[13389.804056]  testjiffy_timer_function: runcount 8 
[13389.804072]  testjiffy expires: 3272452 - jiffies 3272451 => 1 / 3272451 last: 3272451
[13389.808045]  testjiffy_timer_function: runcount 9 
[13389.808057]  testjiffy expires: 3272453 - jiffies 3272452 => 1 / 3272452 last: 3272452
[13389.812054]  testjiffy_timer_function: runcount 10 
[13390.815415] Exit testjiffy

...如下所示:

...因此，基本上，我在+ 8ms插槽(应为@ 3272446 jiffies)处有一个延迟/“下降”，然后在+ 12ms插槽(应为@ 3272447 jiffies)处运行了两个函数；因此，您甚至可以看到图中的标签为“更粗体”。从“丢弃”序列现在已与适当的非丢弃序列同步的 Angular 来看，这更好(这就是您所说的:“以避免一个较晚的计时器函数转移所有随后的计时器调用”)-但是，我仍然错过一拍;而且由于每次跳动都必须向硬件写入字节，因此我保持了持续且恒定的传输速率，因此，这对我没有太大帮助。

至于另一个建议，“使用十个计时器”-因为我的最终目标(使用定期的lo-res计时器功能写入硬件)；我以为一开始它并不适用-但如果没有其他办法(除了做一些特殊的实时内核准备工作)，那么我肯定会尝试一个有10个(或N个)计时器的方案(也许存储在数组)，然后定期触发。

编辑:仅添加剩余的相关评论:

USB transfers are either scheduled in advance (isochronous) or have no timing guarantees (asynchronous). If your device doesn't use isochronous transfers, it's badly misdesigned. – CL. Jun 5 at 10:47

Thanks for the comment, @CL. - "... scheduled in advance (isochronous)..." cleared a confusion I had. I'm (eventually) targeting an FT232, which only has BULK mode - and as long as the bytes per timer hit is low, I can actually "cheat" my way through in "streaming" data with add_timer; however, when I transfer ammount of bytes close to consuming bandwidth, then these "misfires" start getting noticeable as drops. So I was interested in testing the limits of that, for which I need a reliably repetitive "timer" function - is there anything else I could try to have a reliable "timer"? – sdaau Jun 5 at 12:27

@sdaau Bulk transfers are not suitable for streaming. You cannot fix shortcomings in the hardware protocol by using another kind of software timer. – CL. Jun 5 at 13:50

...以及我对@CL的回应。 :我知道我将无法解决缺点；我对观察这些缺点更感兴趣-例如，如果内核功能定期进行USB写操作，则可以观察示波器/分析仪上的信号，并希望从某种意义上来看不适合批量模式。但是首先，我不得不相信该功能可以(至少在某种程度上)可靠地以周期性的速率重复(即“生成”一个时钟/滴答声)-而且直到现在，我才意识到我无法真正信任jit分辨率下的 add_timer(因为它能够相对容易地跳过整个周期)。但是，从这种意义上讲，转移到Linux的高分辨率计时器( hrtimer)确实为我提供了可靠的周期性函数-因此我想这可以解决我的问题(发布在 answer below中)。

最佳答案

非常感谢您的所有评论和答复；他们都指出了必须考虑的事情-但是鉴于我有点儿永远傻了，在获得一些理解之前，我仍然需要多读一些书(我希望是对的)。另外，我找不到真正适合于定期“滴答”功能的东西-因此，我将在此处发布更详细的答案。

简而言之-为了获得可靠的周期性Linux内核功能(分辨率为jiffy)，请不要使用add_timer(<linux/time.h>)，因为它可能会“丢弃”整个时间段；请改用高分辨率计时器(<linux/hrtimer.h>)。更详细地:

Is it possible that I get a "wrong" timestamp - ...?

@CL.: The timestamp in the log is the time when that string was printed to the log.

因此，也许有可能-但事实证明，这不是这里的问题:

Is this expected behavior from add_timer at this resolution (that a period can occasionally be missed)?

我猜，事实证明-是的:

If so, is there a way to "force" add_timer to fire the function at each 4ms slot, as specified by a jiffy on this platform?

...而且(我想再次)，结果-不。

现在，造成这种情况的原因有些微妙-我希望，如果我做错了，有人会纠正我。首先，我的第一个误解是“时钟只是一个时钟”(在某种意义上:即使将其实现为计算机代码)，但这也不完全正确。每次使用 add_timer之类的内容时，内核基本上都必须在某个地方“排队”“事件”。这个请求实际上可能来自任何事物:来自任何(和所有)种类的驱动程序，甚至可能来自用户空间。

问题在于，这种“排队”成本很高-因为内核除了必须处理(等效)遍历并在数组中插入(和删除)项之外，还必须处理跨越几个数量级的计时器延迟(从毫秒到十秒)；事实是某些驱动程序(显然是网络协议(protocol)的驱动程序)显然会排队很多计时器事件，这些事件通常在运行之前被取消-而其他类型的驱动程序可能需要完全不同的行为(例如我的情况) ，您希望在大多数情况下该事件通常不会被取消；并且您也将事件逐个排队)。最重要的是，内核需要针对单处理器，SMP和多处理器平台进行处理。因此，在内核中实现计时器处理涉及成本效益折衷。

事实证明，围绕jiffies/ add_timer的体系结构旨在处理最常见的设备-对于它们而言，达到jiffy分辨率的精度不是问题；但这也意味着使用此方法无法期望一个可靠的计时器能够解决单个抖动问题。内核通过将这些“事件队列”(某种程度上)像中断服务请求(IRQ)一样来处理，从而使情况更加复杂。并且内核中的IRQ处理有多个优先级，其中较高优先级的例程可以抢占较低优先级的例程(即:即使在当时正在执行，也要中断并挂起较低优先级的例程-和允许较高优先级的例程开始其业务)。或者，如前所述:

@granquet: timers run in soft irq context, which means they have the highest priority and they preempt everything running/runnable on the CPU ... but hardware interrupts which are not disabled when servicing a soft irq. So you might (most probable explanation) get an Hardware interrupt here and there that preempts your timer ... and thus you get an interrupt that is not serviced at the right time.

@CL.: It is indeed possible that your timer function gets called at a later jiffy than what expires what set to. Possible reasons are scheduling delays, other drivers that disable interrupts for too long (graphics and WLAN drivers are usual culprits), or some crappy BIOS executing SMI code.

我现在也这样认为-我认为这可以说明发生的情况:

jiffies更改为10000(== 40000 ms @ 250 Hz)

假设计时器功能(由add_timer排队)即将开始运行-但尚未开始运行

在这里说，网卡生成(出于某种原因)硬件中断

具有较高优先级的硬件中断会触发内核抢占(停止和挂起)定时器功能(可能现在已启动，并且其中仅有几条指令)。

这意味着内核现在必须重新计划计时器函数，以便稍后运行-由于一个内核只能在整数运算中使用，并且这种事件的时间分辨率很差-它能做的最好的是重新安排jiffies + 1(10001 == 40004 ms @ 250 Hz)

现在，内核将上下文切换到网卡驱动程序的IRQ服务例程，并开始其业务

假设IRQ服务例程在200μs内完成-这意味着我们现在(以“绝对”术语)为40000.2 ms-但是，我们仍然处于10000 jiffies

如果内核现在将上下文切换回计时器函数，它将完成-无需我注意延迟；

...但是，这不会发生，因为将计时器功能安排在下一个 Action 中!

因此，内核在接下来的大约3.8毫秒内继续其业务(可能处于休眠状态)

jiffies更改为10001(== 40004 ms @ 250 Hz)

(先前重新计划的)计时器函数运行-这次完成而不会中断

我还没有进行详细的分析来确定事件的顺序是否完全如上所述。但是我非常有说服力，这很接近-换句话说，就是分辨率问题-尤其是因为高分辨率计时器方法似乎没有显示这种行为。获取调度程序日志，并确切地知道是什么导致了抢占，这确实很棒，但是我怀疑在用户空间中来回访问是正确的，我是在OP编辑中尝试响应 @granquet的评论的要做的事。

无论如何，回到这个:

Note that I'm not looking for a period resolution below what corresponds to a jiffy (in this case, 4ms); nor am I looking to decrease the delta variance when the code works properly. So as I see it, I don't have "high resolution timer" demands, nor "hard real-time" demands ...

...这是我犯的一个严重错误-正如上面的分析所示，我确实有“高分辨率”的要求!而且，如果我意识到早些时候，我可能会早些找到相关的阅读资料。无论如何，对我来说，一些相关的文档(即使他们没有专门讨论周期性函数)是:

LDD3: 5.3. Semaphores and Mutexes-(在此处描述具有不同需求的驱动程序):“不会从中断处理程序或其他异步上下文进行访问。没有特定的延迟(响应时间)要求； 应用程序程序员理解I/O请求是通常不立即立即满足

Documentation/timers/hrtimers.txt-“timers.c代码非常严格地围绕“跳动”和32位假设进行了编码，并且已经针对相对狭窄的用例(跳动在相对窄的HZ范围内)进行了微调和微优化-因此，即使对其进行很小的扩展，也很容易打破轮子的概念”

T. Gleixner，D。Niehaus Hrtimers and Beyond: Transforming the Linux Time Subsystems (pdf)-(最详细的信息，另请参阅内部图表)“1997年实现的级联计时器轮(CTW)取代了原来的按时间排序的双链表，以解决链的可扩展性问题list的O(N)插入时间... Linux中当前的计时器管理方法可以很好地满足非常广泛的要求，但是在某些情况下它不能提供所需的服务质量，因为它必须满足这样的要求。各种各样的需求...与超时相关的定时器保留在现有的定时器轮中，并实现了针对(高分辨率)定时器要求而优化的新子系统hrtimers，hrtimers完全基于人工时间(单位:纳秒)...被保存在按时间排序的按CPU列表中，并以红黑树的形式实现。”

The high-resolution timer API [LWN.net]-“hrtimer机制在本质上保持不变。hrtimer而不是使用“timer wheel”数据结构，而是位于按时间排序的链接列表上，下一个到期的计时器位于列表的顶部。单独的红色/黑色树还用于在不扫描列表的情况下插入和删除计时器事件。但是，尽管核心保持不变，但几乎所有其他方面都发生了变化，至少是表面上的变化。”

Software interrupts and realtime [LWN.net]-“softirq机制旨在处理几乎(但不十分)与处理硬件中断同样重要的处理。Softirq以较高的优先级运行(尽管有一个有趣的异常(exception)，如下所述)，但具有硬件中断因此，他们通常会抢占除对“实际”硬件中断的响应之外的任何工作...但是，从3.0实时补丁集开始，该功能就消失了...作为响应，在3.6.1-rt1中， softirqs的处理方式又发生了变化。”

High- (but not too high-) resolution timeouts [LWN.net]-“_poll()和epoll_wait()接受整数毫秒； select()接受具有微秒分辨率的struct timeval，而ppoll()和pselect()接受具有纳秒分辨率的struct timespec。它们都是相同的，但是，他们将这个超时值转换为jiffies，最大分辨率在1到10毫秒之间。程序员可以将pselect()调用编程为具有10纳秒的超时时间，但是该调用可能直到10毫秒后才返回，即使在没有争用CPU的情况下。...这是一个有用的功能，但要付出一些重要的API更改的代价。_“

从引号中可以明显看出，内核中高分辨率计时功能仍在积极开发中(随着API的更改)-我担心，也许我必须安装一个特殊的“实时补丁”内核。幸运的是，高分辨率计时器似乎在我的2.6.38-16 SMP内核中可用(并且正在工作)，而无需进行任何特殊更改。以下是修改后的 testjiffies.c内核模块的 list ，该模块现在使用高分辨率计时器，但其他时间段保持与 jiffies所确定的相同。为了进行测试，我使它循环了200次(而不是OP中的10次)。并运行 rerun.sh脚本约20-30次，这是我得到的最糟糕的结果:

现在显然无法读取时间序列，但直方图仍可以告诉我们:最大偏差取0.00435-0.004(= 0.004-0.00365)= 350μs，它仅代表100 *(350/4000)= 8.75％预期期限；我当然没有问题。此外，我从不掉线(或相应地，整个2 *周期= 8毫秒延迟)，也没有0毫秒延迟-我获得的捕获效果与OP中第一张图像上显示的质量相同。现在，我当然可以进行更长的测试，并更精确地看到它的可靠性-但这就是我希望/需要的对于这种简单情况的所有可靠性。与OP相反，在OP中，我仅10个循环就下降了，并且有抛硬币的可能性- rerun.sh脚本的每第二或第三次运行，即使在OS资源不足的情况下，我也会下降用法!

最后，请注意，以下来源由 @CL.发现该问题:“您的模块有问题:必须确保在卸载模块之前计时器未挂起”，已修复(在 hrtimer的上下文中)。这似乎回答了我的奖励问题，因为它消除了对 sleep脚本中任何一个“MUSTHAVE” rerun.sh的需求。但是，请注意，由于200次循环@ 4毫秒耗时0.8 s-如果我们想要捕获200个完整的滴答声，则需要 sleep和 insmod之间的 rmmod(否则，在我的机器上，我仅捕获了7个滴答声)。

好吧，希望我现在就知道了(至少在大多数情况下)-如果没有，欢迎更正 :)
testjiffy(-hr).c
#include <linux/module.h>   /* Needed by all modules */
#include <linux/kernel.h>   /* Needed for KERN_INFO */
#include <linux/init.h>     /* Needed for the macros */
#include <linux/jiffies.h>
#include <linux/time.h>
#define MAXRUNS 200

#include <linux/hrtimer.h>


static volatile int runcount = 0;

//~ static struct timer_list my_timer;
static unsigned long period_ms;
static unsigned long period_ns;
static ktime_t ktime_period_ns;
static struct hrtimer my_hrtimer;


//~ static void testjiffy_timer_function(unsigned long data)
static enum hrtimer_restart testjiffy_timer_function(struct hrtimer *timer)
{
  int tdelay = 100;
  unsigned long tjnow;
  ktime_t kt_now;
  int ret_overrun;

  runcount++;
  if (runcount == 5) {
    while (tdelay > 0) { tdelay--; } // small delay
  }

  printk(KERN_INFO
    " %s: runcount %d \n",
    __func__, runcount);

  if (runcount < MAXRUNS) {
    tjnow = jiffies;
    kt_now = hrtimer_cb_get_time(&my_hrtimer);
    ret_overrun = hrtimer_forward(&my_hrtimer, kt_now, ktime_period_ns);
    printk(KERN_INFO
      " testjiffy jiffies %lu ; ret: %d ; ktnsec: %lld \n",
      tjnow, ret_overrun, ktime_to_ns(kt_now));
    return HRTIMER_RESTART;
  }
  else return HRTIMER_NORESTART;
}


static int __init testjiffy_init(void)
{
  struct timespec tp_hr_res;
  period_ms = 1000/HZ;
  hrtimer_get_res(CLOCK_MONOTONIC, &tp_hr_res);
  printk(KERN_INFO
    "Init testjiffy: %d ; HZ: %d ; 1/HZ (ms): %ld ; hrres: %lld.%.9ld\n",
               runcount,      HZ,        period_ms, (long long)tp_hr_res.tv_sec, tp_hr_res.tv_nsec );

  hrtimer_init(&my_hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
  my_hrtimer.function = &testjiffy_timer_function;
  period_ns = period_ms*( (unsigned long)1E6L );
  ktime_period_ns = ktime_set(0,period_ns);
  hrtimer_start(&my_hrtimer, ktime_period_ns, HRTIMER_MODE_REL);

  return 0;
}

static void __exit testjiffy_exit(void)
{
  int ret_cancel = 0;
  while( hrtimer_callback_running(&my_hrtimer) ) {
    ret_cancel++;
  }
  if (ret_cancel != 0) {
    printk(KERN_INFO " testjiffy Waited for hrtimer callback to finish (%d)\n", ret_cancel);
  }
  if (hrtimer_active(&my_hrtimer) != 0) {
    ret_cancel = hrtimer_cancel(&my_hrtimer);
    printk(KERN_INFO " testjiffy active hrtimer cancelled: %d (%d)\n", ret_cancel, runcount);
  }
  if (hrtimer_is_queued(&my_hrtimer) != 0) {
    ret_cancel = hrtimer_cancel(&my_hrtimer);
    printk(KERN_INFO " testjiffy queued hrtimer cancelled: %d (%d)\n", ret_cancel, runcount);
  }
  printk(KERN_INFO "Exit testjiffy\n");
}

module_init(testjiffy_init);
module_exit(testjiffy_exit);

MODULE_LICENSE("GPL");

关于linux - Linux内核add_timer的可靠性只有一个小问题吗？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/16920238/

文章推荐： ruby-on-rails - facebook开放图爬虫在rails操作中触发json响应

文章推荐： c# - SQLiteDataReader 类型亲和性错误？

文章推荐： maven-2 - 如何手动将 jar 安装到本地 Maven 存储库？

文章推荐： go - 命令需要换行才能完成

linux - 远程文本编辑 : Linux to Linux
关闭。这个问题不符合Stack Overflow guidelines .它目前不接受答案。要求我们推荐或查找工具、库或最喜欢的场外资源的问题对于 Stack Overflow 来说是偏离主题的，
linux - Linux 管道缓冲区有多少数据？ linux 管道缓冲区大小可以配置吗？
Linux 管道可以缓冲多少数据？这是可配置的吗？如果管道的两端在同一个进程中，但线程不同，这会有什么不同吗？请注意:这个“同一个进程，两个线程”的问题是理论上的边栏，真正的问题是关于缓冲的。最
linux - 用 Linux 启动 Linux？
我找到了here [最后一页] 一种有趣的通过 Linux 启动 Linux 的方法。不幸的是，它只是被提及，我在网上找不到任何有用的链接。那么有人听说过一种避免引导加载程序而使用 Linux 的方法
linux - linux 内核、linux 设备驱动程序或模块编写器程序员是否需要算法分析？
很难说出这里要问什么。这个问题模棱两可、含糊不清、不完整、过于宽泛或夸夸其谈，无法以目前的形式得到合理的回答。如需帮助澄清此问题以便重新打开，visit the help center . 关闭 1
linux - Linux ld-linux.so 的版本化符号
我试图了解 ld-linux.so 如何在 Linux 上解析对版本化符号的引用。我有以下文件: 测试.c: void f(); int main() { f(); } a.c 和 b.c:
linux - Linux 桌面应用程序可以用作 Linux 桌面吗？
与 RetroPie 的工作原理类似，我可以使用 Linux 应用程序作为我的桌面环境吗？我实际上并不需要像实际桌面和安装应用程序这样的东西。我只需要一种干净简单的方法来在 RaspberryPi 上
linux - linux 上用户和 linux 系统范围内的打开文件数是多少？
关闭。这个问题不符合Stack Overflow guidelines .它目前不接受答案。这个问题似乎不是关于 a specific programming problem, a softwar
linux - 亚马逊 Linux 与红帽 Linux
关闭。这个问题是off-topic .它目前不接受答案。想改进这个问题吗？ Update the question所以它是on-topic用于堆栈溢出。关闭 10 年前。 Improve thi
linux - 在 Linux (Linux mint) 中创建您自己的定制应用程序软件中心的最佳方法是什么？
有什么方法可以覆盖现有的源代码，我应该用 PyQt、PyGTK、Java 等从头开始构建吗？最佳答案如果您指的是软件本身而不是它所连接的存储库，那么自定义应用程序的方法就是 fork 项目。据我所
linux - 如何通过一个磁盘中的另一个 linux 系统更新一个 linux 系统？
我的情况是:我在一个磁盘上安装了两个 linux。我将第一个安装在/dev/sda1 中，然后在/dev/sda2 中安装第二个然后我运行第一个系统，我写了一个脚本来在第一个系统运行时更新它。
linux - 如何让 linux 驱动程序在 linux 内核加载后检测和使用设备？
我在 i2c-0 总线上使用地址为 0x3f 的系统监视器设备。该设备在设备树中配置有 pmbus 驱动程序。问题是，加载 linux 内核时，这个“Sysmon”设备没有供电。因此，当我在总线 0
linux - 在旧 Linux 版本中构建的应用程序可以在最新 Linux 中运行吗？
关闭。这个问题是off-topic .它目前不接受答案。想改进这个问题吗？ Update the question所以它是on-topic用于堆栈溢出。关闭 11 年前。 Improve thi
linux - 将 linux 内核中的函数导出到 linux 模块
我正试图在 linux 模块中分配一大块内存，而 kalloc 做不到。我知道唯一的方法是使用 alloc_bootmem(unsigned long size) 但我只能从 linux 内核而不是
linux - 我如何连接到一个简单的 linux 控制台来执行一些任何人都可以使用 linux 操作系统的基本命令？
关闭。这个问题不符合Stack Overflow guidelines .它目前不接受答案。这个问题似乎不是关于 a specific programming problem, a softwar
linux - "pwd"命令适用于所有 linux 类型的 linux？
我有 .sh 文件来运行应用程序。在该文件中，我想动态设置服务器名称，而不是每次都配置。我尝试了以下方法，它在 CentOS 中运行良好。 nohup /voip/java/jdk1.8.0_71/
linux - 将 Linux 应用程序复制到另一个 Linux 操作系统
我是在 Linux 上开发嵌入式 C++ 程序的新手。我有我的 Debian 操作系统，我在其中开发和编译了我的 C++ 项目(一个简单的控制台进程)。我想将我的应用程序放到另一个 Debian 操
linux - 如何从 Linux 向 Linux 机器发送数据或文件？
关闭。这个问题需要多问focused 。目前不接受答案。想要改进此问题吗？更新问题，使其仅关注一个问题 editing this post . 已关闭 4 年前。 Improve this ques
linux - 在 Linux 内核源代码树中哪里可以找到 Linux Logo ？
我使用4.19.78版本的稳定内核，我想找到带有企鹅二进制数据的C数组。系统启动时显示。我需要在哪里搜索该内容？我在 include/linux/linux_logo.h 文件中只找到了一些 Log
linux - 是否可以从非 linux 系统远程调试 linux 代码？
我知道可以使用 gdb 的服务器模式远程调试代码，我知道可以调试针对另一种架构交叉编译的代码，但是是否可以更进一步，从远程调试 Linux 应用程序OS X 使用 gdbserver？最佳答案当然
linux - 从一个 linux 到另一个 linux 的二进制文件
是否有任何可能的方法来运行在另一个 Linux 上编译的二进制文件？我知道当然最简单的是在另一台机器上重建它，但假设我们唯一能得到的是一个二进制文件，那么这可能与否？ (我知道这可能并不容易，但我只是

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

linux - Linux内核add_timer的可靠性只有一个小问题吗？