gpt4 book ai didi

linux - SMP 中的 CPU 停顿

转载 作者:太空宇宙 更新时间:2023-11-04 04:13:43 25 4
gpt4 key购买 nike

            We are facing an issue on which we need some help.

简要说明:

            We have enabled SMP in Linux 2.6.39.4 kernel and cross compiled it for PPC-476. After booting, kernel is able to map both the processors (2 cores at h/w). The problem we are facing is, while running modprobe command repeatedly,  one of the cpu goes into stall state. We have tried to dump stack of all active cpus (using sysrq) while one of the cpu is in stall state. The stack dump showed both the processors were executing same process (with same PID) i.e. modprobe.

问题:1. 两个处理器是否有可能在具有相同 PID 的运动中执行相同的进程。2. 两个进程同时执行是否会引起某种竞争条件,从而导致任一 CPU 进入停顿状态。

日志==================================================

SysRq : Show backtrace of all active CPUs
CPU0:
NIP: 701786c4 LR: 701752f0 CTR: 00000004
REGS: 9fb4fdc0 TRAP: 0501 Not tainted (2.6.39.4)
MSR: 00029000 <EE,ME,CE> CR: 44002048 XER: 00000000
TASK = 8f868ae0[827] 'modprobe' THREAD: 9fb48000 CPU: 0
GPR00: 08101820 9fb4fe70 8f868ae0 22222222 0002374d 00000002 00849ffc 00000000
GPR08: a2bb3a8c 00000810 a2c2aac4 00000148 44002088
NIP [701786c4] __sw_hweight32+0x50/0x58
LR [701752f0] __bitmap_weight+0x54/0xc0
Call Trace:
[9fb4fe70] [20000000] 0x20000000 (unreliable)
[9fb4fe90] [7006b1cc] sys_init_module+0x11a8/0x1ca4
[9fb4ff40] [7000f1b8] ret_from_syscall+0x0/0x3c
--- Exception: c01 at 0x10050e38
LR = 0x100a6708
Instruction dump:
7c634838 7c004838 7c001a14 5409e13e 7c090214 3d200f0f 61290f0f 7c004838
5409c23e 7c090214 5409843e 7c090214 <5403063e> 4e800020 5460f87e 70005555
CPU1:
NIP: 700a9678 LR: 700a964c CTR: 7012b0f8
REGS: 9fb4fe30 TRAP: 0501 Not tainted (2.6.39.4)
MSR: 00029000 <EE,ME,CE> CR: 80008022 XER: 20000000
TASK = 8f868ae0[827] 'modprobe' THREAD: 9fb48000 CPU: 1
GPR00: 00000000 9fb4fee0 8f868ae0 9efb9f20 9efb9f20 9fb4fee8 1004d6d0 0002d000
GPR08: 9efb9d68 9f8eee00 9efb9f20 00000000 20002022
NIP [700a9678] do_munmap+0x114/0x314
LR [700a964c] do_munmap+0xe8/0x314
Call Trace:
[9fb4fee0] [00000014] 0x14 (unreliable)
[9fb4ff20] [700aa7c8] sys_munmap+0x44/0x74
[9fb4ff40] [7000f1b8] ret_from_syscall+0x0/0x3c
--- Exception: c01 at 0x100510a8
LR = 0x10022088
Instruction dump:
4bffe461 7c641b79 41820010 80040004 7f9d0040 419d01d4 83210008 2e190000
41920200 83d9000c 801f0074 2f800000 <419e00f8> 2f9e0000 419e00f0 801e0004
Call Trace:
[9ffaff00] [70008654] show_stack+0x6c/0x1a4 (unreliable)
[9ffaff40] [70192d30] showacpu+0x84/0xcc
[9ffaff60] [700679bc] generic_smp_call_function_single_interrupt+0x100/0x18c
[9ffaff90] [7000ff6c] call_function_single_action+0x10/0x24
[9ffaffa0] [7006f7f4] handle_irq_event_percpu+0xa0/0x21c
[9ffaffe0] [700728d8] handle_percpu_irq+0x88/0xb8
[9ffafff0] [7000e038] call_handle_irq+0x18/0x28
[9fb4fdf0] [700044b0] do_IRQ+0xe8/0x1a0
[9fb4fe20] [7000f81c] ret_from_except+0x0/0x18
--- Exception: 501 at do_munmap+0x114/0x314
LR = do_munmap+0xe8/0x314
[9fb4fee0] [00000014] 0x14 (unreliable)
[9fb4ff20] [700aa7c8] sys_munmap+0x44/0x74
[9fb4ff40] [7000f1b8] ret_from_syscall+0x0/0x3c
--- Exception: c01 at 0x100510a8

================================================================

最佳答案

你的两个问题都是错误的。两个核心不可能同时运行相同的任务。

从 OOPS/panic 跟踪中,我们可以知道 CPU 0 上的内核首先发生 panic ,然后触发 SysRq“显示所有事件 CPU 的回溯”。也就是说,CPU 0 发生 panic 时,CPU 1 工作正常,但 CPU 0 异常触发转储 CPU 1 的回溯。

现在让我们分析一下为什么CPU 0会出现panic:从回溯来看,它发生在“modprobe”期间,所以请分析您系统上的内核模块,以找到触发panic的模块。

关于linux - SMP 中的 CPU 停顿,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17849398/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com