Panic in strcmp() during boot on grsec 3.11.6

Discuss usability issues, general maintenance, and general support issues for a grsecurity-enabled system.

Panic in strcmp() during boot on grsec 3.11.6

Postby wizeman » Sat Nov 02, 2013 4:28 pm

While booting 3.11.6 with the grsecurity-2.9.1-3.11.6-201310292050.patch applied I consistently run into this crash during boot:

http://i.imgur.com/lnMQDFn.jpg

Any idea of what could be wrong?

The same kernel boots and seems to work fine without the grsec patch applied.
wizeman
 
Posts: 14
Joined: Sat Nov 02, 2013 4:06 pm

Re: Panic in strcmp() during boot on grsec 3.11.6

Postby wizeman » Thu Nov 07, 2013 3:50 pm

Some more data: it also fails with the same error using the updated "grsecurity-2.9.1-3.11.6-201311021635.patch", and it always fails consistently in the same way.

If I use kernel 3.2.52, with "grsecurity-2.9.1-3.2.52-201311021628.patch" applied and with the same (or at least, as similar as possible) kernel config, it boots up and works fine.

I can provide more info if necessary.
wizeman
 
Posts: 14
Joined: Sat Nov 02, 2013 4:06 pm

Re: Panic in strcmp() during boot on grsec 3.11.6

Postby wizeman » Thu Nov 07, 2013 4:47 pm

Some more info which might be relevant: I am using ZFS as the root filesystem, which boots and works fine on both 3.11.6 without grsecurity and 3.2.52 with grsecurity.
However, I don't know if the module that is failing is a ZFS module or not.
wizeman
 
Posts: 14
Joined: Sat Nov 02, 2013 4:06 pm

Re: Panic in strcmp() during boot on grsec 3.11.6

Postby PaX Team » Thu Nov 07, 2013 5:46 pm

while the full logs would be nice to have, this seems to be some null pointer dereference triggered during module loading. if you tell me how to reproduce it in qemu i'll take a look.
PaX Team
 
Posts: 2310
Joined: Mon Mar 18, 2002 4:35 pm

Re: Panic in strcmp() during boot on grsec 3.11.6

Postby wizeman » Thu Nov 07, 2013 9:50 pm

Strangely, running qemu with the exact same kernel and initrd seems to boot perfectly fine.

I even tried running:
Code: Select all
$ qemu-kvm -m 512 -cpu kvm64 -smp 1,cores=4 -snapshot /dev/sda


... which makes qemu use my hard disk directly (in read-only mode), which should reproduce my exact config more closely... but it also boots up just fine!
However, when I reboot the machine and boot the exact same config on bare metal, it will crash on boot.

I will now attempt to get a more complete kernel log when booting on bare metal (unless you have another suggestion).

Thanks!
wizeman
 
Posts: 14
Joined: Sat Nov 02, 2013 4:06 pm

Re: Panic in strcmp() during boot on grsec 3.11.6

Postby PaX Team » Sun Nov 10, 2013 8:09 pm

so i'll need those logs then, your best bet is netconsole probably if you have another machine around.
PaX Team
 
Posts: 2310
Joined: Mon Mar 18, 2002 4:35 pm

Re: Panic in strcmp() during boot on grsec 3.11.6

Postby wizeman » Sun Nov 10, 2013 10:13 pm

I was able to get a netconsole log today.
However, the machine behaves differently with netconsole - instead of crashing immediately, it seems to freeze during boot, at least for several seconds up until I press some keys (or until some timeout occurs?) and then it crashes.

Here's the last few lines of the netconsole log:

Code: Select all
[    0.466038] netpoll: netconsole: device eth0 not up yet, forcing it
[    0.471076] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0
[    1.455257] tsc: Refined TSC clocksource calibration: 2394.560 MHz
[    2.456220] Switched to clocksource tsc
[   60.517633] r8169 0000:06:00.0 eth0: unable to load firmware patch rtl_nic/rtl8168e-2.fw (-2)
[   60.530334] r8169 0000:06:00.0 eth0: link down
[   60.530349] r8169 0000:06:00.0 eth0: link down
[   62.935196] r8169 0000:06:00.0 eth0: link up
[   62.939297] console [netcon0] enabled
[   62.939366] netconsole: network logging started
[   62.940292] Freeing unused kernel memory: 1272K (ffffffff8183c000 - ffffffff8197a000)
[   62.954531] BUG: unable to handle kernel NULL pointer dereference at            (nil)
[   62.954725] IP: [<ffffffff812a4800>] strcmp+0x10/0x40
[   62.954850] PGD 0
[   62.954959] Thread overran stack, or stack corrupted
[   62.955034] Oops: 0000 [#1] SMP
[   62.955187] Modules linked in:
[   62.955305] CPU: 1 PID: 488 Comm: modprobe Not tainted 3.11.7 #1-NixOS
[   62.955382] Hardware name: Dell Inc.          Dell System XPS L502X/XXXXXX, BIOS A11 05/29/2012
[   62.955490] task: ffff880232f3cf60 ti: ffff880232f3d3d0 task.ti: ffff880232f3d3d0
[   62.955597] RIP: 0010:[<ffffffff812a4800>]  [<ffffffff812a4800>] strcmp+0x10/0x40
[   62.955744] RSP: 0018:ffff88023054bbe8  EFLAGS: 00010246
[   62.955824] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000010
[   62.955903] RDX: 0000000000000da1 RSI: ffffffff817008db RDI: 0000000000000000
[   62.955982] RBP: ffff88023054bbf8 R08: ffffffff810b6410 R09: ffffffffa0006040
[   62.956064] R10: ffffffff81665ca5 R11: ffffffff81665cb6 R12: ffffffff817008db
[   62.956143] R13: 0000000000000010 R14: 0000000000000000 R15: 00000000000006d0
[   62.956223] FS:  0000033c65c4b700(0000) GS:ffff88023e440000(0000) knlGS:0000000000000000
[   62.956333] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   62.956409] CR2: 0000000000000000 CR3: 00000000014bc000 CR4: 00000000000607f0
[   62.956489] Stack:
[   62.956563]  0000000000000000 ffffffff816de350 ffff88023054bc18 ffffffff810b6438
[   62.956829]  0000000000000000 ffffffff816d7650 ffff88023054bc78 ffffffff812b0ed0
[   62.957092]  ffff88023054bc58 ffffffff810b6410 ffffffff816de350 0000000000000da1
[   62.957357] Call Trace:
[   62.957435]  [<ffffffff810b6438>] cmp_name+0x28/0x40
[   62.957511]  [<ffffffff812b0ed0>] bsearch+0x70/0xa0
[   62.957587]  [<ffffffff810b6410>] ? mod_find_symname+0x90/0x90
[   62.957670]  [<ffffffff810b65dd>] find_symbol_in_section+0x4d/0xf0
[   62.957750]  [<ffffffff810b6590>] ? section_objs+0x70/0x70
[   62.957830]  [<ffffffff810b70e0>] each_symbol_section+0x30/0x70
[   62.957912]  [<ffffffff810b716c>] find_symbol+0x4c/0x90
[   62.957990]  [<ffffffff810ba26b>] load_module+0x19eb/0x25f0
[   62.958069]  [<ffffffff810bafe1>] SyS_init_module+0x171/0x240
[   62.958153]  [<ffffffff814ae2d5>] system_call_fastpath+0x18/0x1d
[   62.958229] Code: 44 00 00 41 0f b6 0c 14 88 0c 10 48 83 c2 01 84 c9 75 f0 48 89 d8 5b 41 5c 5d c3 55 48 89 e5 41 54 49 89 f4 53 48 89 fb 31 c0 90 <0f> b6 14 03 41 3a 14 04 75 16 48 83 c0 01 84 d2 75 ee 5b 41 5c
[   62.960961] RIP  [<ffffffff812a4800>] strcmp+0x10/0x40
[   62.961077]  RSP <ffff88023054bbe8>
[   62.961150] CR2: 0000000000000000
[   62.961238] ---[ end trace 61a6180546662351 ]---
[   62.961314] Kernel panic - not syncing: grsec: halting the system due to suspicious kernel crash caused by root


And here's the full log: http://pastebin.com/raw.php?i=1S96Qxx5

Thanks!
wizeman
 
Posts: 14
Joined: Sat Nov 02, 2013 4:06 pm

Re: Panic in strcmp() during boot on grsec 3.11.6

Postby wizeman » Wed Nov 13, 2013 5:25 pm

Notice that in the stack trace, the kernel found stack corruption.

I am now suspecting this may be because of ZFS and not because of grsecurity... Although the stack trace is different, I have also found a stack corruption case in their bug tracker:

https://github.com/zfsonlinux/zfs/issues/1778

I have also ran into a similar bug in ZFS a few years ago that caused stack corruption in an unrelated kernel thread due to DMA transfers into the wrong kernel buffer... so it may be something similar as well.
wizeman
 
Posts: 14
Joined: Sat Nov 02, 2013 4:06 pm

Re: Panic in strcmp() during boot on grsec 3.11.6

Postby PaX Team » Sat Nov 16, 2013 12:25 pm

do you load zfs as a module or is it patched into the kernel directly? i'm just asking because it's not listed as an already loaded module in the oops. you could also try to boot from a non-zfs file system (and not compile/load zfs at all) just to see if it's a factor.
PaX Team
 
Posts: 2310
Joined: Mon Mar 18, 2002 4:35 pm

Re: Panic in strcmp() during boot on grsec 3.11.6

Postby wizeman » Mon Nov 18, 2013 8:09 pm

PaX Team wrote:do you load zfs as a module or is it patched into the kernel directly? i'm just asking because it's not listed as an already loaded module in the oops. you could also try to boot from a non-zfs file system (and not compile/load zfs at all) just to see if it's a factor.


Actually, I load it as an external module... so I think you may be right.

I guess I will have to do some deeper debugging when I have the chance...

Thanks!
wizeman
 
Posts: 14
Joined: Sat Nov 02, 2013 4:06 pm


Return to grsecurity support

cron