Yes, thanks for the idea. I've also setup the test machine with same userland so I can look into it, and hooked up serial console.
Also, I've found why the problem still happens even when CONFIG_GRKERNSEC_BRUTE is not set in the kernel config - it is because I have "RES_CRASH 3 10m" in my /etc/grsec/policy. It seems to override CONFIG_GRKERNSEC_BRUTE in the kernel. When I disable RES_CRASH in the policy too, segfaults no longer get detected as bruteforce, nor do they cause any problem -- so it is a workaround.
With serial console I've found the part of the explanation of the symptoms.
As you recall, I said that after bruteforce detection kills processes of my user, and then that there is half a minute of great slowdown, and only than reboot happens.
Also, I've found out it seems that the bruteforce detection kills more than the user connection -- I was logged on VTY as root (and in "gradm -a admin" mode) in tty1 and on tty2 as user, and provoked a bruteforce as user.
It closed not only user shell, but also the root one (which printed the message about "exiting special role admin") !
Serial console reveals that reboot happens because of softdog which we use as part of debian watchdog package - it prints "SoftDog: Initiating system reboot." and then machine immediately reboots.
It seems the slowdown or something makes it register as machine lockup.
The softdog params are "Software Watchdog Timer: 0.07 initialized. soft_noboot=0 soft_margin=60 sec (nowayout= 0)"
Armed with that knowledge, I've stopped the watchdog daemon (and most of the others) and provoked bruteforce as a user.
The machine still got very slow, but I managed to re-login as root (slowly) without the reboot, and then the kernel oopsed (and rebooted due to sysctl kernel.panic=60).
here is what serial console logged - from end of one boot to next reboot:
- Code: Select all
Running local boot scripts (/etc/rc.local).
grsec: (default:D:/sbin/gradm) grsecurity 2.2.2 RBAC system loaded by /sbin/gradm[gradm:1706] uid/euid:0/0 gid/egid:0/0, parent /etc/init.d/grsec[S99x5_grsec:1705] uid/euid:0/0 gid/egid:0/0
warning: `ntpd' uses 32-bit capabilities (legacy support in use)
grsec: (default:D:/sbin/gradm) successful change to special role admin (id 1) by /sbin/gradm[gradm:18540] uid/euid:0/0 gid/egid:0/0, parent /bin/zsh4[zsh:18451] uid/euid:0/0 gid/egid:0/0
getpwnam__segfa[19675]: segfault at 28 ip 00000000004005ed sp 000003e22f405d70 error 4 in getpwnam__segfault_reboota_masinu[400000+1000]
grsec: (default:D:/) Segmentation fault occurred at 0000000000000028 in /home/mnalis/work/test_getpwnam/getpwnam__segfault_reboota_masinu[getpwnam__segfa:19675] uid/euid:500/500 gid/egid:500/500, parent /bin/zsh4[zsh:18490] uid/euid:500/500 gid/egid:500/500
getpwnam__segfa[19676]: segfault at 28 ip 00000000004005ed sp 000003844d5fe450 error 4 in getpwnam__segfault_reboota_masinu[400000+1000]
grsec: (default:D:/) Segmentation fault occurred at 0000000000000028 in /home/mnalis/work/test_getpwnam/getpwnam__segfault_reboota_masinu[getpwnam__segfa:19676] uid/euid:500/500 gid/egid:500/500, parent /bin/zsh4[zsh:18490] uid/euid:500/500 gid/egid:500/500
getpwnam__segfa[19677]: segfault at 28 ip 00000000004005ed sp 000003ea2183e090 error 4 in getpwnam__segfault_reboota_masinu[400000+1000]
grsec: (default:D:/) Segmentation fault occurred at 0000000000000028 in /home/mnalis/work/test_getpwnam/getpwnam__segfault_reboota_masinu[getpwnam__segfa:19677] uid/euid:500/500 gid/egid:500/500, parent /bin/zsh4[zsh:18490] uid/euid:500/500 gid/egid:500/500
grsec: (default:D:/) possible exploit bruteforcing on /home/mnalis/work/test_getpwnam/getpwnam__segfault_reboota_masinu[getpwnam__segfa:19677] uid/euid:500/500 gid/egid:500/500, parent /bin/zsh4[zsh:18490] uid/euid:500/500 gid/egid:500/500 banning execution for 600 seconds/home/mnalis/work/test_getpwnam/getpwnam__segfault_reboota_masinu[getpwnam__segfa:19677] uid/euid:500/500 gid/egid:500/500, parent /bin/zsh4[zsh:18490] uid/euid:500/500 gid/egid:500/500
grsec: (default:D:/) special role admin (id 1) exited by /bin/zsh4[zsh:18451] uid/euid:0/0 gid/egid:0/0, parent /bin/login[login:1735] uid/euid:0/0 gid/egid:0/0
BUG: soft lockup - CPU#1 stuck for 61s! [migration/1:6]
Modules linked in: xfs exportfs ipv6 nls_iso8859_1 nls_cp437 ext2 ide_generic ide_core isofs vfat fat 8139too r8169 mii sky2 softdog
CPU 1:
Modules linked in: xfs exportfs ipv6 nls_iso8859_1 nls_cp437 ext2 ide_generic ide_core isofs vfat fat 8139too r8169 mii sky2 softdog
Pid: 6, comm: migration/1 Not tainted 2.6.32.46-grsec201110250925-nobrute2 #1 P35-DS3L
RIP: 0010:[<ffffffff8138ecf7>] [<ffffffff8138ecf7>] schedule+0x7e/0x9f8
RSP: 0000:ffff88010ba81dc0 EFLAGS: 00000282
RAX: ffff880028287490 RBX: ffff88010ba81e70 RCX: 0000000000006db8
RDX: 0000000000000001 RSI: ffff880028290890 RDI: ffff88010ba52350
RBP: ffffffff8100349e R08: ffff880028280000 R09: 0000000000000000
R10: 0000000000000000 R11: ffff880028292e40 R12: 0000000000000001
R13: 0000000000000000 R14: 0000000000000000 R15: ffff880028292e40
FS: 0000000000000000(0000) GS:ffff880028280000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000000007b2e38 CR3: 00000000013a2000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
[<ffffffff8100336b>] ? retint_restore_args+0x6/0xd
[<ffffffff8100336b>] ? retint_restore_args+0x6/0xd
[<ffffffff81033fda>] ? migration_thread+0x0/0x274
[<ffffffff81034172>] ? migration_thread+0x198/0x274
[<ffffffff81033fda>] ? migration_thread+0x0/0x274
[<ffffffff810566ee>] ? kthread+0x87/0x8f
[<ffffffff81003bef>] ? child_rip+0xf/0x20
[<ffffffff8100336b>] ? retint_restore_args+0x6/0xd
[<ffffffff81056667>] ? kthread+0x0/0x8f
[<ffffffff81003be0>] ? child_rip+0x0/0x20
Kernel panic - not syncing: softlockup: hung tasks
Pid: 6, comm: migration/1 Not tainted 2.6.32.46-grsec201110250925-nobrute2 #1
Call Trace:
<IRQ> [<ffffffff81387c18>] panic+0x75/0x14e
[<ffffffff8100349e>] ? apic_timer_interrupt+0xe/0x30
[<ffffffff8107c965>] softlockup_tick+0x17c/0x193
[<ffffffff810490ed>] run_local_timers+0x18/0x20
[<ffffffff81049124>] update_process_times+0x2f/0x5a
[<ffffffff81062ce0>] tick_sched_timer+0x6d/0x93
[<ffffffff8105992c>] __run_hrtimer+0xd0/0x165
[<ffffffff81062c73>] ? tick_sched_timer+0x0/0x93
[<ffffffff8105a152>] hrtimer_interrupt+0xd1/0x1a9
[<ffffffff8105a081>] ? hrtimer_interrupt+0x0/0x1a9
[<ffffffff81003cfc>] ? call_softirq+0x1c/0x2c
[<ffffffff8105a081>] ? hrtimer_interrupt+0x0/0x1a9
[<ffffffff81017ef3>] smp_apic_timer_interrupt+0x9f/0xbc
[<ffffffff81033fda>] ? migration_thread+0x0/0x274
[<ffffffff810034b8>] apic_timer_interrupt+0x28/0x30
<EOI> [<ffffffff8138ecf7>] ? schedule+0x7e/0x9f8
[<ffffffff8100336b>] ? retint_restore_args+0x6/0xd
[<ffffffff8100336b>] ? retint_restore_args+0x6/0xd
[<ffffffff81033fda>] ? migration_thread+0x0/0x274
[<ffffffff81034172>] ? migration_thread+0x198/0x274
[<ffffffff81033fda>] ? migration_thread+0x0/0x274
[<ffffffff810566ee>] ? kthread+0x87/0x8f
[<ffffffff81003bef>] ? child_rip+0xf/0x20
[<ffffffff8100336b>] ? retint_restore_args+0x6/0xd
[<ffffffff81056667>] ? kthread+0x0/0x8f
[<ffffffff81003be0>] ? child_rip+0x0/0x20
Rebooting in 60 seconds..Linux version 2.6.32.46-grsec201110250925-nobrute2 (mnalis@sabik) (gcc version 4.6.1 (Debian 4.6.1-15) ) #1 SMP Wed Nov 9 17:31:22 UTC 2011
Command line: BOOT_IMAGE=/boot/vmlinuz-2.6.32.46-grsec201110250925-nobrute2 root=/dev/sda2 ro elevator=cfq console=tty0 console=ttyS0,115200
KERNEL supported cpus:
I hope some of that helps tracing it down