Cron triggers general deadlock

Discuss and suggest new grsecurity features

Cron triggers general deadlock

Postby linkfanel » Mon Apr 16, 2007 9:02 pm

Hi there,

I've been running Debian sid on a server of mine, and after I upgraded and booted it on a new grsec-patched kernel, it would hang in the morning upon execution of the daily cron. After investigation, I've isolated a culprit script: /etc/cron.daily/find, which is basically a wrapper around updatedb.

At first, when I start the script, everything runs fine for a while. But toward the end of the execution, or even after it has completed, bad things happen. I start having several processes stuck in uninterruptible sleep, subsequently the load increases, when I type commands through ssh, characters are echoing back v.e.r.y. .s.l.o.w.l.y., until the system eventually does not respond anymore.

It affects as well new processes upon spawn, as already running processes (top, sshd), and processes just hanging around without really doing much (ospfd). I have a cron executed every minute, and I end up with several:

Code: Select all
S    /usr/sbin/cron
D     \_ /USR/SBIN/CRON
D     \_ /USR/SBIN/CRON
D     \_ /USR/SBIN/CRON
...


Meanwhile, the CPU spends around 80% waiting, and most of the rest for the system. I have to say that this server has too few memory and is used to swap a lot, and may reach these figures, but it usually does it just fine (without crashing). Indeed, it swaps a lot while updatedb is running. But even after it has completed and everything starts going wrong, I still have this CPU usage, without anything left running that should cause intensive swapping. I've tried stracing some processes, but saw nothing of interest. Logs are empty of anything unusual.

I'm afraid to have to say that this problem does not occur on a vanilla kernel. I've been running linux 2.6.18.3 with grsecurity-2.1.9-2.6.18.2-200611100917.patch (I think) without any problem, and it first occured when I upgraded to 2.6.19.3 with grsecurity-2.1.10-2.6.19.3-200702061822.patch (I think), and I've just been investigating it under 2.6.20.7 with grsecurity-2.1.10-2.6.20.6-200704091818.patch (I'm sure).

In case it could be useful, my server is an old Pentium MMX at 233 MHz with 48 MB of RAM, and a SATA drive on a Promise SATA 300 TX2+ controller, a normal / partition and several LVM partitions mounted on it, all in reiserfs. (It also has an IDE drive, whose modules were not even loaded at that time.) As far as I could see, updatedb does not "hang somewhere on a weird file." My .config is available at http://prue.dyn.linkfanel.net/config-2.6.20.7-grsec_andrea

As this server is not on the same continent as me, I am open to limited experimentation only (no, remotely rebooting with "echo b > /proc/sysrq-trigger" before losing all control is not that funny).
linkfanel
 
Posts: 39
Joined: Fri Jul 14, 2006 8:26 pm

Re: Cron triggers general deadlock

Postby PaX Team » Fri Apr 20, 2007 8:50 am

linkfanel wrote:I'm afraid to have to say that this problem does not occur on a vanilla kernel. I've been running linux 2.6.18.3 with grsecurity-2.1.9-2.6.18.2-200611100917.patch (I think) without any problem, and it first occured when I upgraded to 2.6.19.3 with grsecurity-2.1.10-2.6.19.3-200702061822.patch (I think), and I've just been investigating it under 2.6.20.7 with grsecurity-2.1.10-2.6.20.6-200704091818.patch (I'm sure).
i don't have an immediate guess as to which feature could be causing this so we're left with trying to turn off features and see when the problem goes away. you could start with enabling PAGEEXEC instead of SEGMEXEC, that'll get rid of the vma mirroring code. if that doesn't help, try without KERNEXEC and UDEREF (even though failure caused by them would result in oops, not this kind of hang).
PaX Team
 
Posts: 2310
Joined: Mon Mar 18, 2002 4:35 pm

Postby linkfanel » Sat May 05, 2007 3:14 pm

I have studied the problem further, and the hang also happens on a tar xjvf linux-2.6.20.8.tar.bz2 or like... However, I eventually reproduced it on a vanilla kernel. It seems that the PaX code just makes the problem worse/more likely to happen.

Sorry for bothering you for nothing!
linkfanel
 
Posts: 39
Joined: Fri Jul 14, 2006 8:26 pm

Postby PaX Team » Sat May 05, 2007 4:50 pm

linkfanel wrote:I have studied the problem further, and the hang also happens on a tar xjvf linux-2.6.20.8.tar.bz2 or like... However, I eventually reproduced it on a vanilla kernel. It seems that the PaX code just makes the problem worse/more likely to happen.

Sorry for bothering you for nothing!
well, it wasn't for nothing, at least you can more or less reliably reproduce it ;-). what would help is looking at where tasks executing in the kernel are waiting, sysrq-w should log some useful info.
PaX Team
 
Posts: 2310
Joined: Mon Mar 18, 2002 4:35 pm


Return to grsecurity development