Semi-random kernel panics on 3.7.1+201212271953

Discuss usability issues, general maintenance, and general support issues for a grsecurity-enabled system.

Re: Semi-random kernel panics on 3.7.1+201212271953

Postby ephox » Wed Jan 09, 2013 6:22 pm

Hi,

Thanks for your help, I found the bug :). I created a temporary fix for it. I will fix it properly in a later version.
You can find it in the next pax/grsec version or here:
http://www.grsecurity.net/~ephox/overfl ... 20130109.c
ephox
 
Posts: 134
Joined: Tue Mar 20, 2012 4:36 pm

Re: Semi-random kernel panics on 3.7.1+201212271953

Postby Neokernsec » Wed Jan 09, 2013 11:16 pm

ephox wrote:Hi,

Thanks for your help, I found the bug :). I created a temporary fix for it. I will fix it properly in a later version.
Excellent, ephox!

In the interests of assessing "risk" to some of these features in PaX, would you say the overflow plugin is "high risk" with respect to being something that could cause these sorts of panics? I realise this is a new feature for grsecurity, and also can appreciate how radical it is in the sense that it works with the compiler to perform some additional semantic validation.

I've been a LONG TIME user of grsec patches, and this is the first time I've had *ANY* "stable" version that caused panics of any kind.

Thanks to you, and the whole PaX + grsec team again. I honestly couldn't imagine deploying front-facing servers without MLS/RBAC security in a heavily locked down configuration. To this day, most of my servers don't have any X11 stuff on them, which attracts some chuckling about "dinosaurs with their text mode consoles, etc", but the NOC guys in my datacentre give me respect because I've yet to have a COMSEC "incident", despite some of my customers being some of the most despised orgs in the world.

I'd also like to add that having just done some extensive burn-ins with new hardware (all having 64GB RAM and up), just about ALL of them report memory correction events via EDAC reporting -
These are SDDC x4 ECC Intel chipset systems with proactive memory scrubbing - sometimes every 48 hours. These are corrected errors, and thus "harmless", but it begs the question: if a single bit within a 64GB system randomly flips within 24-48 hours, how many flipped bits will there be after a year's uptime?

MLS/RBAC, SDCC, etc, are not silver bullets, but simply components to reduce risk and hopefully maximise reliability and data coherence. If only the zfs licence got detoxified, the whole system would reflect a design philosophy that anticipates (and almost expects) data loss/corruption, and takes appropriate countermeasures and recovers from it accordingly.

Edit 1: Yes, it looks like you've nailed this bug at last!
Neokernsec
 
Posts: 16
Joined: Sun Dec 30, 2012 8:53 am

Re: Semi-random kernel panics on 3.7.1+201212271953

Postby PaX Team » Fri Jan 18, 2013 8:05 am

Neokernsec wrote:In the interests of assessing "risk" to some of these features in PaX, would you say the overflow plugin is "high risk" with respect to being something that could cause these sorts of panics? I realise this is a new feature for grsecurity, and also can appreciate how radical it is in the sense that it works with the compiler to perform some additional semantic validation.
note that we've always included experimental or in-progress features in PaX and grsec, and we usually say so when we announce them (in this case, the mailing list and the blog). as for the risk of false positives, they'll be there for some time because finding and eliminating them is not easy. this is because at the point the plugin gets access to the internal representation of the code it is already too late to tell intentional overflows (those intoduced by gcc itself due to canonicalization of expressions) from accidental ones. if the gcc plugin system gave access to the language frontends, things could be much improved but that's not the case today.
I've been a LONG TIME user of grsec patches, and this is the first time I've had *ANY* "stable" version that caused panics of any kind.
the panic was a sideeffect of trying to kill the offending process that triggered the (false positive) size overflow check which in this case happened in irq context.
PaX Team
 
Posts: 2310
Joined: Mon Mar 18, 2002 4:35 pm

Previous

Return to grsecurity support