Page 1 of 1

applicaton crashed at _start()

PostPosted: Tue Mar 01, 2011 2:46 am
by lgskyxp
i'm using the WindRiver2-SP4(2.6.21), the grsecurity patch is grsecurity-2.1.10-2.6.21.5-200706182032.patch, Intel Core2 L7400.

we found that during the system start-up, sometimes(not always, approximately 8%), one application will receive the SIGSEGV signal, the message looks like:
-----------------------------
Feb 5 23:22:06 localhost kernel: grsec: signal 11 sent to /sbin/modprobe[modprobe:2074] uid/euid:0/0 gid/egid:0/0, parent /lib/udev/modprobe[modprobe:2072] uid/euid:0/0 gid/egid:0/0
-----------------------------
the applications were random, i saw /lib/udev/scsi_id, /sbin/modprobe, /usr/sbin/crond, arping, xined, udevd, /sbin/pam_console_apply at least.(so i don't think it's the application's problem.)

after check the core dump, it show the EIP always point to the _start, one example as below:
-------------------------------
gdb /sbin/udevtrigger -c core.udevtrigger
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-23.el5)
Reading symbols from /sbin/udevtrigger...(no debugging symbols found)...done.
warning: exec file is newer than core file.
Reading symbols from /lib/ld-linux.so.2...Reading symbols from /usr/lib/debug/lib/ld-2.5.so.debug...(no debugging symbols found)...done.
(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `/sbin/udevtrigger'.
Program terminated with signal 11, Segmentation fault.
#0 0x03780810 in _start () from /lib/ld-linux.so.2

(gdb) info reg
eax 0x0 0
ecx 0x0 0
edx 0x0 0
ebx 0x0 0
esp 0xbe8bbab0 0xbe8bbab0
ebp 0x0 0x0
esi 0x0 0
edi 0x0 0
eip 0x3780810 0x3780810 <_start>
eflags 0x212 [ AF IF ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x0 0
-------------------------------

i dump the kernel stack before the SIGSEGV was sent to the application, it always look like below:
-------------------------------
Feb 7 02:55:35 localhost kernel: [<c0153aba>] show_trace_log_lvl+0x1a/0x30
Feb 7 02:55:35 localhost kernel: [<c0154352>] show_trace+0x12/0x20
Feb 7 02:55:35 localhost kernel: [<c01544a6>] dump_stack+0x16/0x20
Feb 7 02:55:35 localhost kernel: [<c018077c>] force_sig_info+0xbc/0xd0
Feb 7 02:55:35 localhost kernel: [<c01541fe>] do_trap+0x5e/0x160
Feb 7 02:55:35 localhost kernel: [<c0155219>] do_iret_error+0x99/0xb0
Feb 7 02:55:35 localhost kernel: [<c047f9c3>] error_code+0x73/0x80
-------------------------------

i don't know why there is iret error, after changing the kernel configuration, we found that disabling the CONFIG_PAX_PAGEEXEC will make the kernel/application work well. I thought that may be the NX protection problem, but i don't find the log like ' PAX: execution attempt it'.
could you give me some hint/help on this problem? anything else we need for the analysis further?

thanks

the configuration segment as below:
CONFIG_PAX=y

CONFIG_PAX_SOFTMODE=y
CONFIG_PAX_EI_PAX=y
CONFIG_PAX_PT_PAX_FLAGS=y
# CONFIG_PAX_NO_ACL_FLAGS is not set
# CONFIG_PAX_HAVE_ACL_FLAGS is not set
CONFIG_PAX_HOOK_ACL_FLAGS=y

CONFIG_PAX_NOEXEC=y
CONFIG_PAX_PAGEEXEC=y
# CONFIG_PAX_SEGMEXEC is not set
CONFIG_PAX_EMUTRAMP=y
# CONFIG_PAX_MPROTECT is not set

CONFIG_PAX_ASLR=y
CONFIG_PAX_RANDKSTACK=y
CONFIG_PAX_RANDUSTACK=y
CONFIG_PAX_RANDMMAP=y

CONFIG_PAX_MEMORY_SANITIZE=y
CONFIG_PAX_MEMORY_UDEREF=y
# CONFIG_KEYS is not set
CONFIG_SECURITY=y
CONFIG_SECURITY_NETWORK=y
# CONFIG_SECURITY_NETWORK_XFRM is not set
CONFIG_SECURITY_CAPABILITIES=m
# CONFIG_SECURITY_ROOTPLUG is not set
CONFIG_SECURITY_BSDJAIL=m

Re: applicaton crashed at _start()

PostPosted: Wed Mar 02, 2011 12:57 pm
by PaX Team
lgskyxp wrote:i'm using the WindRiver2-SP4(2.6.21), the grsecurity patch is grsecurity-2.1.10-2.6.21.5-200706182032.patch, Intel Core2 L7400.
first of all, this is a very old and therefore unsupported kernel, you should try something more recent and reproduce the problem there. with that said, the symptoms look familiar and that bug got fixed already (around 2009, and in fact it was a windriver guy who helped me debug it so i assume they also backported the fix at the time to some of their products, not sure what you're using there). can you tell me if you had PAE enabled? if not then the old PAGEEXEC method would be used that had a bug with the lazy CS limit tracking (compare switch_mm() and the 'else' branch in particular, you'll see what changed there since then).