gradm 1.9.13 gets stuck

Discuss usability issues, general maintenance, and general support issues for a grsecurity-enabled system.

gradm 1.9.13 gets stuck

Postby niz » Tue Dec 02, 2003 8:34 am

First, the gradm worked fine couple of times. I was testing ACL's and those worked ok - after that i shut it down and later started again but it got stuck..

No matter what I try to do now with gradm, it gets stuck
Those stucked gradm processes can't be killed.

Every once in a while following appears in kern.log:

Code: Select all
Dec  2 10:24:54 xxxx kernel: Unable to handle kernel paging request at virtual address 69afac52
Dec  2 10:24:54 xxxx kernel: printing eip:
Dec  2 10:24:54 xxxx kernel: c02c03b2
Dec  2 10:24:54 xxxx kernel: *pde = 00000000
Dec  2 10:24:54 xxxx kernel: Oops: 0000
Dec  2 10:24:54 xxxx kernel: CPU:    0
Dec  2 10:24:54 xxxx kernel: EIP:    0010:[<c02c03b2>]    Not tainted
Dec  2 10:24:54 xxxx kernel: EFLAGS: 00010206
Dec  2 10:24:54 xxxx kernel: eax: 69afa1ca   ebx: 00000000   ecx: db6a4d80   edx: 000002a4
Dec  2 10:24:54 xxxx kernel: esi: c014e524   edi: c014e51c   ebp: c014e510   esp: cf17be4c
Dec  2 10:24:54 xxxx kernel: ds: 0018   es: 0018   ss: 0018
Dec  2 10:24:54 xxxx kernel: Process gradm (pid: 26391, stackpage=cf17b000)
Dec  2 10:24:54 xxxx kernel: Stack: c02b7737 cf17a000 080cbee8 cf17bf54 cf17be84 c02ba8b7 c0102c78 00000002
Dec  2 10:24:54 xxxx kernel: d3183840 00000001 c0110f80 db6a4da0 00000000 000000d0 080cbef0 00000010
Dec  2 10:24:54 xxxx kernel: 0000001a 000000f8 00000000 00000000 00000000 00000000 00000000 00000000
Dec  2 10:24:54 xxxx kernel: Call Trace:    [<c02b7737>] [<c02ba8b7>] [<c01a9bad>] [<c01a9c00>] [<c01c5c07>]
Dec  2 10:24:54 xxxx kernel: [<c01941a3>]
Dec  2 10:24:54 xxxx kernel:
Dec  2 10:24:54 xxxx kernel: Code: 8b 44 90 f8 85 c0 74 09 50 e8 88 c9 ef ff 83 c4 04 ff 0d 40


My acl was:

Code: Select all
/ {
        /
#       /opt    rx
        /home   rx
        /mnt    r
        /dev
        /dev/random     r
        /dev/urandom    r
#       /dev/input      rw
#       /dev/psaux      rw
        /dev/tty?       rw
        /dev/null       rw
        /dev/pts        rw
        /dev/ptmx       rw
        /dev/tty        rw
#       /dev/dsp        rw
#       /dev/mixer      rw
        /dev/console    rw
        /dev/mem        h
        /dev/kmem       h
        /dev/port       h
        /dev/zero       rw
        /bin            rx
        /sbin           rx
        /lib            rx
        /usr            rx
        /etc            rx
#       /etc/postfix    r
        /etc/init.d     h
        /etc/shadow-    h
        /etc/shadow     h
        /proc           rwx
        /proc/sys       r
        /proc/kcore     h
        /root           r
        /tmp            rw
        /var            rx
        /var/cache      rw
        /var/spool      rw
#        /var/spool/postfix/lib rx
        /var/run        rw
        /var/tmp        rw
        /var/log
        /boot           r
        /etc/grsec      h
 
        -CAP_ALL
}


And I used following versions:
gradm-1.9.13
grsecurity-1.9.13-2.4.23
linux-2.4.23

I'm not using any other kernel patches than grsec.

any ideas?
niz
 
Posts: 19
Joined: Mon Sep 09, 2002 6:12 am

debug results

Postby niz » Tue Dec 02, 2003 9:13 am

I debugged gradm little and I found that it stucks in gradm_misc.c line 26:

if ((fd = open(GR_SYSCTL_PATH, O_WRONLY)) < 0) {
fprintf(stderr, "Could not open %s\n", GR_SYSCTL_PATH);
failure("open");
}

if (write(fd, buf, len) != len) { /* !!!! CRASH !!!! */
switch (errno) {
case EFAULT:
fprintf(stderr, "Error copying structures to the "
"kernel.\n");
break;
niz
 
Posts: 19
Joined: Mon Sep 09, 2002 6:12 am

hmm..

Postby niz » Tue Dec 02, 2003 9:34 am

I found that after gradm got stuck, kernel stops logging anything..
I think my grsecurity is totally jammed.. but not whole kernel.. I can boot that machine next time at night when nobody is using it.
niz
 
Posts: 19
Joined: Mon Sep 09, 2002 6:12 am

Grsecurity kernel bug?

Postby niz » Tue Dec 02, 2003 1:03 pm

I'm pretty sure that if this is bug in grsecurity.. it's on kernel side.
I can't test it much more because it's important computer and it's working for now.

Next time today (december 2) utc 8pm I can try to debug kernel in that machine if anybody have ideas.. But that machine is 100km away from me (without watchdog or management processor) so I don't want crash it totally.. in kernel panic it should reboot (kernel parameter: panic=30).

I updated kernel and grsecurity because of that kernel bug in older kernels and I just hoped that it would work.. =)
Maybe I need to use older kernel and just put tpe or acl to restrict users from running own programs so machine would be immune to that kernel bug but I have one shitty program running with it's own user and it needs to execute it's own programs.

If anybody else haves this problem, please post something so we can find where that problem possibly is.
niz
 
Posts: 19
Joined: Mon Sep 09, 2002 6:12 am

Postby spender » Tue Dec 02, 2003 5:35 pm

run ksymoops against that oops report and paste the results.

-Brad
spender
 
Posts: 2185
Joined: Wed Feb 20, 2002 8:00 pm

Postby mg » Tue Dec 02, 2003 6:32 pm

Hi,
I noticed this bug too. Here goes oops report.

Unable to handle kernel paging request at virtual address a9b56b26
c0349cd7
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c0349cd7>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010206
eax: 00002e73 ebx: 00000000 ecx: a9b4b162 edx: 00000000
esi: 00000077 edi: 00000086 ebp: 000000d0 esp: c2a2fce4
ds: 0018 es: 0018 ss: 0018
Process gradm (pid: 22956, stackpage=c2a2f000)
Stack: 00000086 c033eb85 d6769ec0 00000050 00000077 00000050 c03421d1 c0397020
00000050 00000036 00000086 00000077 c2a2e24a 000059ac 00000000 00000000
c8a1a24a 0000354b 00000000 00000000 c020a530 c15a3780 c011c800 d6769e80
Call Trace: [<c033eb85>] [<c03421d1>] [<c020a530>] [<c033f4c1>] [<c020a530>]
[<c033f4c1>] [<c034000d>] [<c034000d>] [<c020d139>] [<c034b87f>] [<c01d2855>]
[<c01d28ff>] [<c01f3043>] [<c01b56bd>] [<c01b56a3>]
Code: 8b 54 81 f8 85 d2 75 53 48 ba 01 00 00 00 a3 34 4a 11 c0 85


>>EIP; c0349cd7 <acl_free_all+27/a0> <=====

>>eax; 00002e73 Before first symbol
>>ecx; a9b4b162 Before first symbol
>>esp; c2a2fce4 <_end+26851a4/2051b520>

Trace; c033eb85 <free_variables+95/1c0>
Trace; c03421d1 <gr_proc_handler+9e1/29a0>
Trace; c020a530 <dput+30/190>
Trace; c033f4c1 <chk_obj_label+1c1/5f0>
Trace; c020a530 <dput+30/190>
Trace; c033f4c1 <chk_obj_label+1c1/5f0>
Trace; c034000d <gr_search_file+4d/110>
Trace; c034000d <gr_search_file+4d/110>
Trace; c020d139 <iget4+109/110>
Trace; c034b87f <gr_acl_handle_open+6f/7c0>
Trace; c01d2855 <do_rw_proc+b5/f0>
Trace; c01d28ff <proc_writesys+2f/40>
Trace; c01f3043 <sys_write+a3/160>
Trace; c01b56bd <system_call+4d/50>
Trace; c01b56a3 <system_call+33/50>

Code; c0349cd7 <acl_free_all+27/a0>
00000000 <_EIP>:
Code; c0349cd7 <acl_free_all+27/a0> <=====
0: 8b 54 81 f8 mov 0xfffffff8(%ecx,%eax,4),%edx <=====
Code; c0349cdb <acl_free_all+2b/a0>
4: 85 d2 test %edx,%edx
Code; c0349cdd <acl_free_all+2d/a0>
6: 75 53 jne 5b <_EIP+0x5b>
Code; c0349cdf <acl_free_all+2f/a0>
8: 48 dec %eax
Code; c0349ce0 <acl_free_all+30/a0>
9: ba 01 00 00 00 mov $0x1,%edx
Code; c0349ce5 <acl_free_all+35/a0>
e: a3 34 4a 11 c0 mov %eax,0xc0114a34
Code; c0349cea <acl_free_all+3a/a0>
13: 85 00 test %eax,(%eax)
mg
 
Posts: 1
Joined: Tue Dec 02, 2003 6:31 pm

ksymoops

Postby niz » Tue Dec 02, 2003 6:35 pm

Ok, I ran ksymoops and thats result:
Code: Select all
xxxx:/# tail -27 /var/log/kern.log|head -19|cut -d ' ' -f 7-|ksymoops --no-ksyms --no-lsmod --no-object --system-map=/usr/src/linux-2.4.23/System.map
ksymoops 2.4.5 on i686 2.4.23-grsec.  Options used
     -V (default)
     -K (specified)
     -L (specified)
     -O (specified)
     -m /usr/src/linux-2.4.23/System.map (specified)

Unable to handle kernel paging request at virtual address 69afac52
c02c03b2
*pde = 00000000
Oops: 0000
CPU:    0
EIP:    0010:[<c02c03b2>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010206
eax: 69afa1ca   ebx: 00000000   ecx: db6a4d80   edx: 000002a4
esi: c014e524   edi: c014e51c   ebp: c014e510   esp: cf17be4c
ds: 0018   es: 0018   ss: 0018
Process gradm (pid: 26391, stackpage=cf17b000)
Stack: c02b7737 cf17a000 080cbee8 cf17bf54 cf17be84 c02ba8b7 c0102c78 00000002
d3183840 00000001 c0110f80 db6a4da0 00000000 000000d0 080cbef0 00000010
0000001a 000000f8 00000000 00000000 00000000 00000000 00000000 00000000
Call Trace:    [<c02b7737>] [<c02ba8b7>] [<c01a9bad>] [<c01a9c00>] [<c01c5c07>]
[<c01941a3>]
Code: 8b 44 90 f8 85 c0 74 09 50 e8 88 c9 ef ff 83 c4 04 ff 0d 40


>>EIP; c02c03b2 <acl_free_all+22/8c>   <=====

>>eax; 69afa1ca Before first symbol
>>ecx; db6a4d80 <END_OF_CODE+1b399f98/????>
>>esi; c014e524 <inodev_set+0/8>
>>edi; c014e51c <name_set+0/8>
>>ebp; c014e510 <acl_subj_set+0/8>
>>esp; cf17be4c <END_OF_CODE+ee71064/????>

Trace; c02b7737 <free_variables+bb/188>
Trace; c02ba8b7 <gr_proc_handler+a03/1c50>
Trace; c01a9bad <do_rw_proc+a1/b4>
Trace; c01a9c00 <proc_writesys+1c/24>
Trace; c01c5c07 <sys_write+8f/100>
Trace; c01941a3 <system_call+33/40>

Code;  c02c03b2 <acl_free_all+22/8c>
00000000 <_EIP>:
Code;  c02c03b2 <acl_free_all+22/8c>   <=====
   0:   8b 44 90 f8               mov    0xfffffff8(%eax,%edx,4),%eax   <=====
Code;  c02c03b6 <acl_free_all+26/8c>
   4:   85 c0                     test   %eax,%eax
Code;  c02c03b8 <acl_free_all+28/8c>
   6:   74 09                     je     11 <_EIP+0x11>
Code;  c02c03ba <acl_free_all+2a/8c>
   8:   50                        push   %eax
Code;  c02c03bb <acl_free_all+2b/8c>
   9:   e8 88 c9 ef ff            call   ffefc996 <_EIP+0xffefc996>
Code;  c02c03c0 <acl_free_all+30/8c>
   e:   83 c4 04                  add    $0x4,%esp
Code;  c02c03c3 <acl_free_all+33/8c>
  11:   ff 0d 40 00 00 00         decl   0x40

In addition I got one (weird) error message few hours after that oops:
Code: Select all
Dec  2 19:04:21 xxxx kernel:      /# .* is not set.*/ { printf("%s=n\n", $2) }
Dec  2 19:04:21 xxxx kernel:    ! /# .*  ) by (sh:6344) UID(0) EUID(0), parent (sh:1680) UID(0) EUID(0)
Dec  2 19:04:21 xxxx kernel: BEGIN {
Dec  2 19:04:21 xxxx kernel:    menu_no = 0
Dec  2 19:04:21 xxxx kernel:    comment_is_option = 0
Dec  2 19:04:21 xxxx kernel:    parser("arch ) by (sh:12325) UID(0) EUID(0), parent (sh:1680) UID(0) EUID(0)
Dec  2 19:04:37 xxxx kernel: ) by (sh:22599) UID(0) EUID(0), parent (sh:208) UID(0) EUID(0)
Dec  2 19:14:13 xxxx kernel: ) by (sh:684) UID(0) EUID(0), parent (sh:13441) UID(0) EUID(0)
niz
 
Posts: 19
Joined: Mon Sep 09, 2002 6:12 am

Postby spender » Tue Dec 02, 2003 7:21 pm

can you remove all occurances of __inline__ in grsecurity/gracl_alloc.c, recompile, and show the ksymoops output of a crash on the new kernel? I'd like to know the exact function it's crashing in.

-Brad
spender
 
Posts: 2185
Joined: Wed Feb 20, 2002 8:00 pm

Postby niz » Tue Dec 02, 2003 7:51 pm

Ok.. I recompiled kernel.. but now I can't reproduce that oops..
It can take some time..
niz
 
Posts: 19
Joined: Mon Sep 09, 2002 6:12 am

Postby niz » Wed Dec 03, 2003 6:19 pm

Now that oops should be without __inline__ in gracl_alloc.c
Code: Select all
ksymoops 2.4.5 on i686 2.4.23-grsec.  Options used
     -V (default)
     -K (specified)
     -L (specified)
     -O (specified)
     -m /usr/src/linux-2.4.23/System.map (specified)

Unable to handle kernel paging request at virtual address 1b04498a
c02c0355
*pde = 00000000
Oops: 0000
CPU:    1
EIP:    0010:[<c02c0355>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010202
eax: 1b043e72   ebx: 00000000   ecx: da656780   edx: 000002c8
esi: c014e524   edi: c014e51c   ebp: c014e510   esp: d79f1e48
ds: 0018   es: 0018   ss: 0018
Process gradm (pid: 31323, stackpage=d79f1000)
Stack: c02c03f5 c02b7737 d79f0000 080d72f8 d79f1f54 d79f1e84 c02ba8b7 c0102c78
00000002 ded8bbc0 00000001 c0110f80 da656760 00000000 000000d0 080d7300
00000011 0000001c 00000105 00000000 00000000 00000000 00000000 00000000
Call Trace:    [<c02c03f5>] [<c02b7737>] [<c02ba8b7>] [<c01a9bad>] [<c01a9c00>]
[<c01c5c07>] [<c01941a3>]
Code: 8b 44 90 f8 85 c0 74 09 50 e8 e5 c9 ef ff 83 c4 04 ff 0d 40


>>EIP; c02c0355 <alloc_pop+15/34>   <=====

>>eax; 1b043e72 Before first symbol
>>ecx; da656780 <END_OF_CODE+1a34b998/????>
>>esi; c014e524 <inodev_set+0/8>
>>edi; c014e51c <name_set+0/8>
>>ebp; c014e510 <acl_subj_set+0/8>
>>esp; d79f1e48 <END_OF_CODE+176e7060/????>

Trace; c02c03f5 <acl_free_all+1d/70>
Trace; c02b7737 <free_variables+bb/188>
Trace; c02ba8b7 <gr_proc_handler+a03/1c50>
Trace; c01a9bad <do_rw_proc+a1/b4>
Trace; c01a9c00 <proc_writesys+1c/24>
Trace; c01c5c07 <sys_write+8f/100>
Trace; c01941a3 <system_call+33/40>

Code;  c02c0355 <alloc_pop+15/34>
00000000 <_EIP>:
Code;  c02c0355 <alloc_pop+15/34>   <=====
   0:   8b 44 90 f8               mov    0xfffffff8(%eax,%edx,4),%eax   <=====
Code;  c02c0359 <alloc_pop+19/34>
   4:   85 c0                     test   %eax,%eax
Code;  c02c035b <alloc_pop+1b/34>
   6:   74 09                     je     11 <_EIP+0x11>
Code;  c02c035d <alloc_pop+1d/34>
   8:   50                        push   %eax
Code;  c02c035e <alloc_pop+1e/34>
   9:   e8 e5 c9 ef ff            call   ffefc9f3 <_EIP+0xffefc9f3>
Code;  c02c0363 <alloc_pop+23/34>
   e:   83 c4 04                  add    $0x4,%esp
Code;  c02c0366 <alloc_pop+26/34>
  11:   ff 0d 40 00 00 00         decl   0x40

I got that oops when doing 'gradm -R'
niz
 
Posts: 19
Joined: Mon Sep 09, 2002 6:12 am

Postby devastor » Sat Dec 06, 2003 3:17 pm

Any news about this?
We're having a maintenance break here tomorrow morning and the idea was
to upgrade to 1.9.13 at the same time, but now this concerns me a lot..
I don't want to upgrade if there's a bug that will oops the kernel..
devastor
 
Posts: 41
Joined: Fri Oct 11, 2002 5:07 pm

Postby niz » Sat Dec 06, 2003 4:44 pm

Problem is there.. but I don't know if memleak fix in cvs would fix this.. or is this problem smp specific..
niz
 
Posts: 19
Joined: Mon Sep 09, 2002 6:12 am

Postby devastor » Sun Dec 07, 2003 12:44 pm

yep yep,
It seems to work fine on a single-CPU system.
However the current cvs version which i also tried, oopses the kernel immediately when
running gradm -E ;)
devastor
 
Posts: 41
Joined: Fri Oct 11, 2002 5:07 pm

Postby spender » Sun Dec 14, 2003 10:01 pm

Can you try either current cvs of grsecurity or the grsecurity 2.0-rc4 patch from here:

http://grsecurity.net/~spender/

I believe I've resolved this issue.

-Brad
spender
 
Posts: 2185
Joined: Wed Feb 20, 2002 8:00 pm

Postby niz » Sat Jan 17, 2004 9:06 pm

I think this problem is gone.. I have been using new grsecurity now long time and no problems appeared.
niz
 
Posts: 19
Joined: Mon Sep 09, 2002 6:12 am


Return to grsecurity support

cron