Problem with 2.6.23.1-grsec and MySQL 5.0.32-7etch1

Discuss usability issues, general maintenance, and general support issues for a grsecurity-enabled system.

Re: Problem with 2.6.23.1-grsec and MySQL 5.0.32-7etch1

Postby zImage » Mon Mar 08, 2010 6:22 am

I think we have same thing here:

  • 2.6.32.2-grsec + mysql 5.0.{82,90} = crash
  • 2.6.32.9-grsec + mysql 5.0.{82,90} = crash
  • 2.6.30.2-grsec + mysql 5.0.{82,90} = ok

MySQL dies mostly with signal 11, often with signal 6, and occasionally with signal 8 or just hangs and have to kill it (-9). We are going to downgrade to 2.6.30 for the time being and hope for resolution in 2.6.33.
zImage
 
Posts: 10
Joined: Mon Mar 27, 2006 10:44 am

Re: Problem with 2.6.23.1-grsec and MySQL 5.0.32-7etch1

Postby dabetz » Tue Mar 09, 2010 6:57 am

Hello again,

so here is another debugging in gdb with a litte bit more debugging infos from mysqld

[New Thread 0xad9aeb90 (LWP 14240)]
[New Thread 0xad7c4b90 (LWP 14241)]
[Thread 0xac47db90 (LWP 14232) exited]
[Thread 0xa9aa8b90 (LWP 14235) exited]
[Thread 0xa9fbab90 (LWP 14233) exited]
[Thread 0xad97db90 (LWP 14239) exited]
[New Thread 0xad97db90 (LWP 14246)]
[New Thread 0xa9fbab90 (LWP 14247)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xad7c4b90 (LWP 14241)]
0xb76116eb in strlen () from /lib/libc.so.6
(gdb) bt full
#0 0xb76116eb in strlen () from /lib/libc.so.6
No symbol table info available.
#1 0xb75e08c4 in vfprintf () from /lib/libc.so.6
No symbol table info available.
#2 0xb75e1810 in ?? () from /lib/libc.so.6
No symbol table info available.
#3 0xb75dcc76 in vfprintf () from /lib/libc.so.6
No symbol table info available.
#4 0xb75e66ff in fprintf () from /lib/libc.so.6
No symbol table info available.
#5 0x0871d7ef in _checkchunk (irem=0xac352770, filename=0x87f0c70 "sql_class.cc", lineno=2963) at safemalloc.c:472
flag = 1
magicp = 0xad7c41cc "¥¥¥¥"
data = 0xac352788 'Â¥' <repeats 24 times>, "h4z\025M"
#6 0x0871d9a6 in _sanity (filename=0x87f0c70 "sql_class.cc", lineno=2963) at safemalloc.c:515
irem = (struct st_irem *) 0xac352770
flag = 0
count = 1124
#7 0x0871d360 in _myfree (ptr=0x900e920, filename=0x87f0c70 "sql_class.cc", lineno=2963, myflags=0) at safemalloc.c:265
irem = (struct st_irem *) 0x8fa1570
_db_func_ = 0x87f0fa0 "~THD()"
_db_file_ = 0x87f0c70 "sql_class.cc"
_db_level_ = 4
_db_framep_ = (char **) 0xad7c4288
#8 0x082a4bb7 in Security_context::destroy (this=0xac35316c) at sql_class.cc:2963
No locals.
#9 0x0829ea57 in ~THD (this=0xac352838) at sql_class.cc:1100
_db_func_ = 0x87f66c9 "unlink_thd"
_db_file_ = 0x87f5ea5 "mysqld.cc"
_db_level_ = 3
_db_framep_ = (char **) 0xad7c4b90
dbug_violation_helper = {_entered = true}
#10 0x082b7bae in unlink_thd (thd=0xac352838) at mysqld.cc:1883
_db_func_ = 0x87f6776 "one_thread_per_connection_end"
_db_file_ = 0x87f5ea5 "mysqld.cc"
_db_level_ = 2
_db_framep_ = (char **) 0xb7833c0c
dbug_violation_helper = {_entered = true}
#11 0x082b7dd6 in one_thread_per_connection_end (thd=0xac352838, put_in_cache=true) at mysqld.cc:1962
_db_func_ = 0x8916d8d "?func"
_db_file_ = 0x8916d93 "?file"
_db_level_ = 1
_db_framep_ = (char **) 0x26e
dbug_violation_helper = {_entered = true}
#12 0x082c6d25 in handle_one_connection (arg=0xac352838) at sql_connect.cc:1738
net = (NET *) 0xac3528b4
create_user = true
thd = (class THD *) 0xac352838
#13 0xb783116f in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#14 0xb766dc0e in clone () from /lib/libc.so.6
No symbol table info available.
(gdb) x/8i $pc
0xb76116eb <strlen+11>: cmp %ch,(%eax)
0xb76116ed <strlen+13>: je 0xb761178a <strlen+170>
0xb76116f3 <strlen+19>: inc %eax
0xb76116f4 <strlen+20>: xor $0x3,%ecx
0xb76116f7 <strlen+23>: je 0xb7611713 <strlen+51>
0xb76116f9 <strlen+25>: cmp %ch,(%eax)
0xb76116fb <strlen+27>: je 0xb761178a <strlen+170>
0xb7611701 <strlen+33>: add $0x1,%eax
(gdb) x/8x $sp
0xad7c147c: 0xb75e08c4 0xa5a5a5a5 0x0890e080 0x0000001b
0xad7c148c: 0xb7835b35 0x0000000d 0x00000000 0xad7c14a8
(gdb) info reg
eax 0xa5a5a5a5 -1515870811
ecx 0x1 1
edx 0xad7c41cc -1384365620
ebx 0xb76ddff4 -1217536012
esp 0xad7c147c 0xad7c147c
ebp 0xad7c1a98 0xad7c1a98
esi 0xa5a5a5a5 -1515870811
edi 0xad7c41c8 -1384365624
eip 0xb76116eb 0xb76116eb <strlen+11>
eflags 0x10202 [ IF RF ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x33 51


#5 0x0871d7ef in _checkchunk (irem=0xac352770, filename=0x87f0c70 "sql_class.cc", lineno=2963) at safemalloc.c:472
static int _checkchunk(register struct st_irem *irem, const char *filename,
uint lineno)
{
int flag=0;
char *magicp, *data;

data= (((char*) irem) + ALIGN_SIZE(sizeof(struct st_irem)) +
sf_malloc_prehunc);
/* Check for a possible underrun */
if (*((uint32*) (data- sizeof(uint32))) != MAGICKEY)
{
fprintf(stderr, "Error: Memory allocated at %s:%d was underrun,",
irem->filename, irem->linenum);

fprintf(stderr, " discovered at %s:%d\n", filename, lineno);
(void) fflush(stderr);
DBUG_PRINT("safe",("Underrun at %p, allocated at %s:%d",
data, irem->filename, irem->linenum));
flag=1;
}


#7 0x0871d360 in _myfree (ptr=0x900e920, filename=0x87f0c70 "sql_class.cc", lineno=2963, myflags=0) at safemalloc.c:265
void _myfree(void *ptr, const char *filename, uint lineno, myf myflags)
{
struct st_irem *irem;
DBUG_ENTER("_myfree");
DBUG_PRINT("enter",("ptr: %p", ptr));

if (!sf_malloc_quick)
(void) _sanity (filename, lineno);

if ((!ptr && (myflags & MY_ALLOW_ZERO_PTR)) ||
check_ptr("Freeing",(uchar*) ptr,filename,lineno))
DBUG_VOID_RETURN;

/* Calculate the address of the remember structure */
irem= (struct st_irem *) ((char*) ptr- ALIGN_SIZE(sizeof(struct st_irem))-
sf_malloc_prehunc);
/*
Check to make sure that we have a real remember structure.
Note: this test could fail for four reasons:
(1) The memory was already free'ed
(2) The memory was never new'ed
(3) There was an underrun
(4) A stray pointer hit this location
*/



Problem is on 2.6.32.9-grsec with mysql-5.1.44
It seems that SANITIZE fails.

Could you give me some hints on doing better debugging for you ?

Greetings,
Daniel
dabetz
 
Posts: 22
Joined: Mon Nov 09, 2009 3:38 pm

Re: Problem with 2.6.23.1-grsec and MySQL 5.0.32-7etch1

Postby PaX Team » Tue Mar 09, 2010 8:53 am

dabetz wrote:so here is another debugging in gdb with a litte bit more debugging infos from mysqld
thanks for the info, i've got a few more thoughts/questions on this.

1. you say SANITIZE fails, but i thought we'd established already that the problem occured without it as well. is that still the case?
2. did you enable SL*B debugging in your kernel? (it uses 0xa5 as a poison value, although if that ended up in userland, we have a much bigger problem here)
3. alternatively, does mysql use 0xa5 for memory poisoning somewhere?
4. it seems that the 'irem' structure was overwritten with 0xa5 (you could try to dump it fully to be sure), that's why irem->filename was a garbage pointer.
5. i just got an idea looking at the 30->31 changes. take a look at arch/x86/include/asm/uaccess.h where i change access_ok and introduce __access_ok. can you revert back to the old access_ok like this (while keeping __access_ok of course):
Code: Select all
#define access_ok(type, addr, size) (likely(__range_not_ok(addr, size) == 0))
and see if it changes anything? if it does, then the most likely reason is that mysql or the kernel has a very subtle race somewhere, we'll see if we can debug it.
PaX Team
 
Posts: 2310
Joined: Mon Mar 18, 2002 4:35 pm

Re: Problem with 2.6.23.1-grsec and MySQL 5.0.32-7etch1

Postby PaX Team » Tue Mar 09, 2010 8:55 am

zImage wrote:MySQL dies mostly with signal 11, often with signal 6, and occasionally with signal 8 or just hangs and have to kill it (-9). We are going to downgrade to 2.6.30 for the time being and hope for resolution in 2.6.33.
can you do the same experiment as i've just suggested to Daniel please (and answer some of those questions too)? also, how easy is it for you to reproduce the problem? if someone could give me something easy to debug here locally or perhaps remote access to a box where i can play with this myself, it'd speed up debugging a lot ;).
PaX Team
 
Posts: 2310
Joined: Mon Mar 18, 2002 4:35 pm

Re: Problem with 2.6.23.1-grsec and MySQL 5.0.32-7etch1

Postby dabetz » Tue Mar 09, 2010 11:21 am

PaX Team wrote:1. you say SANITIZE fails, but i thought we'd established already that the problem occured without it as well. is that still the case?

With compiling mysql with --debug=full it uses SAFEMALLOC and sanity

PaX Team wrote:2. did you enable SL*B debugging in your kernel? (it uses 0xa5 as a poison value, although if that ended up in userland, we have a much bigger problem here)

Do you mean Spinlock Debugging ? No this is off in Kernel

PaX Team wrote:3. alternatively, does mysql use 0xa5 for memory poisoning somewhere?

Good question. The next pls. :-) Sorry i dont know.

PaX Team wrote:4. it seems that the 'irem' structure was overwritten with 0xa5 (you could try to dump it fully to be sure), that's why irem->filename was a garbage pointer.

How can i dump it fully ?

PaX Team wrote:5. i just got an idea looking at the 30->31 changes. take a look at arch/x86/include/asm/uaccess.h where i change access_ok and introduce __access_ok. can you revert back to the old access_ok like this (while keeping __access_ok of course):
Code: Select all
#define access_ok(type, addr, size) (likely(__range_not_ok(addr, size) == 0))
and see if it changes anything? if it does, then the most likely reason is that mysql or the kernel has a very subtle race somewhere, we'll see if we can debug it.

Ok, i give it an try. I will post the results later.

PaX Team wrote:if someone could give me something easy to debug here locally or perhaps remote access to a box where i can play with this myself, it'd speed up debugging a lot ;).


I will try to install an test setup for you.
My setup at the moment is an core i7 server with an simple typo3 installation.
i wrote an small perl script, that forks and generates random querys with 100 concurrent connections to the mysql.
Sometimes the crash occurs after 30 minutes and sometimes after some hours.


Thank you for your help.
Daniel
dabetz
 
Posts: 22
Joined: Mon Nov 09, 2009 3:38 pm

Re: Problem with 2.6.23.1-grsec and MySQL 5.0.32-7etch1

Postby PaX Team » Tue Mar 09, 2010 12:13 pm

dabetz wrote:With compiling mysql with --debug=full it uses SAFEMALLOC and sanity
ah, i mistook your use of sanitize for the pax feature, i see now.
Do you mean Spinlock Debugging ?
no, i meant slub/slab/whatever debugging (which would use such a poison value).
How can i dump it fully ?
p *irem or something like that should work if you have debug/type info for gdb. you can also just dump the surrounding area as normal bytes (x command) to see what's in there. basically what i'm after is to determine how big an area was poisoned with 0xa5 (whole page, many pages, irem struct itself, etc).
I will try to install an test setup for you.
ok thank you (email me with the details, my pgp key is on the servers)!
PaX Team
 
Posts: 2310
Joined: Mon Mar 18, 2002 4:35 pm

Re: Problem with 2.6.23.1-grsec and MySQL 5.0.32-7etch1

Postby zImage » Tue Mar 09, 2010 12:27 pm

1. you say SANITIZE fails, but i thought we'd established already that the problem occured without it as well. is that still the case?


If this is CONFIG_PAX_MEMORY_SANITIZE we do not use this feature.

5. i just got an idea looking at the 30->31 changes. take a look at arch/x86/include/asm/uaccess.h where i change access_ok and introduce __access_ok. can you revert back to the old access_ok


I'll try reverting this access_ok change and see what happens.

also, how easy is it for you to reproduce the problem?


I don't know how to reproduce it. Just wait somewhere between 6 and 48 hours to happen... so, as daniel seems to know how to reproduce it every few hours he might have the results sooner than me.
zImage
 
Posts: 10
Joined: Mon Mar 27, 2006 10:44 am

Re: Problem with 2.6.23.1-grsec and MySQL 5.0.32-7etch1

Postby dabetz » Wed Mar 10, 2010 3:19 am

Hello,

reverting this access_ok change works for me and mysql doesnt crash anymore.
It runs now for 16 hours and 27 million querys without any problems.

Reproducing is very easy. I have installed an Typo3 an wrote an small perl script which forks an sends with 100 concurrent connections querys to the mysql. All InnoDB Tables. The mysql does on an corei7 with 12GB RAM about 700 querys per second.
After some time it crashes randomly.

Greetings,
Daniel
dabetz
 
Posts: 22
Joined: Mon Nov 09, 2009 3:38 pm

Re: Problem with 2.6.23.1-grsec and MySQL 5.0.32-7etch1

Postby biz » Wed Mar 10, 2010 1:41 pm

Interesting problem. I'm going to run multiple MySQL 5.1.44+ instances on 2.6.32.9+grsec.
Please report back after some more testing 8)

If access_ok is causing this, when can we count on an updated grsec patch for this kernel version?

Thanks for all the debugging effort!
biz
 
Posts: 4
Joined: Mon Mar 01, 2010 8:04 pm

Re: Problem with 2.6.23.1-grsec and MySQL 5.0.32-7etch1

Postby PaX Team » Wed Mar 10, 2010 4:03 pm

dabetz wrote:reverting this access_ok change works for me and mysql doesnt crash anymore.
It runs now for 16 hours and 27 million querys without any problems.
ok, another experiment: keep the new access_ok macro but remove the VERIFY_WRITE/__put_user lines and see if that still works.
PaX Team
 
Posts: 2310
Joined: Mon Mar 18, 2002 4:35 pm

Re: Problem with 2.6.23.1-grsec and MySQL 5.0.32-7etch1

Postby PaX Team » Wed Mar 10, 2010 4:12 pm

biz wrote:If access_ok is causing this, when can we count on an updated grsec patch for this kernel version?
the new access_ok cannot really cause this, it does what the kernel is about to do anyway (access userland memory), so the underlying race (probably in mysql) would not be fixed if i simply reverted this feature. in the meantime i found that ALLOC_VAL is 0xa5 and is used by mysql in debug builds to initialize memory. the race is then probably about allocating memory and using it too soon, before it could be properly filled in with real data (probably from the kernel). i'm afraid that someone will need to talk to mysql developers as i have no idea about its internals and don't have time to debug it much further. any takers? ;)
PaX Team
 
Posts: 2310
Joined: Mon Mar 18, 2002 4:35 pm

Re: Problem with 2.6.23.1-grsec and MySQL 5.0.32-7etch1

Postby zImage » Wed Mar 10, 2010 4:29 pm

PaX Team wrote:
dabetz wrote:reverting this access_ok change works for me and mysql doesnt crash anymore.
It runs now for 16 hours and 27 million querys without any problems.
ok, another experiment: keep the new access_ok macro but remove the VERIFY_WRITE/__put_user lines and see if that still works.


I'm going to try the following change to arch/x86/include/asm/uaccess.h:
Code: Select all
                        if (__get_user(__c_ao, (char __user *)__addr_ao))\
                                break;                                  \
-                       if (type != VERIFY_WRITE)                       \
-                               continue;                               \
-                       if (__put_user(__c_ao, (char __user *)__addr_ao))\
-                               break;                                  \
                }                                                       \
zImage
 
Posts: 10
Joined: Mon Mar 27, 2006 10:44 am

Re: Problem with 2.6.23.1-grsec and MySQL 5.0.32-7etch1

Postby dabetz » Thu Mar 11, 2010 7:39 am

Hello Pax Team,

PaX Team wrote:
dabetz wrote:reverting this access_ok change works for me and mysql doesnt crash anymore.
It runs now for 16 hours and 27 million querys without any problems.
ok, another experiment: keep the new access_ok macro but remove the VERIFY_WRITE/__put_user lines and see if that still works.


this works for me too. :-)

Greetings,
Daniel
dabetz
 
Posts: 22
Joined: Mon Nov 09, 2009 3:38 pm

Re: Problem with 2.6.23.1-grsec and MySQL 5.0.32-7etch1

Postby PaX Team » Thu Mar 11, 2010 5:09 pm

dabetz wrote:
PaX Team wrote:ok, another experiment: keep the new access_ok macro but remove the VERIFY_WRITE/__put_user lines and see if that still works.
this works for me too. :-)
ok, i reworked this logic in the latest patch, let's see if it still produces the mysql crash. if not, it was my bug otherwise there's still something in mysql/etc.
PaX Team
 
Posts: 2310
Joined: Mon Mar 18, 2002 4:35 pm

Re: Problem with 2.6.23.1-grsec and MySQL 5.0.32-7etch1

Postby biz » Fri Mar 12, 2010 7:51 am

Hello PaX Team,

just to get it right (I'm not into it), the fix is in:
http://www.grsecurity.net/~paxguy1/pax-linux-2.6.32.9-test25.patch
and it looks like it's included in:
http://www.grsecurity.net/stable/grsecurity-2.1.14-2.6.32.9-201003112025.patch

Is this right? :oops:
biz
 
Posts: 4
Joined: Mon Mar 01, 2010 8:04 pm

PreviousNext

Return to grsecurity support

cron