Page 1 of 1

Refcount overflow detected

PostPosted: Fri Aug 20, 2010 1:45 pm
by kmcfate
This is an odd one. Basically after weeks of running a system (this one was fine since Aug 2) the kernel will start killing every process that touches NFS mounts with a refcount overflow error.

Removing the nfs modules, and reloadiing them fixes the problem.

How do I compile the kernel to provide useful backtraces ass all the error dumps look like:

Aug 20 11:32:36 php5-n384 kernel: PAX: refcount overflow detected in: cpv-maps:27025, uid/euid: 0/0
Aug 20 11:32:36 php5-n384 kernel: CPU 2:
Aug 20 11:32:36 php5-n384 kernel: Modules linked in: nfsd exportfs ipmi_devintf ipmi_si ipmi_msghandler dell_rbu nfs fscache nfs_acl auth_rpcgss lockd sunrpc ipt_LOG xt_limit xt_recent iptable_mangle ext2 d]
Aug 20 11:32:36 php5-n384 kernel: Pid: 27025, comm: cpv-maps Tainted: G B 2.6.32.16-2.x86_64 #1 PowerEdge 2970
Aug 20 11:32:36 php5-n384 kernel: RIP: 0010:[<ffffffffa0436a03>] [<ffffffffa0436a03>]
Aug 20 11:32:36 php5-n384 kernel: RSP: 0018:ffff88004334fa78 EFLAGS: 00000a96
Aug 20 11:32:36 php5-n384 kernel: RAX: 000000007fffffff RBX: ffff88042c4d0440 RCX: ffff88004334f988
Aug 20 11:32:36 php5-n384 kernel: RDX: 0000000000003f00 RSI: ffff880234818500 RDI: ffff88042c4d0440
Aug 20 11:32:36 php5-n384 kernel: RBP: ffff88004334fa88 R08: 0000000000000007 R09: ffff88042c4d0440
Aug 20 11:32:36 php5-n384 kernel: R10: 0000000000000004 R11: 0000000000000000 R12: ffff88042c4d0440
Aug 20 11:32:36 php5-n384 kernel: R13: ffff88004334fb68 R14: ffff880429aa7200 R15: ffff88004334fd08
Aug 20 11:32:36 php5-n384 kernel: FS: 00006f1ad820e6e0(0000) GS:ffff880028240000(0000) knlGS:00000000dd2ccb90
Aug 20 11:32:36 php5-n384 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Aug 20 11:32:36 php5-n384 kernel: CR2: 00000000006be248 CR3: 0000000078e8b000 CR4: 00000000000006f0
Aug 20 11:32:36 php5-n384 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug 20 11:32:36 php5-n384 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Aug 20 11:32:36 php5-n384 kernel: Call Trace:
Aug 20 11:32:36 php5-n384 kernel: [<ffffffffa0437276>]
Aug 20 11:32:36 php5-n384 kernel: [<ffffffffa042e699>]
Aug 20 11:32:36 php5-n384 kernel: [<ffffffffa042e763>]
Aug 20 11:32:36 php5-n384 kernel: [<ffffffffa044a630>] ?
Aug 20 11:32:36 php5-n384 kernel: [<ffffffffa04ebb26>]
Aug 20 11:32:36 php5-n384 kernel: [<ffffffffa04ebdaf>]
Aug 20 11:32:36 php5-n384 kernel: [<ffffffffa04db3a1>]
Aug 20 11:32:36 php5-n384 kernel: [<ffffffff826a7b7c>] ?
Aug 20 11:32:36 php5-n384 kernel: [<ffffffff811713b0>] ?
Aug 20 11:32:36 php5-n384 kernel: [<ffffffff811650f1>] ?
Aug 20 11:32:36 php5-n384 kernel: [<ffffffff8125c61f>] ?
Aug 20 11:32:36 php5-n384 kernel: [<ffffffffa04db51c>]
Aug 20 11:32:36 php5-n384 kernel: [<ffffffffa04d5cd3>]
Aug 20 11:32:36 php5-n384 kernel: [<ffffffff81165b1f>]
Aug 20 11:32:36 php5-n384 kernel: [<ffffffff8116633c>]
Aug 20 11:32:36 php5-n384 kernel: [<ffffffff8116720f>]
Aug 20 11:32:36 php5-n384 kernel: [<ffffffff81167353>]
Aug 20 11:32:36 php5-n384 kernel: [<ffffffff81127efe>] ?
Aug 20 11:32:36 php5-n384 kernel: [<ffffffff8100a743>] ?
Aug 20 11:32:36 php5-n384 kernel: [<ffffffff8115cf47>]
Aug 20 11:32:36 php5-n384 kernel: [<ffffffff8115d09b>]
Aug 20 11:32:36 php5-n384 kernel: [<ffffffff8115d0c4>]
Aug 20 11:32:36 php5-n384 kernel: [<ffffffff810d107b>] ?
Aug 20 11:32:36 php5-n384 kernel: [<ffffffff8100b1b2>]

Re: Refcount overflow detected

PostPosted: Sat Aug 21, 2010 7:24 am
by PaX Team
kmcfate wrote:This is an odd one. Basically after weeks of running a system (this one was fine since Aug 2) the kernel will start killing every process that touches NFS mounts with a refcount overflow error.
if you send me the corresponding vmlinux image, i can decode the addresses but i think i got this reported already and have a fix for it for the next test patch.

Re: Refcount overflow detected

PostPosted: Tue Aug 24, 2010 8:23 am
by kmcfate

Re: Refcount overflow detected

PostPosted: Tue Aug 24, 2010 1:46 pm
by PaX Team
kmcfate wrote:Here is the boot image
thanks but i need vmlinux, not vmlinuz ;).

Re: Refcount overflow detected

PostPosted: Tue Aug 24, 2010 2:26 pm
by kmcfate

Re: Refcount overflow detected

PostPosted: Tue Aug 24, 2010 6:09 pm
by PaX Team
ok, got it but i'll also need the module which triggered the refcount overflow detection. now the question is whether this box/kernel is still running or not. in the former case you should look at /proc/modules and find the module in there which covers the address ffffffffa0436a03 and send the corresponding .ko to me please. if you rebooted the box since then the module load order is probably different and we're out of luck and have to assume it's the same nfs issue i already fixed here...

Re: Refcount overflow detected

PostPosted: Tue Sep 07, 2010 8:31 am
by kmcfate
Took me a while, as i had update to 2.6.32.18, so another 3 weeks and it has hit. Here is the appropriate information:

http://server.darkink.com/grsec/vmlinux ... so5.x86_64

Sep 7 07:14:33 php5-n32 kernel: PAX: From 172.17.10.102: refcount overflow detected in: sshd:27732, uid/euid: 0/645
Sep 7 07:14:33 php5-n32 kernel: CPU 2:
Sep 7 07:14:33 php5-n32 kernel: Modules linked in: xt_comment ipv6 nfs fscache nfs_acl auth_rpcgss lockd sunrpc ipt_LOG xt_limit xt_recent iptable_mangle ext2 dm_mirror dm_region_hash dm_log dm_multipath dm_mod video output sbs sbshc power_meter acpi_pad parport_pc lp parport ses enclosure sg sr_mod cdrom joydev bnx2 i5k_amb pata_acpi hwmon i5000_edac iTCO_wdt serio_raw snd_pcm dcdbas iTCO_vendor_support edac_core snd_timer ata_piix ata_generic snd soundcore snd_page_alloc pcspkr mptspi mptscsih scsi_transport_spi mptbase sata_nv sata_svw megaraid_mbox megaraid_mm shpchp megaraid_sas ext3 jbd mbcache radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: microcode]
Sep 7 07:14:33 php5-n32 kernel: Pid: 27732, comm: sshd Not tainted 2.6.32.18-3.mosso5.x86_64 #1 PowerEdge 2950
Sep 7 07:14:33 php5-n32 kernel: RIP: 0010:[<ffffffffa0487a33>] [<ffffffffa0487a33>]
Sep 7 07:14:33 php5-n32 kernel: RSP: 0018:ffff88009b1dda78 EFLAGS: 00000a96
Sep 7 07:14:33 php5-n32 kernel: RAX: 000000007fffffff RBX: ffff880225bf71c0 RCX: ffff88009b1dd988
Sep 7 07:14:33 php5-n32 kernel: RDX: ffff8801ccc30900 RSI: ffff8800af7bc300 RDI: ffff880225bf71c0
Sep 7 07:14:33 php5-n32 kernel: RBP: ffff88009b1dda88 R08: 0a00000000000000 R09: ffff880225bf71c0
Sep 7 07:14:33 php5-n32 kernel: R10: 0000000000000001 R11: 0000000000000004 R12: ffff880225bf71c0
Sep 7 07:14:33 php5-n32 kernel: R13: ffff88009b1ddb68 R14: ffff880225219800 R15: ffff88009b1ddd08
Sep 7 07:14:33 php5-n32 kernel: FS: 00006578a7b8b710(0000) GS:ffff880028280000(0000) knlGS:0000000000000000
Sep 7 07:14:33 php5-n32 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Sep 7 07:14:33 php5-n32 kernel: CR2: 00006578a3f3d03e CR3: 00000002247f3000 CR4: 00000000000006f0
Sep 7 07:14:33 php5-n32 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Sep 7 07:14:33 php5-n32 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Sep 7 07:14:33 php5-n32 kernel: Call Trace:
Sep 7 07:14:33 php5-n32 kernel: [<ffffffffa04882a6>]
Sep 7 07:14:33 php5-n32 kernel: [<ffffffffa047f699>]
Sep 7 07:14:33 php5-n32 kernel: [<ffffffffa047f763>]
Sep 7 07:14:33 php5-n32 kernel: [<ffffffffa049b9d0>] ?
Sep 7 07:14:33 php5-n32 kernel: [<ffffffffa053da76>]
Sep 7 07:14:33 php5-n32 kernel: [<ffffffffa053dcff>]
Sep 7 07:14:33 php5-n32 kernel: [<ffffffffa052d341>]
Sep 7 07:14:33 php5-n32 kernel: [<ffffffff813ede8d>] ?
Sep 7 07:14:33 php5-n32 kernel: [<ffffffff814918a1>] ?
Sep 7 07:14:33 php5-n32 kernel: [<ffffffff8115e609>] ?
Sep 7 07:14:33 php5-n32 kernel: [<ffffffff811533c1>] ?
Sep 7 07:14:33 php5-n32 kernel: [<ffffffffa052d4bc>]
Sep 7 07:14:33 php5-n32 kernel: [<ffffffffa0527e63>]
Sep 7 07:14:33 php5-n32 kernel: [<ffffffff81153bda>]
Sep 7 07:14:33 php5-n32 kernel: [<ffffffff8115434c>]
Sep 7 07:14:33 php5-n32 kernel: [<ffffffff811544df>]
Sep 7 07:14:33 php5-n32 kernel: [<ffffffff81154613>]
Sep 7 07:14:33 php5-n32 kernel: [<ffffffff81116613>] ?
Sep 7 07:14:33 php5-n32 kernel: [<ffffffff8114b177>]
Sep 7 07:14:33 php5-n32 kernel: [<ffffffff8114b2cb>]
Sep 7 07:14:33 php5-n32 kernel: [<ffffffff8114b2f4>]
Sep 7 07:14:33 php5-n32 kernel: [<ffffffff810c0e3b>] ?
Sep 7 07:14:33 php5-n32 kernel: [<ffffffff8100b132>]


xt_comment 1040 2 - Live 0xffffffffa01b0000 0xffffffffa0182000
ipv6 341773 65 - Live 0xffffffffa0590000 0xffffffffa0588000
nfs 333102 10 - Live 0xffffffffa0524000 0xffffffffa051f000
fscache 50110 1 nfs, Live 0xffffffffa050a000 0xffffffffa0507000
nfs_acl 2813 1 nfs, Live 0xffffffffa0501000 0xffffffffa04fe000
auth_rpcgss 44955 1 nfs, Live 0xffffffffa04ed000 0xffffffffa04e8000
lockd 73462 1 nfs, Live 0xffffffffa04cf000 0xffffffffa04ca000
sunrpc 232133 35 nfs,nfs_acl,auth_rpcgss,lockd, Live 0xffffffffa047f000 0xffffffffa0477000
ipt_LOG 6212 3 - Live 0xffffffffa0470000 0xffffffffa046d000
xt_limit 2424 1 - Live 0xffffffffa0467000 0xffffffffa0464000
xt_recent 8161 2 - Live 0xffffffffa045c000 0xffffffffa0459000
iptable_mangle 3331 0 - Live 0xffffffffa0453000 0xffffffffa0450000
ext2 67575 1 - Live 0xffffffffa0412000 0xffffffffa040f000
dm_mirror 13199 0 - Live 0xffffffffa0405000 0xffffffffa0402000
dm_region_hash 11406 1 dm_mirror, Live 0xffffffffa03f9000 0xffffffffa03f6000
dm_log 9710 2 dm_mirror,dm_region_hash, Live 0xffffffffa03ed000 0xffffffffa03ea000
dm_multipath 16884 0 - Live 0xffffffffa03e0000 0xffffffffa03dd000
dm_mod 73531 3 dm_mirror,dm_log,dm_multipath, Live 0xffffffffa03c1000 0xffffffffa03bb000
video 22798 0 - Live 0xffffffffa03ae000 0xffffffffa03a8000
output 2495 1 video, Live 0xffffffffa03a2000 0xffffffffa039f000
sbs 11742 0 - Live 0xffffffffa0396000 0xffffffffa0393000
sbshc 4579 1 sbs, Live 0xffffffffa038d000 0xffffffffa038a000
power_meter 9458 0 - Live 0xffffffffa0382000 0xffffffffa037c000
acpi_pad 88053 0 - Live 0xffffffffa0374000 0xffffffffa035d000
parport_pc 23969 0 - Live 0xffffffffa0350000 0xffffffffa034a000
lp 11959 0 - Live 0xffffffffa0341000 0xffffffffa033e000
parport 40290 2 parport_pc,lp, Live 0xffffffffa032c000 0xffffffffa0329000
ses 6441 0 - Live 0xffffffffa0285000 0xffffffffa0273000
enclosure 8249 1 ses, Live 0xffffffffa0239000 0xffffffffa0231000
sg 33053 0 - Live 0xffffffffa02a4000 0xffffffffa022a000
sr_mod 15629 0 - Live 0xffffffffa026d000 0xffffffffa01f9000
cdrom 47007 1 sr_mod, Live 0xffffffffa02fe000 0xffffffffa01c5000
joydev 14517 0 - Live 0xffffffffa0177000 0xffffffffa0139000
bnx2 70837 0 - Live 0xffffffffa02dd000 0xffffffffa022e000
i5k_amb 5822 0 - Live 0xffffffffa01d8000 0xffffffffa01c8000
pata_acpi 3619 0 - Live 0xffffffffa01ba000 0xffffffffa01b3000
hwmon 2368 2 power_meter,i5k_amb, Live 0xffffffffa0190000 0xffffffffa008a000
i5000_edac 8771 0 - Live 0xffffffffa0135000 0xffffffffa0087000
iTCO_wdt 11716 0 - Live 0xffffffffa02d9000 0xffffffffa02d5000
serio_raw 4844 0 - Live 0xffffffffa02cf000 0xffffffffa02cc000
snd_pcm 88066 0 - Live 0xffffffffa028c000 0xffffffffa0289000
dcdbas 9129 0 - Live 0xffffffffa0281000 0xffffffffa027e000
iTCO_vendor_support 3255 1 iTCO_wdt, Live 0xffffffffa0278000 0xffffffffa0275000
edac_core 45858 3 i5000_edac, Live 0xffffffffa0260000 0xffffffffa025c000
snd_timer 25493 1 snd_pcm, Live 0xffffffffa024e000 0xffffffffa024b000
ata_piix 22900 0 - Live 0xffffffffa023f000 0xffffffffa023c000
ata_generic 3599 0 - Live 0xffffffffa0236000 0xffffffffa0233000
snd 76547 2 snd_pcm,snd_timer, Live 0xffffffffa0216000 0xffffffffa0212000
soundcore 7974 1 snd, Live 0xffffffffa020a000 0xffffffffa0207000
snd_page_alloc 8828 1 snd_pcm, Live 0xffffffffa01ff000 0xffffffffa01fc000
pcspkr 2078 0 - Live 0xffffffffa01f6000 0xffffffffa01f3000
mptspi 16288 0 - Live 0xffffffffa01e0000 0xffffffffa01dd000
mptscsih 34771 1 mptspi, Live 0xffffffffa01cd000 0xffffffffa01ca000
scsi_transport_spi 25792 1 mptspi, Live 0xffffffffa01bd000 0xffffffffa01b7000
mptbase 92206 2 mptspi,mptscsih, Live 0xffffffffa0197000 0xffffffffa0194000
sata_nv 23063 0 - Live 0xffffffffa0188000 0xffffffffa0185000
sata_svw 4877 0 - Live 0xffffffffa017f000 0xffffffffa017c000
megaraid_mbox 29498 0 - Live 0xffffffffa015f000 0xffffffffa008c000
megaraid_mm 11943 1 megaraid_mbox, Live 0xffffffffa0130000 0xffffffffa0075000
shpchp 32858 0 - Live 0xffffffffa0126000 0xffffffffa0019000
megaraid_sas 39448 4 - Live 0xffffffffa016d000 0xffffffffa0168000
ext3 131414 4 - Live 0xffffffffa013d000 0xffffffffa0072000
jbd 49582 1 ext3, Live 0xffffffffa0057000 0xffffffffa0054000
mbcache 7410 2 ext2,ext3, Live 0xffffffffa000f000 0xffffffffa000c000
radeon 605027 0 - Live 0xffffffffa009a000 0xffffffffa008e000
ttm 43165 1 radeon, Live 0xffffffffa007a000 0xffffffffa0077000
drm_kms_helper 27771 1 radeon, Live 0xffffffffa0069000 0xffffffffa0066000
drm 209618 3 radeon,ttm,drm_kms_helper, Live 0xffffffffa0024000 0xffffffffa001c000
i2c_algo_bit 5695 1 radeon, Live 0xffffffffa0015000 0xffffffffa0012000
i2c_core 30957 3 radeon,drm,i2c_algo_bit, Live 0xffffffffa0003000 0xffffffffa0000000

Re: Refcount overflow detected

PostPosted: Tue Sep 07, 2010 10:10 am
by spender
If you update to the latest grsec patch (uploaded over the weekend), it includes a fix for the sunrpc refcount problem. Thanks for your report.

-Brad

Re: Refcount overflow detected

PostPosted: Tue Sep 07, 2010 11:35 am
by kmcfate
Just to verify, fix is in 2.2.0-2.6.32.21 ??

Re: Refcount overflow detected

PostPosted: Tue Sep 07, 2010 1:54 pm
by spender
Yes, it's in both 2.2.0-2.6.32.21 and 2.2.0-2.6.34.6

-Brad