Page 1 of 1

Kernel Panic with ZFS on Arch Linux

PostPosted: Wed Sep 16, 2015 2:37 am
by Sir Wuffleton
Hello! I've been trying to make the switch to a grsec-enabled kernel for my personal webserver, but have hit a pretty big roadblock: the system kernel panics at boot when importing my root zfs pool. I'm kinda at a loss of where to go next, since I vaguely remember some issue with ZFS last time I decided to try grsec.

I'm using the latest linux-grsec package from the repos (4.1.7.201509131604-1 as of writing), and have compiled ZFS and SPL against linux-grsec using the zfs-git (https://aur.archlinux.org/packages/zfs-git/) and spl-git (https://aur.archlinux.org/packages/spl-git/) packages (with appropriate edits for building against -grsec) from our AUR. They both use the commit of the latest stable release of ZFSonLinux, which is 0.6.5 as of writing.

The one disclaimer on our wiki page (https://wiki.archlinux.org/index.php/Grsecurity#Out-of-tree_kernel_module_compilation_failure) about building out of tree modules doesn't apply since the gcc package hasn't been updated since the last release of the linux-grsec package. As such, I should be compiling my modules using the same toolchain as the repo package. Additionally, it also seems to cover build issues, though I was able to compile both modules without issue.

I've attached the full output of the panic from my lights-out serial console below:
Image

Unfortunately, since this happens so early in my boot process, I'm not sure if there's a ton of extra debugging I can do short of building up a bare-metal VM with just grsec and zfs, using an ext4 root which might let me do some more poking and prodding. That said, if there's any other helpful information I can provide, I'll do my best to be able to get it.

I know there's a lot of elements in play here that aren't maintained by you guys, but if you could help me decipher what's going on in this panic so I can figure out how proceed, I'd really appreciate it!

Re: Kernel Panic with ZFS on Arch Linux

PostPosted: Wed Sep 16, 2015 3:42 am
by PaX Team
you must have USERCOPY enabled and zfs has at least one slab cache that is used to copy between the kernel and userland but since it's not marked with SLAB_USERCOPY, the defense mechanism kicks in. the solution is also simple, find where the zio_data_buf_16384 slab is created and add SLAB_USERCOPY to the flags argument (you can look at how we use it elsewhere). judging by the slab's name there may be more that need this change. another way to fix this would be to determine whether this slab is actually meant to be used in user/kernel copying and rewrite the copying code but i guess in this case this is legitimate use so adding that flag is the proper action.