My previous post was quite messed up, and the research i did about it was proven wrong by further tests.
I would like to update you on the situation.
Maybe it's better to start from the beginning, so eventual readers/developers won't have to search through the past topics.
Facts:
Linux 2.6.24 vanilla comes with builtin support for being a Xen domU (guest), and since the grsec patch only seems to conflict with Xen host (dom0) support, I've tried to make a grsec kernel run under Xen.
Brief tech explaination:
For those who don't know how Xen works: Xen Hypervisor takes a Linux image, fills a struct with some informations, and then "jumps" to its entry point and begins to execute the code.
In the Linux guest code, the (x86) entry part first checks if the Xen Hypervisor is compatible, and then loads itself and boots normally.
Other facts of interest:
Linux 2.6.24 vanilla boots without any problem under Xen. Tested, works.
Symptoms:
The guest grsec kernel *crashes* badly, calling ud2a (the "just crash" instruction) right after entering its entry point.
The hypervisor sees that the guest kernel has gone ud2a, and terminates it.
End of the story, no printk nor anything (it doesn't even get past the entry..), so no debugging.
Analysis and research:
Since the Xen Hypervisor logs a few addresses when the kernel crashes, i took the EIP addres and checked what was there under gdb (gdb vmlinux)
There I found the line where it crashed, and that line is:
- Code: Select all
BUG_ON(memcmp(xen_start_info->magic, "xen-3", 5) != 0);
As I said before, xen_start_info is the structure that the Hypervisor fills before loading the kernel, and that line does the compatibility check i was speaking about earlier.
BUG_ON() macro checks the condition it gets and calls the infamous ud2a if the condition is true (hence, in this case, xen_start_info->magic's first 5 letters are NOT "xen-3")
At first, my thought was that the structure was not filled correctly for some reason (my first idea was about an offset issue), but then, i have eventually succeded in debugging the kernel in the very moment it was crashing (via the Xen's provided gdbserver, and a remote gdb client), and i found out that in the address where the magic was expected.... There it was.
But, what surprised me was that the second argument to memcmp() was *not* "xen-3".
The struct was ok, while the static string was not.
This looks like a problem with the linker or something like that.... Unfortunately, at this point I don't have any experience or knowledge to further research on the issue.
How to reproduce the problem:
Set up xen, make sure Debian's (or your distro) default kernel boots and works as a Xen guest.
Fetch a vanilla 2.6.24, configure it with Debian's guest configuration (/boot/config-version), patch with grsec, make oldconfig and disable anything that grsec carries with it.
This is my .config, in case you wonder
objdump --strip-unneeded vmlinux vmlinux-stripped
gzip -9 vmlinux-stripped
mv vmlinux-stripped.gz /boot/vmlinuz-xen-grsec
And then use that image as the Xen domU kernel.
I hope I have given you all the data you may need to solve the problem.
Thank you in advance for your attention,
xstasi
PS: Xen's paravirtualization host support only works, Debian speaking, under etch (2.6.18 kernel series)