Page 1 of 2

bandwidth problem netns ovs

PostPosted: Tue Mar 03, 2015 4:43 am
by Stephane
Hi all,

I'm experiencing a bandwidth problem with a router (ubuntu running on a 3.14.17 grsec kernel with netns ans ovs). The bandwidth between 2 vm on 2 different networks (2 vlans tagged with ovs basically) is really slow (80kb/s = red color) whereas when rebooting my router on a regular kernel everything is ok (green color = 3.87Gb/s). Tests were done with iperf.

Here is a picture representing the architecture :
http://postimg.org/image/dcs55r7nl/

Please tell me if you need more infos

grsec config :

CONFIG_GRKERNSEC_KMEM=y
CONFIG_GRKERNSEC_IO=y
CONFIG_GRKERNSEC_JIT_HARDEN=y
CONFIG_GRKERNSEC_PERF_HARDEN=y
CONFIG_GRKERNSEC_RAND_THREADSTACK=y
CONFIG_GRKERNSEC_PROC_MEMMAP=y
CONFIG_GRKERNSEC_KSTACKOVERFLOW=y
CONFIG_GRKERNSEC_BRUTE=y
CONFIG_GRKERNSEC_MODHARDEN=y
CONFIG_GRKERNSEC_HIDESYM=y
CONFIG_GRKERNSEC_RANDSTRUCT=y
CONFIG_GRKERNSEC_RANDSTRUCT_PERFORMANCE=y
CONFIG_GRKERNSEC_KERN_LOCKOUT=y
CONFIG_GRKERNSEC_PROC=y
CONFIG_GRKERNSEC_PROC_USERGROUP=y
CONFIG_GRKERNSEC_PROC_ADD=y
CONFIG_GRKERNSEC_LINK=y
CONFIG_GRKERNSEC_SYMLINKOWN=y
CONFIG_GRKERNSEC_FIFO=y
CONFIG_GRKERNSEC_SYSFS_RESTRICT=y
CONFIG_GRKERNSEC_DEVICE_SIDECHANNEL=y
CONFIG_GRKERNSEC_CHROOT=y
CONFIG_GRKERNSEC_CHROOT_MOUNT=y
CONFIG_GRKERNSEC_CHROOT_DOUBLE=y
CONFIG_GRKERNSEC_CHROOT_PIVOT=y
CONFIG_GRKERNSEC_CHROOT_CHDIR=y
CONFIG_GRKERNSEC_CHROOT_CHMOD=y
CONFIG_GRKERNSEC_CHROOT_FCHDIR=y
CONFIG_GRKERNSEC_CHROOT_MKNOD=y
CONFIG_GRKERNSEC_CHROOT_SHMAT=y
CONFIG_GRKERNSEC_CHROOT_UNIX=y
CONFIG_GRKERNSEC_CHROOT_FINDTASK=y
CONFIG_GRKERNSEC_CHROOT_NICE=y
CONFIG_GRKERNSEC_CHROOT_SYSCTL=y
CONFIG_GRKERNSEC_CHROOT_CAPS=y
CONFIG_GRKERNSEC_CHROOT_INITRD=y
CONFIG_GRKERNSEC_RESLOG=y
CONFIG_GRKERNSEC_SIGNAL=y
CONFIG_GRKERNSEC_TIME=y
CONFIG_GRKERNSEC_PROC_IPADDR=y
CONFIG_GRKERNSEC_RWXMAP_LOG=y
CONFIG_GRKERNSEC_DMESG=y
CONFIG_GRKERNSEC_HARDEN_PTRACE=y
CONFIG_GRKERNSEC_PTRACE_READEXEC=y
CONFIG_GRKERNSEC_SETXID=y
CONFIG_GRKERNSEC_HARDEN_IPC=y
CONFIG_GRKERNSEC_TPE=y
CONFIG_GRKERNSEC_BLACKHOLE=y
CONFIG_GRKERNSEC_NO_SIMULT_CONNECT=y
CONFIG_GRKERNSEC_DENYUSB=y
CONFIG_GRKERNSEC_SYSCTL=y
CONFIG_GRKERNSEC_SYSCTL_ON=y

Re: bandwidth problem netns ovs

PostPosted: Tue Mar 03, 2015 8:29 am
by PaX Team
1. what about the PaX part of the config?
2. can you run perf on the host kernel while the two VMs are commucating to see where the CPU cycles are spent?

Re: bandwidth problem netns ovs

PostPosted: Tue Mar 03, 2015 8:37 am
by Stephane
Sorry, I forgot the PaX part :

#
# Security options
#

#
# Grsecurity
#
CONFIG_PAX_KERNEXEC_PLUGIN=y
CONFIG_PAX_PER_CPU_PGD=y
CONFIG_TASK_SIZE_MAX_SHIFT=42
CONFIG_PAX_USERCOPY_SLABS=y
CONFIG_GRKERNSEC=y
CONFIG_GRKERNSEC_CONFIG_AUTO=y
# CONFIG_GRKERNSEC_CONFIG_CUSTOM is not set
CONFIG_GRKERNSEC_CONFIG_SERVER=y
# CONFIG_GRKERNSEC_CONFIG_DESKTOP is not set
# CONFIG_GRKERNSEC_CONFIG_VIRT_NONE is not set
CONFIG_GRKERNSEC_CONFIG_VIRT_GUEST=y
# CONFIG_GRKERNSEC_CONFIG_VIRT_HOST is not set
CONFIG_GRKERNSEC_CONFIG_VIRT_EPT=y
# CONFIG_GRKERNSEC_CONFIG_VIRT_SOFT is not set
# CONFIG_GRKERNSEC_CONFIG_VIRT_XEN is not set
# CONFIG_GRKERNSEC_CONFIG_VIRT_VMWARE is not set
CONFIG_GRKERNSEC_CONFIG_VIRT_KVM=y
# CONFIG_GRKERNSEC_CONFIG_VIRT_VIRTUALBOX is not set
CONFIG_GRKERNSEC_CONFIG_PRIORITY_PERF=y
# CONFIG_GRKERNSEC_CONFIG_PRIORITY_SECURITY is not set

#
# Default Special Groups
#
CONFIG_GRKERNSEC_PROC_GID=1001
CONFIG_GRKERNSEC_TPE_UNTRUSTED_GID=1005
CONFIG_GRKERNSEC_SYMLINKOWN_GID=1006

#
# Customize Configuration
#
#
# PaX
#
CONFIG_PAX=y

#
# PaX Control
#
# CONFIG_PAX_SOFTMODE is not set
CONFIG_PAX_EI_PAX=y
CONFIG_PAX_PT_PAX_FLAGS=y
CONFIG_PAX_XATTR_PAX_FLAGS=y
# CONFIG_PAX_NO_ACL_FLAGS is not set
CONFIG_PAX_HAVE_ACL_FLAGS=y
# CONFIG_PAX_HOOK_ACL_FLAGS is not set

#
# Non-executable pages
#
CONFIG_PAX_NOEXEC=y
CONFIG_PAX_PAGEEXEC=y
CONFIG_PAX_EMUTRAMP=y
CONFIG_PAX_MPROTECT=y
# CONFIG_PAX_MPROTECT_COMPAT is not set
# CONFIG_PAX_ELFRELOCS is not set
CONFIG_PAX_KERNEXEC=y
CONFIG_PAX_KERNEXEC_PLUGIN_METHOD_BTS=y
CONFIG_PAX_KERNEXEC_PLUGIN_METHOD="bts"

#
# Address Space Layout Randomization
#
CONFIG_PAX_ASLR=y
CONFIG_PAX_RANDKSTACK=y
CONFIG_PAX_RANDUSTACK=y
CONFIG_PAX_RANDMMAP=y

#
# Miscellaneous hardening features
#
# CONFIG_PAX_MEMORY_SANITIZE is not set
# CONFIG_PAX_MEMORY_STACKLEAK is not set
# CONFIG_PAX_MEMORY_STRUCTLEAK is not set
# CONFIG_PAX_MEMORY_UDEREF is not set
CONFIG_PAX_REFCOUNT=y
CONFIG_PAX_CONSTIFY_PLUGIN=y
CONFIG_PAX_USERCOPY=y
# CONFIG_PAX_USERCOPY_DEBUG is not set
CONFIG_PAX_SIZE_OVERFLOW=y
CONFIG_PAX_LATENT_ENTROPY=y

#
# Memory Protections
#
CONFIG_GRKERNSEC_KMEM=y
CONFIG_GRKERNSEC_IO=y
CONFIG_GRKERNSEC_JIT_HARDEN=y
CONFIG_GRKERNSEC_PERF_HARDEN=y


I'll run iperf on the host kernel and keep you in touch asap. FYI, the host is also running the same kernel with grsec.

Re: bandwidth problem netns ovs

PostPosted: Tue Mar 03, 2015 9:06 am
by Stephane
I'll try to describe you more clearly what I have :


VM-a Router VM-b
------ -> ------ -> ------
Host 1 Host 2 Host 3

VM-a (no grsec kernel) is running on host 1 (with grsec/Pax), Router (grsec kernel) uses namespaces netns and ovs like a tenant router. Router is running on Host2 with Grsec kernel. VM-b (no grsec) is running on Host3 (Grsec kernel).

When VM-a is communicating with VM-b via router the traffic is about 80kb/s when router is running on a grsec kernel. When I reboot router under a regular kernel, traffic is becoming normal (about 3.87Gb/s).
Iperf between host 1, 2 and 3 is ok (9.89Gb/s).
So I suspect something to be wrong with my grsec kernel on the router...

Re: bandwidth problem netns ovs

PostPosted: Tue Mar 03, 2015 9:11 am
by PaX Team
so run perf stat on the router and then we'll know where it spends its time (if it's cpu bound at all, that is). you can also try to do a binary search for the config option that causes this, i'd start with SIZE_OVERFLOW/LATENT_ENTROPY/KERNEXEC/USERCOPY/REFCOUNT in that order.

Re: bandwidth problem netns ovs

PostPosted: Tue Mar 03, 2015 9:34 am
by Stephane
On VM-A

Code: Select all
adminomc@vma:~$ iperf -c 192.168.1.5 -t 900 -i 1
------------------------------------------------------------
Client connecting to 192.168.1.5, TCP port 5001 TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local 192.168.0.5 port 54046 connected with 192.168.1.5 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 1.0 sec   128 KBytes  1.05 Mbits/sec
[  3]  1.0- 2.0 sec  0.00 Bytes  0.00 bits/sec [  3]  2.0- 3.0 sec  0.00 Bytes  0.00 bits/sec [  3]  3.0- 4.0 sec  0.00 Bytes  0.00 bits/sec [  3]  4.0- 5.0 sec  0.00 Bytes  0.00 bits/sec [  3]  5.0- 6.0 sec  0.00 Bytes  0.00 bits/sec [  3]  6.0- 7.0 sec  0.00 Bytes  0.00 bits/sec [  3]  7.0- 8.0 sec  0.00 Bytes  0.00 bits/sec [  3]  8.0- 9.0 sec  0.00 Bytes  0.00 bits/sec [  3]  9.0-10.0 sec  0.00 Bytes  0.00 bits/sec
[  3] 10.0-11.0 sec   128 KBytes  1.05 Mbits/sec
[  3] 11.0-12.0 sec  0.00 Bytes  0.00 bits/sec [  3] 12.0-13.0 sec  0.00 Bytes  0.00 bits/sec [  3] 13.0-14.0 sec  0.00 Bytes  0.00 bits/sec [  3] 14.0-15.0 sec  0.00 Bytes  0.00 bits/sec [  3] 15.0-16.0 sec  0.00 Bytes  0.00 bits/sec [  3] 16.0-17.0 sec  0.00 Bytes  0.00 bits/sec [  3] 17.0-18.0 sec  0.00 Bytes  0.00 bits/sec [  3] 18.0-19.0 sec  0.00 Bytes  0.00 bits/sec [  3] 19.0-20.0 sec  0.00 Bytes  0.00 bits/sec [  3] 20.0-21.0 sec  0.00 Bytes  0.00 bits/sec [  3] 21.0-22.0 sec  0.00 Bytes  0.00 bits/sec [  3] 22.0-23.0 sec  0.00 Bytes  0.00 bits/sec [  3] 23.0-24.0 sec  0.00 Bytes  0.00 bits/sec [  3] 24.0-25.0 sec  0.00 Bytes  0.00 bits/sec
[  3] 25.0-26.0 sec   128 KBytes  1.05 Mbits/sec
[  3] 26.0-27.0 sec  0.00 Bytes  0.00 bits/sec [  3] 27.0-28.0 sec  0.00 Bytes  0.00 bits/sec [  3] 28.0-29.0 sec  0.00 Bytes  0.00 bits/sec [  3] 29.0-30.0 sec  0.00 Bytes  0.00 bits/sec [  3] 30.0-31.0 sec  0.00 Bytes  0.00 bits/sec [  3] 31.0-32.0 sec  0.00 Bytes  0.00 bits/sec [  3] 32.0-33.0 sec  0.00 Bytes  0.00 bits/sec [  3] 33.0-34.0 sec  0.00 Bytes  0.00 bits/sec [  3] 34.0-35.0 sec  0.00 Bytes  0.00 bits/sec [  3] 35.0-36.0 sec  0.00 Bytes  0.00 bits/sec



On VM-B

Code: Select all
adminomc@vmb:~$ iperf -s -i 1
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 192.168.1.5 port 5001 connected with 192.168.0.5 port 54046
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0- 1.0 sec  5.66 KBytes  46.3 Kbits/sec [  4]  1.0- 2.0 sec  7.07 KBytes  57.9 Kbits/sec [  4]  2.0- 3.0 sec  7.07 KBytes  57.9 Kbits/sec [  4]  3.0- 4.0 sec  7.07 KBytes  57.9 Kbits/sec [  4]  4.0- 5.0 sec  7.07 KBytes  57.9 Kbits/sec [  4]  5.0- 6.0 sec  7.07 KBytes  57.9 Kbits/sec [  4]  6.0- 7.0 sec  9.90 KBytes  81.1 Kbits/sec [  4]  7.0- 8.0 sec  7.07 KBytes  57.9 Kbits/sec [  4]  8.0- 9.0 sec  9.90 KBytes  81.1 Kbits/sec [  4]  9.0-10.0 sec  9.90 KBytes  81.1 Kbits/sec [  4] 10.0-11.0 sec  5.66 KBytes  46.3 Kbits/sec [  4] 11.0-12.0 sec  8.48 KBytes  69.5 Kbits/sec [  4] 12.0-13.0 sec  7.07 KBytes  57.9 Kbits/sec [  4] 13.0-14.0 sec  7.07 KBytes  57.9 Kbits/sec [  4] 14.0-15.0 sec  9.90 KBytes  81.1 Kbits/sec [  4] 15.0-16.0 sec  7.07 KBytes  57.9 Kbits/sec [  4] 16.0-17.0 sec  7.07 KBytes  57.9 Kbits/sec
[  4] 17.0-18.0 sec  17.0 KBytes   139 Kbits/sec
[  4] 18.0-19.0 sec  8.48 KBytes  69.5 Kbits/sec [  4] 19.0-20.0 sec  5.66 KBytes  46.3 Kbits/sec [  4] 20.0-21.0 sec  7.07 KBytes  57.9 Kbits/sec [  4] 21.0-22.0 sec  8.48 KBytes  69.5 Kbits/sec [  4] 22.0-23.0 sec  11.3 KBytes  92.7 Kbits/sec [  4] 23.0-24.0 sec  7.07 KBytes  57.9 Kbits/sec [  4] 24.0-25.0 sec  7.07 KBytes  57.9 Kbits/sec [  4] 25.0-26.0 sec  7.07 KBytes  57.9 Kbits/sec [  4] 26.0-27.0 sec  11.3 KBytes  92.7 Kbits/sec [  4] 27.0-28.0 sec  7.07 KBytes  57.9 Kbits/sec [  4] 28.0-29.0 sec  7.07 KBytes  57.9 Kbits/sec [  4] 29.0-30.0 sec  5.66 KBytes  46.3 Kbits/sec [  4] 30.0-31.0 sec  7.07 KBytes  57.9 Kbits/sec [  4] 31.0-32.0 sec  7.07 KBytes  57.9 Kbits/sec [  4] 32.0-33.0 sec  11.3 KBytes  92.7 Kbits/sec [  4] 33.0-34.0 sec  7.07 KBytes  57.9 Kbits/sec [  4] 34.0-35.0 sec  7.07 KBytes  57.9 Kbits/sec [  4] 35.0-36.0 sec  9.90 KBytes  81.1 Kbits/sec
[  4] 36.0-37.0 sec  99.0 KBytes   811 Kbits/sec
[  4] 37.0-38.0 sec  35.4 KBytes   290 Kbits/sec
[  4] 38.0-39.0 sec  28.3 KBytes   232 Kbits/sec
[  4] 39.0-40.0 sec  45.2 KBytes   371 Kbits/sec



vmstat on the router

Code: Select all
root@router:~# vmstat 1 10000
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0 3132948  27740 183272    0    0    29     4  151  307  1  2 96  1  0
 0  0      0 3132576  27740 183304    0    0     0    16  335  631  2  2 96  0  0
 1  0      0 3132592  27740 183304    0    0     0     0  297  541  0  1 99  0  0
 0  0      0 3132820  27748 183304    0    0     0    24  335  645  2  3 95  0  0
 0  0      0 3132968  27748 183304    0    0     0     8  286  528  0  1 99  0  0
 0  0      0 3132924  27748 183304    0    0     0     0  434  623  2  3 95  0  0
 0  0      0 3132928  27764 183288    0    0     0    44  322  533  0  1 97  2  0
 0  0      0 3132644  27764 183304    0    0     0     0  383  705  2  2 96  0  0
 0  0      0 3132612  27764 183304    0    0     0     0  323  531  1  1 98  0  0
 0  0      0 3133068  27772 183300    0    0     0    12  326  595  2  2 96  1  0
 0  0      0 3133204  27772 183304    0    0     0     0  276  454  1  1 99  0  0


vmstat on the host running the router

Code: Select all
root@phys-router-kvm:~# vmstat 1 1000
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0      0 18450636 524700 6828520    0    0     0     1    0    0  0  0 99  0  0
 0  0      0 18451116 524700 6828520    0    0     0     0 1434 2810  1  1 99  0  0
 0  0      0 18452952 524700 6828520    0    0     0    80 1649 2808  1  0 98  0  0
 0  0      0 18453332 524700 6828520    0    0     0     0 1324 2571  1  0 99  0  0
 0  0      0 18454200 524700 6828520    0    0     0     0 1749 2985  1  1 99  0  0
 0  0      0 18455440 524700 6828520    0    0     0   156 1376 2924  1  1 99  0  0
 0  0      0 18456680 524700 6828520    0    0     0    20 1541 2905  1  1 99  0  0
 0  0      0 18457656 524700 6828520    0    0     0    64 1375 2773  1  1 99  0  0
 0  0      0 18458020 524700 6828520    0    0     0     0 1549 3073  1  0 99  0  0
 3  0      0 18459032 524700 6828520    0    0     0     0 1339 2606  1  1 99  0  0
 0  0      0 18460340 524700 6828520    0    0     0   200 1545 3219  1  1 99  0  0
 0  0      0 18460952 524700 6828520    0    0     0     0 1403 2751  1  0 99  0  0
 0  0      0 18461448 524700 6828520    0    0     0    88 1500 2979  1  0 99  0  0
 1  0      0 18461868 524700 6828520    0    0     0     0 1412 2889  1  0 99  0  0
 0  0      0 18462744 524700 6828520    0    0     0     0 1523 3029  1  1 99  0  0
 0  0      0 18461292 524700 6828524    0    0     0   148 1659 3751  1  1 99  0  0
 1  0      0 18462080 524700 6828528    0    0     0    12 1476 2799  1  0 99  0  0
 1  0      0 18462392 524700 6828528    0    0     0    88 1332 2517  1  0 99  0  0
 0  0      0 18462712 524700 6828524    0    0     0     0 1480 2961  1  1 99  0  0
 0  0      0 18463340 524700 6828524    0    0     0     0 1419 2706  1  0 99  0  0
 0  0      0 18463456 524700 6828524    0    0     0   196 1522 3280  1  1 99  0  0
 0  0      0 18463548 524700 6828524    0    0     0    12 1371 2787  1  0 99  0  0
 1  0      0 18464168 524700 6828524    0    0     0   168 1366 2657  1  0 99  0  0

Re: bandwidth problem netns ovs

PostPosted: Tue Mar 03, 2015 9:40 am
by Stephane
Is there a buffer added somewhere ?

LRO is off on my physical NICS

Re: bandwidth problem netns ovs

PostPosted: Tue Mar 03, 2015 10:01 am
by PaX Team
we don't touch networking code this deep, so i have no idea why the cpu is basically idle instead of transmitting packets. so i guess the next step is disabling config options until the culprit is found then we can do some debugging.

PS: i edited the console output above to use the code tag, feel free to do so yourself in the future ;)

Re: bandwidth problem netns ovs

PostPosted: Tue Mar 03, 2015 10:05 am
by Stephane
Ok, I'll compile a new kernel with SIZE_OVERFLOW/LATENT_ENTROPY/KERNEXEC/USERCOPY/REFCOUNT disabled to see if it fixes the issue, if yes I'll try to disable them one by one in the order you told me.

Re: bandwidth problem netns ovs

PostPosted: Tue Mar 03, 2015 10:18 am
by PaX Team
also try a newer kernel please, yours is quite old now...

Re: bandwidth problem netns ovs

PostPosted: Tue Mar 03, 2015 10:21 am
by Stephane
yes :) I'm compiling the 3.14.34, I keep you in touch...thx

Re: bandwidth problem netns ovs

PostPosted: Tue Mar 03, 2015 10:34 am
by PaX Team
another idea i just had is that maybe some procfs/sysfs restriction prevents your userland from setting a knob somewhere that'd be needed for proper network performance. maybe try to disable them too and/or check if all your interface/network settings end up being the same between the grsec and normal kernel.

Re: bandwidth problem netns ovs

PostPosted: Tue Mar 03, 2015 10:57 am
by Stephane
You mean CONFIG_GRKERNSEC_SYSFS_RESTRICT ?

Re: bandwidth problem netns ovs

PostPosted: Tue Mar 03, 2015 11:03 am
by PaX Team
yes though on a second thought this would affect non-root users only and i guess you're not doing any setup on the router as such.

Re: bandwidth problem netns ovs

PostPosted: Tue Mar 03, 2015 11:08 am
by Stephane
Ok SIZE_OVERFLOW/LATENT_ENTROPY/KERNEXEC/USERCOPY/REFCOUNT are not set now in my new 3.14.34 kernel and no changes in the behaviour... same problem...
I'll try setting CONFIG_GRKERNSEC_SYSFS_RESTRICT off ... but I'm starting to be pessimistic.