]> git.proxmox.com Git - mirror_ovs.git/blame - INSTALL.DPDK
datapath-windows: OVS_PACKET_CMD_EXECUTE handler.
[mirror_ovs.git] / INSTALL.DPDK
CommitLineData
8a9562d2
PS
1 Using Open vSwitch with DPDK
2 ============================
3
4Open vSwitch can use Intel(R) DPDK lib to operate entirely in
5userspace. This file explains how to install and use Open vSwitch in
6such a mode.
7
8The DPDK support of Open vSwitch is considered experimental.
9It has not been thoroughly tested.
10
11This version of Open vSwitch should be built manually with "configure"
12and "make".
13
14Building and Installing:
15------------------------
16
d7310583 17Required DPDK 1.7.
8a9562d2
PS
18
19DPDK:
d7310583 20Set dir i.g.: export DPDK_DIR=/usr/src/dpdk-1.7.0
c2cbb53c 21cd $DPDK_DIR
d7310583 22update config/common_linuxapp so that dpdk generate single lib file.
30f4d875 23(modification also required for IVSHMEM build)
8a9562d2
PS
24CONFIG_RTE_BUILD_COMBINE_LIBS=y
25
30f4d875 26For default install without IVSHMEM:
d7310583 27make install T=x86_64-native-linuxapp-gcc
30f4d875
PS
28To include IVSHMEM (shared memory):
29make install T=x86_64-ivshmem-linuxapp-gcc
8a9562d2
PS
30For details refer to http://dpdk.org/
31
32Linux kernel:
33Refer to intel-dpdk-getting-started-guide.pdf for understanding
34DPDK kernel requirement.
35
36OVS:
30f4d875
PS
37Non IVSHMEM:
38export DPDK_BUILD=$DPDK_DIR/x86_64-native-linuxapp-gcc/
39IVSHMEM:
40export DPDK_BUILD=$DPDK_DIR/x86_64-ivshmem-linuxapp-gcc/
41
8a9562d2
PS
42cd $(OVS_DIR)/openvswitch
43./boot.sh
c2cbb53c 44./configure --with-dpdk=$DPDK_BUILD
8a9562d2
PS
45make
46
47Refer to INSTALL.userspace for general requirements of building
48userspace OVS.
49
50Using the DPDK with ovs-vswitchd:
51---------------------------------
52
c2cbb53c
PM
53Setup system boot:
54 kernel bootline, add: default_hugepagesz=1GB hugepagesz=1G hugepages=1
55
8a9562d2
PS
56First setup DPDK devices:
57 - insert uio.ko
c2cbb53c 58 e.g. modprobe uio
8a9562d2 59 - insert igb_uio.ko
d7310583
DDP
60 e.g. insmod $DPDK_BUILD/kmod/igb_uio.ko
61 - Bind network device to igb_uio.
62 e.g. $DPDK_DIR/tools/dpdk_nic_bind.py --bind=igb_uio eth1
c2cbb53c
PM
63 Alternate binding method:
64 Find target Ethernet devices
65 lspci -nn|grep Ethernet
66 Bring Down (e.g. eth2, eth3)
67 ifconfig eth2 down
68 ifconfig eth3 down
69 Look at current devices (e.g ixgbe devices)
70 ls /sys/bus/pci/drivers/ixgbe/
71 0000:02:00.0 0000:02:00.1 bind module new_id remove_id uevent unbind
72 Unbind target pci devices from current driver (e.g. 02:00.0 ...)
73 echo 0000:02:00.0 > /sys/bus/pci/drivers/ixgbe/unbind
74 echo 0000:02:00.1 > /sys/bus/pci/drivers/ixgbe/unbind
75 Bind to target driver (e.g. igb_uio)
76 echo 0000:02:00.0 > /sys/bus/pci/drivers/igb_uio/bind
77 echo 0000:02:00.1 > /sys/bus/pci/drivers/igb_uio/bind
78 Check binding for listed devices
79 ls /sys/bus/pci/drivers/igb_uio
80 0000:02:00.0 0000:02:00.1 bind module new_id remove_id uevent unbind
81
82Prepare system:
c2cbb53c 83 - mount hugetlbfs
30f4d875 84 e.g. mount -t hugetlbfs -o pagesize=1G none /dev/hugepages
8a9562d2
PS
85
86Ref to http://www.dpdk.org/doc/quick-start for verifying DPDK setup.
87
c2cbb53c
PM
88Start ovsdb-server as discussed in INSTALL doc:
89 Summary e.g.:
90 First time only db creation (or clearing):
91 mkdir -p /usr/local/etc/openvswitch
92 mkdir -p /usr/local/var/run/openvswitch
93 rm /usr/local/etc/openvswitch/conf.db
94 cd $OVS_DIR
95 ./ovsdb/ovsdb-tool create /usr/local/etc/openvswitch/conf.db \
96 ./vswitchd/vswitch.ovsschema
97 start ovsdb-server
98 cd $OVS_DIR
99 ./ovsdb/ovsdb-server --remote=punix:/usr/local/var/run/openvswitch/db.sock \
6ba531aa 100 --remote=db:Open_vSwitch,Open_vSwitch,manager_options \
c2cbb53c 101 --private-key=db:Open_vSwitch,SSL,private_key \
30f4d875 102 --certificate=Open_vSwitch,SSL,certificate \
c2cbb53c
PM
103 --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --pidfile --detach
104 First time after db creation, initialize:
105 cd $OVS_DIR
106 ./utilities/ovs-vsctl --no-wait init
107
8a9562d2
PS
108Start vswitchd:
109DPDK configuration arguments can be passed to vswitchd via `--dpdk`
d1279464
PS
110argument. This needs to be first argument passed to vswitchd process.
111dpdk arg -c is ignored by ovs-dpdk, but it is a required parameter
8a9562d2
PS
112for dpdk initialization.
113
114 e.g.
c2cbb53c 115 export DB_SOCK=/usr/local/var/run/openvswitch/db.sock
30f4d875 116 ./vswitchd/ovs-vswitchd --dpdk -c 0x1 -n 4 -- unix:$DB_SOCK --pidfile --detach
8a9562d2 117
30f4d875
PS
118If allocated more than one GB hugepage (as for IVSHMEM), set amount and use NUMA
119node 0 memory:
c2cbb53c
PM
120
121 ./vswitchd/ovs-vswitchd --dpdk -c 0x1 -n 4 --socket-mem 1024,0 \
30f4d875 122 -- unix:$DB_SOCK --pidfile --detach
c2cbb53c 123
8a9562d2
PS
124To use ovs-vswitchd with DPDK, create a bridge with datapath_type
125"netdev" in the configuration database. For example:
126
127 ovs-vsctl add-br br0
128 ovs-vsctl set bridge br0 datapath_type=netdev
129
130Now you can add dpdk devices. OVS expect DPDK device name start with dpdk
131and end with portid. vswitchd should print number of dpdk devices found.
132
133 ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
c2cbb53c 134 ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk
8a9562d2 135
c2cbb53c 136Once first DPDK port is added to vswitchd, it creates a Polling thread and
8a9562d2
PS
137polls dpdk device in continuous loop. Therefore CPU utilization
138for that thread is always 100%.
139
c2cbb53c
PM
140Test flow script across NICs (assuming ovs in /usr/src/ovs):
141 Assume 1.1.1.1 on NIC port 1 (dpdk0)
142 Assume 1.1.1.2 on NIC port 2 (dpdk1)
143 Execute script:
144
145############################# Script:
146
147#! /bin/sh
c2cbb53c 148# Move to command directory
c2cbb53c
PM
149cd /usr/src/ovs/utilities/
150
151# Clear current flows
152./ovs-ofctl del-flows br0
153
154# Add flows between port 1 (dpdk0) to port 2 (dpdk1)
155./ovs-ofctl add-flow br0 in_port=1,dl_type=0x800,nw_src=1.1.1.1,\
156nw_dst=1.1.1.2,idle_timeout=0,action=output:2
157./ovs-ofctl add-flow br0 in_port=2,dl_type=0x800,nw_src=1.1.1.2,\
158nw_dst=1.1.1.1,idle_timeout=0,action=output:1
159
160######################################
161
ee8627fa
AW
162With pmd multi-threading support, OVS creates one pmd thread for each
163numa node as default. The pmd thread handles the I/O of all DPDK
164interfaces on the same numa node. The following two commands can be used
165to configure the multi-threading behavior.
c2cbb53c 166
ee8627fa 167 ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=<hex string>
c2cbb53c 168
ee8627fa
AW
169The command above asks for a CPU mask for setting the affinity of pmd threads.
170A set bit in the mask means a pmd thread is created and pinned to the
171corresponding CPU core. For more information, please refer to
172`man ovs-vswitchd.conf.db`
c2cbb53c 173
ee8627fa 174 ovs-vsctl set Open_vSwitch . other_config:n-dpdk-rxqs=<integer>
c2cbb53c 175
ee8627fa
AW
176The command above sets the number of rx queues of each DPDK interface. The
177rx queues are assigned to pmd threads on the same numa node in round-robin
178fashion. For more information, please refer to `man ovs-vswitchd.conf.db`
c2cbb53c 179
ee8627fa
AW
180Ideally for maximum throughput, the pmd thread should not be scheduled out
181which temporarily halts its execution. The following affinitization methods
182can help.
c2cbb53c 183
ee8627fa
AW
184Lets pick core 4,6,8,10 for pmd threads to run on. Also assume a dual 8 core
185sandy bridge system with hyperthreading enabled where CPU1 has cores 0,...,7
186and 16,...,23 & CPU2 cores 8,...,15 & 24,...,31. (A different cpu
187configuration could have different core mask requirements).
c2cbb53c 188
ee8627fa
AW
189To kernel bootline add core isolation list for cores and associated hype cores
190(e.g. isolcpus=4,20,6,22,8,24,10,26,). Reboot system for isolation to take
191effect, restart everything.
c2cbb53c 192
ee8627fa 193Configure pmd threads on core 4,6,8,10 using 'pmd-cpu-mask':
c2cbb53c 194
ee8627fa
AW
195 ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=00000550
196
197You should be able to check that pmd threads are pinned to the correct cores
198via:
199
200 top -p `pidof ovs-vswitchd` -H -d1
201
202Note, the pmd threads on a numa node are only created if there is at least
203one DPDK interface from the numa node that has been added to OVS.
204
205Note, core 0 is always reserved from non-pmd threads and should never be set
206in the cpu mask.
c2cbb53c 207
95fb793a 208DPDK Rings :
209------------
210
211Following the steps above to create a bridge, you can now add dpdk rings
212as a port to the vswitch. OVS will expect the DPDK ring device name to
213start with dpdkr and end with a portid.
214
215 ovs-vsctl add-port br0 dpdkr0 -- set Interface dpdkr0 type=dpdkr
216
217DPDK rings client test application
218
219Included in the test directory is a sample DPDK application for testing
220the rings. This is from the base dpdk directory and modified to work
221with the ring naming used within ovs.
222
223location tests/ovs_client
224
225To run the client :
30f4d875
PS
226 cd /usr/src/ovs/tests/
227 ovsclient -c 1 -n 4 --proc-type=secondary -- -n "port id you gave dpdkr"
95fb793a 228
229In the case of the dpdkr example above the "port id you gave dpdkr" is 0.
230
231It is essential to have --proc-type=secondary
232
233The application simply receives an mbuf on the receive queue of the
234ethernet ring and then places that same mbuf on the transmit ring of
235the ethernet ring. It is a trivial loopback application.
236
30f4d875
PS
237DPDK rings in VM (IVSHMEM shared memory communications)
238-------------------------------------------------------
239
95fb793a 240In addition to executing the client in the host, you can execute it within
241a guest VM. To do so you will need a patched qemu. You can download the
242patch and getting started guide at :
243
244https://01.org/packet-processing/downloads
245
246A general rule of thumb for better performance is that the client
247application should not be assigned the same dpdk core mask "-c" as
248the vswitchd.
249
8a9562d2
PS
250Restrictions:
251-------------
252
253 - This Support is for Physical NIC. I have tested with Intel NIC only.
8a9562d2
PS
254 - Work with 1500 MTU, needs few changes in DPDK lib to fix this issue.
255 - Currently DPDK port does not make use any offload functionality.
95fb793a 256 ivshmem
257 - The shared memory is currently restricted to the use of a 1GB
258 huge pages.
259 - All huge pages are shared amongst the host, clients, virtual
260 machines etc.
8a9562d2
PS
261
262Bug Reporting:
263--------------
264
265Please report problems to bugs@openvswitch.org.