Jarno Rajahalme [Tue, 9 Jun 2015 22:24:33 +0000 (15:24 -0700)]
odp-util: Simplify parsing function for GCC.
GCC 4.7.2 -O3 flagged potential use before initialization for the 'id'
and 'id_mask' being scanned in scan_vxlan_gbp(). For the 'id' this
was a real possiblity, but for the 'id_mask' it seems to be a false
positive in gcc analysis. Simplify scan_vxlan_gbp() to fix this.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jarno Rajahalme [Tue, 9 Jun 2015 22:24:33 +0000 (15:24 -0700)]
ofproto: Fix memory leak in ofproto_rule_delete().
Commit 401aa90e33be (ofproto: Fix memory leak in flow deletion.) fixed
the memory leak when a rule is deleted, but failed to do the same when
all rules in a bridge are deleted just before the bridge itself is
deleted.
This patch adds the necessary unref to ofproto_rule_delete().
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Mon, 16 Mar 2015 22:45:27 +0000 (15:45 -0700)]
ovsdb-monitor: add json cache
Although multiple jsonrpc monitors can share the same ovsdb monitor,
each change still needs to translated into json object from scratch.
This can be wasteful if multiple jsonrpc monitors are interested in the
same changes.
Json cache improves this by keeping an copy of json object generated
for transaction X to current transaction. When jsonrpc is interested
in a change, the cache is searched first, if an json object is found,
a copy of it is handed back, skipping the regeneration process.
Any commit to the monitor will empty the cache. This can be further
optimized to not throw away the cache if the updated tables and columns
are not being monitored.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Mon, 16 Mar 2015 10:03:20 +0000 (03:03 -0700)]
ovsdb-monitor: allow multiple jsonrpc monitors to share a single ovsdb
monitor
Store ovsdb monitor in a global hmap. If a newly created ovsdb monitor
object monitors the same tables and columns as an existing one, the
existing monitor will be reused.
With this patch, jsonrpc monitor and ovsdb monitor now have N:1 mapping.
The goals are to:
1) Reduce the cost of maintaining duplicated monitors.
2) Allow for create Json cache for the same updates. Json cache will be
introduced in the following patch.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Thu, 4 Jun 2015 20:45:55 +0000 (13:45 -0700)]
tunneling: Fix a tunnel name display bug
Currently, 'ovs-appctl tnl/ports/show' command won't display gre port
name correctly. Since netdev_vport_get_dpif_port() will not always
set the 'namebuf' it receives. Should use the name by its return
value instead. Found by inspection.
Also extend the test case to cover this command.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jesse Gross [Sat, 16 May 2015 00:03:17 +0000 (17:03 -0700)]
ofp-util: Convert flow_metadata to match structure.
We have a special flow_metadata structure to represent the parts
of a packet that aren't carried in the payload itself. This is
used in the case where we need to send the packet as a Packet In
to an OpenFlow controller. This is a subset of the more general
struct flow.
In practice, almost all operations we do on this structure involve
converting it to or from a match or have code that is the same as
a match. Serialization to NXM and back is done as a match. There
is special flow_metadata formatting code that is almost identical
to match formatting.
The uses for struct flow_metadata aren't performance critical
when it comes to memory, so we can save quite a bit of code by
just using a match structure directly instead. In addition, as
metadata increases and becomes more complex (Geneve options require
some special handling beyond just additional fields), using the
match structure means we only have to do this work in one place.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
This commit adds the windows installer to the OVS tree.
Requirements are the following:
Visual Studio Community 2013
WiX Toolset 3.9
Microsoft_VC120_CRT_x86.msm
More detailed information on the requirements and build instructions
can be found under:
https://github.com/cloudbase/ovs-windows-installer/blob/master/README.rst
To run and make the installer issue the following:
To uninstall one could use the following Powershell commandlets:
$app = Get-WmiObject -Class Win32_Product | Where-Object `
{ $_.Name -match "Open Vswitch" }
Alex Wang [Sun, 7 Jun 2015 18:38:52 +0000 (11:38 -0700)]
odp-util: Make sure vlan tci mask has exact match for VLAN_CFI.
OVS datapath has check which prevents the installation of flow
that matches VLAN TCI but does not have exact match for VLAN_CFI
bit. To follow this rule, ovs userspace must make sure the
flow key for datapath flow matching VLAN TCI has exact match for
VLAN_CFI bit.
Before this commit, this is not enforced, so OpenFlow flow like
"vlan_tci=0x000a/0x0fff,action=output:local" can generate datapath
flow like "vlan(vid=10/0xfff,pcp=0/0x0,cfi=1/0)".
With the OVS datapath check, the installation of such datapath
flow will be rejected with:
"|WARN|system@ovs-system: failed to put[create][modify] (Invalid argument)"
This commit makes ovs userspace always exact match the VLAN_CFI
bit if the flow matches VLAN TCI.
Reported-by: Ronald Lee <ronaldlee@vmware.com> Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Use xzalloc instead of xmalloc for some key structure allocations in
ofproto-dpif (viz. ofproto_dpif, ofport_dpif and rule_dpif) so as to
prevent uninitialized values in these structures.
Signed-off-by: Sabyasachi Sengupta <sabyasachi.sengupta@alcatel-lucent.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Fri, 5 Jun 2015 15:13:28 +0000 (08:13 -0700)]
ofproto-dpif: Avoid creating OpenFlow ports for duplicate tunnels.
Until now, when two tunnels had an identical configuration, both of them
were assigned OpenFlow ports, but only one of those OpenFlow ports was
functional. With this commit, only one of the two (or more) identically
configured tunnels will be assigned an OpenFlow port number.
Reported-by: Keith Holleman <hollemanietf@gmail.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Co-authored-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com>
Non pmd threads have a core_id == UINT32_MAX, while queue ids used by
netdevs range from 0 to the number of CPUs. Therefore core ids cannot
be used directly to select a queue.
This commit introduces a simple mapping to fix the problem: pmd threads
continue using queues 0 to N (where N is the number of CPUs in the
system), while non pmd threads use queue N+1.
Fixes: d5c199ea7ff7 ("netdev-dpdk: Properly support non pmd threads.") Reported-by: 차은호 <eunho.cha@atto-research.com Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Signed-off-by: Mark D. Gray <mark.d.gray@intel.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Flavio Leitner <fbl@redhat.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Jesse Gross [Tue, 19 May 2015 00:17:41 +0000 (17:17 -0700)]
metaflow: Convert hex parsing to use new utility functions.
We now have functions that can do parsing and printing of long hex
strings, so we should use them for meta flow fields to ensure
consistent behavior.
Since these functions can handle infinitely long strings, we can
also increase the maximum field size for MFS_HEXADECIMAL types to
the limit allowed by NXM/OXM. This is useful for future large fields,
such as Geneve options.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Jarno Rajahalme [Tue, 2 Jun 2015 01:07:39 +0000 (18:07 -0700)]
bundles: Validate bundled messages.
OpenFlow bundle messages should be decoded and validated at the time
they are added to the bundle. This commit does this for flow mod and
port mod messages.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jarno Rajahalme [Fri, 29 May 2015 18:28:38 +0000 (11:28 -0700)]
classifier: Support duplicate rules.
OpenFlow 1.4 bundles are easier to implement when it is possible to
mark a rule as 'to_be_removed' and then insert a new, identical rule
with the same priority.
All but one out of the identical rules must be marked as
'to_be_removed', and the one rule that is not 'to_be_removed' must
have been inserted last.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Fri, 13 Mar 2015 23:35:49 +0000 (16:35 -0700)]
ovsdb-monitor: add ovsdb_monitor_changes
Currently, each monitor table contains a single hmap 'changes' to
track updates. This patch introduces a new data structure
'ovsdb_monitor_changes' that stores the updates 'rows' tagged by
its first commit transaction id. Each 'ovsdb_monitor_changes' is
refenece counted allowing multiple jsonrpc_monitors to share them.
The next patch will allow each ovsdb monitor table to store a list
of 'ovsdb_monitor_changes'. This patch stores only one, same as
before.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
jsonrpc_monitor_compose_update() seems to fit better than
jsonrpc_monitor_compose_table_update(), since it composes changes
from all tables. Albeit the original one is named after the
<table-updates> object described in RFC 7047.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Fri, 13 Mar 2015 06:15:56 +0000 (23:15 -0700)]
ovsdb-monitor: add transaction ids
With N:1 mappings, multiple jsonrpc server may be servicing the rpc
connection at a different pace. ovsdb-monitor thus needs to maintain
different change sets, depends on connection speed of each rpc
connections. Connections servicing at the same speed can share the
same change set.
Transaction ID is an concept added to describe the change set. One
possible view of the database state is a sequence of changes, more
precisely, commits be applied to it in order, starting from an
initial state, with commit 0. The logic can also be applied to the
jsonrpc monitor; each change it pushes corresponds to commits between
two transaction IDs.
This patch introduces transaction IDs. For ovsdb-monitor, it maintains
n_transactions, starting from 0. Each commit add 1 to the number.
Jsonrpc maintains and 'unflushed' transaction number, corresponding to
the next commit the remote has not seen. jsonrpc's job is simply to
notice there are changes in the ovsdb-monitor that it is interested in,
i.e. 'n_transactions' >= 'unflushed', get the changes in json format,
and push them to the remote site.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Thu, 9 Apr 2015 23:39:58 +0000 (16:39 -0700)]
ovsdb-monitor: stores jsonrpc-monitor in a linked-list
Currently, each ovsdb-monitor points to a single jsonrpc_monitor object.
This means there is 1:1 relationship between them.
In case multiple jsonrpc-monitors need to monitor the same tables and
the columns within them, then can share a single ovsdb-monitor, so the
updates only needs to be maintained once.
This patch, with a few following patches, will allow for N:1 mapping
between jsonrpc-monitor and ovsdb-monitor.
Maintaining jsonrpc-monitor pointers in a linked-list is essential
in allowing N:1 mapping. The ovsdb-monitor life cycle
is now reference counted. An empty list means zero references.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Refactoring ovsdb_monitor_get_initial() to not generate JSON object.
It only collect changes within the ovsdb_monitor().
ovsdb_jsonrpc_monitor_compose_table_update() is then used to generate
JSON object.
This change will also make future patch easier.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Wed, 11 Mar 2015 23:50:41 +0000 (16:50 -0700)]
jsonrpc-server: split monitors into database back end and JSON-RPC front
end
jsonrpc-server.c has two main functions. One deals with handling the
jsonrpc connections, the other deals with monitoring the database.
Currently, each jsonrpc connections has its own set of DB monitors.
This can be wasteful if a number of connections shares the same
monitors.
This patch, and a few following refactoring patches attempts to
split the jsonrpc handling front end off the main monitoring
functions within jsonrpc.c.
This patch changes the monitoring functions and data structures from
'ovsdb_jsonrpc_monitor_xxx' into 'ovsdb_monitor_xxx'
This and the following patches move the ovsdb_monitor backend functions
into their own file.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Thu, 9 Apr 2015 01:05:27 +0000 (18:05 -0700)]
ovsdb-test: add multiple clients to backlogged connection test
Backlogged connection test tests jsonrpc monitor's ability to combine
updates. Adding multiple clients to ensure that non-blocking clients
will get individual updates while blocking clients will get combined
updates.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
To be more explicit about which actions require datapath assistance,
split this out into a separate function. While this is fairly trivial
currently, there will be more special cases for the upcoming conntrack
changes.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Tue, 3 Mar 2015 21:28:32 +0000 (13:28 -0800)]
ofp-errors: Add Nicira extension code for OFPBMC_BAD_FIELD.
There are a couple of cases where OpenFlow 1.0 controllers that use
Nicira extensions can get OFPBMC_BAD_FIELD, so we should have an error
code for it in that protocol.
Jesse Gross [Mon, 18 May 2015 23:03:01 +0000 (16:03 -0700)]
odp-util: Geneve netlink decoding.
Even though userspace does not yet support Geneve options,
the kernel does and there is some basic support for decoding
those attributes. This adds the ability to print Geneve
attributes that might potentially come from the kernel.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Jesse Gross [Thu, 21 May 2015 01:47:21 +0000 (18:47 -0700)]
util: Library routines for printing and scanning large hex integers.
Geneve options are variable length and up to 124 bytes long, which means
that they can't be easily manipulated by the integer string functions
like we do for other fields. This adds a few helper routines to make
these operations easier.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Jesse Gross [Sun, 17 May 2015 05:08:20 +0000 (22:08 -0700)]
odp-util: Format tunnel attributes directly from netlink.
When we format most netlink attributes we do so from the netlink
itself, iterating through each one and printing the contents out.
However, for tunnels we don't do this - we first convert to the
OVS userspace representation and then format that. While convienient,
this isn't really ideal as the primary use of printing netlink
attributes is debugging and this conversion is lossy, particularly
when the attributes aren't as expected. The result is that unexpected
keys are silently ignored and the level of detail on errors is
minimal.
This situation becomes worse when we introduce support for Geneve.
The conversion to userspace format requires additional information
which we might not have (ovs-dpctl) and is more complicated than
other attributes so it is likely to be confusing in the event of a
bug. The information from the kernel is self-describing so it's
much more reliable to display it directly from the netlink.
This converts tunnel attribute formatting to be more similar to
other types of attributes. As a nice bonus the output becomes
more compact because it doesn't print zeroed out attributes in
cases where they aren't relevant and therefore not present.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Jesse Gross [Wed, 20 May 2015 18:57:35 +0000 (11:57 -0700)]
odp-util: Correctly generate wildcards when formating nested attributes.
When formatting netlink attributes if no mask is present a wildcarded
attribute is synthesized for the purposes of later processing. In
the case of nested attributes this must be done recursively, filling
in the correct attributes at each level rather than just generating
a set of zeros of the correct size. This is done already but it
always uses the attribute type for the top level keys - this corresponds
to nested ENCAP attributes. However, we have several levels of potentially
nested attributes for tunnels that each have their own types.
This uses an approach similar to the kernel where we have sets of
tables for the type of each attribute linked together by pointers.
This allows the mask generation function to automatically traverse
the nested attributes and always get the right types.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Sorin Vinturis [Wed, 27 May 2015 17:08:00 +0000 (17:08 +0000)]
datapath-windows: Removed memory barrier and master lock
There is no need to enforce Netlink serialization on transactions
sent from userspace. The access to the driver's shared resources
is synchronized anyway. Thus I have removed the master lock.
I also removed the memory barrier from filter dispatch routine. A
memory barrier is already in place in OvsReleaseSwitchContext
function, due to the use of InterlockedCompareExchange function.
Sorin Vinturis [Wed, 27 May 2015 16:58:25 +0000 (16:58 +0000)]
datapath-windows: Support for multiple VXLAN tunnels
At the moment the OVS extension supports only one VXLAN tunnel that
is cached in the extension switch context. Replaced the latter
cached pointer with an array list that contains all VXLAN tunnel
vports.
Sorin Vinturis [Wed, 27 May 2015 16:58:25 +0000 (16:58 +0000)]
datapath-windows: Support for custom VXLAN tunnel port
The kernel datapath supports only port 4789 for VXLAN tunnel creation.
Added support in order to allow for the VXLAN tunnel port to be
configurable to any port number set by the userspace.
The patch also checks to see if an existing WFP filter, for the
necessary UDP tunnel port, is already created before adding a new one.
This is a double check, because currently the userspace also verifies
this, but it is necessary to avoid future issues.
Custom VXLAN tunnel port requires the addition of a new WFP filter
with the new UDP tunnel port. The creation of a new WFP filter is
triggered in OvsInitVxlanTunnel function and the removal of the WFP
filter in OvsCleanupVxlanTunnel function.
But the latter functions are running at IRQL = DISPATCH_LEVEL, due
to the NDIS RW lock acquisition, and all WFP calls must be running at
IRQL = PASSIVE_LEVEL. This is why I have created a system thread which
records all filter addition/removal requests into a list for later
processing by the system thread. The ThreadStart routine processes all
received requests at IRQL = PASSIVE_LEVEL, which is the required IRQL
for the necessary WFP calls for adding/removal of the WFP filters.
The WFP filter for the default VXLAN port 4789 is not added anymore at
filter attach. All WFP filters for the tunnel ports are added when the
tunnel ports are initialized and are removed at cleanup. WFP operation
status is then reported to userspace.
It is necessary that OvsTunnelFilterUninitialize function is called
after OvsClearAllSwitchVports in order to allow for the added WFP
filters to be removed. OvsTunnelFilterUninitialize function closes the
global engine handle used by most of the WFP calls, including filter
removal.
Signed-off-by: Sorin Vinturis <svinturis@cloudbasesolutions.com> Reported-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/66 Acked-by: Nithin Raju <nithin@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Joe Stringer [Wed, 20 May 2015 17:35:15 +0000 (10:35 -0700)]
tests: Fix in_port(name) test for ofproto/trace.
Commit c2a77f33ade (tests/ofproto-dpif: Use vlog to test dpif
behaviour.) mistakenly changed the test which checked that ovs-dpctl
accepts named ports as input. Restore the name to the test.
Reported-by: Gurucharan Shetty <gshetty@nicira.com> Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Joe Stringer [Tue, 19 May 2015 21:13:12 +0000 (14:13 -0700)]
extract-ofp-fields: Detect duplicate fields.
Figure out if a developer accidentally defines new NXM fields using an
existing number, and warn them. Useful particularly if new fields are
introduced upstream while rebasing an in-progress patchset.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
ovs_threads: Avoid running pthread destructors from main thread exit.
Windows uses pthreads-win32 library to provide the Linux pthread
functionality. It is observed that when the main thread calls
a pthread destructor after it exits, undefined behavior is seen
(e.g., junk values in data, causing pthread deadlocks).
Similar behavior has been seen by
other people as seen in the following email thread:
https://sourceware.org/ml/pthreads-win32/2003/msg00001.html
To avoid this, this commit de-registers the thread destructor
when the main thread exits (via the atexit handler).
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Although the ovs-ctl/ovs-lib takes care of creating the rundir,
it is correct to let the systemd manages the directory and let
the rpm know about the ownership too.
Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
ovs-docker: Add the ability to set the mac address.
For testing OVN, it is useful to set the mac address
of the container. Since ovs-docker hasn't been part
of any released versions of OVS, it is probably OK
to change the options style.
Ansis Atteka [Tue, 26 May 2015 23:49:49 +0000 (16:49 -0700)]
debian: install openvswitch kernel module under "updates" directory
This patch fixes a bug where "modprobe openvswitch" command on Ubuntu
distribution would have sometimes tried to load OVS kernel module that
shipped together with Linux Kernel, even though one had also installed
OVS datapath debian package created with module-assistant. Because of
this issue force-reload-kmod command occasionally malfunctioned and
failed to load the right kernel module.
This bug happened *occasionally* because the default Ubuntu depmod
configuration in /etc/depmod.d/ubuntu.conf is set to look for kernel
modules first in "updates" directory, then in "ubuntu" directory and
then in other directories. If there were two openvswitch.ko modules
in "other directories", then modprobe would have loaded kernel
module that was nondeterministically listed first by file system.
Signed-off-by: Ansis Atteka <aatteka@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Wed, 20 May 2015 20:14:29 +0000 (13:14 -0700)]
Revert "ovs-ofctl: Always prints recirc_id in decimal"
As there is the potential for this field to be maskable in future, and
the dpctl "-m" output prints a mask for it, return it to hexadecimal.
The next patch will make this consistent to the recirc action by making
the action print the recirc_id in hex as well.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com>
Joe Stringer [Fri, 22 May 2015 17:24:34 +0000 (10:24 -0700)]
dpctl: Don't print UFID if not present.
With verbose dpctl, if userspace runs against an older kernel, every
entry will have "ufid:<empty>" at the beginning. This is unnecessary and
introduces an additional format for scripts to parse. Drop it.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Joe Stringer [Tue, 19 May 2015 21:20:31 +0000 (14:20 -0700)]
extract-ofp-fields: Port to python3.
Mostly "print foo" -> "print(foo)" and "iteritems() -> items()". The
latter may be less efficient in python2, but we're not dealing with
massive numbers of items here so it shouldn't noticably slow the build.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
netdev-dpdk: Adapt the requested number of tx and rx queues.
This commit changes the semantics of 'netdev_set_multiq()' to allow OVS
DPDK to run on device with limited multi queue support.
* If a netdev doesn't have the requested number of rxqs it can simply
inform the datapath without failing.
* If a netdev doesn't have the requested number of txqs it should try
to create as many as possible and use locking.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Right now ethernet and ring devices use a mutex, while vhost devices use
a mutex or a spinlock to protect statistics. This commit introduces a
single spinlock that's always used for stats updates.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
We used to reserve DPDK lcore 0 for non pmd operations, making it
difficult to use core 0 for packet processing.
DPDK 2.0 properly support non EAL threads with lcore LCORE_ID_ANY.
Using non EAL threads for non pmd threads, we do not need to reserve
any core for non pmd operations
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Jarno Rajahalme [Fri, 22 May 2015 18:22:40 +0000 (11:22 -0700)]
datapath: Support masked set actions.
OVS kernel module support for masked set actions in already upstream
in Linux (commit 83d2b9ba1abca241df44a502b6da950a25856b5b). This
patch adds the same for the OVS tree kernel module.
The existing set action sets many fields at once. When only a subset
of the IP header fields, for example, should be modified, all the IP
fields need to be exact matched so that the other field values can be
copied to the set action. A masked set action allows modification of
an arbitrary subset of the supported header bits without requiring the
rest to be matched.
Masked set action is now supported for all writeable key types, except
for the tunnel key. The set tunnel action is an exception as any
input tunnel info is cleared before action processing starts, so there
is no tunnel info to mask.
The kernel module converts all (non-tunnel) set actions to masked set
actions. This makes action processing more uniform, and results in
less branching and duplicating the action processing code. When
returning actions to userspace, the conversion is inverted. We use a
kernel internal action code to be able to tell the userspace provided
and converted masked set actions apart.
Having the same RSS hash after recirculation can cause unnecessary
collisions in the exact match cache. A simple solution is to rehash it
with the recirculation depth if it is non-zero.
Suggested-by: Ethan Jackson <ethan@nicira.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ethan Jackson [Wed, 20 May 2015 23:55:17 +0000 (16:55 -0700)]
dpif-netdev: Clear flow batches before execute.
When executing actions, it's possible a recirculation will occur
causing dp_netdev_input() to be called multiple times. If the batch
pointers embedded in dp_netdev_flow aren't cleared, it's possible
packets after the recirculation will be reinserted into a batch
associated with the original lookup. This could be very bad.
This patch fixes the problem by zeroing out flow batch pointers before
calling packet_batch_execute(). This probably has a slightly negative
performance impact, though I haven't tried it.
Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Kevin Traynor [Thu, 21 May 2015 16:26:48 +0000 (17:26 +0100)]
netdev-dpdk: Use default NIC configuration.
This patch simplifies Rx/Tx NIC configuration by removing
custom values and using the defaults provided by the DPDK
PMDs. This also enables Rx vectorisation which improves
performance.
Signed-off-by: Kevin Traynor <kevin.traynor@intel.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Ciara Loftus [Wed, 13 May 2015 13:54:56 +0000 (14:54 +0100)]
dpif-netdev: Increase the number of EMC entries
Prior to this commit, the number of possible entries in the Exact
Match Cache stood at 1024 per thread exacting to 0.18Mb. A typical
server system will have 2.5Mb cache per core meaning a larger EMC will
comfortably fit in. This patch increases the number of entries to 8192
per thread (1.4Mb) which in turn yields improved throughput when
processing multiple flows of traffic.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ethan Jackson [Sat, 16 May 2015 15:18:20 +0000 (08:18 -0700)]
dpdk: Ditch MAX_PKT_BURST macro.
The MAX_PKT_BURST and NETDEV_MAX_RX_BATCH macros had a confusing
relationship. They basically purport to do the same thing, making it
unclear which is the source of truth.
Furthermore, while NETDEV_MAX_RX_BATCH was 256, MAX_PKT_BURST was 32,
meaning we never process a batch larger than 32 packets further adding
to the confusion.
This patch resolves the issue by removing MAX_PKT_BURST completely,
and shrinking the new NETDEV_MAX_BURST macro to only 32. This should
have no change in the execution path except shrinking a couple of
structs and memory allocations (can't hurt).
Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>