git.proxmox.com Git - ovs.git/log

]> git.proxmox.com Git - ovs.git/log

Alin Serdean [Tue, 10 Jan 2017 16:48:28 +0000 (16:48 +0000)]

datapath-windows: Fix alignment in MapTunAttrToFlowPut

Found by inspection.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>

commit | commitdiff | tree

Shashank Ram [Tue, 24 Jan 2017 20:37:32 +0000 (12:37 -0800)]

datapath-windows: Add a wrapper to retreive external vport

This wrapper is to simplify readability.

Signed-off-by: Shashank Ram <rams@vmware.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>

commit | commitdiff | tree

Alin Serdean [Thu, 26 Jan 2017 23:43:39 +0000 (23:43 +0000)]

datapath-windows: Add support for OVS_KEY_ATTR_TCP set action

This patch adds support for set action with OVS_KEY_ATTR_TCP attribute
(change TCP source or destination port).

If the source or destination TCP port was changed, update the TCP checksum.

A sample flow can look like the following:
set(tcp(src=80,dst=443))

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>

commit | commitdiff | tree

Alin Serdean [Thu, 26 Jan 2017 23:41:46 +0000 (23:41 +0000)]

datapath-windows: Add support for OVS_KEY_ATTR_UDP set action

This patch adds support for set action with OVS_KEY_ATTR_UDP attribute
(change UDP source or destination port).

If the source or destination UDP port was changed, update the UDP checksum.

A sample flow can look like the following:
set(udp(src=67,dst=68))

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>

commit | commitdiff | tree

Alin Serdean [Thu, 26 Jan 2017 23:45:16 +0000 (23:45 +0000)]

datapath-windows: Add function to get continuous buffer from context

This patch extracts the code that tries to get a continuous IPv4 header
buffer from the function 'OvsUpdateIPv4Header' and moves it to a new
function 'OvsGetHeaderBySize'.

The new function can be used later when trying to change the UDP/TCP/MPLS
etc., headers.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>

commit | commitdiff | tree

Alin Serdean [Thu, 26 Jan 2017 23:38:17 +0000 (23:38 +0000)]

datapath-windows: OvsUpdateIPv4Header remove unnecessary addition

bufferStart can be used directly to access the data of the net buffer.
Add the MDL offset to save unnecessary additions.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>

commit | commitdiff | tree

Sairam Venugopal [Fri, 27 Jan 2017 07:56:04 +0000 (23:56 -0800)]

Windows: Fix wmi.c to use count of wchar_t instead of sizeof

wcscat_s and wcscpy_s requires number of elements as argument. wchar_t
uses 2 bytes for storage and using sizeof(internal_port_query) causes
access violation error on Windows 2012 R2 (64 bit). This patch introduces
a #define WMI_QUERY_COUNT set to 2048 and uses that instead.

Reported-by: Sairam Venugopal <vsairam@vmware.com>
Reported-at: openvswitch/ovs-issues#121
Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>

commit | commitdiff | tree

Alin Serdean [Fri, 16 Dec 2016 02:50:28 +0000 (02:50 +0000)]

windows: WSAPoll broken on windows

Unfortunately, WSAPoll misbehaves on Windows please view detailed behavior
on: https://github.com/openvswitch/ovs-issues/issues/117

We replace the WSAPoll with select looking only for errors and write events.

Reported-at: https://github.com/openvswitch/ovs-issues/issues/117
Reported-by: Yin Lin <linyi@vmware.com>
Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>

commit | commitdiff | tree

Mickey Spiegel [Fri, 27 Jan 2017 01:31:10 +0000 (17:31 -0800)]

ovn: rewrite redirect-chassis description in ovn-nb.xml

This optional patch addresses offline comments that the documentation
in ovn-nb.xml should not describe southbound constructs or flow
details, since it is user facing documentation.

Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>

commit | commitdiff | tree

Mickey Spiegel [Fri, 27 Jan 2017 01:31:09 +0000 (17:31 -0800)]

ovn: ovn-nbctl commands for distributed NAT

This patch adds the new optional arguments "logical_port" and
"external_mac" to lr-nat-add, and displays that information in
lr-nat-list.

Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>

commit | commitdiff | tree

Mickey Spiegel [Fri, 27 Jan 2017 01:31:08 +0000 (17:31 -0800)]

ovn: distributed NAT flows

This patch implements the flows required in the ingress and egress
pipeline stages in order to support NAT on a distributed logical router.

NAT functionality is associated with the logical router gateway port.
The flows that carry out NAT functionality all have match conditions on
inport or outport equal to the logical router gateway port.  There are
additional flows that are used to redirect traffic when necessary,
using the tunnel key of a "chassisredirect" SB port binding in order to
redirect traffic to the instance of the logical router gateway port on
the centralized "redirect-chassis".

North/south traffic subject to one-to-one "dnat_and_snat" is handled
in a distributed manner, with south-to-north traffic going to the
local instance of the logical router gateway port.  North/south
traffic subject to (possibly one-to-many) "snat" is handled in a
centralized manner, with south-to-north traffic going to the instance
of the logical router gateway port on the "redirect-chassis".
North-to-south traffic is directed to the corresponding chassis by
limiting ARP responses to the appropriate instance of the logical
router gateway port on one chassis.  For centralized NAT rules, this
is the instance on the "redirect-chassis".  For distributed NAT rules,
this is the chassis where the corresponding logical port resides, using
an ethernet address specified in the NB NAT rule to trigger upstream
MAC learning.

East/west NAT traffic is all handled in a centralized manner.  While it
is certainly possible to handle some of this traffic in a distributed
manner, the centralized approach keeps the NAT flows simpler and
cleaner.  The expectation is that east/west NAT traffic is not as
important to optimize as north/south NAT traffic, with most east/west
traffic not requiring NAT.

Automated tests are currently limited to only a single node.  The
single node automated tests cover both north/south and east/west
traffic flows.

Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>

commit | commitdiff | tree

Mickey Spiegel [Fri, 27 Jan 2017 01:31:07 +0000 (17:31 -0800)]

ovn: avoid snat recirc only on gateway routers

Currently, for performance reasons on gateway routers, ct_snat
that does not specify an IP address does not immediately trigger
recirculation.  On gateway routers, ct_snat that does not specify
an IP address happens in the UNSNAT pipeline stage, which is
followed by the DNAT pipeline stage that triggers recirculation
for all packets.  This DNAT pipeline stage recirculation takes
care of the recirculation needs of UNSNAT as well as other cases
such as UNDNAT.

On distributed routers, UNDNAT is handled in the egress pipeline
stage, separately from DNAT in the ingress pipeline stages.  The
DNAT pipeline stage only triggers recirculation for some packets.
Due to this difference in design, UNSNAT needs to trigger its own
recirculation.

This patch restricts the logic that avoids recirculation for
ct_snat, so that it only applies to datapaths representing
gateway routers.

Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>

commit | commitdiff | tree

Mickey Spiegel [Fri, 27 Jan 2017 01:31:06 +0000 (17:31 -0800)]

ovn: move load balancing flows after NAT flows

This will make it easy for distributed NAT to reuse some of the
existing code for NAT flows, while leaving load balancing and defrag
as functionality specific to gateway routers. There is no intent to
change any functionality in this patch.

Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>

commit | commitdiff | tree

Andy Zhou [Tue, 17 Jan 2017 23:56:58 +0000 (15:56 -0800)]

dp-packet: Enhance packet batch APIs.

One common use case of 'struct dp_packet_batch' is to process all
packets in the batch in order. Add an iterator for this use case
to simplify the logic of calling sites,

Another common use case is to drop packets in the batch, by reading
all packets, but writing back pointers of fewer packets. Add macros
to support this use case.

Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>

commit | commitdiff | tree

Andy Zhou [Wed, 18 Jan 2017 10:30:26 +0000 (02:30 -0800)]

netdev-dummy: Add --len option for netdev-dummy/receive command

Currently, there is no way to specify the packet size when injecting
a packet via "netdev-dummy/receive" with a flow specification. Thus
far, packet size is not important for testing OVS features, but it
becomes useful in writing unit tests for the future patches.

Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>

commit | commitdiff | tree

Alin Serdean [Thu, 26 Jan 2017 22:49:40 +0000 (22:49 +0000)]

windows: Change driver and MSI company name to LF

Until now we used 'Open vSwitch' as the company/organization name.

The project is now under The Linux Foundation ownership.

This patch updates the MSI and driver attributes to reflect that ownership.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>

commit | commitdiff | tree

Ben Pfaff [Thu, 26 Jan 2017 19:28:32 +0000 (11:28 -0800)]

lib: Remove generated ovs-fields.7 manpage on make clean.

Found by travis.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>

commit | commitdiff | tree

Aaron Conole [Thu, 26 Jan 2017 19:07:00 +0000 (14:07 -0500)]

libX.pc: use the correct output directory

When the ovsdb library pkgconfig changes were introduced, they placed
generated output in the src directory. This is incorrect, however, as
the output files should actually be placed in the build directory. It
is only seen when running `make distcheck` after enabling shared
libraries (ex: `./configure --enable-shared`).

Fixes: commit e72e07a97e95 ("lib: Add support for pkgconfig for libovsdb.")
Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

Alin Serdean [Thu, 26 Jan 2017 18:37:42 +0000 (18:37 +0000)]

Add ovs-fields.7 to lib/.gitignore

Found by inspection.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

Alin Serdean [Thu, 26 Jan 2017 17:57:41 +0000 (17:57 +0000)]

test windows: appctl - route/add with gateway

This test passes on Windows, change the test accordingly.

Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>

commit | commitdiff | tree

Ben Pfaff [Thu, 26 Jan 2017 04:29:48 +0000 (20:29 -0800)]

extract-ofp-fields: Define .TQ directive in nroff output.

This missing directive caused groff warnings and probably some erroneous
output too.

Fixes: 96fee5e0a2a0 ("ovs-fields: New manpage to document Open vSwitch and OpenFlow fields.")
Reported-by: Daniele Di Proietto <diproiettod@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Daniele Di Proietto <diproiettod@ovn.org>

commit | commitdiff | tree

Ben Pfaff [Thu, 26 Jan 2017 04:28:06 +0000 (20:28 -0800)]

ovs-fields: Eliminate non-ASCII characters from groff input.

It's difficult to make groff portably accept non-ASCII characters. It's
easier to replace them by groff escapes for the same characters, which
this commit does.

Fixes: 96fee5e0a2a0 ("ovs-fields: New manpage to document Open vSwitch and OpenFlow fields.")
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>

commit | commitdiff | tree

Numan Siddique [Wed, 25 Jan 2017 07:55:12 +0000 (13:25 +0530)]

ovn-northd: Add flows in DHCP_OPTIONS pipeline to support renew requests

ovn-northd adds the flows to send the DHCPv4 packets to ovn-controller
only with the match ip4.src = 0.0.0.0 and ip4.dst = 255.255.255.255.

When a DHCPv4 lease is about to expire, before sending a DHCPDISCOVER
packet, the client can send a DHCPREQUEST packet to renew its ip
with ip4.src set to its offered ip and ip4.dst set to the DHCP server
ip or broadcast ip.

This patch supports this missing scenario by adding the necessary
flows in DHCP_OPTIONS ingress pipeline.

Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Russell Bryant <russell@ovn.org>

commit | commitdiff | tree

Ben Pfaff [Wed, 25 Jan 2017 21:58:03 +0000 (13:58 -0800)]

ovs-fields: New manpage to document Open vSwitch and OpenFlow fields.

There is still plenty of opportunity for improvement, but this new
ovs-fields(7) manpage is much more comprehensive than ovs-ofctl(8)
could be.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>

commit | commitdiff | tree

Ben Pfaff [Wed, 28 Dec 2016 20:41:31 +0000 (12:41 -0800)]

nroff: Improve formatting of ASCII diagrams.

This makes diagrams in ASCII output look about as good as one might
reasonably expect, so that in ovn-architecture(7), for example, this:

. 9 bits: reserved (0)
. 15 bits: ingress port
. 16 bits: egress port
. 24 bits: datapath

now gets formatted as:

9 15 16 24
+--------+------------+-----------+--------+
|reserved|ingress port|egress port|datapath|
+--------+------------+-----------+--------+
0

which isn't perfect but certainly more evocative than a bulleted list.

This will be more useful in upcoming commits that start using diagrams more
frequently.

Signed-off-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

Andy Zhou [Fri, 20 Jan 2017 06:40:14 +0000 (22:40 -0800)]

xlate: Generate of datapath clone action when supported

Add logic to detect whether datapath support clone.
Enhance the xlate logic to make use of it.
Added logic to turn on/off clone support for testing.

Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>

commit | commitdiff | tree

Andy Zhou [Wed, 11 Jan 2017 02:13:47 +0000 (18:13 -0800)]

dpif-netdev: Add clone action

Add support for userspace datapath clone action.  The clone action
provides an action envelope to enclose an action list.
For example, with actions A, B, C and D,  and an action list:
      A, clone(B, C), D

The clone action will ensure that:

- D will see the same packet, and any meta states, such as flow, as
  action B.

- D will be executed regardless whether B, or C drops a packet. They
  can only drop a clone.

- When B drops a packet, clone will skip all remaining actions
  within the clone envelope. This feature is useful when we add
  meter action later:  The meter action can be implemented as a
  simple action without its own envolop (unlike the sample action).
  When necessary, the flow translation layer can enclose a meter action
  in clone.

The clone action is very similar with the OpenFlow clone action.
This is by design to simplify vswitchd flow translation logic.

Without datapath clone, vswitchd simulate the effect by inserting
datapath actions to "undo" clone actions. The above flow will be
translated into   A, B, C, -C, -B, D.

However, there are two issues:
- The resulting datapath action list may be longer without using
  clone.

- Some actions, such as NAT may not be possible to reverse.

This patch implements clone() simply with packet copy. The performance
can be improved with later patches, for example, to delay or avoid
packet copy if possible.  It seems datapath should have enough context
to carry out such optimization without the userspace context.

Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>

commit | commitdiff | tree

Andy Zhou [Fri, 13 Jan 2017 01:08:53 +0000 (17:08 -0800)]

lib: Add nl_msg_end_non_empty_nested()

Later patch will make use of nl_msg_end_non_empty_nested()

Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>

commit | commitdiff | tree

Andy Zhou [Fri, 20 Jan 2017 06:13:19 +0000 (22:13 -0800)]

dpif-netdev: Avoid sending probe packets

When ofproto probe for datapath features, no packets should actually
be sent to the network. This pactch fixes the userspace by dropping
probe packets before action execution.

Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>

commit | commitdiff | tree

Russell Bryant [Wed, 7 Dec 2016 19:24:45 +0000 (14:24 -0500)]

ovn: Document upgrade procedure.

Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

Russell Bryant [Thu, 19 Jan 2017 19:11:48 +0000 (14:11 -0500)]

doc: Remove tutorials/ovn-basics.

The only thing worse than a lack of documentation is incorrect or
out-of-date documentation. Over time, this document has not kept up with
the pace of OVN and is no longer a good current resource.

For a sandbox based tutorial like this, I'd like to start over using
ovn-trace as the basis.

An even more important type of tutorial would be something along the lines
of: http://blog.spinhirne.com/p/blog-series.html

That blog series was fantastic and has been the primary tutorial reference
I have been sending people to since it was written.

Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

Ben Pfaff [Fri, 20 Jan 2017 17:06:23 +0000 (09:06 -0800)]

actions: Add new "ct_clear" action.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>

commit | commitdiff | tree

Ben Pfaff [Sat, 21 Jan 2017 19:03:42 +0000 (11:03 -0800)]

actions: Make "next" action able to jump from egress to ingress pipeline.

This feature is useful for centralized gateways.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>

commit | commitdiff | tree

Ben Pfaff [Fri, 20 Jan 2017 16:56:19 +0000 (08:56 -0800)]

actions: Introduce enum ovnact_pipeline.

This isn't used yet by the actions code, but an upcoming commit will
introduce a user. This commit just adjusts ovn-trace to use this common
type instead of its own local type.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>

commit | commitdiff | tree

Ben Pfaff [Fri, 20 Jan 2017 21:44:27 +0000 (13:44 -0800)]

actions: Omit table number when possible for formatting "next" action.

Until now, formatting the "next" action has always required including
the table number, because the action struct didn't include enough context
so that the formatter could decide whether the table number was the next
table or some other table.  This is more or less OK, but an upcoming commit
will add a "pipeline" field to the "next" action, which means that the same
policy there would require that the pipeline always be printed.  That's a
little obnoxious because 99+% of the time, the pipeline to be printed is
the same pipeline that the flow is in and printing it would be distracting.
So it's better to store some context to help with formatting.  This commit
begins adopting that policy for the existing table number field.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>

commit | commitdiff | tree

Ben Pfaff [Fri, 20 Jan 2017 06:29:17 +0000 (22:29 -0800)]

actions: Separate action structures for "next" and "ct_next".

These actions aren't very similar but until now they both had the same
action structure. These structures are going to diverge in an upcoming
commit, so separate them now.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>

commit | commitdiff | tree

Ben Pfaff [Fri, 20 Jan 2017 21:41:23 +0000 (13:41 -0800)]

actions: Add new OVN action "clone".

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>

commit | commitdiff | tree

Ben Pfaff [Thu, 19 Jan 2017 19:11:04 +0000 (11:11 -0800)]

actions: Make "free" functions per-struct, not per-action.

In some cases multiple kinds of OVN action share the same structure. In
all of these cases, a given kind of structure is freed one particular way
(it would be confusing if this were not the case), so there's no benefit
in having per-action free functions. Therefore, this commit switches to
a free function per structure type.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>

commit | commitdiff | tree

Ben Pfaff [Sat, 21 Jan 2017 18:44:57 +0000 (10:44 -0800)]

ovn-trace: Fix selection of table that "next" jumps to.

The common case is that "next" advances to the next table, but it can
jump to any table.

Reported-by: Mickey Spiegel <mickeys.dev@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>

commit | commitdiff | tree

Ben Pfaff [Fri, 20 Jan 2017 04:39:26 +0000 (20:39 -0800)]

actions: Make "arp { drop; };" acceptable.

Before this commit, the OVN action parser would accept "arp {};" and then
the formatter would format it back as "arp { drop; };", but the parser
didn't accept the latter. There were basically two choices: make the
parser accept "arp { drop; };" or make the formatter output "arp {};"
(or both). This patch does (only) the former, and adds a test to avoid
regression.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>

commit | commitdiff | tree

Ben Pfaff [Fri, 20 Jan 2017 03:55:41 +0000 (19:55 -0800)]

lex: Make lexer_force_match() work for LEX_T_END.

Without this change, lexer_force_match(lex, LEX_T_END) mostly works, except
that in the failure case it emits an error that says "expecting `$'",
which is a surprising error message.

Arguably, lexer_force_end() could be removed entirely, but I don't see a
real problem with the existing arrangement.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>

commit | commitdiff | tree

Ben Pfaff [Thu, 19 Jan 2017 23:47:49 +0000 (15:47 -0800)]

actions: Fix "arp" and "nd_na" followed by another action.

OVN logical actions are supposed to be padded to a multiple of 8 bytes,
but the code for parsing "arp" and "nd_na" actions didn't do this properly.
The result was that it worked OK if one of these actions was the last one
in a sequence of logical actions, but failed badly if they were in the
middle. This commit fixes the problem, adds assertions to make it harder
for the problem to recur, and adds a test.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>

commit | commitdiff | tree

Ben Pfaff [Fri, 20 Jan 2017 17:27:38 +0000 (09:27 -0800)]

tnl-neigh-cache: Force revalidation for a new neighbor entry.

When a new ARP or ND entry was added, the code failed to force
revalidation. This commit fixes the problem.

Reported-by: László Sürü <laszlo.suru@ericsson.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327788.html
Tested-by: László Sürü <laszlo.suru@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

Daniele Di Proietto [Thu, 19 Jan 2017 01:31:51 +0000 (17:31 -0800)]

Documentation: Update DPDK doc after port naming change.

options:dpdk-devargs is always required now. This commit also changes
some of the names from 'dpdk0' to various others.

netdev-dpdk/detach accepts a PCI id instead of a port name.

CC: Ciara Loftus <ciara.loftus@intel.com>
Fixes: 55e075e65ef9("netdev-dpdk: Arbitrary 'dpdk' port naming")
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>

commit | commitdiff | tree

Mickey Spiegel [Mon, 9 Jan 2017 00:21:14 +0000 (16:21 -0800)]

ovn: Introduce distributed gateway port and "chassisredirect" port binding

Currently OVN distributed logical routers achieve reachability to
physical networks by passing through a "join" logical switch to a
centralized gateway router, which then connects to another logical
switch that has a localnet port connecting to the physical network.

This patch adds logical port and port binding abstractions that allow
an OVN distributed logical router to connect directly to a logical
switch that has a localnet port connecting to the physical network.
In this patch, this logical router port is called a "distributed
gateway port".

The primary design goal of distributed gateway ports is to allow as
much traffic as possible to be handled locally on the hypervisor
where a VM or container resides.  Whenever possible, packets from
the VM or container to the outside world should be processed
completely on that VM's or container's hypervisor, eventually
traversing a localnet port instance on that hypervisor to the
physical network.  Whenever possible, packets from the outside
world to a VM or container should be directed through the physical
network directly to the VM's or container's hypervisor, where the
packet will enter the integration bridge through a localnet port.

However, due to the implications of the use of L2 learning in the
physical network, as well as the need to support advanced features
such as one-to-many NAT (aka IP masquerading), where multiple
logical IP addresses spread across multiple chassis are mapped to
one external IP address, it will be necessary to handle some of the
logical router processing on a specific chassis in a centralized
manner.  For this reason, the user must associate a chassis with
each distributed gateway port.

In order to allow for the distributed processing of some packets,
distributed gateway ports need to be logical patch ports that
effectively reside on every hypervisor, rather than "l3gateway"
ports that are bound to a particular chassis.  However, the flows
associated with distributed gateway ports often need to be
associated with physical locations.  This is implemented in this
patch (and subsequent patches) by adding "is_chassis_resident()"
match conditions to several logical router flows.

While most of the physical location dependent aspects of distributed
gateway ports can be handled by restricting some flows to specific
chassis, one additional mechanism is required.  When a packet
leaves the ingress pipeline and the logical egress port is the
distributed gateway port, one of two different sets of actions is
required at table 32:
- If the packet can be handled locally on the sender's hypervisor
  (e.g. one-to-one NAT traffic), then the packet should just be
  resubmitted locally to table 33, in the normal manner for
  distributed logical patch ports.
- However, if the packet needs to be handled on the chassis
  associated with the distributed gateway port (e.g. one-to-many
  SNAT traffic or non-NAT traffic), then table 32 must send the
  packet on a tunnel port to that chassis.
In order to trigger the second set of actions, the
"chassisredirect" type of southbound port_binding is introduced.
Setting the logical egress port to the type "chassisredirect"
logical port is simply a way to indicate that although the packet
is destined for the distributed gateway port, it needs to be
redirected to a different chassis.  At table 32, packets with this
logical egress port are sent to a specific chassis, in the same
way that table 32 directs packets whose logical egress port is a
VIF or a type "l3gateway" port to different chassis.  Once the
packet arrives at that chassis, table 33 resets the logical egress
port to the value representing the distributed gateway port.  For
each distributed gateway port, there is one type "chassisredirect"
port, in addition to the distributed logical patch port
representing the distributed gateway port.

A "chassisredirect" port represents a particular instance, bound
to a specific chassis, of an otherwise distributed port.  A
"chassisredirect" port is associated with a chassis in the same
manner as a "l3gateway" port.  However, unlike "l3gateway" ports,
"chassisredirect" ports have no associated IP or MAC addresses,
and "chassisredirect" ports should never be used as the "inport".
Any pipeline stages that depend on port specific IP or MAC addresses
should be carried out in the context of the distributed gateway
port's logical patch port.

Although the abstraction represented by the "chassisredirect" port
binding is generalized, in this patch the "chassisredirect" port binding
is only created for NB logical router ports that specify the new
"redirect-chassis" option.  There is no explicit notion of a
"chassisredirect" port in the NB database.  The expectation is when
capabilities are implemented that take advantage of "chassisredirect"
ports (e.g. distributed gateway ports), flows specifying a
"chassisredirect" port as the outport will be added as part of that
capability.

Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

Mickey Spiegel [Mon, 9 Jan 2017 00:21:13 +0000 (16:21 -0800)]

ovn: add is_chassis_resident match expression component

This patch introduces a new match expression component
is_chassis_resident().  Unlike match expression comparisons,
is_chassis_resident is not pushed down to OpenFlow.  It is a
conditional that is evaluated in the controller during expr_simplify(),
when it is replaced by a boolean expression.  The is_chassis_resident
conditional evaluates to "true" when the specified string identifies a
port name that is resident on this controller chassis, i.e., the
corresponding southbound database Port_Binding has a chassis column
that matches this chassis.  Otherwise it evaluates to "false".

This allows higher level features to specify flows that are only
installed on some chassis rather than on all chassis with the
corresponding datapath.

Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

Shu Shen [Wed, 18 Jan 2017 18:55:20 +0000 (10:55 -0800)]

lacp: add test step for link recovery

An additional step is added to test case "lacp - negotiation" to
ensure the bond port and its slave interfaces properly re-negotiate
after a link previously down comes back.

Signed-off-by: Shu Shen <shu.shen@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

Gurucharan Shetty [Wed, 18 Jan 2017 12:59:39 +0000 (04:59 -0800)]

ovn-nbctl: Ability to bootstrap CA certificate.

Utilities like ovs-vsctl have the ability to bootstrap
CA certificate. It looks useful for ovn-nbctl to have
the same ability too. One could connect over to OVN NB
database over SSL for transactions without having to
copy over the certificate being used by ovsdb-server
backing OVN NB.

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Lance Richardson <lrichard@redhat.com>
Acked-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

Ben Pfaff [Wed, 18 Jan 2017 23:57:51 +0000 (15:57 -0800)]

faq: Document OVS packet buffering.

We get questions about this sometimes.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>

commit | commitdiff | tree

Jarno Rajahalme [Wed, 18 Jan 2017 23:26:03 +0000 (15:26 -0800)]

ofproto-dpif: Use acquire/release barriers with 'tables_version'.

Use memory_order_release when updating the tables version number to
make sure no memory accesses before the atomic_store (possibly
relating to setting up the new version) are reordered to take place
after the atomic_store, which makes the new version available to other
threads.

Correspondingly, use memory_order_acquire when reading the
current tables_version to make sure no later memory accesses (possibly
relating to the current version) are reordered to take place before
the atomic_read to ensure that those memory accesses can not relate to
an older version than returned by the atomic_read.

Suggested-by: Daniele Di Proietto <ddiproietto@vmware.com>
Fixes: 621b8064b7 ("ofproto: Infra for table versioning.")
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

Binbin Xu [Wed, 18 Jan 2017 19:55:57 +0000 (03:55 +0800)]

configuration.rst: Update the example of DPDK port's configuration

After the hotplug of DPDK ports, a valid dpdk-devargs must be
specified. Otherwise, the DPDK device can't be available.

Signed-off-by: Binbin Xu <xu.binbin1@zte.com.cn>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>

commit | commitdiff | tree

Gurucharan Shetty [Wed, 18 Jan 2017 11:21:12 +0000 (03:21 -0800)]

ovn-ctl: Add bootstrap ovn-controller CA certificate option.

ovn-controller accepts the option --bootstrap-ca-cert. With this
commit, ovn-ctl will let user pass a value for that via
--ovn-controller-ssl-bootstrap-ca-cert option.

Bootstrapping is useful for ovn-controller as you don't have to
copy the controller's certificate (self-signed or otherwise) to every host.

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Lance Richardson <lrichard@redhat.com>

commit | commitdiff | tree

Aaron Conole [Mon, 16 Jan 2017 19:06:27 +0000 (14:06 -0500)]

libX: add new release / version info tags

This commit uses the $PACKAGE_VERSION automake variable to construct a
release and version info combination which sets the library name to be:

   libfoo-$(OVS_MAJOR_VERSION).so.$(OVS_MINOR_VERSION).0.$(OVS_MICRO_VERSION)

where formerly, it was always:

   libfoo.so.1.0.0

This allows releases of Open vSwitch libraries to reflect which specific
versions they came with, and sets up a psuedo ABI-versioning scheme.  In
this fashion, future releases of Open vSwitch could be installed
alongside older releases, allowing 3rd party utilities linked against
previous versions to continue to function.

ex:

$ ldd /path/to/utility
linux-vdso.so.1 (0x00007ffe92cf6000)
libopenvswitch-2.so.6 => /lib64/libopenvswitch-2.so.6 (0x00007f733b7a3000)
libssl.so.10 => /lib64/libssl.so.10 (0x00007f733b530000)
...

Note the library name and version information.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

Mickey Spiegel [Tue, 17 Jan 2017 09:45:02 +0000 (01:45 -0800)]

ovn: document logical routers and logical patch ports in ovn-architecture

This patch adds a description of logical routers and logical patch ports,
including gateway routers, to ovn/ovn-architecture.7.xml.

Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

Russell Bryant [Mon, 16 Jan 2017 21:46:16 +0000 (16:46 -0500)]

vlan.rst: Strip leftover HTML.

Strip a couple of closing HTML tags that were left over from when this doc
was converted from the web site to RST.

Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

nickcooper-zhangtonghao [Mon, 16 Jan 2017 12:56:39 +0000 (04:56 -0800)]

dpif-netdev: Avoids repeated addition of DP_STAT_LOST.

CC: Daniele Di Proietto <diproiettod@vmware.com>
Fixes: 8aaa125dab66 ("dpif-netdev: Share emc and fast path output batches.")
Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech>
Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>

commit | commitdiff | tree

Daniele Di Proietto [Tue, 15 Nov 2016 23:40:49 +0000 (15:40 -0800)]

ovs-numa: Remove unused functions.

ovs-numa doesn't need to keep the state of the pmd threads, it is an
implementation detail of dpif-netdev.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>

commit | commitdiff | tree

Daniele Di Proietto [Tue, 15 Nov 2016 23:40:49 +0000 (15:40 -0800)]

dpif-netdev: Centralized threads and queues handling code.

Currently we have three different code paths that deal with pmd threads
and queues, in response to different input

1. When a port is added
2. When a port is deleted
3. When the cpumask changes or a port must be reconfigured.

1. and 2. are carefully written to minimize disruption to the running
datapath, while 3. brings down all the threads reconfigure all the ports
and restarts everything.

This commit removes the three separate code paths by introducing the
reconfigure_datapath() function, that takes care of adapting the pmd
threads and queues to the current datapath configuration, no matter how
we got there.

This aims at simplifying maintenance and introduces a long overdue
improvement: port reconfiguration (can happen quite frequently for
dpdkvhost ports) is now done without shutting down the whole datapath,
but just by temporarily removing the port that needs to be reconfigured
(while the rest of the datapath is running).

We now also recompute the rxq scheduling from scratch every time a port
is added of deleted.  This means that the queues will be more balanced,
especially when dealing with explicit rxq-affinity from the user
(without shutting down the threads and restarting them), but it also
means that adding or deleting a port might cause existing queues to be
moved between pmd threads.  This negative effect can be avoided by
taking into account the existing distribution when computing the new
scheduling, but I considered code clarity and fast reconfiguration more
important than optimizing port addition or removal (a port is added and
removed only once, but can be reconfigured many times)

Lastly, this commit moves the pmd threads state away from ovs-numa.  Now
the pmd threads state is kept only in dpif-netdev.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Co-authored-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>

commit | commitdiff | tree

Daniele Di Proietto [Tue, 15 Nov 2016 23:40:49 +0000 (15:40 -0800)]

dpif-netdev: Use hmap for poll_list in pmd threads.

A future commit will use this to determine if a queue is already
contained in a pmd thread.

To keep the behavior unaltered we now have to sort queues before
printing them in pmd_info_show_rxq().

Also this commit introduces 'struct polled_queue' that will be used
exclusively in the fast path, uses 'struct dp_netdev_rxq' from 'struct
rxq_poll' and uses 'rx' for 'netdev_rxq' and 'rxq' for 'dp_netdev_rxq'.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>

commit | commitdiff | tree

Daniele Di Proietto [Wed, 30 Nov 2016 02:10:41 +0000 (18:10 -0800)]

ovs-numa: Add per numa and global counts in dump.

They will be used by a future commit.

Suggested-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>

commit | commitdiff | tree

Daniele Di Proietto [Tue, 29 Nov 2016 22:51:03 +0000 (14:51 -0800)]

ovs-numa: Don't use hmap_first_with_hash().

I think it's better to iterate the hmap than to use
hmap_first_with_hash(), because it handles hash collisions.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>

commit | commitdiff | tree

Daniele Di Proietto [Tue, 15 Nov 2016 23:40:49 +0000 (15:40 -0800)]

ovs-numa: Add new dump types.

They will be used by a future commit.

This patch introduces some code duplication which will be removed in a
future commit.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>

commit | commitdiff | tree

Daniele Di Proietto [Tue, 15 Nov 2016 23:40:49 +0000 (15:40 -0800)]

ovs-numa: New ovs_numa_dump_contains_core() function.

It will be used by a future commit. struct ovs_numa_dump now uses an
hmap instead of a list to make ovs_numa_dump_contains_core() more
efficient.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>

commit | commitdiff | tree

Daniele Di Proietto [Tue, 15 Nov 2016 23:40:49 +0000 (15:40 -0800)]

dpctl: Avoid making assumptions on pmd threads.

Currently dpctl depends on ovs-numa module to delete and create flows on
different pmd threads for pmd devices.

The next commits will move away the pmd threads state from ovs-numa to
dpif-netdev, so the ovs-numa interface will not be supported.

Also, the assignment between ports and thread is an implementation
detail of dpif-netdev, dpctl shouldn't know anything about it.

This commit changes the dpif_flow_put() and dpif_flow_del() calls to
iterate over all the pmd threads, if pmd_id is PMD_ID_NULL.

A simple test is added.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>

commit | commitdiff | tree

Daniele Di Proietto [Tue, 15 Nov 2016 23:40:49 +0000 (15:40 -0800)]

dpif-netdev: Make 'static_tx_qid' const.

Since previous commit, 'static_tx_qid' doesn't need to be atomic and is
actually never touched (except for initialization), so it can be made
const.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>

commit | commitdiff | tree

Daniele Di Proietto [Tue, 15 Nov 2016 23:40:49 +0000 (15:40 -0800)]

dpif-netdev: Create pmd threads for every numa node.

A lot of the complexity in the code that handles pmd threads and ports
in dpif-netdev is due to the fact that we postpone the creation of pmd
threads on a numa node until we have a port that needs to be polled on
that particular node.

Since the previous commit, a pmd thread with no ports will not consume
any CPU, so it seems easier to create all the threads at once.

This will also make future commits easier.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>

commit | commitdiff | tree

Daniele Di Proietto [Tue, 15 Nov 2016 23:40:49 +0000 (15:40 -0800)]

dpif-netdev: Block pmd threads if there are no ports.

There's no reason for a pmd thread to perform its main loop if there are
no queues in its poll_list.

This commit introduces a seq object on which the pmd thread can be
blocked, if there are no queues.

When the main thread wants to reload a pmd threads it must now change
the seq object (in case it's blocked) and set 'reload' to true.

This is useful to avoid wasting CPU cycles and is also necessary for a
future commit.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>

commit | commitdiff | tree

Daniele Di Proietto [Tue, 15 Nov 2016 23:40:49 +0000 (15:40 -0800)]

dpif-netdev: Use a boolean instead of pmd->port_seq.

There's no need for a sequence number, since the main thread has to wait
for the pmd thread, so there's no chance that an update will be
undetected.

A seq object will be introduced for another purpose in the next commit,
and changing this to boolean makes the code more readable.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>

commit | commitdiff | tree

Daniele Di Proietto [Tue, 15 Nov 2016 23:40:49 +0000 (15:40 -0800)]

netdev-dpdk: Refactor construct and destruct.

Some refactoring for _construct() and _destruct() methods:
* Rename netdev_dpdk_init() to common_construct(). init() has a
  different meaning in the netdev context.
* Remove DPDK_DEV_ETH and DPDK_DEV_VHOST checks in common_construct()
  and move them to different functions
* Introduce common_destruct().
* Avoid taking 'dev->mutex' in construct and destruct: we're guaranteed
  to be the only thread with access to the object.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>

commit | commitdiff | tree

Daniele Di Proietto [Tue, 15 Nov 2016 23:40:49 +0000 (15:40 -0800)]

netdev-dpdk: Start also dpdkr devices only once on port-add.

Since commit 55e075e65ef9("netdev-dpdk: Arbitrary 'dpdk' port naming"),
we don't call rte_eth_start() from netdev_open() anymore, we only call
it from netdev_reconfigure(). This commit does that also for 'dpdkr'
devices, and remove some useless code.

Calling rte_eth_start() also from netdev_open() was unnecessary and
wasteful. Not doing it reduces code duplication and makes adding a port
faster (~900ms before the patch, ~400ms after).

Another reason why this is useful is that some DPDK driver might have
problems with reconfiguration. For example, until DPDK commit
8618d19b52b1("net/vmxnet3: reallocate shared memzone on re-config"),
vmxnet3 didn't support being restarted with a different number of
queues.

Technically, the netdev interface changed because before opening rxqs or
calling netdev_send() the user must check if reconfiguration is
required. This patch also documents that, even though no change to the
userspace datapath (the only user) is required.

Lastly, this patch makes sure the errors returned by ofproto_port_add
(which includes the first port reconfiguration) are reported back to the
database.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>

commit | commitdiff | tree

Daniele Di Proietto [Tue, 15 Nov 2016 23:40:49 +0000 (15:40 -0800)]

netdev-dpdk: Don't call rte_dev_stop() in update_flags().

Calling rte_eth_dev_stop() while the device is running causes a crash.

We could use rte_eth_dev_set_link_down(), but not every PMD implements
that, and I found one NIC where that has no effect.

Instead, this commit checks if the device has the NETDEV_UP flag when
transmitting or receiving (similarly to what we do for vhostuser). I
didn't notice any performance difference with this check in case the
device is up.

An alternative would be to remove the device queues from the pmd threads
tx and receive cache, but that requires reconfiguration and I'd prefer
to avoid it, because the change can come from OpenFlow.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>

commit | commitdiff | tree

Daniele Di Proietto [Tue, 15 Nov 2016 23:40:49 +0000 (15:40 -0800)]

dpif-netdev: Don't try to output on a device without txqs.

Tunnel devices have 0 txqs and don't support netdev_send(). While
netdev_send() simply returns EOPNOTSUPP, the XPS logic is still executed
on output, and that might be confused by devices with no txqs.

It seems better to have different structures in the fast path for ports
that support netdev_{push,pop}_header (tunnel devices), and ports that
support netdev_send. With this we can also remove a branch in
netdev_send().

This is also necessary for a future commit, which starts DPDK devices
without txqs.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>

commit | commitdiff | tree

Daniele Di Proietto [Tue, 15 Nov 2016 23:40:49 +0000 (15:40 -0800)]

dpif-netdev: Take non_pmd_mutex to access tx cached ports.

As documented in dp_netdev_pmd_thread, we must take non_pmd_mutex to
access the tx port caches for the non pmd thread.

Found by inspection.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>

commit | commitdiff | tree

Daniele Di Proietto [Tue, 15 Nov 2016 23:40:49 +0000 (15:40 -0800)]

dpif-netdev: Fix memory leak.

We keep all the per-port classifiers around, since they can be reused,
but when a pmd thread is destroyed we should free them.

Found using valgrind.

Fixes: 3453b4d62a98("dpif-netdev: dpcls per in_port with sorted
subtables")

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

Guoshuai Li [Sat, 7 Jan 2017 06:28:35 +0000 (14:28 +0800)]

python: Catch exception "SSL.SysCallError" for send by SSL.

When OVSDB server is aborted,
the SSL send function will throw SSL.SysCallError exception,
which we need to catch and return it's -errno.

While SSL.WantWriteError exception needs to return -EAGAIN
based on its parent class, not EAGAIN

Signed-off-by: Guoshuai Li <ligs@dtdream.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

Shu Shen [Sat, 14 Jan 2017 01:51:00 +0000 (17:51 -0800)]

Use PRIu32 format for ofp_port_t

Although ofp_port_t uses a 16-bit range, it is defined as a 32-bit type.
The format strings throughout the code base were using PRIu16 for
ofp_port_t which leads to the compiler to throw Wformat message on
platforms that don't promote 16-bit to 32-bit integers, e.g., on macOS.

Signed-off-by: Shu Shen <shu.shen@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

Mickey Spiegel [Fri, 6 Jan 2017 20:00:28 +0000 (12:00 -0800)]

ovn: specify addresses of type "router" lsps as "router"

Currently in OVN, when a logical switch port of type "router" is
created, the MAC and optionally IP addresses of the peer logical
router port must be specified again as the addresses of the logical
switch port.

This patch allows the logical switch port's addresses to be
specified as the string "router", rather than explicitly copying the
logical router port's MAC and optionally IP addresses. The router
addresses are used to populate the logical switch's destination
lookup, and to populate op->lsp_addrs in ovn-northd.c, which in turn
is used to generate logical switch ARP and ND replies. Since ipam
already looks at logical router ports, the only ipam modification
necessary is to skip logical switch ports with addresses "router".

Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

Ben Pfaff [Wed, 4 Jan 2017 22:29:21 +0000 (14:29 -0800)]

db-ctl-base: Always support all tables in schema.

When one adds a new table to a database schema, it's easy to forget to
add the table to the list of tables in the *ctl.c program.  When this
happens, the database commands for that program don't work on that table
at all, even for commands like "list" and "create" that don't need any
special help.  This patch fixes that problem, by making sure that
db-ctl-base always has the complete list of tables.

Previously, each ctl_table_class pointed directly to the corresponding
ovsdb_idl_table_class.  With this patch, there are instead two parallel
arrays, one of ovsdb_idl_table_classes and the other of ctl_table_classes.
This change accounts for the bulk of the change to the db-ctl-base code.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Lance Richardson <lrichard@redhat.com>

commit | commitdiff | tree

Ben Pfaff [Thu, 12 Jan 2017 17:16:52 +0000 (09:16 -0800)]

travis: Update build list email address.

The lists these days prefer an ovs- prefix. Currently all of the build
emails are being dropped because it is missing.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Andy Zhou <azhou@ovn.org>

commit | commitdiff | tree

Andy Zhou [Wed, 11 Jan 2017 23:26:27 +0000 (15:26 -0800)]

dpif: Simplify dpif_execute_helper_cb()

The may_steal flag is now used, Remove OVS_UNUSED.

Since dp_packet_delete() handles the NULL pointer properly, we can
drop a few tracking variables, and make the code easier to follow.

Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>

commit | commitdiff | tree

Daniele Di Proietto [Thu, 12 Jan 2017 07:59:57 +0000 (23:59 -0800)]

netdev-vport: Do not log empty warnings on success.

set_tunnel_config() always logs a warning, even on success. This
shouldn't happen.

Without this, some unit tests fail.

Fixes: 9fff138ec3a6("netdev: Add 'errp' to set_config().")
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Antonio Fischetti <antonio.fischetti@intel.com>
Acked-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

Ben Pfaff [Thu, 12 Jan 2017 16:15:02 +0000 (08:15 -0800)]

ofproto-dpif: Make ofproto/trace output easier to read.

"ovs-appctl ofproto/trace" is invaluable for debugging, but as the users of
Open vSwitch have evolved it has failed to keep up with the times.  It's
pretty easy to design OpenFlow tables and pipelines that resubmit dozens of
times.  Each resubmit causes an additional tab of indentation, so the
output wraps around, sometimes again and again, and makes the output close
to unreadable.

ovn-trace pioneered better formatting for tracing in OVN logical datapaths,
mostly by not increasing indentation for tail recursion, which in practice
gets rid of almost all indentation.

This commit experiments with redoing ofproto/trace the same way.  Try
looking at, for example, the testsuite output for test 2282 "ovn -- 3 HVs,
3 LRs connected via LS, source IP based routes".  Without this commit, it
indents 61 levels (488 spaces!).  With this commit, it indents 1 level
(4 spaces) and it's possible to actually understand what's going on almost
at a glance.

To see this for yourself, try the following command either with or without
this commit (but be sure to keep the change to ovn.at that adds an
ofproto/trace to the test):
make check TESTSUITEFLAGS='-d 2282' && less tests/testsuite.dir/2282/testsuite.log

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Lance Richardson <lrichard@redhat.com>
Acked-by: Justin Pettit <jpettit@ovn.org>

commit | commitdiff | tree

Daniele Di Proietto [Wed, 21 Dec 2016 01:58:14 +0000 (17:58 -0800)]

netdev: Add 'errp' to set_config().

Since 55e075e65ef9("netdev-dpdk: Arbitrary 'dpdk' port naming"),
set_config() is used to identify a DPDK device, so it's better to report
its detailed error message to the user.  Tunnel devices and patch ports
rely a lot on set_config() as well.

This commit adds a param to set_config() that can be used to return
an error message and makes use of that in netdev-dpdk and netdev-vport.

Before this patch:

$ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
ovs-vsctl: Error detected while setting up 'dpdk0': dpdk0: could not set
    configuration (Invalid argument).  See ovs-vswitchd log for details.
ovs-vsctl: The default log directory is "/var/log/openvswitch/".

$ ovs-vsctl add-port br0 p+ -- set Interface p+ type=patch
ovs-vsctl: Error detected while setting up 'p+': p+: could not set
    configuration (Invalid argument).  See ovs-vswitchd log for details.
ovs-vsctl: The default log directory is "/var/log/openvswitch/".

$ ovs-vsctl add-port br0 gnv0 -- set Interface gnv0 type=geneve
ovs-vsctl: Error detected while setting up 'gnv0': gnv0: could not set
    configuration (Invalid argument).  See ovs-vswitchd log for details.
ovs-vsctl: The default log directory is "/var/log/openvswitch/".

After this patch:

$ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
ovs-vsctl: Error detected while setting up 'dpdk0': 'dpdk0' is missing
    'options:dpdk-devargs'. The old 'dpdk<port_id>' names are not
    supported.  See ovs-vswitchd log for details.
ovs-vsctl: The default log directory is "/var/log/openvswitch/".

$ ovs-vsctl add-port br0 p+ -- set Interface p+ type=patch
ovs-vsctl: Error detected while setting up 'p+': p+: patch type requires
    valid 'peer' argument.  See ovs-vswitchd log for details.
ovs-vsctl: The default log directory is "/var/log/openvswitch/".

$ ovs-vsctl add-port br0 gnv0 -- set Interface gnv0 type=geneve
ovs-vsctl: Error detected while setting up 'gnv0': gnv0: geneve type
    requires valid 'remote_ip' argument.  See ovs-vswitchd log for
    details.
ovs-vsctl: The default log directory is "/var/log/openvswitch/".

CC: Ciara Loftus <ciara.loftus@intel.com>
CC: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Tested-by: Ciara Loftus <ciara.loftus@intel.com>

commit | commitdiff | tree

xu.binbin1@zte.com.cn [Thu, 12 Jan 2017 14:18:13 +0000 (22:18 +0800)]

netdev-dpdk: Assign socket id according to device's numa id

We can hotplug attach DPDK ports specified via the 'dpdk-devargs'
option now.

But the socket id of DPDK ports can't be assigned correctly,
it is always 0. The socket id of DPDK ports should be assigned
according to the numa id of the device.

Fixes: 55e075e65ef9e ("netdev-dpdk: Arbitrary 'dpdk' port naming")
Signed-off-by: Binbin Xu <xu.binbin1@zte.com.cn>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>

commit | commitdiff | tree

Joe Stringer [Tue, 10 Jan 2017 23:54:03 +0000 (15:54 -0800)]

revalidator: Complain for more ukey transitions.

For most ukey transition states, only one thread should be responsible
for transitioning the ukey into the new state. If another thread
attempts to transition the ukey into the same state (for instance,
evicting the datapath flow or deleting the ukey), then it is likely
performing additional work which should only happen once. Log all cases
of ukey transition into the current state, except for UKEY_OPERATIONAL
-> UKEY_OPERATIONAL which regularly occurs when revalidating ukeys.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>

commit | commitdiff | tree

Joe Stringer [Tue, 10 Jan 2017 23:54:02 +0000 (15:54 -0800)]

revalidator: Prevent double-delete of ukey.

revalidator_sweep__() splits checking for whether to delete a ukey from
the actual deletion to prevent taking the umap lock for too long.
However it uses information gathered from the first critical section to
decide to call ukey_delete() - ie, the second critical section.

Since 67f08985d769 ("upcall: Replace ukeys for deleted flows."), it is
possible for a handler thread to receive an upcall for the same flow and
to replace the ukey which is being deleted with a new one, in between
these critical sections. This will remove the ukey from the cmap,
rcu-defer its deletion, and update the ukey state.

If this occurs in between the critical sections of revalidator cleanup
of the flow, then the revalidator will subsequently call ukey_delete()
to delete the original ukey, which was already deleted by the handler
thread. This leads to a segfault in cmap_replace__().

Guard against this by checking the ukey state in ukey_delete() while
holding the ukey lock.

Backtrace:
    Program terminated with signal 11, Segmentation fault.
    #0  0x00007fe969b13da3 in cmap_replace__ ()
    #1  0x00007fe969b14491 in cmap_replace ()
    #2  0x00007fe969aee9ff in ukey_delete ()
    #3  0x00007fe969aefd42 in revalidator_sweep__ ()
    #4  0x00007fe969af1bad in udpif_revalidator ()
    #5  0x00007fe969b8b2a6 in ovsthread_wrapper ()
    #6  0x00007fe968e07dc5 in start_thread () from /lib64/libpthread.so.0
    #7  0x00007fe96862c73d in clone () from /lib64/libc.so.6

Fixes: 54ebeff4c03d ("upcall: Track ukey states.")
Fixes: 67f08985d769 ("upcall: Replace ukeys for deleted flows.")
Reported-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>

commit | commitdiff | tree

nickcooper-zhangtonghao [Tue, 10 Jan 2017 05:56:14 +0000 (21:56 -0800)]

netdev-dummy: Limits the number of tx/rx queues.

This patch avoids the ovs_rcu to report WARN, caused by blocked
for a long time, when ovs-vswitchd processes a port with many
rx/tx queues. The number of tx/rx queues per port may be appropriate,
because the dpdk uses it as an default max value.

Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>

commit | commitdiff | tree

Daniele Di Proietto [Wed, 5 Oct 2016 00:58:05 +0000 (17:58 -0700)]

dpdk: Late initialization.

With this commit, we allow the user to set other_config:dpdk-init=true
after the process is started. This makes it easier to start Open
vSwitch with DPDK using standard init scripts without restarting the
service.

This is still far from ideal, because initializing DPDK might still
abort the process (e.g. if there not enough memory), so the user must
check the status of the process after setting dpdk-init to true.

Nonetheless, I think this is an improvement, because it doesn't require
restarting the whole unit.

CC: Aaron Conole <aconole@redhat.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Aaron Conole <aconole@redhat.com>

commit | commitdiff | tree

Jarno Rajahalme [Thu, 5 Jan 2017 23:33:13 +0000 (15:33 -0800)]

dpcls: Avoid one 8-byte chunk in subtable mask.

This patch allows to skip the 8-byte chunk comprising of dp_hash and
in_port in the subtable mask when dp_hash is wildcarded. This will
slightly speed up the hash computation as one expensive function call
to hash_add64() can be skipped.

For each new netdev flow we wildcard in_port in the mask, so in the
typical case where dp_hash is also wildcarded, the resulting 8-byte
chunk will not be part of the subtable mask.

This manipulation of the mask is possible as the datapath classifier
is explicitly selected based on the in_port value, so that all the
datapath flows in the selected classifier have an exact match on that
in_port value. Given this, it is safe to ignore the in_port value
when doing a lookup in the chosen classifier.

Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com>
Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Co-authored-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Co-authored-by: Jarno Rajahalme <jarno@ovn.org>

commit | commitdiff | tree

Ben Pfaff [Tue, 10 Jan 2017 18:43:54 +0000 (10:43 -0800)]

AUTHORS: Add Dong Jun.

Signed-off-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

Ben Pfaff [Fri, 6 Jan 2017 04:36:52 +0000 (20:36 -0800)]

ovn-controller: Clear conntrack state inside clone action.

ovn-controller implements traversal from one OVN logical network to another
using the Open vSwitch "clone" action. The "clone" action preserves
connection tracking state, which is confusing for passing from one logical
datapath to another because this state is only relevant for a single
logical datapath and does not make sense in the new one. This commit
fixes a problem sometimes seen by ensuring that the connection tracking
state is cleared when these traversals happen.

Reported-by: Numan Siddique <nusiddiq@redhat.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/326948.html
Fixes: f1a8bd06d58f ("ovn-controller: Drop most uses of OVS patch ports.")
Tested-by: Dong Jun <dongj@dtdream.com>

commit | commitdiff | tree

Ben Pfaff [Fri, 6 Jan 2017 16:19:53 +0000 (08:19 -0800)]

New action "ct_clear".

This is being introduced specifically to allow a user of the "clone" action
to clear the connection tracking state, but it's implemented as a separate
action as a matter of clean design and in case another use case arises
later.

Reported-by: Mickey Spiegel <mickeys.dev@gmail.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/326981.html
Fixes: 7ae62a676d3a ("ofp-actions: Add clone action.")
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>
Tested-by: Dong Jun <dongj@dtdream.com>

commit | commitdiff | tree

Ben Pfaff [Fri, 6 Jan 2017 00:11:15 +0000 (16:11 -0800)]

ofproto-dpif-xlate: Make clone save "was_mpls".

This seems like it's an optimization rather than a correctness issue, but
in general it's best to make "clone" like patch ports where there is no
reason to depart from its design, since we know that patch ports work well.

Reported-by: Mickey Spiegel <mickeys.dev@gmail.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/326981.html
Fixes: 7ae62a676d3a ("ofp-actions: Add clone action.")
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>
Tested-by: Dong Jun <dongj@dtdream.com>

commit | commitdiff | tree

Ben Pfaff [Fri, 6 Jan 2017 04:37:15 +0000 (20:37 -0800)]

ofproto-dpif-xlate: Make "clone" save action set and stack.

This is a design decision but it seems conceptually cleaner than having
them leak through into the clone.

Reported-by: Mickey Spiegel <mickeys.dev@gmail.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/326981.html
Fixes: 7ae62a676d3a ("ofp-actions: Add clone action.")
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>
Tested-by: Dong Jun <dongj@dtdream.com>

commit | commitdiff | tree

Andy Zhou [Tue, 20 Dec 2016 07:55:01 +0000 (23:55 -0800)]

ovsdb-idl: Enhance conditional monitoring API

To allow client to know when the conditional monitoring changes
has been accepted by the OVSDB server and the 'idl' contents has
been updated to match the new conditions.

Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

andy zhou [Sat, 17 Dec 2016 00:55:09 +0000 (16:55 -0800)]

ovsdb-idl: Properly handle conditional monitor update error

When generating conditional monitoring update request, current code
failed to update idl's 'request-id'. This bug causes the reply
message of the update request, regardless an ACK or a NACK, be
logged as an unexpected message at the debug level and ignored by
the core idl logic.

In addition, the idl should not generate another conditional
monitoring update request when there is an outstanding request.
So that the requests and their reply are properly serialized.

When the conditional monitoring is nacked by the server, drop idl
into a client visible error state.

Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>

commit | commitdiff | tree

nickcooper-zhangtonghao [Mon, 9 Jan 2017 01:30:22 +0000 (17:30 -0800)]

dpif-netdev: Uses the OVS_CORE_UNSPEC instead of magic numbers.

This patch uses OVS_CORE_UNSPEC for the queue unpinned instead
of "-1". More important, the "-1" casted to unsigned int is
equal to NON_PMD_CORE_ID. We make the distinction between them.

Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>

commit | commitdiff | tree

nickcooper-zhangtonghao [Mon, 9 Jan 2017 01:30:21 +0000 (17:30 -0800)]

netdev-dummy: Uses the NR_QUEUE instead of magic numbers.

The NR_QUEUE is defined in "lib/dpif-netdev.h", netdev-dpdk
uses it instead of magic number. netdev-dummy should be
in the same case.

Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>

commit | commitdiff | tree

nickcooper-zhangtonghao [Mon, 9 Jan 2017 01:30:19 +0000 (17:30 -0800)]

netdev-dpdk: Fix formatting typo.

Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>

commit | commitdiff | tree

Jarno Rajahalme [Fri, 6 Jan 2017 01:30:27 +0000 (17:30 -0800)]

nx-match: Only store significant bytes to stack.

Always storing the maximum mf_value size wastes about 120 bytes for
each stack entry. This patch changes the stack from an mf_value array
to a string of value-length pairs.

The length is stored after the value so that the stack pop may first
read the length and then the appropriate number of bytes.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>

openvswitch packages for PVE

RSS Atom