]> git.proxmox.com Git - mirror_ovs.git/log
mirror_ovs.git
8 years agovconn: Update length of bundled messages.
Jarno Rajahalme [Fri, 29 Jul 2016 23:52:03 +0000 (16:52 -0700)]
vconn: Update length of bundled messages.

Variable length messages need their length updated before they can be
added to the bundle.

Message length updating after encoding is sometimes done by the
encoding function, but always latest when the message is sent out.  As
an OpenFlow message is added to a bundle add message, it will not be
sent by itself, and we need to update the length explicitly instead.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto: Make groups versioned.
Jarno Rajahalme [Fri, 29 Jul 2016 23:52:02 +0000 (16:52 -0700)]
ofproto: Make groups versioned.

This is a prepatory step for adding group mod support for bundles in a
following patch.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto: refactor group mods.
Jarno Rajahalme [Fri, 29 Jul 2016 23:52:02 +0000 (16:52 -0700)]
ofproto: refactor group mods.

This changes ofproto providers modify_group() to never fail.

Separating major refactoring to a separate patch should make following
patches easier to review.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto: Report flow mods also from bundles.
Jarno Rajahalme [Fri, 29 Jul 2016 23:52:02 +0000 (16:52 -0700)]
ofproto: Report flow mods also from bundles.

Flow mod stats get skewed if they are not reported from bundles.  Move
reporting to ofproto_flow_mod_finish() so that it will be done in all
cases.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto: Generalize flow_mod_requester.
Jarno Rajahalme [Fri, 29 Jul 2016 23:52:02 +0000 (16:52 -0700)]
ofproto: Generalize flow_mod_requester.

Group mods also need a 'requester', so rename 'flow_mod_requester' as
'openflow_mod_requester'.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto: Add generic non-intrusive object_collection.
Jarno Rajahalme [Fri, 29 Jul 2016 23:52:02 +0000 (16:52 -0700)]
ofproto: Add generic non-intrusive object_collection.

Define rule_collection in terms of a new object_collection.  This
makes it easier to add other types of collections later.

This patch makes no functional changes.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto: Use ofproto_mutex for groups and keep track of referring flows.
Jarno Rajahalme [Fri, 29 Jul 2016 23:52:02 +0000 (16:52 -0700)]
ofproto: Use ofproto_mutex for groups and keep track of referring flows.

Adding groups support for bundles is simpler if also groups are
modified under ofproto_mutex.

Eliminate the search for rules when deleting a group so that we will
not keep the mutex for too long.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto: Make flow handling more symmetric.
Jarno Rajahalme [Fri, 29 Jul 2016 23:52:02 +0000 (16:52 -0700)]
ofproto: Make flow handling more symmetric.

Remove flow from ofproto data structures in the 'start' phase, even if
we may need to add them back in 'revert' phase.

This makes bundled group mods easier, as a group delete may also
delete flows, and we need the referring flows to be updated in the
'start' phase so that we will not have stale references to the
referring flows.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto: Take group references only when needed.
Jarno Rajahalme [Fri, 29 Jul 2016 23:52:02 +0000 (16:52 -0700)]
ofproto: Take group references only when needed.

Avoid unnecessary references when RCU protection suffices.  This makes
group lookup memory management more like flow lookup memory
management.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoofproto: Lockless group lookups.
Jarno Rajahalme [Fri, 29 Jul 2016 23:52:01 +0000 (16:52 -0700)]
ofproto: Lockless group lookups.

Make groups RCU protected and make group lookups lockless.  While this
makes group lookups perform better, the main motivation is to have an
unified memory management model for versioned data supported in
OpenFlow bundles.  Later patches will make groups versioned and add
bundle support for groups.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agolib: Separate versioning to its own module.
Jarno Rajahalme [Fri, 29 Jul 2016 23:52:01 +0000 (16:52 -0700)]
lib: Separate versioning to its own module.

Separate rule versioning to lib/versions.h to make it easier to use
versioning for other data types.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agoovn-controller: Restore ct zone assignment.
Babu Shanmugam [Thu, 28 Jul 2016 20:15:14 +0000 (16:15 -0400)]
ovn-controller: Restore ct zone assignment.

Recent commits reorganizing bindings handling and also moving ct zone
assignment to ovn-controller.c caused ct zone assignment to no longer
work.  The code relies on an "all_lports" sset that should contain all
logical ports that we should be assigning ct zones for.  Prior to this
change, all_lports was always empty.

Signed-off-by: Babu Shanmugam <bschanmu@redhat.com>
Co-authored-by: Russell Bryant <russell@ovn.org>
Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
8 years agosystem-tests: Add ping through conntrack test.
Daniele Di Proietto [Tue, 26 Apr 2016 02:06:40 +0000 (19:06 -0700)]
system-tests: Add ping through conntrack test.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agosystem-tests: Run conntrack tests with userspace.
Daniele Di Proietto [Mon, 16 Nov 2015 06:07:25 +0000 (22:07 -0800)]
system-tests: Run conntrack tests with userspace.

The userspace connection tracker doesn't support ALGs, frag reassembly
or NAT yet, so skip those tests.

Also, connection tracking state input from a local port is not possible
in userspace.

Finally, the userspace datapath checks for the IPv4 header checksum, so
fix those in the hardcoded packets.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
8 years agotests: Remove trim_zeros() from ovn tests.
Daniele Di Proietto [Fri, 29 Jul 2016 17:32:13 +0000 (10:32 -0700)]
tests: Remove trim_zeros() from ovn tests.

trim_zeros() is not necessary anymore, since now we don't pad packets in
the userspace datapath.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agonetdev-*: Do not use dp_packet_pad() in recv() functions.
Daniele Di Proietto [Fri, 29 Jul 2016 01:02:01 +0000 (18:02 -0700)]
netdev-*: Do not use dp_packet_pad() in recv() functions.

All the netdevs used by dpif-netdev (except for netdev-dpdk) have a
dp_packet_pad() call in the receive function, probably because the
userspace datapath couldn't handle properly short packets.

This doesn't appear to be the case anymore.

This commit removes the call to have a more consistent behavior with the
kernel datapath.

All the testsuite changes in this commit adjust the expectations for
packet lengths in flow dumps and other stats.  There's only one fix in
ovn.at: one of the test_ip() functions generated an incomplete udp
packet, which was not a problem until now, because of the padding.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agotravis: Fix flake8 failures from flake8 3.0.
Russell Bryant [Fri, 29 Jul 2016 18:51:07 +0000 (14:51 -0400)]
travis: Fix flake8 failures from flake8 3.0.

The "hacking" plugin for flake8 is not currently compatible with flake8
3.0.  Ensure that we install flake8 2.x on travis-ci.  Also update the
docs to indicate this incompatibility.

Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Andy Zhou <azhou@ovn.org>
8 years agofedora: Prioritize OVS modules in weak-updates.
Joe Stringer [Fri, 29 Jul 2016 00:09:38 +0000 (17:09 -0700)]
fedora: Prioritize OVS modules in weak-updates.

Out-of-tree modules are installed into the kernel's "extra" modules
directory for the version that kmod-openvswitch is compiled against. For
all other kernels on the system at install time, a symlink is created in
the "weak-updates" directory. This provides a path for the same kernel
module to be used when minor kernel updates are done on a system.
However, without updating the depmod configuration the weak-update will
not be prioritized, so modprobe will switch back to using upstream
kernel modules when you upgrade. This patch introduces that depmod
configuration to ensure that the out-of-tree module is always used when
it is installed, regardless of kernel upgrades.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
8 years agorhel: Prioritize our vport-foo modules in depmod.
Joe Stringer [Fri, 29 Jul 2016 00:09:37 +0000 (17:09 -0700)]
rhel: Prioritize our vport-foo modules in depmod.

We've done the same for openvswitch.ko previously, but we really should
be doing this for vport modules as well; otherwise, depmod may try to
pair upstream vport modules with the out-of-tree openvswitch module
(leading to depmod warnings on package install, and failure to load the
module at runtime).

VMware-BZ: #1700293
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
8 years agopvector: Expose non-concurrent priority vector.
Jarno Rajahalme [Fri, 29 Jul 2016 18:04:49 +0000 (11:04 -0700)]
pvector: Expose non-concurrent priority vector.

PMD threads use pvectors but do not need the overhead of the
concurrent version.  Expose the non-concurrent version for
that use.

Note that struct pvector is renamed as struct cpvector (for concurrent
priority vector), and the former struct pvector_impl is now struct
pvector.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agopvector: Get rid of special purpose of INT_MIN.
Jarno Rajahalme [Fri, 29 Jul 2016 18:04:48 +0000 (11:04 -0700)]
pvector: Get rid of special purpose of INT_MIN.

Allow clients to use the whole priority range.  Note that this changes
the semantics of PVECTOR_FOR_EACH_PRIORITY so that the iteration still
continues for entries at the given priority.

Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agopvector: Move PVECTOR_EXTRA_ALLOC to pvector.c.
Jarno Rajahalme [Fri, 29 Jul 2016 18:04:48 +0000 (11:04 -0700)]
pvector: Move PVECTOR_EXTRA_ALLOC to pvector.c.

There is no need to expose PVECTOR_EXTRA_ALLOC in the API.

Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agotests: Ignore proxy configuration.
Jarno Rajahalme [Fri, 29 Jul 2016 18:04:48 +0000 (11:04 -0700)]
tests: Ignore proxy configuration.

As any proxy configuration may ruin kernel testsuite tests, it is
better to ignore all proxy configuration.

Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agodatapath-windows: Post Conntrack delete and new events
Sairam Venugopal [Tue, 26 Jul 2016 00:04:43 +0000 (17:04 -0700)]
datapath-windows: Post Conntrack delete and new events

Post Conntrack delete and create events when entries are deleted or
created.

Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Paul Boca <pboca@cloudbasesolutions.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-By: Yin Lin <linyi@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
8 years agodatapath-windows: Update OvsReadEventCmdHandler in Datapath.c to support different...
Sairam Venugopal [Tue, 26 Jul 2016 00:04:42 +0000 (17:04 -0700)]
datapath-windows: Update OvsReadEventCmdHandler in Datapath.c to support different events

OvsReadEventCmdHandler must now reflect the right event being read. If the
event is a Conntrack related event, then convert the entry to netlink
format and send it to userspace. If it's Vport event, retain the existing
workflow.

Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Paul Boca <pboca@cloudbasesolutions.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
8 years agodatapath-windows: Add support for multiple event queue in Event.c
Sairam Venugopal [Tue, 26 Jul 2016 00:04:41 +0000 (17:04 -0700)]
datapath-windows: Add support for multiple event queue in Event.c

Update Event.c to have multiple event queues and mechanism to retrieve the
associated queue. Introduce OvsPostCtEvent and OvsRemoveCtEventEntry
similar to OvsPostVportEvent and OvsRemoveVportEventEntry.

Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Acked-by: Paul Boca <pboca@cloudbasesolutions.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-By: Yin Lin <linyi@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
8 years agodatapath-windows: Modify OvsCreateNlMsgFromCtEntry to make it reusable
Sairam Venugopal [Tue, 26 Jul 2016 00:04:40 +0000 (17:04 -0700)]
datapath-windows: Modify OvsCreateNlMsgFromCtEntry to make it reusable

Tweak the OvsCreateNlMsgFromCtEntry() method to reuse it for creating
netlink messages from other files. Also define the function in Conntrack.h
to make it accessible.

Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Acked-By: Yin Lin <linyi@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-By: Yin Lin <linyi@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
8 years agodatapath-windows: Define new multicast conntrack events and netlink protocol
Sairam Venugopal [Tue, 26 Jul 2016 00:04:39 +0000 (17:04 -0700)]
datapath-windows: Define new multicast conntrack events and netlink protocol

The Hyper-V datapath supports NETLINK_GENERIC and NETLINK_NETFILTER
protocols for netlink communication. Define these two protocols in the
datapath.

Define new Conntrack events (new and delete) and add support for
subscribing to these events. Parse out OVS_NL_ATTR_MCAST_GRP and store it
as part of OVS_EVENT_SUBSCRIBE structure.

Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Acked-By: Yin Lin <linyi@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
8 years agodatapath-windows: Fix bugs in Event.c around subscribe and lock
Sairam Venugopal [Tue, 26 Jul 2016 00:04:38 +0000 (17:04 -0700)]
datapath-windows: Fix bugs in Event.c around subscribe and lock

When userspace tries to resubscribe to an existing queue, return
STATUS_INVALID_PARAMETER since it's not supported. The current bug
overwrites status to STATUS_SUCCESS.

The second bug fix is around releasing the EventQueue lock if an open
instance couldn't be found. The current version returns back without
releasing the lock. Moving the OvsAcquireEventQueueLock() after the
instance is verified.

OvsGetOpenInstance does not enforce a safe read for
gOvsSwitchContext->dpNo. Use the gOvsSwitchContext->dispatchLock for
accessing the parameter.

Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Acked-By: Yin Lin <linyi@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
8 years agodatapath-windows: Explicitly name vport related event to vportEvent
Sairam Venugopal [Tue, 26 Jul 2016 00:04:37 +0000 (17:04 -0700)]
datapath-windows: Explicitly name vport related event to vportEvent

OVS_EVENT_ENTRY currently handles only Vport related events. Updating the
name of the struct to OVS_VPORT_EVENT_ENTRY. Remove OVS_EVENT_STATUS since
it's currently not in use. Update the datapath to refer to events as
vportEvents. This will aid in the introduction of other events.

Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
Acked-By: Yin Lin <linyi@vmware.com>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-By: Yin Lin <linyi@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
8 years agotests: Remove most packet-forwarding related "sleep"s from OVN tests.
Ben Pfaff [Wed, 27 Jul 2016 06:50:06 +0000 (23:50 -0700)]
tests: Remove most packet-forwarding related "sleep"s from OVN tests.

These arbitrary sleeps are often longer than necessary and can be too short
in corner cases.  This commit replaces them by a common macro that waits
only as long as necessary.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Acked-by: Flavio Fernandes <flavio@flaviof.com>
8 years agoovn: Make two end-to-end tests more reliable.
Ben Pfaff [Wed, 27 Jul 2016 06:18:12 +0000 (23:18 -0700)]
ovn: Make two end-to-end tests more reliable.

These tests change the northbound configuration and then immediately check
that the changes have taken effect on the hypervisors.  This can't work
reliably, so add a sleep to each one.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
8 years agotests: Define trim_zeros in only one place.
Ben Pfaff [Fri, 29 Jul 2016 04:41:29 +0000 (21:41 -0700)]
tests: Define trim_zeros in only one place.

Defining trim_zeros in a common place allows us to skip defining it in
every test that needs it.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Acked-by: Flavio Fernandes <flavio@flaviof.com>
8 years agoovn-controller: Remove old values from local_ids.
Russell Bryant [Thu, 28 Jul 2016 21:22:41 +0000 (17:22 -0400)]
ovn-controller: Remove old values from local_ids.

local_ids is supposed to be the set of interface iface-id values from
this chassis that correspond to OVN logical ports.  We use this for
detecting when an interface has been removed as well as if child-ports
should be bound to this chassis.

Old values were not being removed from local_ids.  The most immediate
effect of this was that once an interface has been removed from a
chassis, we would think a removal has occured *every* time through
binding_run and trigger the full binding processing.  This was
a performance problem.

The second problem this would cause is if a port that had child ports
was moved to another chassis.  We would end up with two chassis fighting
over the binding of the child ports.

Signed-off-by: Russell Bryant <russell@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
8 years agoREADME: Add reference to DPDK installation.
Mauricio Vasquez B [Thu, 28 Jul 2016 20:49:26 +0000 (22:49 +0200)]
README: Add reference to DPDK installation.

There was not any reference to the DPDK installation in the main README file.

Signed-off-by: Mauricio Vasquez B <mauricio.vasquezbernal@studenti.polito.it>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoREADME: add missing reference to INSTALL.SELinux.md
Mauricio Vasquez B [Thu, 28 Jul 2016 20:49:25 +0000 (22:49 +0200)]
README: add missing reference to INSTALL.SELinux.md

Signed-off-by: Mauricio Vasquez B <mauricio.vasquezbernal@studenti.polito.it>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovn-northd, tests: Adding IPAM to ovn-northd.
Nimay Desai [Wed, 27 Jul 2016 18:28:24 +0000 (11:28 -0700)]
ovn-northd, tests: Adding IPAM to ovn-northd.

Added an IPv4 and MAC addresses management system to ovn-northd. When a logical
switch's other_config:subnet field is set, logical ports attached to that
switch that have the keyword "dynamic" in their addresses column will
automatically be allocated a globally unique MAC address/unused IPv4 address
within the provided subnet. The allocated address will populate the
dynamic_addresses column. This can be useful for a user who wants to deploy
many VM's or containers with networking capabilities, but does not care about
the specific MAC/IPv4 addresses that are assigned.

Added tests in ovn.at for ipam.

Signed-off-by: Nimay Desai <nimaydesai1@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
8 years agorhel: Fix ifup-ovs to delete ports first.
Flavio Leitner [Tue, 26 Jul 2016 01:16:31 +0000 (22:16 -0300)]
rhel: Fix ifup-ovs to delete ports first.

When ifdown isn't executed (system didn't shut down properly),
ports remain in the openvswitch's database.  In that case, an
inconsitency is left behind when the ifcfg was modified because
ovs-vsctl won't do anything to update existing port's configuration
in the database.

The ifup/ifdown will operate only on configured interfaces, so
this patch fixes the issue by deleting the port from the database
before attempt to configure it with fresh configuration.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
8 years agoovn: Rename "gateway" to "l3gateway".
Russell Bryant [Tue, 26 Jul 2016 20:29:25 +0000 (16:29 -0400)]
ovn: Rename "gateway" to "l3gateway".

When L3 gateway support was added, it introduced a port type called
"gateway" and a corresponding option called "gateway-chassis".  Since
that time, we also have an L2 gateway port type called "l2gateway" and a
corresponding option called "l2gateway-chassis".  This patch renames the
L3 gateway port type and option to "l3gateway" and "l3gateway-chassis"
to make things a little more clear and consistent.

Signed-off-by: Russell Bryant <russell@ovn.org>
8 years agoovn: Add ovn-controller-vtep debian package
Ryan Moats [Thu, 28 Jul 2016 16:53:03 +0000 (16:53 +0000)]
ovn: Add ovn-controller-vtep debian package

Having a separate debian package for deploying
the ovn-controller-vtep binary enables the ability
to assign specific nodes the role of communicating
with VTEP enabled TORs.

Change-Id: Ia36aea7d89bd011a57918820b2a9f6e3469b3e04
Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovn-controller: Clean up cases that lead to duplicate OF flows.
Ryan Moats [Thu, 28 Jul 2016 18:10:16 +0000 (18:10 +0000)]
ovn-controller: Clean up cases that lead to duplicate OF flows.

In physical_run, there are multiple places where OF flows can be
produced each cycle.  Because the desired flow table may not have
been completely cleared first, remove flows created during previous
runs before creating new flows.  This avoid collisions.

Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agosystem-ovn.at: Fix ICMP conntrack output.
Joe Stringer [Thu, 28 Jul 2016 17:48:37 +0000 (10:48 -0700)]
system-ovn.at: Fix ICMP conntrack output.

Recent changes to the dump-conntrack command provide more info
(type,code), but the system-ovn tests weren't updated for this.
Update the tests.

Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Gurucharan Shetty <guru@ovn.org>
8 years agoroute-table: flush addresses list when route table is reset
Thadeu Lima de Souza Cascardo [Thu, 28 Jul 2016 16:14:58 +0000 (13:14 -0300)]
route-table: flush addresses list when route table is reset

When the route table is reset, the addresses list may be out of date, as we race
for the many netlink socket notifications.

A quick fix for this is flushing the addresses list, before dumping the routes
and gathering source addresses for them.

That way, instead of using invalid source addresses or preventing an entry to be
added because of missing source addresses, repeated tests showed the correct
entry is always added.

As route-table.c is only built for Linux, we don't need to be concerned that
Windows does not have netdev_get_addrs_list_flush, since it uses
route-table-stub.c instead.

Fixes: a8704b502785 ("tunneling: Handle multiple ip address for given device.")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovn: add easy SNAT test case.
Dongjun [Thu, 28 Jul 2016 07:14:01 +0000 (15:14 +0800)]
ovn: add easy SNAT test case.

Signed-off-by: Dongjun <dongj@dtdream.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
8 years agoovn: Allow SNAT traffic destined to router ip.
Chandra Sekhar Vejendla [Thu, 28 Jul 2016 04:34:06 +0000 (21:34 -0700)]
ovn: Allow SNAT traffic destined to router ip.

When router ip is used as SNAT IP, traffic destined to router
ip should not be dropped.

Fixes: 4685e523695c ("ovn: Support multiple addresses on a single logical
router port.")
Signed-off-by: Chandra Sekhar Vejendla <csvejend@us.ibm.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
8 years agodoc: Update INSTALL.Docker.md to reflect it's focus on OVN
Kyle Mestery [Wed, 27 Jul 2016 17:40:23 +0000 (12:40 -0500)]
doc: Update INSTALL.Docker.md to reflect it's focus on OVN

While reading this document, the title stood out to me as not
accurate. The title indicates it will discuss how to use
Open vSwitch with Docker, but in reality, it's about using
Open Virtual Networking with Docker.

This change updates the title, as well as the opening paragraphs
to more accurately reflect what the document is talking about.

Signed-off-by: Kyle Mestery <mestery@mestery.com>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
8 years agoconntrack: Add 'dl_type' parameter to conntrack_execute().
Daniele Di Proietto [Thu, 26 May 2016 01:10:09 +0000 (18:10 -0700)]
conntrack: Add 'dl_type' parameter to conntrack_execute().

Now that dpif_execute has a 'flow' member, it's pretty easy to access a
the flow (or the matching megaflow) in dp_execute_cb().

This means that's not necessary anymore for the connection tracker to
reextract 'dl_type' from the packet, it can be passed as a parameter.

This change means that we have to complicate sightly test-conntrack to
group the packets by dl_type before passing them to the connection
tracker.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agoconntrack: Track ICMP type and code.
Daniele Di Proietto [Fri, 13 May 2016 22:04:17 +0000 (15:04 -0700)]
conntrack: Track ICMP type and code.

From the connection tracker perspective, an ICMP connection is a tuple
identified by source ip address, destination ip address and ICMP id.

While this allows basic ICMP traffic (pings) to work, it doesn't take
into account the icmp type: the connection tracker will allow
requests/replies in any directions.

This is improved by making the ICMP type and code part of the connection
tuple.  An ICMP echo request packet from A to B, will create a
connection that matches ICMP echo request from A to B and ICMP echo
replies from B to A.  The same is done for timestamp and info
request/replies, and for ICMPv6.

A new modules conntrack-icmp is implemented, to allow only "request"
types to create new connections.

Also, since they're tracked in both userspace and kernel
implementations, ICMP type and code are always printed in ct-dpif (a few
testcase are updated as a consequence).

Reported-by: Subramani Paramasivam <subramani.paramasivam@wipro.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agotests: Add conntrack ofproto-dpif tests.
Daniele Di Proietto [Mon, 16 Nov 2015 06:07:25 +0000 (22:07 -0800)]
tests: Add conntrack ofproto-dpif tests.

While the system testsuite already has connection tracking tests, it
will be still useful to add some to the standard testsuite because:

* They're run more often by developers.
* Some of them are more interesting for the userspace datapath.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
8 years agoflow: Generate checksum and udp_len in flow_compose().
Daniele Di Proietto [Wed, 20 Apr 2016 18:19:18 +0000 (11:19 -0700)]
flow: Generate checksum and udp_len in flow_compose().

This is useful to test the connection tracker, which performs checksum
and udp length verification.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agodpif-netdev: Implement conntrack flush interface.
Daniele Di Proietto [Mon, 16 Nov 2015 06:07:25 +0000 (22:07 -0800)]
dpif-netdev: Implement conntrack flush interface.

New functions are implemented in the conntrack module to support this.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
8 years agodpif-netdev: Implement conntrack dump functions.
Daniele Di Proietto [Mon, 16 Nov 2015 06:07:25 +0000 (22:07 -0800)]
dpif-netdev: Implement conntrack dump functions.

New functions are implemented in the conntrack module to support this.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
8 years agodpif-netdev: Execute conntrack action.
Daniele Di Proietto [Mon, 16 Nov 2015 06:07:25 +0000 (22:07 -0800)]
dpif-netdev: Execute conntrack action.

This commit implements the OVS_ACTION_ATTR_CT action in dpif-netdev.

To allow ofproto-dpif to detect the conntrack feature, flow_put will not
discard anymore flows with ct_* fields set. We still shouldn't allow
flows with NAT bits set, since there is no support for NAT.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Antonio Fischetti <antonio.fischetti@intel.com>
8 years agotests: Add test-conntrack pcap test.
Daniele Di Proietto [Fri, 15 Apr 2016 00:00:35 +0000 (17:00 -0700)]
tests: Add test-conntrack pcap test.

Simple program that runs the packet in a pcap file through the
connection tracker and prints the 'ct_state' for each packet.

E.g. the line:

`./test/ovstest test-conntrack capture.pcap 2`

sends the packets in `capture.pcap` to the connection tracker, 2 per
call.

Useful for debugging.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
8 years agotests: Add very simple conntrack benchmark.
Daniele Di Proietto [Mon, 16 Nov 2015 06:07:25 +0000 (22:07 -0800)]
tests: Add very simple conntrack benchmark.

This introduces a very limited but simple benchmark for
conntrack_execute(). It just sends repeatedly the same batch of packets
through the connection tracker and returns the time spent to process
them.

While this is not a realistic benchmark, it has proven useful during
development to evaluate different batching and locking strategies.

E.g. the line:

`./tests/ovstest test-conntrack benchmark 1 14880000 32`

starts 1 thread that will send 14880000 packets to the connection
tracker, 32 at a time. It will print the time taken to process them.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
8 years agoXXX Improve comment.
Daniele Di Proietto [Thu, 28 Jul 2016 01:32:15 +0000 (18:32 -0700)]
XXX Improve comment.

8 years agoconntrack: Periodically delete expired connections.
Daniele Di Proietto [Mon, 16 May 2016 19:59:23 +0000 (12:59 -0700)]
conntrack: Periodically delete expired connections.

This commit adds a thread that periodically removes expired connections.

The expiration time of a connection can be expressed by:

expiration = now + timeout

For each possible 'timeout' value (there aren't many) we keep a list.
When the expiration is updated, we move the connection to the back of the
corresponding 'timeout' list. This ways, the list is always ordered by
'expiration'.

When the cleanup thread iterates through the lists for expired
connections, it can stop at the first non expired connection.

Suggested-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agoconntrack: New userspace connection tracker.
Daniele Di Proietto [Mon, 16 Nov 2015 06:07:25 +0000 (22:07 -0800)]
conntrack: New userspace connection tracker.

This commit adds the conntrack module.

It is a connection tracker that resides entirely in userspace.  Its
primary user will be the dpif-netdev datapath.

The module main goal is to provide conntrack_execute(), which offers a
convenient interface to implement the datapath ct() action.

The conntrack module uses two submodules to deal with the l4 protocol
details (conntrack-other for UDP and ICMP, conntrack-tcp for TCP).

The conntrack-tcp submodule implementation is adapted from FreeBSD's pf
subsystem, therefore it's BSD licensed.  It has been slightly altered to
match the OVS coding style and to allow the pickup of already
established connections.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Antonio Fischetti <antonio.fischetti@intel.com>
Acked-by: Joe Stringer <joe@ovn.org>
8 years agoflow: Introduce parse_dl_type().
Daniele Di Proietto [Mon, 16 Nov 2015 06:07:25 +0000 (22:07 -0800)]
flow: Introduce parse_dl_type().

The function simply returns the ethernet type of the packet (after
eventually discarding the VLAN tag).  It will be used by a following
commit.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
8 years agoflow: Export parse_ipv6_ext_hdrs().
Daniele Di Proietto [Mon, 16 Nov 2015 06:07:25 +0000 (22:07 -0800)]
flow: Export parse_ipv6_ext_hdrs().

This will be used by a future commit.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
8 years agopackets: Define ICMP types.
Daniele Di Proietto [Mon, 16 Nov 2015 06:07:25 +0000 (22:07 -0800)]
packets: Define ICMP types.

Linux and FreeBSD have slightly different names for these constants.
Windows doesn't define them.  It is simpler to redefine them from
scratch for OVS.  The new names are different than those used in Linux
and FreeBSD.

These definitions will be used by a future commit.

Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Joe Stringer <joe@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
8 years agodebian: Add six dependency to python-openvswitch.
Joe Stringer [Tue, 26 Jul 2016 19:34:16 +0000 (12:34 -0700)]
debian: Add six dependency to python-openvswitch.

python-openvswitch uses the python "six" library, add a dependency for
this to the debian package.

VMware-BZ: #1700259
Reported-by: Devang Doshi <ddoshi@vmware.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
8 years agonetdev-provider: fix comments for netdev_rxq_recv
Mark Kavanagh [Tue, 26 Jul 2016 13:19:17 +0000 (14:19 +0100)]
netdev-provider: fix comments for netdev_rxq_recv

Commit 64839cf43 applies batch objects to netdev-providers, but
some comments were not updated accordingly. Fix these:
   - replace 'pkts' with 'batch'
   - replace '*cnt' with 'batch->count'
   - replace MAX_RX_BATCH with NETDEV_MAX_BURST
   - remove superfluous whitespace

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agoovsdb-client: Fix memory leak reported by Valgrind.
William Tu [Wed, 27 Jul 2016 03:24:57 +0000 (20:24 -0700)]
ovsdb-client: Fix memory leak reported by Valgrind.

Testcase 1857: ovsdb-monitor.at:538 monitor-cond-change reports the
following definitely memory leak:
    ovsdb_schema_create (ovsdb.c:34)
    ovsdb_schema_from_json (ovsdb.c:196)
    fetch_schema (ovsdb-client.c:385)
    do_monitor_cond (ovsdb-client.c:1112)

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agotest-ovsdb: Fix memory leak reported by Valgrind.
William Tu [Wed, 27 Jul 2016 03:12:36 +0000 (20:12 -0700)]
test-ovsdb: Fix memory leak reported by Valgrind.

Valgrind testcase 1967: simple idl, conditional, modify as delete due
to condition - C reports the following leak:
    json_array_create_empty (json.c:185)
    json_parser_push_array (json.c:1234)
    json_parser_input (json.c:1328)
    json_lex_input (json.c:945)
    json_parser_feed (json.c:1103)
    json_from_string (json.c:1025)
    parse_json (test-ovsdb.c:227)
    update_conditions (test-ovsdb.c:2324)
    do_idl (test-ovsdb.c:2389)
    ovs_cmdl_run_command (command-line.c:121)
    main (test-ovsdb.c:73)

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agopython: Send old values of the updated cols in notify for update2
Numan Siddique [Tue, 26 Jul 2016 17:58:14 +0000 (23:28 +0530)]
python: Send old values of the updated cols in notify for update2

When python IDL calls the "notify" function after processing the "update2"
message from ovsdb-server, it is suppose to send the old values of the
updated columns as the last parameter. But the recent commit "897c8064"
sends the updated values. This breaks the behaviour.
This patch fixes this issue. It also updates the description of
the 'updates' param of the notify function to make it more clear.

Fixes: 897c8064 ("python: move Python idl to work with monitor_cond")
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agonetdev: do not allow devices to be opened with conflicting types
Thadeu Lima de Souza Cascardo [Wed, 27 Jul 2016 15:06:45 +0000 (12:06 -0300)]
netdev: do not allow devices to be opened with conflicting types

When a device is already opened, netdev_open should verify that the types match,
or else return an error.

Otherwise, users might expect to open a device with a certain type and get a
handle belonging to a different type.

This also prevents certain conflicting configurations that would have a port of
a certain type in the database and one of a different type on the system.

For example, when adding an interface with a type other than system, and there
is already a system interface with the same name, as the routing table will hold
a reference to that system interface, some conflicts will arise. The netdev will
be opened with the incorrect type and that will make vswitchd remove it, but
adding it again will fail as it already exists. Failing earlier prevents some
vswitchd loops in reconfiguring the interface.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agodpif-netdev: use the open_type when creating the local port
Thadeu Lima de Souza Cascardo [Wed, 27 Jul 2016 15:06:44 +0000 (12:06 -0300)]
dpif-netdev: use the open_type when creating the local port

Instead of using the internal type, use the port_open_type when creating the
local port. That makes sure that whenever dpif_port_query is used, the netdev
open_type is returned instead of the "internal" type.

For other ports, that is already the case, as the netdev type is used when
creating the dp_netdev_port.

That changes the output of dpctl when showing the local port, and also when
trying to change its type. So, corresponding tests are fixed.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agonetdev-vport: don't use system type when opening netdev
Thadeu Lima de Souza Cascardo [Wed, 27 Jul 2016 15:06:43 +0000 (12:06 -0300)]
netdev-vport: don't use system type when opening netdev

tunnel_check_status_change__ calls netdev_open with type system. Using NULL
instead will default to system in case the device is not opened yet, and allow a
different type in case it's already opened.

Any type should be fine, as netdev_get_carrier will work with any of them.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agoin-band: don't use system type when opening netdev
Thadeu Lima de Souza Cascardo [Wed, 27 Jul 2016 15:06:42 +0000 (12:06 -0300)]
in-band: don't use system type when opening netdev

A netdev might be already opened with a different type and that can be used
instead. The system type is already the default type that will be used when
there is no netdev opened and the type is not specified.

And as long as the opened netdev supports the required operations, it doesn't
matter its type.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agoin-band: use open_type when opening internal device
Thadeu Lima de Souza Cascardo [Wed, 27 Jul 2016 15:06:41 +0000 (12:06 -0300)]
in-band: use open_type when opening internal device

in-band code will open a device that it expects to be the main internal port of
the bridge. However, it's possible that the correct type is a different one. For
dpif-netdev, it might be a tap device, or a dummy device for dummy datapaths.
ofproto_port_open_type will give the correct type.

While this doesn't cause any problems right now, as the needed type would be
opened already, a later patch assumes netdev with different types cannot be
opened.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agoFAQ: Add contents section and enable internal links.
Bhanuprakash Bodireddy [Wed, 27 Jul 2016 21:16:17 +0000 (22:16 +0100)]
FAQ: Add contents section and enable internal links.

Add contents section to FAQ and enable internal links in doc for pretty
printing on GitHub.

Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agomove ovn/lib/<lex|actions|expr>.h to include/ovn
Aaron Rosen [Mon, 25 Jul 2016 22:04:32 +0000 (15:04 -0700)]
move ovn/lib/<lex|actions|expr>.h to include/ovn

This patch is done to enable in tree building of the ovn-utils python
wrapper.  This is similar to what was done in commit
ee89ea7b477bb4fd05137de03b2e8443807ed9f4 (json: Move from lib to
include/openvswitch.).

Signed-off-by: Aaron Rosen <aaronorosen@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agophysical: Persist tunnels from one ovn-controller loop to the next.
Ryan Moats [Mon, 25 Jul 2016 16:28:52 +0000 (16:28 +0000)]
physical: Persist tunnels from one ovn-controller loop to the next.

While commit ab39371d68842b7e4000cc5d8718e6fc04e92795
(ovn-controller: Handle physical changes correctly) addressed
unit test failures, it did so at the cost of performance: [1]
notes that ovn-controller cpu usage is now pegged at 100%.

Root cause of this is that while the storage for tunnels is
persisted, their creation is not (which the above changed
incorrectly assumed was the case).  This patch persists
tunneled data across invocations of physical_run.  A side
effect is that renaming of localfvif_map_changed variable
to physical_map_changed and extending its scope to include
tunnel changes.

[1] http://openvswitch.org/pipermail/dev/2016-July/076058.html

Fixes: ab39371d6884 ("ovn-controller: Handle physical changes correctly")
Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
Acked-by: Flavio Fernandes <flavio@flaviof.com>
Tested-by: Flavio Fernandes <flavio@flaviof.com>
Acked-by: Liran Schour <lirans@il.ibm.com>
Tested-by: Liran Schour <lirans@il.ibm.com>
Acked-by: Hui Kang <kangh@us.ibm.com>
Tested-by: Hui Kang <kangh@us.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovsdb: Fix memory leak in replication logic
Andy Zhou [Tue, 26 Jul 2016 02:23:02 +0000 (19:23 -0700)]
ovsdb: Fix memory leak in replication logic

Release the memory of reply message of the initial "monitor" request.

Reported-at: http://openvswitch.org/pipermail/dev/2016-July/076075.html
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
8 years agoovsdb: Properly close replication rpc connection
Andy Zhou [Tue, 26 Jul 2016 02:22:03 +0000 (19:22 -0700)]
ovsdb: Properly close replication rpc connection

This patch removes rpc related memory leak reported below.

Reported-at: http://openvswitch.org/pipermail/dev/2016-July/076075.html
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
8 years agoovsdb: Fix memory leak reported by valgrind.
Liran Schour [Mon, 25 Jul 2016 08:00:29 +0000 (11:00 +0300)]
ovsdb: Fix memory leak reported by valgrind.

Destroy shash on destroy of session's condition structure.
Reported here: http://openvswitch.org/pipermail/dev/2016-July/075968.html

Signed-off-by: Liran Schour <lirans@il.ibm.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
8 years agoovs-router: Ignore IPv6 source addresses for IPv4 routes.
Thadeu Lima de Souza Cascardo [Sun, 24 Jul 2016 16:07:26 +0000 (13:07 -0300)]
ovs-router: Ignore IPv6 source addresses for IPv4 routes.

Though this should not happen when we have another address on the device that is
IPv4 mapped, we should prevent adding a routing entry to IPv4 with an IPv6
source address.

This entry has been observed when the addresses list was out of date.

Cached: 172.16.10.1/32 dev br3 SRC fe80::c4d0:14ff:feb1:b54b
Cached: 172.16.10.0/24 dev br3 SRC fe80::c4d0:14ff:feb1:b54b

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoutilities/ovs-ctl.in: Allow non-monitoring daemons
Aaron Conole [Mon, 25 Jul 2016 18:03:51 +0000 (14:03 -0400)]
utilities/ovs-ctl.in: Allow non-monitoring daemons

This commit allows the ovs-ctl command to spawn daemons without the
internal process monitor.  This is useful when integrating with,
ex. systemd, which provides its own monitoring facilities.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Reviewed-by: Markos Chandras <mchandras@suse.de>
Acked-by: Ben Pfaff <blp@ovn.org>
Acked-by: Flavio Fernandes <flavio@flaviof.com>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agodpif-netdev: Introduce pmd-rxq-affinity.
Ilya Maximets [Wed, 27 Jul 2016 14:44:44 +0000 (17:44 +0300)]
dpif-netdev: Introduce pmd-rxq-affinity.

New 'other_config:pmd-rxq-affinity' field for Interface table to
perform manual pinning of RX queues to desired cores.

This functionality is required to achieve maximum performance because
all kinds of ports have different cost of rx/tx operations and
only user can know about expected workload on different ports.

Example:
# ./bin/ovs-vsctl set interface dpdk0 options:n_rxq=4 \
                  other_config:pmd-rxq-affinity="0:3,1:7,3:8"
Queue #0 pinned to core 3;
Queue #1 pinned to core 7;
Queue #2 not pinned.
Queue #3 pinned to core 8;

It's decided to automatically isolate cores that have rxq explicitly
assigned to them because it's useful to keep constant polling rate on
some performance critical ports while adding/deleting other ports
without explicit pinning of all ports.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agodpif-netdev: Add reconfiguration request to dp_netdev.
Ilya Maximets [Wed, 27 Jul 2016 14:44:43 +0000 (17:44 +0300)]
dpif-netdev: Add reconfiguration request to dp_netdev.

Next patches will add new conditions when reconfiguration will be
required. It'll be simpler to have common way to request reconfiguration.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agobridge: Pass interface's configuration to datapath.
Ilya Maximets [Wed, 27 Jul 2016 14:44:42 +0000 (17:44 +0300)]
bridge: Pass interface's configuration to datapath.

This commit adds functionality to pass value of 'other_config' column
of 'Interface' table to datapath.

This may be used to pass not directly connected with netdev options and
configure behaviour of the datapath for different ports.
For example: pinning of rx queues to polling threads in dpif-netdev.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agodpif-netdev: XPS (Transmit Packet Steering) implementation.
Ilya Maximets [Wed, 27 Jul 2016 14:44:41 +0000 (17:44 +0300)]
dpif-netdev: XPS (Transmit Packet Steering) implementation.

If CPU number in pmd-cpu-mask is not divisible by the number of queues and
in a few more complex situations there may be unfair distribution of TX
queue-ids between PMD threads.

For example, if we have 2 ports with 4 queues and 6 CPUs in pmd-cpu-mask
such distribution is possible:
<------------------------------------------------------------------------>
pmd thread numa_id 0 core_id 13:
        port: vhost-user1       queue-id: 1
        port: dpdk0     queue-id: 3
pmd thread numa_id 0 core_id 14:
        port: vhost-user1       queue-id: 2
pmd thread numa_id 0 core_id 16:
        port: dpdk0     queue-id: 0
pmd thread numa_id 0 core_id 17:
        port: dpdk0     queue-id: 1
pmd thread numa_id 0 core_id 12:
        port: vhost-user1       queue-id: 0
        port: dpdk0     queue-id: 2
pmd thread numa_id 0 core_id 15:
        port: vhost-user1       queue-id: 3
<------------------------------------------------------------------------>

As we can see above dpdk0 port polled by threads on cores:
12, 13, 16 and 17.

By design of dpif-netdev, there is only one TX queue-id assigned to each
pmd thread. This queue-id's are sequential similar to core-id's. And
thread will send packets to queue with exact this queue-id regardless
of port.

In previous example:

pmd thread on core 12 will send packets to tx queue 0
pmd thread on core 13 will send packets to tx queue 1
...
pmd thread on core 17 will send packets to tx queue 5

So, for dpdk0 port after truncating in netdev-dpdk:

core 12 --> TX queue-id 0 % 4 == 0
core 13 --> TX queue-id 1 % 4 == 1
core 16 --> TX queue-id 4 % 4 == 0
core 17 --> TX queue-id 5 % 4 == 1

As a result only 2 of 4 queues used.

To fix this issue some kind of XPS implemented in following way:

* TX queue-ids are allocated dynamically.
* When PMD thread first time tries to send packets to new port
  it allocates less used TX queue for this port.
* PMD threads periodically performes revalidation of
  allocated TX queue-ids. If queue wasn't used in last
  XPS_TIMEOUT_MS milliseconds it will be freed while revalidation.
        * XPS is not working if we have enough TX queues.

Reported-by: Zhihong Wang <zhihong.wang@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
8 years agoINSTALL.md: Update configure section for built-in intrinsics.
Bhanuprakash Bodireddy [Wed, 27 Jul 2016 18:31:11 +0000 (19:31 +0100)]
INSTALL.md: Update configure section for built-in intrinsics.

Built-in CRC32 intrinsics can be used for efficient hash computation on
processors with SSE4.2 support.

Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoINSTALL.RHEL: Update missing hyperlink for Fedora install guide.
Bhanuprakash Bodireddy [Wed, 27 Jul 2016 18:31:10 +0000 (19:31 +0100)]
INSTALL.RHEL: Update missing hyperlink for Fedora install guide.

Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoINSTALL.md: Update missing hyperlink for Windows install guide.
Bhanuprakash Bodireddy [Wed, 27 Jul 2016 18:31:09 +0000 (19:31 +0100)]
INSTALL.md: Update missing hyperlink for Windows install guide.

Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agorelease-process.md: Document OVS release process and propose a schedule.
Ben Pfaff [Fri, 22 Jul 2016 19:39:44 +0000 (12:39 -0700)]
release-process.md: Document OVS release process and propose a schedule.

This document has two different kinds of text:

   - The first sections of the document, "Release Strategy" and "Release
     Numbering", describe what we've already been doing for most of the
     history of Open vSwitch.  If there is anything surprising in them,
     then it's because our process has not been transparent enough, and not
     because we're making a change.

   - The final section of the document, "Release Scheduling", is a proposal
     for current and future releases.  We have not had a regular release
     schedule in the past, but it seems important to have one in the
     future, so this section requires review and feedback from everyone in
     the community.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Russell Bryant <russell@ovn.org>
Acked-by: Ryan Moats <rmoats@us.ibm.com>
8 years agoovn-nbctl: Improve usage message.
Ben Pfaff [Wed, 27 Jul 2016 05:43:07 +0000 (22:43 -0700)]
ovn-nbctl: Improve usage message.

The most important change here is to delete misspelled "the".

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Numan Siddique <nusiddiq@redhat.com>
8 years agoovn: Make it possible for CMS to detect when the OVN system is up-to-date.
Ben Pfaff [Sun, 24 Jul 2016 20:14:59 +0000 (13:14 -0700)]
ovn: Make it possible for CMS to detect when the OVN system is up-to-date.

Until now, there has been no reliable for the CMS (or ovn-nbctl, or
anything else) to detect when changes made to the northbound configuration
have been passed through to the southbound database or to the hypervisors.
This commit adds this feature to the system, by adding sequence numbers
to the northbound and southbound databases and adding code in ovn-nbctl,
ovn-northd, and ovn-controller to keep those sequence numbers up-to-date.

The biggest user-visible change from this commit is new a new option
--wait to ovn-nbctl.  With --wait=sb, ovn-nbctl now waits for ovn-northd
to update the southbound database; with --wait=hv, it waits for the
changes to make their way to Open vSwitch on every hypervisor.

Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Russell Bryant <russell@ovn.org>
8 years agotravis: Update datapath target kernel list.
Pravin B Shelar [Mon, 18 Jul 2016 02:24:09 +0000 (19:24 -0700)]
travis: Update datapath target kernel list.

Update kernel list to latest stable release.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodatapath: Add support for kernel 4.6
Pravin B Shelar [Tue, 26 Jul 2016 01:40:05 +0000 (18:40 -0700)]
datapath: Add support for kernel 4.6

Most of patch iron out USE_UPSTREAM_TUNNEL case where datapath
directly use upstream tunneling modules.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
Acked-by: Amitabha Biswas <abiswas@us.ibm.com>
8 years agodatapath: compat: simplify ip_local_out().
Pravin B Shelar [Tue, 26 Jul 2016 20:37:46 +0000 (13:37 -0700)]
datapath: compat: simplify ip_local_out().

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodatapath: compat: unset skb encapsulation bit
Pravin B Shelar [Tue, 26 Jul 2016 00:49:54 +0000 (17:49 -0700)]
datapath: compat: unset skb encapsulation bit

OVS compat layer can handle tunnel GSO packets. but it does
keep skb encapsulation on for packet handled in GSO. This can
confuse some NIC drivers. I have seen this issue on intel devices:

>>>  i40e 0000:42:00.0: TX driver issue detected, PF reset issued

Following patch resets this bit in case compat layer handles the packet.

VMware-BZ: 1698877
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agodatapath: compat: fix udp checksum calculation
Pravin B Shelar [Tue, 26 Jul 2016 00:49:53 +0000 (17:49 -0700)]
datapath: compat: fix udp checksum calculation

In upstream linux kernel networking stack udp_set_csum() is called
with only udp header applied but in case of compat layer it can
be called with IP header. So following patch take the offset into
account.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
8 years agoovn-northd: Fix {}-enclosed constants for ND responder
Zong Kai LI [Tue, 26 Jul 2016 06:02:26 +0000 (14:02 +0800)]
ovn-northd: Fix {}-enclosed constants for ND responder

It missed comma as constant seperator in match string for ND responder.

Signed-off-by: Zong Kai LI <zealokii@gmail.com>
Signed-off-by: Russell Bryant <russell@ovn.org>
8 years agoovn-northd: Add logical flows to support native DHCPv4
Numan Siddique [Tue, 26 Jul 2016 19:24:39 +0000 (00:54 +0530)]
ovn-northd: Add logical flows to support native DHCPv4

OVN implements a native DHCPv4 support which caters to the common
use case of providing an IP address to a booting instance by
providing stateless replies to DHCPv4 requests based on statically
configured address mappings. To do this it allows a short list of
DHCPv4 options to be configured and applied at each compute host
running ovn-controller.

A new table 'DHCP_Options' is added in OVN NB DB to store the DHCP
options. Logical ports refer to this table to configure the DHCPv4
options.

For each logical port configured with DHCPv4 Options following flows
are added
 - A logical flow which copies the DHCPv4 options to the DHCPv4
   request packets using the 'put_dhcp_opts' action and advances the
   packet to the next stage.

 - A logical flow which implements the DHCP reponder by sending
   the DHCPv4 reply back to the inport once the 'put_dhcp_opts' action
   is applied.

Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Co-authored-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Tested-by: Ramu Ramamurthy <ramu.ramamurthy@us.ibm.com>
Acked-by: Ramu Ramamurthy <ramu.ramamurthy@us.ibm.com>
8 years agorhel/openvswitch.spec: Add SELinux policy.
Joe Stringer [Mon, 25 Jul 2016 21:09:26 +0000 (14:09 -0700)]
rhel/openvswitch.spec: Add SELinux policy.

Commit 9b897c9125ef ("rhel: provide our own SELinux custom policy
package") added the SELinux policy to the fedora packaging as a
subpackage. This patch makes the corresponding change to
openvswitch.spec, so that users of that specfile can generate the
selinux policy package without having to build all of the fedora
packages.

VMware-BZ: #1692972
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
8 years agoselinux: Allow ovs-ctl force-reload-kmod.
Joe Stringer [Fri, 22 Jul 2016 21:10:51 +0000 (14:10 -0700)]
selinux: Allow ovs-ctl force-reload-kmod.

When invoking ovs-ctl force-reload-kmod via '/etc/init.d/openvswitch
force-reload-kmod', spurious errors would output related to 'hostname'
and 'ip', and the system's selinux audit log would complain about some
of the invocations such as those listed at the end of this commit message.

This patch loosens restrictions for openvswitch_t (used for ovs-ctl, as
well as all of the OVS daemons) to allow it to execute 'hostname' and
'ip' commands, and also to execute temporary files created as
openvswitch_tmp_t. This allows force-reload-kmod to run correctly.

Example audit logs:
type=AVC msg=audit(1468515192.912:16720): avc:  denied  { getattr } for
pid=11687 comm="ovs-ctl" path="/usr/bin/hostname" dev="dm-1"
ino=33557805 scontext=system_u:system_r:openvswitch_t:s0
tcontext=system_u:object_r:hostname_exec_t:s0 tclass=file

type=AVC msg=audit(1468519445.766:16829): avc:  denied  { getattr } for
pid=13920 comm="ovs-save" path="/usr/sbin/ip" dev="dm-1" ino=67572988
scontext=unconfined_u:system_r:openvswitch_t:s0
tcontext=system_u:object_r:ifconfig_exec_t:s0 tclass=file

type=AVC msg=audit(1468519445.890:16833): avc:  denied  { execute } for
pid=13849 comm="ovs-ctl" name="tmp.jdEGHntG3Z" dev="dm-1" ino=106876762
scontext=unconfined_u:system_r:openvswitch_t:s0
tcontext=unconfined_u:object_r:openvswitch_tmp_t:s0 tclass=file

VMware-BZ: #1692972
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Flavio Leitner <fbl@sysclose.org>
8 years agoMakefile.am: Add clang static analysis support
Bhanuprakash Bodireddy [Fri, 15 Jul 2016 18:43:24 +0000 (19:43 +0100)]
Makefile.am: Add clang static analysis support

Clang Static Analyzer is a source code analysis tool to find bugs. This
patch adds make target to trigger static analysis using below commands.

./boot.sh
For Clang: ./configure CC=clang --with-dpdk
For GCC: ./configure CC=gcc --with-dpdk CFLAGS="-std=gnu99"
make clang-analyze

Run 'scan-view <results dir>' command to examine the bug report.

Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Acked-By: Lance Richardson <lrichard@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
8 years agoovn-northd: Combine two NAT loops into one.
Guru Shetty [Wed, 13 Jul 2016 11:20:36 +0000 (04:20 -0700)]
ovn-northd: Combine two NAT loops into one.

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>