Russell Bryant [Thu, 1 Oct 2015 18:26:26 +0000 (14:26 -0400)]
ovn: Add an ovs-sandbox based OVN tutorial.
While working on OVN and OVN integration, I've collected a set of
scripts for quickly setting up simple test environments using
ovs-sandbox with OVN enabled. It seemed like they could be useful to
others for learning about OVN or doing quick testing.
This patch introduces an ovs-sandbox based tutorial for exploring OVN
features in a simulated environment.
Signed-off-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Russell Bryant [Thu, 1 Oct 2015 18:26:25 +0000 (14:26 -0400)]
ovn: Add VLAN support for localnet ports.
This patch makes it possible use a localnet port for connecting to a
specific VLAN on a locally accessible network. The only logical
modeling change is that it is now valid to set the "tag" field on
logical ports with a type of "localnet". Previously, the "tag" field
was only use for child ports.
We still use a single automatically created patch port between br-int
and the bridge configured to provide connectivity to a given network
(the ovn-controller bridge-mappings configuration). We use flows when
necessary to either match on VLAN ID or to add the VLAN ID before
sending the packet out.
Matching for a localnet port with a VLAN ID is done at priority 150 in
table 0, and is similar to how we match traffic from container child
ports. These cases are conceptually similar in that they're separate
logical ports on the same physical port.
Most of the code changes are due to a change in data structures. We
have to keep track of all of the localnet ports and then add flows for
them at the end. Previously this code tracked them as:
hash of localnet bindings, hased on network name
localnet bindings:
openflow port number
list of port bindings
Now we have:
hash of localnet bindings, hased on network name
localnet bindings:
openflow port number
hash of localnet vlans
localnet vlans:
VLAN ID (0 for untagged traffic)
list of port bindings
A detailed example of using localnet ports with a VLAN ID is provided in
a later patch as a part of a larger OVN tutorial.
Signed-off-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Flavio Leitner [Thu, 1 Oct 2015 22:31:09 +0000 (19:31 -0300)]
rhel: Add optional BuildRequires to libcap-ng
Commit e91b927d8 (lib/daemon: support --user option for all OVS daemon)
added optional usage of the libcap-ng library. It's packaged in Fedora,
so go ahead and added it by default to the Fedora spec file.
Our default systemd unit files don't make use of the --user option that
requires this library, but conceivably someone may want to customize
them and use this option.
For those that don't want to use --user option, the Fedora package
offers an option (--without libcapng) to build the RPMs without it.
Signed-off-by: Flavio Leitner <fbl@redhat.com> Acked-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
datapath-windows: Compute checksums for VXLAN inner packets.
Windows does not support VXLAN hardware offloading.
Currently we do not compute IP/TCP/UDP checksums for the inner packet. This
patch computes the checksums mentioned above in regards with the enabled
settings.
i.e. if IP checksum offloading is enabled for the inner packet we compute it.
The same applies for TCP and UDP packets.
This patch also revizes the computation of ones' complement over different
memory blocks, in the case the lengths are odd.
Also per documentation:
https://msdn.microsoft.com/en-us/library/windows/hardware/ff568840%28v=vs.85%29.aspx
set the TCP flags FIN and PSH only for the last segment in the case LSO is
enabled.
Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Sairam Venugopal <vsairam@vmware.com> Acked-by: Sorin Vinturis <svinturis@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Currently in the case of IP fragmentation we send to the userspace that
the flag for the last fragment is 3 when it actually should be a value
between 0..2.
This patch fixes the problem and also uses the values used in the common
header of the datapath.
Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Nithin Raju <nithin@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
util: Fix definition of LIST_FOR_EACH_CONTINUE macro.
The definition of the INIT_CONTAINER macro initializes ITER to NULL,
it will cause a segmentation fault when it is deferenced on
(ITER)->MEMBER.next, then, I changed it to the ASSIGN_CONTAINER macro that
does not initialize ITER.
This does not fix any observable bug because LIST_FOR_EACH_CONTINUE is not
used anywhere.
Signed-off-by: Mauricio Vasquez B <mauricio.vasquezbernal@studenti.polito.it> Signed-off-by: Ben Pfaff <blp@nicira.com>
Bruce Davie [Thu, 1 Oct 2015 19:07:45 +0000 (12:07 -0700)]
vtep: add ACLs to VTEP schema
Two new tables are added to the VTEP schema, for ACL entries and
ACLs (which are groups of entries). The physical port table is modified
to allow ACLs to be associated with ports, and the logical router table
is modified to allow ACLs to be attached to logical router ports.
Signed-off-by: Bruce Davie <bdavie@vmware.com> Signed-off-by: Justin Pettit <jpettit@nicira.com>
Bruce Davie [Thu, 1 Oct 2015 19:07:44 +0000 (12:07 -0700)]
vtep: Document the meaning of VLAN zero for vlan_bindings
The meaning of a value of zero in the VLAN field when mapping <VLAN, port>
pairs to logical switches had not previously been specified in the VTEP
schema. It is now clarified that a value of zero refers to untagged
traffic.
Signed-off-by: Bruce Davie <bdavie@vmware.com> Acked-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: Justin Pettit <jpettit@nicira.com>
Russell Bryant [Thu, 1 Oct 2015 15:29:16 +0000 (11:29 -0400)]
Fix build when HAVE_LIBCAPNG is not defined.
The function daemon_become_new_user_linux was conditionally defined but
then used in code unconditionally. If HAVE_LIBCAPNG is not defined, the
function would never be called, but it still must exist.
Adjust the #if guard around the function to be around the body of the
function instead of outside of its definition to ensure the function is
always defined, even if empty.
Andy Zhou [Sat, 12 Sep 2015 02:10:19 +0000 (19:10 -0700)]
ovs-dev.py: add --user option
ovs-dev.py "run" command now accepts the "--user" option for running
all ovs daemons as "user". The argument can be specified in
"user[:group]" format.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Joe Stringer <joestringer@nicira.com>
Andy Zhou [Fri, 11 Sep 2015 23:06:50 +0000 (16:06 -0700)]
ovs-dev.py: run operational commands as root
Switch operational commands, run, kill, reset and modinst directly
or indirectly read and writes files within the RUNDIR. Currently
these commands run in the current user context, with some "sudo"
commands thrown in to ensure daemons such as ovs-vswichd will be
launched as root.
This approach works fine as long as ovs-dev.py is always
run as root, (but then the 'sudo' commands added are redundant).
When invoking ovs-dev.py as non-root, files in RUNDIR will be mixed
with root created file and non-root created files, making it confusing
to decide whether to run ovs-appctl as root or not. Multiple
invocations of ovs-dev.py as root or non-root causes permission issues
since the same file created by a different user may no longer be
accessible when user changes.
This patch improves the situation by always run those four operational
commands as root. When they are invoked as non-root, "sudo" will be
used automatically by re-run the command with sudo. VARDIR will now
always be access as root. The next patch will add --user and -u option
to allow for downgrading to running all daemons as non-root.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Joe Stringer <joestringer@nicira.com>
Andy Zhou [Fri, 11 Sep 2015 20:34:24 +0000 (13:34 -0700)]
ovs-dev.py: allow current directory to be used as the working directory
Rather than forcing a single ovs source tree under ~/ovs, this
change supports invoking the script from the root of any
ovs source tree as the working source tree. If the script is invoked
from a directory not recognized as OVS source tree, ~/ovs will
then be used.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Joe Stringer <joestringer@nicira.com>
Andy Zhou [Mon, 21 Sep 2015 22:06:00 +0000 (15:06 -0700)]
dpdk: reject --user option
dpdk datapath needs to run as root. Block the --user
option for now. It is likely we will revisit this issue for possibly
supporting --user option for dpdk datapath process as well.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Fri, 11 Sep 2015 01:44:27 +0000 (18:44 -0700)]
configure: add configuration options for libcap-ng
Add configuration option for enabling or disabling linking with
libcap-ng. Since capabilities are a security feature, the libcapng
option is handled as follows:
- no option: use libcapng if it's present
--disable-libcapng: do not use libcapng
--enable-libcapng: do use libcapng and fail configuration if
it's missing
On Linux, not linking with libcapng makes all OVS daemons fail when
--user option is specified.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
These functions could only work with 32-bit integers because of their
special cases for an argument of value 0. However, none of the existing
users depended on this special case, and some of the users did try to use
these functions with 64-bit integer arguments. Thus, this commit changes
them to support 64-bit integer arguments and drops the special cases for
zero.
This fixes a latent bug that applied rightmost_1bit_idx() to an ofpact
bitmap, which only becomes visible when an OFPACT_* with value greater than
32 is included in the bitmap.
Reported-by: Kyle Upton <kupton@baymicrosystems.com>
Reported-at: http://openvswitch.org/pipermail/dev/2015-September/060128.html Signed-off-by: Ben Pfaff <blp@nicira.com>
datapath-windows: return netlink error for read operation
The kernel datapath returns a NL error message upon any errors
during read operations, and returns STATUS_SUCCESS as the return
code. We reply on the input NL request to get the family ID, and the
PID. However, when the request is of type OVS_CTRL_CMD_EVENT_NOTIFY
and OVS_CTRL_CMD_READ_NOTIFY, there's no input buffer associated
with the request. So, we use a temporary input buffer to be able to
call the Netlink APIs for constructing the output NL error message.
Signed-off-by: Nithin Raju <nithin@vmware.com> Acked-by: Sairam Venugopal <vsairam@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Commit fe089c0d1e18 ("vlog: abstract out interface to syslog daemon")
introduced --syslog-method flag that supersedes --syslog-target flag by:
1. making logging format configurable
2. letting daemon to also talk over UNIX domain socket (this is handy
when local rsyslog daemon is running in different network namespace
on the same host)
Signed-off-by: Ansis Atteka <aatteka@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Tue, 29 Sep 2015 22:40:22 +0000 (15:40 -0700)]
ovn: Implement basic end-to-end full mesh test.
This is a really basic test of the OVN features. It verifies that basic
L2 connectivity works as expected over a 3-hypervisor setup with 3 VMs
per hypervisor and all 9 VMs on a single logical switch, with a few ACLs.
The infrastructure added by this patch, which is based on similar code
from ovs-sim, should be useful as a basis for later and more advanced
OVN end-to-end tests.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Justin Pettit <jpettit@nicira.com>
Ben Pfaff [Tue, 29 Sep 2015 17:19:37 +0000 (10:19 -0700)]
tests: Ignore more error messages for hidden rules test.
This test intentionally configures an unreachable controller. It ignored
some error messages in the log, specifically
br0: cannot find route for controller (240.0.0.1): ...
but a bug report says that other forms of messages can also appear, e.g.
br0<->tcp:240.0.0.1:6653: connection dropped (No route to host)
This commit therefore expands the logged error messages that will be
ignored to any message that includes the IP address 240.0.0.1.
ofproto-dpif-upcall: Use flow_wildcards_has_extra().
Update the comment in ukey_revalidate() to reflect the fact that the
mask in ukey is not the datapath mask, but the originally translated
flow wildcards.
Use flow_wildcards_has_extra() instead of open coding equivalent (but
different) functionality. The old form and the code in
flow_wildcards_has_extra() ((dp | wc != dp) and (dp & wc != wc),
respecively) give the same result:
The name 'lport_to_ofport' gives the impression that the
simap contains all the logical port to ofport mapping. In
reality, it only contains a local vif to ofport mapping.
The name 'localvif_to_ofport' feels to be a better fit.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Russell Bryant <rbryant@redhat.com>
This patch adds the modifications needed to compile under x64 under
Windows:
- created a new macro for testing if we are compiling under x64.
this will define the linker flag: "/MACHINE:X64" as per documentation
(https://msdn.microsoft.com/en-us/library/9yb4317s.aspx).
- added x64 pthread libraries under the pthread defines
- add documentation on how to build under x64
Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
datapath: Backport "skbuff: Fix skb checksum flag on skb pull"
Upstream commit:
VXLAN device can receive skb with checksum partial. But the checksum
offset could be in outer header which is pulled on receive. This results
in negative checksum offset for the skb. Such skb can cause the assert
failure in skb_checksum_help(). Following patch fixes the bug by setting
checksum-none while pulling outer header.
Following is the kernel panic msg from old kernel hitting the bug.
Reported-by: Anupam Chanda <achanda@vmware.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 6ae459bdaae ("skbuff: Fix skb checksum flag on skb pull") Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Zoltan Kiss [Fri, 25 Sep 2015 18:42:40 +0000 (11:42 -0700)]
ofproto-dpif: Do not block on uninitialized pause barriers.
e4e74c3a "dpif-netdev: Purge all ukeys when reconfigure pmd." introduced a new
dp_purge_cb function, which calls udpif_pause_revalidators() and that tries to
block on pause_barrier.
But if OVS was started with flow-restore-wait="true" (e.g. through ovs-ctl),
type_run() will have backer->recv_set_enable == false, and udpif_set_threads
won't initialize the barrier, which leads to a segfault like this:
This patch introduces ofproto_dpif_backer_enabled(), which checks
recv_set_enable before touching the latch and blocking on pause_barrier.
Signed-off-by: Zoltan Kiss <zoltan.kiss@linaro.org> Acked-by: Joe Stringer <joestringer@nicira.com>
datapath: Backport "openvswitch: Zero flows on allocation."
Upstream commit:
openvswitch: Zero flows on allocation.
When support for megaflows was introduced, OVS needed to start
installing flows with a mask applied to them. Since masking is an
expensive operation, OVS also had an optimization that would only
take the parts of the flow keys that were covered by a non-zero
mask. The values stored in the remaining pieces should not matter
because they are masked out.
While this works fine for the purposes of matching (which must always
look at the mask), serialization to netlink can be problematic. Since
the flow and the mask are serialized separately, the uninitialized
portions of the flow can be encoded with whatever values happen to be
present.
In terms of functionality, this has little effect since these fields
will be masked out by definition. However, it leaks kernel memory to
userspace, which is a potential security vulnerability. It is also
possible that other code paths could look at the masked key and get
uninitialized data, although this does not currently appear to be an
issue in practice.
This removes the mask optimization for flows that are being installed.
This was always intended to be the case as the mask optimizations were
really targetting per-packet flow operations.
Fixes: 03f0d916 ("openvswitch: Mega flow implementation") Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: ae5f2fb1 ("openvswitch: Zero flows on allocation.") Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
This patch adds an additional include file while compiling under MSVC.
Found by compiling under MSVC x64 and hitting the following problem:
http://stackoverflow.com/questions/23144151/64-bit-function-returns-32-bit-pointer
Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Gurucharan Shetty <gshetty@nicira.com>
Andy Zhou [Tue, 15 Sep 2015 20:51:17 +0000 (13:51 -0700)]
ofproto/bond: simplify rebalancing logic
The current bond relancing logic is more complicated than necessary.
When considering a bucket for rebalancing, we just need to make sure
post rebalancing traffic will be closer to the ideal traffic split
than before. This patch implements the simplification.
There is a bug is current algorithm that causes a heavyly loaded bucket
to ping-pong for each reblancing interval. The simplied loigc also fixes
this bug.
Though not the main motivation for the change, computations are now
done with integer math rather than floating math.
Reported-by: Gregory Smith <gasmith@nutanix.com>
tested-by: Gregory Smith <gasmith@nutanix.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Fixes following compilation error:
In file included from ovs/datapath/linux/actions.c:30: ovs/datapath/linux/compat/include/linux/if_vlan.h:65:
error: redefinition of ‘__vlan_hwaccel_push_inside’ include/linux/if_vlan.h:353: note: previous definition of
‘__vlan_hwaccel_push_inside’ was here ovs/datapath/linux/compat/include/linux/if_vlan.h:83:
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Enable support for Checksum offloads in STT if it's enabled in the Windows
VM. Set the Checksum Partial and Checksum Verified flags as mentioned in
the STT Draft - https://tools.ietf.org/html/draft-davie-stt-06
datapath-windows: Removed hardcoded names for internal/external vports
The internal/external vports will have the actual OS-based names, which
represent the NIC interface alias that is displayed by running
'Get-NetAdapter' Hyper-V PS command.
Ben Pfaff [Sat, 19 Sep 2015 16:48:26 +0000 (09:48 -0700)]
tests: Shorten line in table-features test.
By inserting "dnl" a few places in this 1000+ character line, we bring
the physical line length down (making "git format-patch" willing to put
it into a patch) but m4 will still paste it together into a single line.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Joe Stringer <joestringer@nicira.com>
The Netlink encoding of datapath flow keys cannot express wildcarding
the presence of a VLAN tag. Instead, a missing VLAN tag is interpreted
as exact match on the fact that there is no VLAN. This makes reading
datapath flow dumps confusing, since for everything else, a missing
key value means that the corresponding key was wildcarded.
Unless we refactor a lot of code that translates between Netlink and
struct flow representations, we have to do the same in the userspace
datapath. This makes at least the flow install logs show that the
vlan_tci field is matched to zero. However, the datapath flow dumps
remain as they were before, as they are performed using the netlink
format.
Add a test to verify that packet with a vlan will not match a rule
that may seem wildcarding the presence of the vlan tag. Applying this
test without the userspace datapath modification showed that the
userspace datapath failed to create a new datapath flow for the VLAN
packet before this patch.
Reported-by: Tony van der Peet <tony.vanderpeet@gmail.com> Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
datapath: Backport "openvswitch: allocate nr_node_ids flow_stats instead of num_possible_nodes"
Upstream commit:
openvswitch: allocate nr_node_ids flow_stats instead of num_possible_nodes
Some architectures like POWER can have a NUMA node_possible_map that
contains sparse entries. This causes memory corruption with openvswitch
since it allocates flow_cache with a multiple of num_possible_nodes() and
assumes the node variable returned by for_each_node will index into
flow->stats[node].
Use nr_node_ids to allocate a maximal sparse array instead of
num_possible_nodes().
The crash was noticed after 3af229f2 was applied as it changed the
node_possible_map to match node_online_map on boot. Fixes: 3af229f2071f5b5cb31664be6109561fbe19c861 Signed-off-by: Chris J Arges <chris.j.arges@canonical.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: bac541e4631(""openvswitch: allocate nr_node_ids flow_stats
instead of num_possible_nodes")
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
stream-ssl: Get peer-ca-cert functionality to work.
When --certificate option is provided, we currently use
SSL_CTX_use_certificate_chain_file() function to add
that certificate. If our single certificate file had multiple
certificates (as a chain), all of them would get added and sent
to the remote peer. But once you call
SSL_CTX_use_certificate_chain_file(), any future calls to
SSL_CTX_add_extra_chain_cert() (called when --peer-ca-cert option
is used) had no effect.
Since our man pages and INSTALL.SSL.md say that --certificate
is used to specify one certificate and additional certificates
are sent via --peer-ca-cert, this commit changes
SSL_CTX_use_certificate_chain_file() use to
SSL_CTX_use_certificate_file(). With this, additional certificates
can now be added via --peer-ca-cert option.
The test case added with this commit would fail without the
above changes.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
The test claimed to test peer-ca-cert functionality. But the
certificate provided via --peer-ca-cert was not actually sent
to the peer for bootstrapping. The bootstrapping was successful
because cert provided via --certificate was self-signed. Since the test
was not really testing the --peer-ca-cert functionality, change
the name of the test. We do not have any tests for bootstrapping,
so this test is still useful.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Currently when running the vswitch daemon we get a lot of messages of the
form:
2015-09-10T23:04:21Z|07255|dpif(revalidator11)|WARN|system@ovs-system: failed
to flow_del (Invalid argument).
The userspace expects after sending a delete flow command, to receive the flow
key of the deleted flow.
Currently we only send back the statiscs. This patch appends back the flow key
attribute for to the response buffer for the flow commands new, modify and
delete.
This patch also responds to the userspace with ENOENT in the case the flow
was not modified, deleted, created or retrieved.
Also incorporate some refactors.
Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Sorin Vinturis <svinturis@cloudbasesolutions.com> Acked-by: Sairam Venugopal <vsairam@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
If we have a flow rule of the following form:
actions=strip_vlan,set_tunnel:0x3e9,15,16,17 (Where port 15, 16 and 17 are
VXLAN OF ports with different tunnelling information)
Current implementation is that if a packet will hit that specific flow,
only one packet will be sent out with the first tunnelling information.
This patch saves the initial packet source port for further use of the
currently implemented pipeline and ignores the latter if it
is the last tunnelling port.
Ben Pfaff [Thu, 10 Sep 2015 17:00:41 +0000 (10:00 -0700)]
ofp-util: Fix struct ofputil_requestforward union membership.
'bands' should be paired with 'meter_mod' because 'bands' may hold the
storage for the meter's bands. ('bands' has nothing to do with
'group_mod'.)
Reported-by: niti Rohilla <niti1489@gmail.com>
Reported-at: http://openvswitch.org/pipermail/dev/2015-September/059847.html Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
datapath: Backport "openvswitch: Fix mask generation for nested attributes."
Upstream commit:
openvswitch: Fix mask generation for nested attributes.
Masks were added to OVS flows in a way that was backwards compatible
with userspace programs that did not generate masks. As a result, it is
possible that we may receive flows that do not have a mask and we need
to synthesize one.
Generating a mask requires iterating over attributes and descending into
nested attributes. For each level we need to know the size to generate the
correct mask. We do this with a linked table of attribute types.
Although the logic to handle these nested attributes was there in concept,
there are a number of bugs in practice. Examples include incomplete links
between tables, variable length attributes being treated as nested and
missing sanity checks.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: 982b5270 ("openvswitch: Fix mask generation for nested attributes.") Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
tunneling: Track recursion levels across ARP generation.
If a packet is output to a tunnel port when userspace tunneling is
enabled, it will cause an ARP packet to be generated if the destination
is unknown. This ARP packet is injected into the physical bridge as
a new packet, where it is flooded.
If there is a loop (such as if the tunnel destination is the same bridge),
the result will be infinite recursion. Even though we currently track
recursion limits, they are not effective here since each ARP packet is
considered to be a new translation. This changes the behavior so that
each ARP flow translation is initialized with the recursion counter of
the previous flow. Note that the problem only applies to ARP - data
packets in a loop will hit an existing recursion counter in the datapath.
An additional side effect of this change is that ARP packets are no
longer unconditionally flooded in the new bridge. They will now follow any
flow rules in the new bridge that might apply to them, the same as with
the kernel datapath.
Reported-by: David Evans <davidjoshuaevans@gmail.com> Tested-by: David Evans <davidjoshuaevans@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Russell Bryant [Thu, 17 Sep 2015 18:27:07 +0000 (14:27 -0400)]
ovn: Update TODO with some notes on security.
The impact of the compromise of a chassis running ovn-controller came
up in a discussion with the developers of a system that could
potentially use OVN. Capture some notes on this issue as a todo item.
Signed-off-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
configure: Fix DPDK linking when using a relative path
When linking with DPDK, if a relative path is used with the
'--with-dpdk' flag, then OVS will always be compiled with vHost Cuse
support, even if it is not enabled in the DPDK build.
This patch fixes this problem, and enables the correct version of
vHost despite whether or not a relative or absolute path is used.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Sat, 12 Sep 2015 03:14:59 +0000 (20:14 -0700)]
ovn-nbctl: Enable database commands using db-ctl-base infrastructure.
This makes ovn-nbctl into a pretty slavish imitation of ovn-sbctl, using
almost the same code. It has two immediate benefits. First, multiple
commands can now be chained together into a single ovn-nbctl invocation.
Second, the database commands such as "create", "set", and so on allow
queries and modifications that don't have specific commands already.
In the following commit, this allows testing the OVN ACL feature.
Alex Wang [Thu, 6 Aug 2015 22:40:57 +0000 (15:40 -0700)]
ovn-controller-vtep: Extend vtep module to install Ucast_Macs_Remote.
This commit extends the vtep module to support creating the
'Ucast_Macs_Remote' table entries in the vtep database for
MAC addresses on the ovn logical ports.
Signed-off-by: Alex Wang <ee07b291@gmail.com> Acked-by: Russell Bryant <rbryant@redhat.com> Acked-by: Justin Pettit <jpettit@nicira.com>
Alex Wang [Sat, 4 Jul 2015 06:13:24 +0000 (23:13 -0700)]
ovn-controller-vtep: Add vtep module.
This commit adds the vtep module to ovn-controller-vtep. The
module will scan through the Port_Binding table in OVN-SB database,
and update the vtep logcial switches tunnel keys.
Signed-off-by: Alex Wang <ee07b291@gmail.com> Acked-by: Russell Bryant <rbryant@redhat.com> Acked-by: Justin Pettit <jpettit@nicira.com>
datapath: Use netlink ipv4 API to handle the ipv4 addr attributes.
upstream: ("netlink: implement nla_put_in_addr and nla_put_in6_addr")
upstream: ("netlink: implement nla_get_in_addr and nla_get_in6_addr")
IP addresses are often stored in netlink attributes. Add generic functions
to do that.
Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Ben Pfaff [Sat, 12 Sep 2015 03:09:21 +0000 (20:09 -0700)]
ovn-nbctl: Give handler functions more specific names.
I find that it's nice to give functions for commands names specific to the
utility, even though they're static, because occasionally it makes it
easier to find them using "tags", "grep", etc.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Alex Wang <ee07b291@gmail.com>
tunnel: Validate IP header for userspace tunneling.
Currently, when doing userspace tunneling we don't perform much in
the way of integrity checks on the incoming IP header. The case of
tunneling is different from the usual case of switching since we are
acting as the endpoint here and should not allow invalid packets to
pass.
This adds checks for IP checksum, version, total length, and options and
drops packets that don't pass.
Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>