Thomas Graf [Tue, 28 Oct 2014 10:19:52 +0000 (11:19 +0100)]
doc: Convert docs to Markdown language
Converts the majority of docs over to use the Markdown language for
pretty printing on GitHub. It's a rough first convertion without
exploiting the full potential of Markdown at this point. Section
titles and indentation are fixed as needed. Minimal docs interlinking
is added.
Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Nithin Raju [Fri, 24 Oct 2014 00:33:14 +0000 (17:33 -0700)]
datapath-windows: OvsFindVportByPortIdAndNicIndex() and external port.
We use OvsFindVportByPortIdAndNicIndex() to lookup the vport for a
packte received from the Hyper-V switch. If a packet was indeed received
from the virtual external NIC, we should flag it.
Validation:
1. Install and Uninstall the OVS EXT Driver (without enabling the OVS
extension on the Hyper-V switch).
2. Install and Uninstall the OVS EXT Driver (with enabling the OVS
extension on the Hyper-V switch). Hyper-V switch had a few ports.
3. Install and Uninstall the OVS EXT Driver (with enabling the OVS
extension on the Hyper-V switch). Added a few ports before
uninstalling.
4. Install the OVS EXT driver, and test the following functionality:
a) ping between 2 VMs on the same host
b) ping between 2 VMs on 2 Hyper-Vs - one physical and another
virtual backed by VLAN (patch port between br-pif and br-int).
c) ping between 2 VMs on 2 Hyper-Vs - one physical and another
virtual backed by VXLAN.
d) Successful uninstallation after these tests.
Signed-off-by: Nithin Raju <nithin@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Nithin Raju [Fri, 24 Oct 2014 00:33:13 +0000 (17:33 -0700)]
datapath-windows: Update vport add code.
In this patch, we make the following updates to the vport add code:
1. Clarify the roles of the different hash tables, so it is easier to
read/write code in the future.
2. Update OvsNewVportCmdHandler() to support adding bridge-internal
ports.
3. Fixes in OvsNewVportCmdHandler() to support adding external port.
Earlier, we'd hit ASSERTs.
4. I could not figure out way to add a port of type
OVS_PORT_TYPE_INTERNAL with name "internal" to the confdb using
ovs-vsctl.exe. And, this is needed in order to add the Hyper-V
internal port from userspace. To workaround this problem, we treat a
port of type OVS_PORT_TYPE_NETDEV with name "internal" as a request
to add the Hyper-V internal port. This is a workaround. The end
result is that there's a discrepancy between the port type in the
datpaath v/s confdb, but this has not created any trouble in testing
so far. If this ends up becoming an issue, we can mark the Hyper-V
internal port to be of type OVS_PORT_TYPE_NETDEV. No harm.
5. Because of changes indicated in #1, we also update the vport dump
code to look at the correct hash table for ports added from
userspace.
6. Add a OvsGetTunnelVport() for convenience.
7. Update ASSERTs() while cleaning up the switch.
8. Nuke OvsGetExternalVport() and OvsGetExternalMtu().
Signed-off-by: Nithin Raju <nithin@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Nithin Raju [Fri, 24 Oct 2014 00:33:12 +0000 (17:33 -0700)]
datapath-windows: Refactor core in Vport.c.
We do a bunch of changes that did not make sense to split up into
smaller patches:
1. Add descriptive comments to the important functions to clarify
purpose.
2. s/OvsInitVportCommon/InitHvVportCommon - this function is common code
for every port that shows up on the Hyper-V switch.
3. Introduce a InitOvsVportCommon() that is common code for evrey port
that gets added from userspace. This is especially useful for ports
that are not present on the Hyper-V switch. ie. tunnel ports and
bridge-internal ports.
4. Fix OvsClearAllSwitchVports() to remove ports from both the lists:
the ones added from Hyper-V as well as the ones added from OVS
userspace.
5. Update OvsInitVxlanTunnel() to not call into InitHvVportCommon
(formerly OvsInitVportCommon()) since it is not a port on the Hyper-v
switch. In a later patch in the series, we'll call
InitOvsVportCommon() for a VXLAN port.
6. 'numNonHvVports' increments and decrements ONLY for ports that are
added from OVS userspace but not present on the Hyper-V switch.
Signed-off-by: Nithin Raju <nithin@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Nithin Raju [Fri, 24 Oct 2014 00:33:09 +0000 (17:33 -0700)]
datapath-windows: Clarify externalVport.
In this patch, we add some explanation about the usage of
'externalVport' in the switch context. Also, we rename 'externalVport'
to 'virtualExternalVport' in alignment with the explanation. Also, we
rename 'numVports' to 'numHvVports' since ports are added from 2 ends
now.
Signed-off-by: Nithin Raju <nithin@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Nithin Raju [Fri, 24 Oct 2014 00:33:08 +0000 (17:33 -0700)]
datapath-windows: Re-init the list entry in OvsDeleteVportCmdHandler().
Without this patch, the kernel crashes when it tries to cleanup a port
at unload time when a port has been previously deleted from userspace.
Crash is in OvsRemoveAndDeleteVport() when we call into
RemoveEntryList().
Signed-off-by: Nithin Raju <nithin@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Sorin Vinturis [Mon, 27 Oct 2014 19:26:49 +0000 (19:26 +0000)]
datapath-windows: Fix BSOD caused by DV due to memory leaks.
If the OVS extension is enabled, Driver Verifier will issue a BSOD
due to memory leaks. This issue reproduces each time and the problem
is in the filter attach routine when the switch context is initialized.
Jarno Rajahalme [Mon, 27 Oct 2014 17:57:28 +0000 (10:57 -0700)]
lib/ovs-rcu: Support static initialization.
Currently, OVSRCU_TYPE_INITIALIZER always initializes the RCU pointer
as NULL. There is no reason why the RCU pointer could not be
initialized with a non-NULL value, however, as statically allocated
memory is even more stable than required for RCU.
This patch changes the initializer to OVSRCU_INITIALIZER(VALUE), which
can take any pointer value as a parameter.
This allows rculist, which is introduced in a following patch, to
provide an initializer similar to the one in the normal list.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jarno Rajahalme [Fri, 24 Oct 2014 20:22:24 +0000 (13:22 -0700)]
lib/classifier: Add lib/classifier-private.h.
tests/test-classifier.c used to include lib/classifier.c to gain
access to the internal data structures and some utility functions.
This was confusing, so this patch splits the relevant groups of
classifier internal definations to a new file
(lib/classifier-private.h), which is included by both lib/classifier.c
and tests/test-classifier.c. Other use of the new file is
discouraged.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jarno Rajahalme [Fri, 24 Oct 2014 20:22:24 +0000 (13:22 -0700)]
lib: Clean up vlog use.
Vlog functions assume a vlog module has been defined for the current
translation unit. Including lib/vlog.h from a header file makes the
vlog API visible even when no vlog module may not have been defined.
This patch removes the two cases in the tree where vlog.h was included
from a header file.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Fri, 24 Oct 2014 05:59:20 +0000 (14:59 +0900)]
ofproto: Only allow indirect groups with one bucket
OpenFlow 1.1 - 1.4 specify that indirect groups should
opperate on the "one defined bucket in the group". OpenFlow
1.2 - 1.4 also state "This group only supports a single bucket."
This patch enforces the single bucket limitation for indirect groups
when decoding group mod messages. A test is also added to exercise
this change.
Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Pravin B Shelar [Mon, 20 Oct 2014 23:13:04 +0000 (16:13 -0700)]
datapath: Use upstream ipv6_find_hdr().
ipv6_find_hdr() already fixed in newer upstram kernel by Ansis, we
can start using this API safely.
This patch also backports fix (ipv6: ipv6_find_hdr restore prev
functionality) to compat ipv6_find_hdr().
CC: Ansis Atteka <aatteka@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Ben Pfaff [Thu, 23 Oct 2014 21:34:04 +0000 (14:34 -0700)]
ofp-actions: Properly check for action that exceeds buffer length.
Commit c2d936a44fa (ofp-actions: Centralize all OpenFlow action code for
maintainability.) rewrote OpenFlow action parsing but failed to check that
actions don't overflow their buffers. This commit fixes the problem and
adds negative tests so that this bug doesn't recur.
Reported-by: Tomer Pearl <Tomer.Pearl@Contextream.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Justin Pettit <jpettit@nicira.com>
Nithin Raju [Thu, 23 Oct 2014 15:27:34 +0000 (08:27 -0700)]
dpif-netlink: Add support for packet receive on Windows.
In this patch, we add support in dpif-netlink.c to receive packets on
Windows. Windows does not natively support epoll(). Even though there
are mechanisms/interfaces that provide functionality similar to epoll(),
we take a simple approach of using a pool of sockets.
Here are some details of the implementaion to aid review:
1. There's pool of sockets per upcall handler.
2. The pool of sockets is initialized while setting up the handler in
dpif_netlink_refresh_channels() primarily.
3. When sockets are to be allocated for a vport, we walk through the
pool of sockets for all handlers and pick one of the sockets in each of
the pool. Within a handler's pool, sockets are picked in a round-robin
fashion.
4. We currently support only 1 handler, since there are some kernel
changes needed for support more than 1 handler per vport.
5. The pool size is also set to 1 currently.
The restructions imposed by #4 and #5 can be removed in the future
without much code churn.
Validation:
1. With a hacked up kernel which figures out the netlink socket that is
designated to receive packets, we are cable to perform pings between 2
VMs on the same Hyper-V host.
2. Compiled the code in Linux as well.
3. Tested with pool size == 2 as well, though in this patch we set the
pool size = 1.
Signed-off-by: Nithin Raju <nithin@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Nithin Raju [Thu, 23 Oct 2014 15:27:33 +0000 (08:27 -0700)]
netlink-socket: Add packet subscribe functionality on Windows.
In this patch, we add support in userspace for packet subscribe API
similar to the join/leave MC group API that is used for port events.
The kernel code has already been commited.
Signed-off-by: Nithin Raju <nithin@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
stream-tcp: Call setsockopt TCP_NODELAY after TCP is connected.
On Windows platform, TCP_NODELAY can only be set when TCP is established.
(This is an observed behavior and not written in any MSDN documentation.)
The current code does not create any problems while running unit tests
(because connections get established immediately) but is reportedly
observed while connecting to a different machine.
commit 8b76839(Move setsockopt TCP_NODELAY to when TCP is connected.)
made changes to call setsockopt with TCP_NODELAY after TCP is connected
only in lib/stream-ssl.c. We need the same change for stream-tcp too and
this commit does that.
Currently, a failure of setting TCP_NODELAY results in reporting
the error and then closing the socket. This commit changes that
behavior such that an error is reported if setting TCP_NODELAY
fails, but the connection itself is not torn down.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Commit a8d819675f3 (Remove stream, vconn, and rconn functions to get
local/remote IPs/ports.) removed the code that used the local socket
address but neglected to remove the code to fetch that address. This
commit removes the latter code also.
Ben Pfaff [Wed, 22 Oct 2014 21:57:39 +0000 (14:57 -0700)]
tunnel: Add to nw_tos bits instead of replacing them in tnl_port_send().
We normally only add 1-bits to wc->masks for datapath flow matching
purposes, never removing them. In this case, the bits that get set to
zero will be set back to 1 later on in the function, so this does not fix
any actual bug, but the principle of only setting to 1, not to 0, seems
sound to me.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Justin Pettit <jpettit@nicira.com>
Nithin Raju [Tue, 21 Oct 2014 23:10:36 +0000 (16:10 -0700)]
datapath-windows: Return success when duplicate flow is added.
If we are trying to insert a flow while there's already a key with the
same flow, return success instead of failure. It can be argued that we
should probably return a transactional error EEXIST, but we'll handle
this in a subsequent commit. I've added a comment to address this later.
Pravin B Shelar [Tue, 21 Oct 2014 21:10:41 +0000 (14:10 -0700)]
datapath: net: make skb_gso_segment error handling more robust
skb_gso_segment has three possible return values:
1. a pointer to the first segmented skb
2. an errno value (IS_ERR())
3. NULL. This can happen when GSO is used for header verification.
However, several callers currently test IS_ERR instead of IS_ERR_OR_NULL
and would oops when NULL is returned.
Note that these call sites should never actually see such a NULL return
value; all callers mask out the GSO bits in the feature argument.
However, there have been issues with some protocol handlers erronously not
respecting the specified feature mask in some cases.
It is preferable to get 'have to turn off hw offloading, else slow' reports
rather than 'kernel crashes'.
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Nithin Raju [Sat, 18 Oct 2014 18:39:38 +0000 (11:39 -0700)]
netlink-socket: Fix nl_sock_recv__() on Windows.
In nl_sock_recv__() on Windows, we realloc a new ofpbuf to copy received
data if the caller specified buffer is small. While we do so, we need
reset some of the other stack variables to point to the new ofpbuf.
Other fixes are around using 'error' rather than 'errno'.
Signed-off-by: Nithin Raju <nithin@vmware.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Madhu Challa [Sat, 18 Oct 2014 20:18:01 +0000 (13:18 -0700)]
dpif: Fix crash in format_odp_actions() due to NULL actions.
When flow_get fails (in this case flow does not exist) simply log
the key part of the get and erase the rest of the flow because it
is invalid.
Verified the fix by doing ovs-ofctl del-flows when traffic is running.
2014-10-18T20:12:13.785Z|00011|dpif(revalidator20)|WARN|system@ovs-system: failed to flow_get (No such file or directory) dp_hash(0),recirc_id(0),skb_priority(0),in_port(2),skb_mark(0),eth(src=00:13:72:0b:52:fa,dst=00:14:72:0b:52:fa),eth_type(0x0800),ipv4(src=10.0.0.164,dst=11.0.0.164,proto=6,tos=0,ttl=4,frag=no),tcp(src=1651,dst=6095),tcp_flags(ack), packets:0, bytes:0, used:never
Signed-off-by: Madhu Challa <challa@noironetworks.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
An I/O request is queued in Kernel to be completed upon a packet mismatch.
This mechanism is similar to the port state notification.
Access to instance data should be under a lock (TBD)
Signed-off-by: Eitan Eliahu <eliahue@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
stream-fd: Merge stream-fd-windows and stream-fd-unix.
There was not much difference between the two files after moving
all of the Windows socket HANDLE polling functionality to poll-loop.c.
So merge them together.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
maryam.tahhan [Mon, 13 Oct 2014 14:17:09 +0000 (15:17 +0100)]
netdev-dpdk: Move to DPDK 1.7.1
This patch updates the documentation to reflect that DPDK 1.7.1
is supported. Travis scripts have also been updated to reflect
this. DPDK phy and ring ports were validated against DPDK 1.7.1.
Reviewed-by: Mark D. Gray <mark.d.gray@intel.com> Signed-off-by: Maryam Tahhan <maryam.tahhan@intel.com> Acked-by: Daniele Di Proietto <ddiproietto@vmware.com> Acked-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Jarno Rajahalme [Sat, 18 Oct 2014 00:03:13 +0000 (17:03 -0700)]
lib/dpif-netdev: Fix EMC lookup.
Patch 0de8783a9 (lib/dpif-netdev: Integrate megaflow classifier.)
broke exact match cache lookup, but it went undetected since there are
no separate stats for EMC.
This patch fixes the problem by changing the struct netdev_flow_key
'len' member to cover only the 'mf' member, not the whole
netdev_flow_key, and ignoring the 'len' field in
netdev_flow_key_equal. Comparison is still accurate, as the miniflow
'map' field encodes the length in the number of 1-bits, and the map is
included in the comparison.
Reported-by: Alex Wang <alexw@nicira.com> Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>
Li RongQing [Fri, 17 Oct 2014 13:37:51 +0000 (06:37 -0700)]
datapath: fix a use after free
pskb_may_pull() called by arphdr_ok can change skb->data, so put the arp
setting after arphdr_ok to avoid the use the freed memory
Fixes: 0714812134d7d ("openvswitch: Eliminate memset() from flow_extract.") Cc: Jesse Gross <jesse@nicira.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Eitan Eliahu [Fri, 17 Oct 2014 06:45:42 +0000 (23:45 -0700)]
datapath-windows: Packet subscribe handler
This change includes the following:
[1] Handler for subscribe/unsubscribe to a packet queue associated with a
socket pid.
[2] Allocation of per socket packet queue on a packet subscription.
[3] Removal of static allocated queues.
[4] Freeing the packet queue (on user mode process termination).
Jarno Rajahalme [Fri, 17 Oct 2014 16:37:11 +0000 (09:37 -0700)]
lib/dpif-netdev: Integrate megaflow classifier.
Megaflow inserts and removals are simplified:
- No need for classifier internal mutex, as dpif-netdev already has a
'flow_mutex'.
- Number of memory allocations/frees can be halved.
- Lookup code path can rely on netdev_flow_key always having inline data.
This will also be easier to simplify further when moving to per-thread
megaflow classifiers in the future.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Alex Wang <alexw@nicira.com>
Eitan Eliahu [Fri, 17 Oct 2014 00:53:27 +0000 (17:53 -0700)]
datapath-windows: Add packet miss read Netlink command.
The change include the Packet Read handler.
The current implementation reads once packet at a time. This should be updated
once user mode code is in place.
Ben Pfaff [Thu, 16 Oct 2014 22:00:03 +0000 (15:00 -0700)]
ofproto-dpif-xlate: Support BFD, CFM, carrier, and LACP for port liveness.
This is simpler and shorter than handling each of these by itself.
CC: Niels van Adrichem <N.L.M.vanAdrichem@tudelft.nl> Suggested-by: Alex Wang <alexw@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Alex Wang <alexw@nicira.com>
docker: Integrate docker containers with Open vSwitch.
Open vSwitch does not have native integration with Docker.
INSTALL.Docker explains how Open vSwitch can be integrated
with docker non-natively.
ovs-docker is a helper script to add network interfaces to
docker containers and to attach them as ports to OVS bridge.
This script can be further enhanced as we understand different
use cases.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
ofproto-dpif-xlate: Use bfd forwarding status in fast-failover groups.
Integration of each interface' status as confirmed by BFD into the
FastFailover Group table. When BFD is configured and function
bfd_forwarding() reports false, odp_port_is_alive also reports false in
order to have a watched interface report false and omit to another
backup.
Test-suite has been succesfully run, as well as testing with ICMP echo
requests and replies that traffic was succesfully rerouted over the
backup path. More extensive load-consumption tests with a function that
only checked whether (bfd->state == STATE_UP) have been succesfully
performed, but was later changed to use the larger function
bfd_forwarding() as it captures all possible exceptions and is properly
mutually excluded.
Signed-off-by: Niels van Adrichem <n.l.m.vanadrichem@tudelft.nl> Signed-off-by: Ben Pfaff <blp@nicira.com>
Nithin Raju [Thu, 16 Oct 2014 00:27:14 +0000 (17:27 -0700)]
datapath-windows: Fixes in OvsSetVportCmdHandler()
In this patch, we make a few simple fixes based on reviewing the code.
The code as such is not tested. We'll be hitting the code path soon
and might make more fixes at that time.
Ankur Sharma [Wed, 15 Oct 2014 22:54:52 +0000 (15:54 -0700)]
datapath-windows: changes to existing PACKET_CMD handler.
In this patch we have made following changes:
OvsPacketExecute =>
Changed the data structure to have packet and
actions as pointer (instead of zero length array). It is done because
we will not do memcpy of packet now, pointer will just point
to corresponding offset in input buffer.
OvsExecuteDpIoctl =>
We only need input buffer now. Hence Changed the function signature.
Eitan Eliahu [Wed, 15 Oct 2014 09:14:03 +0000 (02:14 -0700)]
datapath-windows: Upcall NL packet format: Queue elem for packe in NL format.
[1] Allocate a queue element and space to hold the packet, key, tunnel key
and user data in NL format.
[2] Format the NL header
[3] Store packet, key, tunnel key and user data in NL format
[4] Calculates and insert checksum if offloaded.
Eitan Eliahu [Wed, 15 Oct 2014 09:13:10 +0000 (02:13 -0700)]
datapath-windows: Upcall NL packet format: Parametrized Key to NL conversion.
Extend the key and tunnel key conversion to nested NL format functions use the
NL attribute as a parameter so we can use them for missed packet formatting.
Add functions for calculating the space needed for storing the key and the
tunnel key in NL format
The VTEP emulator creates one OVS bridge for every logical switch and then
programs flow in it based on learned local macs and controller programmed
remote macs.
Multiple logical switches can have multiple OVS tunnels to the
same remote machine (with different tunnel ids). But VTEP schema expects
a single BFD session between two physical locators. Therefore
create a separate bridge ('bfd_bridge') and create a single OVS tunnel
between two physical locators (using reference counter).
The creation of BFD tunnels by the VTEP emulator is mostly for reporting
purposes. That is, it can be used by the controller to figure out that
a remote port is down. The emulator itself does not base any of its
forwarding decisions based on the state of a bfd tunnel.
Ankur Sharma [Sat, 11 Oct 2014 22:07:39 +0000 (15:07 -0700)]
datapath-windows: Remove setting of replyLen to zero.
This was one of the review comment which i forgot to address in
FLOW_DUMP checkin. We do not need to explicitly set replyLen to zero
as caller would have already set it.
Ankur Sharma [Sat, 11 Oct 2014 22:07:37 +0000 (15:07 -0700)]
datapath-windows: Validate ETHERTYPE and ETHERNET attribute.
During vswitchd boot up kernel is receiving FLOW_ADD commands
without ETHERTYPE attribute. Added additional check for the same
in this patch. During the same phase kernel is also receiving
packet_execute with no ETHERNET attribute.
Nithin Raju [Mon, 13 Oct 2014 03:56:19 +0000 (20:56 -0700)]
datapath-windows: loop iterator fixes in Vport.c
Validation:
- With these fixes, we no longer see the freeze during module
uninstallation or when we try to add a new port.
- We are able to add a port called "internal of type internal using:
ovs-dpctl.exe add-if ovs-system internal,type=internal
Signed-off-by: Nithin Raju <nithin@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Nithin Raju [Mon, 13 Oct 2014 03:56:16 +0000 (20:56 -0700)]
datapath-windows: remove vport from lists upon deletion
In this patch, we fix a bug in the vport delete code. When a vport is
deleted using a netlink command, we need to remove it from the
'ovsNamHashArray' and the 'portNoHashArray' as well. Addition of a vport
adds the port to the lists.
Signed-off-by: Nithin Raju <nithin@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Deletion of a vport is now handled both by the netlink command vport
delete and the hyper-v switch port delete handler. If a netlink
command vport delete is requested on a vport that has no hyper-v
switch port counterpart (i.e., it is a tunnel port, or or the hyper-v
switch virtual nic is disconnected), the vport is deleted and removed.
If the hyper-v switch port delete is requested (i.e. the VNic is
disconnecting) and the ovs (datapath) part is deleted (i.e. was
deleted by netlink command vport delete, or was never created by
an netlink command vport new), then the hyper-v switch port delete
function handler deletes and removes the vport.
If the hyper-v switch port delete is requested while its datapath
counterpart is still alive, or, when the netlink command vport delete
is requested while the hyper-v switch port is still alive, the port
is only marked that it's part is deleted - the field hvDeleted was
added to OVS_VPORT_ENTRY to specify if the hyper-v switch port side
was deleted; if the ovs (datapath) port number is invalid, then it
means that the ovs (datapath) side of the port is deleted (or, not
created).
Signed-off-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Acked-by: Nithin Raju <nithin@vmware.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Eitan Eliahu <eliahue@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Nithin Raju [Mon, 13 Oct 2014 03:56:14 +0000 (20:56 -0700)]
datapath-windows: Add netlink command: vport new
Does the following:
a. before creating the vport, makes sure there is no existing vport
with the same ovs (datapath) port name. If this is not so, it means
that the specified port already exists: it returns NL_ERROR_EXIST.
b. looks up the vport:
o) if the vport type is "internal", then the internal vport of the
hyper-v switch is yielded.
o) if the vport type is "netdev" and the vport ovs (datapath) name
is "external", then the external vport is yielded. The switch can
have only one "external" vport. The method of looking up the
"external" port can be changed later, if a better method is found.
o) if the vport type is "netdev" but the name is not "external",
then a VM VNic is assumed, so the vport is looked up by hyper-v
switch port friendly name.
o) if none of the above, a tunneling vport type is expected, which
in our case, at the moment, can only be the one vxlan vport. Only
one vxlan vport is allowed, and it's saved in
switchContext->vxlanVport. The tunneling vport is the only kind
which is created in the netlink command vport new, because it does
not have a hyper-v switch port counterpart.
c. if the vport could not be found (non-tunneling vports), then the
NL_ERROR_INVAL is returned to the userspace.
d. if the vport was found, but it has a valid ovs (datapath) port
number, it means that this port was already created by a netlink
command vport new. Therefore, NL_ERROR_EXIST is returned to the
userspace.
e. if the netlink command vport new specified an ovs (datapath) port
number, then it means that the userspace is trying to re-create a
vport: that specified port number will be used. Otherwise, a new
ovs (datapath) port number is computed and assigned to the vport.
f. the ovsName field of the vport is set to the name given by the
OVS_VPORT_ATTR_NAME netlink attribute. The ovsNameLen is no longer
stored in the OVS_VPORT_ENTRY struct, because ovsName is
null-terminated.
g. the "portOptions" are set to the vport, if the attribute
OVS_VPORT_ATTR_OPTIONS was given. Otherwise, it is set to NULL.
portOptions is a PNL_ATTR, which is yet to be implemented. The
only option available for now would be vxlan udp destination port,
but we have a constant value there, so this option is not yet needed.
h. the upcall pid is set to the vport.
i. if the vport type is vxlan, then the vport pointer is also saved
to switchContext->vxlanVport.
j. Now that the ovs (datapath) port number and the ovs name were set,
the vport can be added to the hash array of vports, hashed on ovs name
and to the hash array of vports hashed by ovs (datapath) port number.
k. the reply is yielded to the userspace.
Signed-off-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Acked-by: Nithin Raju <nithin@vmware.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Eitan Eliahu <eliahue@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Nithin Raju [Tue, 7 Oct 2014 22:08:53 +0000 (15:08 -0700)]
netlink-socket: Always pass the output buffer in a transaction.
We need to pass down the output buffer so that the kernel can return
transaction status - error or otherwise.
Also, we were processing the output buffer only when when
'txn->reply != NULL' ie when the caller specified an ofpbuf for the
reply. In this patch, the code has been updated to process the reply
unconditionally, but making sure to copy the reply to the 'txn->reply'
only when it is not NULL. The reason for the unconditional processing is
we can pass up transactional errors in 'txn->error'. Otherwise, it
results in an endless loop of calling nl_transact().
Alex Wang [Fri, 10 Oct 2014 21:41:10 +0000 (14:41 -0700)]
ofproto-dpif-upcall: Fix out-of-scope use of stack memory.
Commit cc377352d (ofproto: Reorganize in preparation for direct
dpdk upcalls.) introduced the bug that keeps reference to 'struct
flow' defined on the stack inside while loop when running out of
the scope. This causes strange bug like wrong mask extraction
when the part of memory is corrupted, and could lead to even
more serious bug/crash.
This commit fixes the above issue by defining an array of the
'struct flow's on the function stack.
Found by running ovs on RHEL7.
Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jarno Rajahalme [Fri, 10 Oct 2014 22:38:57 +0000 (15:38 -0700)]
lib/classifier: Make classifier_remove() more robust.
classifier already provides lockless lookups, and protected
modifications. When user wants to remove a flow, we currently require
the flow to exist in the classifier. To be thread safe, this requires
the caller to introduce their own mutex, lock it before a lookup, and
then issue classifier_remove() while the lock is still held.
This patch relaxes the "existence requirement" of the rule in
classifier_remove(), allowing it to be called on a rule that may have
already been removed from the classifier. This allows users to do a
classifier_lookup() and classifier_remove() without additional
syncronization.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>