Eitan Eliahu [Wed, 15 Oct 2014 09:14:03 +0000 (02:14 -0700)]
datapath-windows: Upcall NL packet format: Queue elem for packe in NL format.
[1] Allocate a queue element and space to hold the packet, key, tunnel key
and user data in NL format.
[2] Format the NL header
[3] Store packet, key, tunnel key and user data in NL format
[4] Calculates and insert checksum if offloaded.
Eitan Eliahu [Wed, 15 Oct 2014 09:13:10 +0000 (02:13 -0700)]
datapath-windows: Upcall NL packet format: Parametrized Key to NL conversion.
Extend the key and tunnel key conversion to nested NL format functions use the
NL attribute as a parameter so we can use them for missed packet formatting.
Add functions for calculating the space needed for storing the key and the
tunnel key in NL format
The VTEP emulator creates one OVS bridge for every logical switch and then
programs flow in it based on learned local macs and controller programmed
remote macs.
Multiple logical switches can have multiple OVS tunnels to the
same remote machine (with different tunnel ids). But VTEP schema expects
a single BFD session between two physical locators. Therefore
create a separate bridge ('bfd_bridge') and create a single OVS tunnel
between two physical locators (using reference counter).
The creation of BFD tunnels by the VTEP emulator is mostly for reporting
purposes. That is, it can be used by the controller to figure out that
a remote port is down. The emulator itself does not base any of its
forwarding decisions based on the state of a bfd tunnel.
Ankur Sharma [Sat, 11 Oct 2014 22:07:39 +0000 (15:07 -0700)]
datapath-windows: Remove setting of replyLen to zero.
This was one of the review comment which i forgot to address in
FLOW_DUMP checkin. We do not need to explicitly set replyLen to zero
as caller would have already set it.
Ankur Sharma [Sat, 11 Oct 2014 22:07:37 +0000 (15:07 -0700)]
datapath-windows: Validate ETHERTYPE and ETHERNET attribute.
During vswitchd boot up kernel is receiving FLOW_ADD commands
without ETHERTYPE attribute. Added additional check for the same
in this patch. During the same phase kernel is also receiving
packet_execute with no ETHERNET attribute.
Nithin Raju [Mon, 13 Oct 2014 03:56:19 +0000 (20:56 -0700)]
datapath-windows: loop iterator fixes in Vport.c
Validation:
- With these fixes, we no longer see the freeze during module
uninstallation or when we try to add a new port.
- We are able to add a port called "internal of type internal using:
ovs-dpctl.exe add-if ovs-system internal,type=internal
Signed-off-by: Nithin Raju <nithin@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Nithin Raju [Mon, 13 Oct 2014 03:56:16 +0000 (20:56 -0700)]
datapath-windows: remove vport from lists upon deletion
In this patch, we fix a bug in the vport delete code. When a vport is
deleted using a netlink command, we need to remove it from the
'ovsNamHashArray' and the 'portNoHashArray' as well. Addition of a vport
adds the port to the lists.
Signed-off-by: Nithin Raju <nithin@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Deletion of a vport is now handled both by the netlink command vport
delete and the hyper-v switch port delete handler. If a netlink
command vport delete is requested on a vport that has no hyper-v
switch port counterpart (i.e., it is a tunnel port, or or the hyper-v
switch virtual nic is disconnected), the vport is deleted and removed.
If the hyper-v switch port delete is requested (i.e. the VNic is
disconnecting) and the ovs (datapath) part is deleted (i.e. was
deleted by netlink command vport delete, or was never created by
an netlink command vport new), then the hyper-v switch port delete
function handler deletes and removes the vport.
If the hyper-v switch port delete is requested while its datapath
counterpart is still alive, or, when the netlink command vport delete
is requested while the hyper-v switch port is still alive, the port
is only marked that it's part is deleted - the field hvDeleted was
added to OVS_VPORT_ENTRY to specify if the hyper-v switch port side
was deleted; if the ovs (datapath) port number is invalid, then it
means that the ovs (datapath) side of the port is deleted (or, not
created).
Signed-off-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Acked-by: Nithin Raju <nithin@vmware.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Eitan Eliahu <eliahue@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Nithin Raju [Mon, 13 Oct 2014 03:56:14 +0000 (20:56 -0700)]
datapath-windows: Add netlink command: vport new
Does the following:
a. before creating the vport, makes sure there is no existing vport
with the same ovs (datapath) port name. If this is not so, it means
that the specified port already exists: it returns NL_ERROR_EXIST.
b. looks up the vport:
o) if the vport type is "internal", then the internal vport of the
hyper-v switch is yielded.
o) if the vport type is "netdev" and the vport ovs (datapath) name
is "external", then the external vport is yielded. The switch can
have only one "external" vport. The method of looking up the
"external" port can be changed later, if a better method is found.
o) if the vport type is "netdev" but the name is not "external",
then a VM VNic is assumed, so the vport is looked up by hyper-v
switch port friendly name.
o) if none of the above, a tunneling vport type is expected, which
in our case, at the moment, can only be the one vxlan vport. Only
one vxlan vport is allowed, and it's saved in
switchContext->vxlanVport. The tunneling vport is the only kind
which is created in the netlink command vport new, because it does
not have a hyper-v switch port counterpart.
c. if the vport could not be found (non-tunneling vports), then the
NL_ERROR_INVAL is returned to the userspace.
d. if the vport was found, but it has a valid ovs (datapath) port
number, it means that this port was already created by a netlink
command vport new. Therefore, NL_ERROR_EXIST is returned to the
userspace.
e. if the netlink command vport new specified an ovs (datapath) port
number, then it means that the userspace is trying to re-create a
vport: that specified port number will be used. Otherwise, a new
ovs (datapath) port number is computed and assigned to the vport.
f. the ovsName field of the vport is set to the name given by the
OVS_VPORT_ATTR_NAME netlink attribute. The ovsNameLen is no longer
stored in the OVS_VPORT_ENTRY struct, because ovsName is
null-terminated.
g. the "portOptions" are set to the vport, if the attribute
OVS_VPORT_ATTR_OPTIONS was given. Otherwise, it is set to NULL.
portOptions is a PNL_ATTR, which is yet to be implemented. The
only option available for now would be vxlan udp destination port,
but we have a constant value there, so this option is not yet needed.
h. the upcall pid is set to the vport.
i. if the vport type is vxlan, then the vport pointer is also saved
to switchContext->vxlanVport.
j. Now that the ovs (datapath) port number and the ovs name were set,
the vport can be added to the hash array of vports, hashed on ovs name
and to the hash array of vports hashed by ovs (datapath) port number.
k. the reply is yielded to the userspace.
Signed-off-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Acked-by: Nithin Raju <nithin@vmware.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Eitan Eliahu <eliahue@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Nithin Raju [Tue, 7 Oct 2014 22:08:53 +0000 (15:08 -0700)]
netlink-socket: Always pass the output buffer in a transaction.
We need to pass down the output buffer so that the kernel can return
transaction status - error or otherwise.
Also, we were processing the output buffer only when when
'txn->reply != NULL' ie when the caller specified an ofpbuf for the
reply. In this patch, the code has been updated to process the reply
unconditionally, but making sure to copy the reply to the 'txn->reply'
only when it is not NULL. The reason for the unconditional processing is
we can pass up transactional errors in 'txn->error'. Otherwise, it
results in an endless loop of calling nl_transact().
Alex Wang [Fri, 10 Oct 2014 21:41:10 +0000 (14:41 -0700)]
ofproto-dpif-upcall: Fix out-of-scope use of stack memory.
Commit cc377352d (ofproto: Reorganize in preparation for direct
dpdk upcalls.) introduced the bug that keeps reference to 'struct
flow' defined on the stack inside while loop when running out of
the scope. This causes strange bug like wrong mask extraction
when the part of memory is corrupted, and could lead to even
more serious bug/crash.
This commit fixes the above issue by defining an array of the
'struct flow's on the function stack.
Found by running ovs on RHEL7.
Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jarno Rajahalme [Fri, 10 Oct 2014 22:38:57 +0000 (15:38 -0700)]
lib/classifier: Make classifier_remove() more robust.
classifier already provides lockless lookups, and protected
modifications. When user wants to remove a flow, we currently require
the flow to exist in the classifier. To be thread safe, this requires
the caller to introduce their own mutex, lock it before a lookup, and
then issue classifier_remove() while the lock is still held.
This patch relaxes the "existence requirement" of the rule in
classifier_remove(), allowing it to be called on a rule that may have
already been removed from the classifier. This allows users to do a
classifier_lookup() and classifier_remove() without additional
syncronization.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Alin Serdean [Thu, 9 Oct 2014 17:46:58 +0000 (17:46 +0000)]
datapath-windows: Add port friendly name to OVS_VPORT_ENTRY
The port friendly name will be set by WMI / powershell script.
It will be used from within the netlink command vport new to
identify the hyper-v switch port it represents.
This patch also adds a function to lookup a vport by the
port friendly name.
Signed-off-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Co-authored-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Eitan Eliahu <eliahue@vmware.com> Acked-by: Nithin Raju <nithin@vmware.com> Tested-by: Nithin Raju <nithin@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Alin Serdean [Thu, 9 Oct 2014 17:46:57 +0000 (17:46 +0000)]
datapath-windows: Make VPORT ovs number values smaller than MAXUINT16
For this, the old method of finding vports based on the
ovs port numbers is removed. Now, the lookup of a vport
by ovs port number is done the same way as the lookup by
hyper-v switch port id.
This is done so that the kernel is able to interact with
the userspace correctly when using vport channels.
The problem manifested in lib/dpif-netlink.c, at the function
vport_add_channels.
This patch removes the field vportArray from OVS_SWITCH_CONTEXT.
In its place, portNoHashArray is set. In the OVS_VPORT_ENTRY
struct, we also add portNoLink. This new method will do lookup
and insertions of vports by ovs (datapath) port numbers the same
way it is done for hyper-v switch port ids.
This patch implicitly removes the indexes, which were previously
used in insertions and lookups on ovs port numbers. The removal
of the index also means that the vxlan vport can no longer be
looked up the same way as it was done before: now we hold a pointer
to it, vxlanVport in OVS_SWITCH_CONTEXT. For the moment, we can
have only one vxlan vport. When more will be allowed, this field
will turn into a list of vxlan ports.
The invalid port number value (held in OVS_DPPORT_NUMBER_INVALID)
is now changed from 0 to MAXUINT16, the same as it is on linux.
The function OvsComputeVportNo has also been removed, since the
computation of a vport port number can no longer be done like this.
When vport add will be added, a new, updated OvsComputeVportNo
function will be added.
Also, in OvsInitVportCommon, we no longer need to (and no longer can)
initialize vport->ovsName, nor vport->ovsNameLen, because they will
be initialized by the netlink command vport add. Since the netlink
command vport add will generate numbers for the datapath (ovs) port
numbers and set the port names, we cannot insert the vport into the
hash array of port numbers here, nor into the hash array of port names.
Signed-off-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Co-authored-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Eitan Eliahu <eliahue@vmware.com> Acked-by: Nithin Raju <nithin@vmware.com> Tested-by: Nithin Raju <nithin@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Alin Serdean [Thu, 9 Oct 2014 17:46:57 +0000 (17:46 +0000)]
datapath-windows: Rename switch context's portHashArray and vport's portLink
The field portLink of the OVS_VPORT_ENTRY is the link within the
OVS_SWITCH_CONTEXT's hash array of vports portHashArray, hashed by the
portId field of the OVS_VPORT_ENTRY.
Later on, we will need to modify the OVS_VPORT_ENTRY so that its port
numbers are set to maximum MAXUINT16. This will require that the field
vportArray of OVS_SWITCH_CONTEXT be removed and replaced with a hash
array, portNoHashArray. Also, a new field, portNoLink, will need to be
added to OVS_VPORT_ENTRY. In order to differentiate between portHashArray
and portNoHashArray, portHashArray is renamed to portIdHashArray. Also,
in order to differentiate between portLink and portNoLink, portLink
is renamed to portIdLink.
In a future patch the vport functionality will be changed to constraint
the port numbers to MAXUINT16.
Signed-off-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Co-authored-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Eitan Eliahu <eliahue@vmware.com> Acked-by: Nithin Raju <nithin@vmware.com> Tested-by: Nithin Raju <nithin@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Alin Serdean [Thu, 9 Oct 2014 17:46:56 +0000 (17:46 +0000)]
datapath-windows: Rename switch context's nameHashArray and vport's nameLink login register
The field nameLink of the OVS_VPORT_ENTRY is the link within the
OVS_SWITCH_CONTEXT's hash array of vports nameHashArray, hashed by the
ovsName field of the OVS_VPORT_ENTRY.
Later on, the friendly name of the hyper-v switch port will need to be
set from userspace using WMI. This will require that the hyper-v switch
port friendly name be set to the exact string value as the ovs
(datapath) port name set from netlink command vport add.
The vport will need to differentiate between the ovs (datapath) port
name and hyper-v switch port friendly name, because they may differ in
erroneous scenarios, or state differences between the hyper-v switch
port and the ovs (datapath) port. This may happen if the vport was
created by the netlink command vport add, but the VM disconnected (i.e.
the hyper-v switch port was later deleted).
Storing another field in vport, "portFriendlyName" would normally
make the current switchContext->nameHashArray and vport->nameLink
confusing since the "name"-s may be understood to mean the hyper-v
switch port friendly name, or the hyper-v switch port name, when it
actually refers to the ovs (datapath) port name.
Hence, the variable nameHashArray is renamed to ovsPortNameHashArray,
while the nameLink is renamed to ovsPortNameLink. This change will make
a clearer connection between these and the vport field "ovsName" to
which they revolve around.
Signed-off-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Co-authored-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Eitan Eliahu <eliahue@vmware.com> Acked-by: Nithin Raju <nithin@vmware.com> Tested-by: Nithin Raju <nithin@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Alin Serdean [Thu, 9 Oct 2014 17:46:56 +0000 (17:46 +0000)]
datapath-windows: Rename OvsGetVportNo into OvsComputeVportNo and make public
OvsGetVportNo computes a new port number. Therefore, OvsComputeVportNo
is a more clear name for what the function does. Reading OvsGetVportNo
may give the false impression that it returns the port number of an
existing vport.
Also, since the responsibility of assigning dp port numbers no longer
falls on the hyper-v switch port handlers side, but on the netlink vport
commands side (vport add), we will need to use this compute port number
function from outside Vport.c. Therefore, this function declaration is
moved from Vport.c to Vport.h, and becomes public.
Signed-off-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Co-authored-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Eitan Eliahu <eliahue@vmware.com> Acked-by: Nithin Raju <nithin@vmware.com> Tested-by: Nithin Raju <nithin@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
The fields externalVport and internalVport of the OVS_SWITCH_CONTEXT
struct are currently defined as PVOID. However, all over the code they
are used as POVS_VPORT_ENTRY. In order to improve clarity and reduce the
need for useless casts to POVS_VPORT_ENTRY, this patch changes the type
from PVOID to POVS_VPORT_ENTRY.
This patch does not cleanup the code that already uses casts to
POVS_VPORT_ENTRY. This cleanup can be done later on as well.
Signed-off-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Co-authored-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Eitan Eliahu <eliahue@vmware.com> Acked-by: Nithin Raju <nithin@vmware.com> Tested-by: Nithin Raju <nithin@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Alin Serdean [Thu, 9 Oct 2014 17:46:55 +0000 (17:46 +0000)]
datapath-windows: Update OVS_SWITCH_CONTEXT: external and internal port
The fields externalVport and internalVport of the OVS_SWITCH_CONTEXT
struct are currently defined as PVOID. However, all over the code they
are used as POVS_VPORT_ENTRY. In order to improve clarity and reduce the
need for useless casts to POVS_VPORT_ENTRY, this patch changes the type
from PVOID to POVS_VPORT_ENTRY.
This patch does not cleanup the code that already uses casts to
POVS_VPORT_ENTRY. This cleanup can be done later on as well.
Signed-off-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Co-authored-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Eitan Eliahu <eliahue@vmware.com> Acked-by: Nithin Raju <nithin@vmware.com> Tested-by: Nithin Raju <nithin@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Alin Serdean [Thu, 9 Oct 2014 17:46:55 +0000 (17:46 +0000)]
datapath-windows: We don't need to keep validation ports in ovs
Validation ports are used internally by the hyper-v switch to validate
and verify settings for the real hyper-v switch ports that will be
connected to the VNic. The validation ports are of no use to us - we
must skip handling them, and return STATUS_SUCCESS as the OID result.
Signed-off-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Co-authored-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Eitan Eliahu <eliahue@vmware.com> Acked-by: Nithin Raju <nithin@vmware.com> Tested-by: Nithin Raju <nithin@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Alin Serdean [Thu, 9 Oct 2014 17:46:54 +0000 (17:46 +0000)]
datapath-windows: Rename hyper-v switch port and nic handlers
Functions such as OvsCreatePort are vague in regard to who creates it or
when. It wasn't a problem thus far, since the vports were created,
updated and destroyed from one place only (hyper-v switch part). But
now, with the netlink implementation of the vport commands, a part of
the vport is constructed by the netlink vport add, and the other part
is constructed by the hyper-v switch nic and port handlers.
This patch renames the hyper-v switch nic and port handlers, so that
they are now prefixed by "Hv" instead of "Ovs", in order to clarify
which of the functions are nic or port handlers. This will make more
clear the usages from netlink vport commands side and from hyper-v
switch side. It will also make more obvious which nic and port
functions are helper functions.
Signed-off-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Co-authored-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Eitan Eliahu <eliahue@vmware.com> Acked-by: Nithin Raju <nithin@vmware.com> Tested-by: Nithin Raju <nithin@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Pravin B Shelar [Sat, 4 Oct 2014 03:23:58 +0000 (20:23 -0700)]
netdev-dpif: Add metadata to dpif-packet.
Today dpif-netdev has single metadat for given batch, since one
batch belongs to one port, but soon packets fro single tunnel ports
can belong to different ports, so we need to have per packet metadata.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Alex Wang [Thu, 9 Oct 2014 08:23:37 +0000 (08:23 +0000)]
ovs-vswitchd: Fix high cpu utilization when acquire idl lock fails.
When ovs-vswitchd fails to acquire the ovsdb idl lock (either due to
contention or due to invalid database path), ovs-vswitchd will spin
on the global connectivity sequence number and consume 100% cpu.
This is in that the local copy is different to the global sequence
number and never updated when ovsdb idl is not locked.
To fix this issue, this commit makes ovs-vswitchd not checking the
global connectivity sequence number in that situation.
Reported-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Alex Wang [Thu, 9 Oct 2014 07:02:07 +0000 (00:02 -0700)]
rhel7: Fix rpm install failure.
When trying to install the kernel module rpm built for RHEL7,
the install failed with following conflicts:
# rpm -i kmod-openvswitch-2.3.1-1.el7.x86_64.rpm
file /lib/modules/3.10.0-123.8.1.el7.x86_64/modules.devname
from install of kmod-openvswitch-2.3.1-1.el7.x86_64 conflicts
with file from package kernel-3.10.0-123.8.1.el7.x86_64
file /lib/modules/3.10.0-123.8.1.el7.x86_64/modules.softdep
from install of kmod-openvswitch-2.3.1-1.el7.x86_64 conflicts
with file from package kernel-3.10.0-123.8.1.el7.x86_64
Similar issue has already been reported and solved here:
Nithin Raju [Tue, 7 Oct 2014 01:05:39 +0000 (18:05 -0700)]
datapath-windows: Update OvsGetExtInfoIoctl() to the new vport add workflow
I applied the patches for the new vport add workflow that is out for
review, and found that some of the existing code in OvsGetExtInfoIoctl()
needs to be updated. In this patch, we add a CPP called
USE_NEW_VPORT_ADD_WORKFLOW and add the fixes under
USE_NEW_VPORT_ADD_WORKFLOW == 1. The current value is set to 0, since
the vport add code is not checked in yet.
Sending out this patch to unblock the vport add code when it gets checked
in. There are other fixes also required, but they are being addressed as
part of the review comments for vport-add.
Signed-off-by: Nithin Raju <nithin@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Nithin Raju [Wed, 8 Oct 2014 21:21:50 +0000 (14:21 -0700)]
datapath-windows: Add support for OVS_DP_CMD_NEW Netlink command.
In this change, we add support for the 'OVS_DP_CMD_NEW' netlink command
in the kernel. We don't really support creation of a new datapath. If
the name of the userspace is the same as the single datapath we support,
we return EEXIST.
This code is required to bootstrap ovs-vswitchd which makes the
following sequence of calls:
open_dpif_backer() -> dpif_create_and_open() -> dpif_create()
We also rename HandleDpTransaction() to HandleDpTransactionCommon().
Signed-off-by: Nithin Raju <nithin@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Ben Pfaff [Thu, 9 Oct 2014 05:13:31 +0000 (22:13 -0700)]
unaligned: Make get_unaligned_be64() compatible on GCC and non-GCC.
Until now, with GCC, get_unaligned_be64() had an interface that accepted
a "ovs_be64 *", and with other compilers its accepted any
pointer-to-64-bit type, but not void *. This commit fixes the problem,
making the interface the same in both cases.
This fixes a build error on MSVC:
lib/nx-match.c(320) : error C2100: illegal indirection
lib/nx-match.c(320) : error C2034: 'build_assert_failed' : type of bit
field too small for number of bits
lib/nx-match.c(320) : error C2296: '%' : illegal, left operand has
type 'void *'
lib/nx-match.c(320) : error C2198: 'ntohll' : too few arguments for call
It might appear that this patch changes get_unaligned_u64() but in fact
it onloy moves it earlier in the file (since it is now called from the
non-GCC fork of the #if).
Reported-by: Alin Serdean <aserdean@cloudbasesolutions.com> Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Fri, 12 Sep 2014 05:09:03 +0000 (22:09 -0700)]
ofp-actions: Support experimenter OXMs in Nicira extensions.
Some of the Nicira extension actions include fixed-size 32-bit members that
designate NXM fields. These actions can't accommodate 64-bit experimenter
OXMs, so we need to figure out some kind of solution. This commit does
that, in different ways for different actions.
For some actions, I did not think it was worthwhile to worry about
experimenter OXM, so I just disabled use of them. This is what I did for
bundle, learn, and multipath actions.
Other actions could be gracefully reinterpreted to support experimenter
OXM. This is true of reg_move, which use NXM headers only at the end of
the action and such that using an experimenter OXM would make the action
longer (which unambigously signals to older OVS that the action is an
error, which is desired behavior since older OVS cannot interpret this
action). The stack push and pop actions are also in this category.
reg_load was the most frustrating case. In OpenFlow 1.5 we had already
eliminated this action in favor of OF1.5+ set_field. In other OpenFlow
versions, though, reg_load is more powerful than set_field because it
can modify partial fields. This commit therefore adds a new variant of
reg_load, called reg_load2, which is simply OF1.5+ set_field with a Nicira
extension header on it.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Ben Pfaff [Wed, 8 Oct 2014 22:41:00 +0000 (15:41 -0700)]
nx-match: Add support for experimenter OXM.
OpenFlow 1.2+ defines a means for vendors to define vendor-specific OXM
fields, called "experimenter OXM". These OXM fields are expressed with a
64-bit OXM header instead of the 32-bit header used for standard OXM (and
NXM). Until now, OVS has not implemented experimenter OXM, and indeed we
have had little need to do so because of a pair of special 32-bit OXM classes
grandfathered to OVS as part of the OpenFlow 1.2 standardization process.
However, I want to prototype a feature for OpenFlow 1.5 that uses an
experimenter OXM as part of the prototype, so to do this OVS needs to
support experimenter OXM. This commit adds that support.
Most of this commit is a fairly straightforward change: it extends the type
used for OXM/NXM from 32 to 64 bits and adds code to encode and decode the
longer headers when necessary. Some other changes are necessary because
experimenter OXMs have a funny idea of the division between "header" and
"body": the extra 32 bits for experimenter OXMs are counted as part of the body
rather than the header according to the OpenFlow standard (even though this
does not entirely make sense), so arithmetic in various places has to be
adjusted, which is the reason for the new functions nxm_experimenter_len(),
nxm_payload_len(), and nxm_header_len().
Another change that calls for explanation is the new function mf_nxm_header()
that has been split from mf_oxm_header(). This function is used in actions
where the space for an NXM or OXM header is fixed so that there is no room
for a 64-bit experimenter type. An upcoming commit will add new variations
of these actions that can support experimenter OXM.
Testing experimenter OXM is tricky because I do not know of any in
widespread use. Two ONF proposals use experimenter OXMs: EXT-256 and
EXT-233. EXT-256 is not suitable to implement for testing because its use
of experimenter OXM is wrong and will be changed. EXT-233 is not suitable
to implement for testing because it requires adding a new field to struct
flow and I am not yet convinced that that field and the feature that it
supports is worth having in Open vSwitch. Thus, this commit assigns an
experimenter OXM code point to an existing OVS field that is currently
restricted from use by controllers, "dp_hash", and uses that for testing.
Because controllers cannot use it, this leaves future versions of OVS free
to drop the support for the experimenter OXM for this field without causing
backward compatibility problems.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Ben Pfaff [Wed, 10 Sep 2014 18:15:20 +0000 (11:15 -0700)]
nx-match: Speak of 'class' instead of 'vendor' for OXM/NXM.
OXM renamed the 'vendor' field from NXM to the 'class', and uses the term
"experimenter", which OVS usually renders as "vendor" for historical
reasons, as part of the extended 64-bit OXMs. To reduce confusion, this
commit adopts the OXM terminology for class.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Ben Pfaff [Tue, 7 Oct 2014 23:49:50 +0000 (16:49 -0700)]
ofp-actions: Support OF1.5 (draft) masked Set-Field, merge with reg_load.
OpenFlow 1.5 (draft) extends the OFPAT_SET_FIELD action originally
introduced in OpenFlow 1.2 so that it can set not just entire fields but
any subset of bits within a field as well. This commit adds support for
that feature when OpenFlow 1.5 is used.
With this feature, OFPAT_SET_FIELD becomes a superset of NXAST_REG_LOAD.
Thus, this commit merges the implementations of the two actions into a
single ofpact_set_field.
ONF-JIRA: EXT-314 Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Nithin Raju [Wed, 8 Oct 2014 18:53:25 +0000 (11:53 -0700)]
lib/netdev-windows.c: nuke the init function.
The init function is not allowed to call into the kernel datapath
while running unit tests since the kernel datapath is not loaded.
Instead of making the function dummy, it is better to not have it
at all.
Ben Pfaff [Fri, 26 Sep 2014 23:00:44 +0000 (16:00 -0700)]
ovs-vsctl: Allow modifying "immutable" columns if we just created the row.
OVSDB has the concept of "immutable" columns, which are columns whose
values are fixed once a row is inserted. Until now, ovs-vsctl has not
allowed these columns to be modified at all. However, this is a little too
strict, because these columns can be set to any value at the time that the
row is inserted. This commit relaxes the ovs-vsctl requirement, then, to
allow an immutable column's value to be modified if its row has been
inserted within this transaction.
Requested-by: Mukesh Hira <mhira@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Lucian Petrut [Wed, 8 Oct 2014 17:05:02 +0000 (20:05 +0300)]
Update the WMI Script handling Hyper-V friendly port names
This patch ensures that the friendly port name has no more than 16 characters
and also if it is not in use.
The method which checks the WMI jobs has been updated in order to provide
relevant error codes/descriptions.
Methods retrieving the according VM and VM network adapter mapped to an OVS
port have been added as well. They are called:
Get-VMNetworkAdapterByOVSPort
Get-VMByOVSPort
Ben Pfaff [Tue, 7 Oct 2014 23:54:04 +0000 (16:54 -0700)]
vswitch.ovsschema: Correct schema version number.
Commit 3e5aeeb581faf7 (bridge: Keep bond active slave selection across OVS
restart) updated the OVS schema number from 7.9.0 to 8.0.0. However,
the major version number should only be incremented for incompatible schema
changes, ones that are likely to break software that interacts with the
schema. The change in question only added a column to a table, so it is
not an incompatible change. Therefore, this commit changes the schema
version number to 7.10.0, indicating a compatible change.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Ben Pfaff [Tue, 30 Sep 2014 21:05:14 +0000 (14:05 -0700)]
ofp-actions: Better support OXM in Copy-Field action.
The OpenFlow 1.5 (draft) Copy-Field action has two OXM headers, one after
the other. Until now, Open vSwitch has implemented these as a pair of
ovs_be32 members, which meant that only 32-bit OXM could be supported. This
commit changes the implementation to use nx_pull_header(), which means that
in the future when that function supports 64-bit experimenter OXMs,
Copy-Field will automatically get that support too.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Ben Pfaff [Wed, 17 Sep 2014 05:13:44 +0000 (22:13 -0700)]
nx-match: Move all knowledge of OXM/NXM here.
This improves the general abstraction of OXM/NXM by eliminating direct
knowledge of it from the meta-flow code and other places.
Some function renaming might be called for; for example, mf_oxm_header()
may not be the best name now that the function is implemented within
nx-match. However, these renamings would make this commit larger and
harder to review, so I'm postponing them.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Ben Pfaff [Fri, 5 Sep 2014 00:02:35 +0000 (17:02 -0700)]
ovs-ofctl: Encode cookies in OXM-compliant manner.
NXM/OXM are only supposed to put 1-bits in a value if the corresponding bit
in the mask is a 1-bit, but in the case of cookie matching, e.g.
ovs-ofctl del-flows br0 cookie=0x3/0x1
ovs-ofctl would encode a bad OXM. This fixes the problem.
(The test "ofproto - del flows based on cookie mask" in the OVS testsuite
tickles this bug.)
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Ben Pfaff [Tue, 7 Oct 2014 22:24:11 +0000 (15:24 -0700)]
meta-flow: Autogenerate mf_field data structures.
This is a first step toward improving the abstraction of OXM and NXM in the
tree. As an immediate improvement, this commit removes all of the
definitions of the OXM and NXM constants from the top-level header files,
because they are no longer used anywhere.
Jarno Rajahalme [Tue, 7 Oct 2014 21:35:04 +0000 (14:35 -0700)]
lib/bitmap: Faster bitmap functions.
Replace bitwise loops with a single operation, inline all bitmap
functions. Inlining allows the compiler to remove unnecessary code
due to some parameters being compile-time constants.
Before:
$ tests/ovstest test-bitmap benchmark 1000000
bitmap equal: 341 ms
bitmap scan: 8089 ms
After:
$ tests/ovstest test-bitmap benchmark 1000000
bitmap equal: 152 ms
bitmap scan: 146 ms
Ben Pfaff [Tue, 7 Oct 2014 19:59:14 +0000 (12:59 -0700)]
flow: Clean up MINIFLOW_FOR_EACH_IN_MAP.
It seemed awkward to have declarations outside the for loop.
This may also be a little faster because it avoids some calls to
count_1bits(). The idea for that change is due to Jarno Rajahalme
<jrajahalme@nicira.com>.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
BFD: Decreasing minimal transmit and receive interval
I found the BFD transmit interval was lowerbounded by the default value
without warning, although documentation does not consider a lowerbound.
Testing has been performed with transmit and receive intervals as low as 1
ms, and although CPU overhead was effected (especially with multiple BFD
sessions such as 6 and higher), it worked well.
Signed-off-by: Niels van Adrichem <n.l.m.vanadrichem@tudelft.nl> Signed-off-by: Ben Pfaff <blp@nicira.com>
Nithin Raju [Mon, 6 Oct 2014 20:07:20 +0000 (13:07 -0700)]
netdev-windows: add code to query netdev information
Primary goals of netdev-windows.c are:
1) To query the 'network device' information of a vport such as MTU, etc.
2) Monitor changes to the 'network device' information such as link
status.
In this change, we implement only #1. #2 can also be implemented, but it
does not seem to be required for the purposes of implement
'ovs-dpctl.exe show'.
Signed-off-by: Nithin Raju <nithin@vmware.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Nithin Raju [Mon, 6 Oct 2014 20:07:19 +0000 (13:07 -0700)]
netdev-windows: New module.
In this patch, we add a lib/netdev-windows.c which mostly contains stub
code and in subsequent patches, would use the netlink interface to query
netdev information for a vport.
The code implements netdev functionality for "internal" and "system"
types of vports.
Signed-off-by: Nithin Raju <nithin@vmware.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Nithin Raju [Mon, 6 Oct 2014 20:07:17 +0000 (13:07 -0700)]
OvsDpInterfaceExt.h: add support for netlink family for netdev
In this patch, we define netlink family, attributes and commands
for query the 'network device' information of VPORTs, such as
MTU, Link status, etc.
I considered adding the netdev command to the OVS_WIN_CONTROL_FAMILY
itself, but the netdev attributes are not compatible with the existing
attributes for the events. I also considered adding new attributes to
the VPORT family, but we'll have to extend the standard datapath
interface for that.
In this patch, we fix the definition of 'OVS_WIN_CONTROL_ATTR_MAX' as
well.
Signed-off-by: Nithin Raju <nithin@vmware.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Sorin Vinturis [Mon, 6 Oct 2014 15:19:23 +0000 (15:19 +0000)]
datapath-windows: Incorrect assumption of the IRQL
Acquiring a spin lock raises the IRQL to DISPATCH_LEVEL. But
in many places of the code, while holding a spin lock, there
are useless checks for the current IRQL against DISPATCH_LEVEL.
Also, the dispatch flag is not correctly set when calling
NdisAcquireRWLockXXX() functions, which causes an extra check
of the current IRQL.
Batching the cmap find improves the memory behavior with large cmaps
and can make searches twice as fast:
$ tests/ovstest test-cmap benchmark 2000000 8 0.1 16
Benchmarking with n=2000000, 8 threads, 0.10% mutations, batch size 16:
cmap insert: 533 ms
cmap iterate: 57 ms
batch search: 146 ms
cmap destroy: 233 ms
cmap insert: 552 ms
cmap iterate: 56 ms
cmap search: 299 ms
cmap destroy: 229 ms
hmap insert: 222 ms
hmap iterate: 198 ms
hmap search: 2061 ms
hmap destroy: 209 ms
Batch size 1 has small performance penalty, but all other batch sizes
are faster than non-batched cmap_find(). The batch size 16 was
experimentally found better than 8 or 32, so now
classifier_lookup_miniflow_batch() performs the cmap find operations
in batches of 16.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
We use the 'counter' as a "lock" providing acquire-release
semantics. Therefore we can use normal, non-atomic access to the
memory accesses between the atomic accesses to 'counter'. The
cmap_node.next needs to be RCU, so that can not be changed.
For the writer this is straightforward, as we first acquire-read the
counter and after all the changes we release-store the counter. For
the reader this is a bit more complex, as we need to make sure the
last counter read is not reordered with the preceding read operations
on the bucket contents.
Also rearrange code to benefit from the fact that hash values are
unique in any bucket.
This patch seems to make cmap_insert() a bit faster.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
tests/test-cmap: Balance benchmarks between cmap and hmap.
The test cases have been carefully crafted so that we do the same
amount of "overhead" operations in each case. Earlier, with no
mutations, the number of random number generations was different for
hmap and cmap test cases. hmap test was also missing an ignore() call.
Now the numbers look like this:
$ tests/ovstest test-cmap benchmark 2000000 8 0
Benchmarking with n=2000000, 8 threads, 0.00% mutations:
cmap insert: 597 ms
cmap iterate: 65 ms
cmap search: 299 ms
cmap destroy: 251 ms
hmap insert: 243 ms
hmap iterate: 201 ms
hmap search: 299 ms
hmap destroy: 202 ms
So it seems search on cmap can be as fast as on hmap in the
single-threaded case.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
lib/cmap: Return number of nodes from functions modifying the cmap.
We already update the count field as the last step of these functions,
so returning the current count is very cheap. Callers that care about
the count become a bit more efficient, as they avoid extra
non-inlineable function call.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Jarno Rajahalme [Wed, 1 Oct 2014 22:35:45 +0000 (15:35 -0700)]
lib/match: Do not format undefined fields.
Add function flow_wildcards_init_for_packet() that can be used to set
sensible wildcards when megaflows are disabled. Before this, we set
all the mask bits to ones, which caused printing tunnel, mpls, and/or
transport port fields even for packets for which it makes no sense.
This has the side effect of generating different megaflow masks for
different packet types, so there will be more than one kind of mask in
the datapath classifier. This should not make practical difference,
as megaflows should not be disabled when performance is important.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
openvswitch.spec: Remove dependency with openvswitch-kmod.
Upstream Linux has OVS kernel module that includes most (not all) of
the features that comes with the kernel module from openvswitch.org.
So, it is okay to relax the requirement for OVS userspace package
to depend on openvswitch-kmod package.
Signed-off-by: Gurucharan Shetty <gshetty@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>