Jarno Rajahalme [Thu, 2 Oct 2014 16:12:11 +0000 (09:12 -0700)]
lib/ovs-atomic-i586: Faster 64-bit atomics on 32-bit builds with SSE.
Aligned 64-bit memory accesses in i586 are atomic. By using an SSE
register we can make such memory accesses in one instruction without
bus-locking. Need to compile with -msse (or higher) to enable this
feature.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Wed, 1 Oct 2014 07:29:19 +0000 (00:29 -0700)]
datapath: avoid hard coding OVS_VPORT_TYPE_GENEVE
OVS_VPORT_TYPE_GENEVE is currently hard coded to 6. This is not
necessary since slot 5 has not been taken yet. Drop the hard
coded value to before upstreaming GENEVE support to Linux kernel.
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Alex Wang [Wed, 1 Oct 2014 18:49:10 +0000 (11:49 -0700)]
cmap: ovsrcu postpone the cmap destroy.
Currently, the cmap_destroy() directly frees the cmap memory.
Some callers of cmap_destroy() (e.g. destroy_subtable()) still
allows other threads (e.g. pmd threads) accessing the cmap at
the same time (e.g. via classifier_lookup_miniflow_batch()),
which could cause segfault.
To fix the above issue, this commit use ovsrcu to postpone
the free of cmap memory.
Reported-by: Ethan Jackson <ethan@nicira.com> Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
When the bridge datapath_type is changed, ofproto is destroyed and immediately
recreated. This involves closing and reopening the mgmt socket. If the
destruction of the 'connmgr' is postponed, a race condition might happen, where
we first recreate the socket and then try to destroy it.
Reported-by: Daniel Badea <daniel.badea@windriver.com> Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Wed, 1 Oct 2014 00:03:07 +0000 (17:03 -0700)]
configure: Disable strict aliasing.
The C standard allows compilers to do type-based alias analysis, which
means that the compiler is allowed to assume that pointers to objects of
different types are pointers to different objects. For example, a compiler
may assume that "uint16_t *a" and "uint32_t *b" point to different and
nonoverlapping locations because the pointed-to types are different. This
can lead to surprising "optimizations" with compilers that by default do
this kind of analysis, which includes GCC and Clang.
The one escape clause that the C standard gives us is that character types
must be assumed to alias any other object. We've always tried to use this
escape clause to avoid problems with type-based alias analysis in the past.
I think that we should continue to try to do this in the future. It's hard
to tell what compiler we might want to use in the future, and one never
knows what kind of control that compiler allows over alias analysis.
However, recently I helped another developer debug a nasty and confusing
issue, which turned out to be the result of a surprising compiler
optimization due to alias analysis. I've seen enough of these that I don't
think it's worthwhile to risk more problems than we have to. Thus, this
commit turns off type-based alias analysis in GCC and Clang.
Linus Torvalds thinks that type-base alias analysis is not sane, at least
as GCC implements it: https://lkml.org/lkml/2003/2/26/158
The GCC manual says that -Wstrict-aliasing is only effective without
-fno-strict-aliasing, otherwise I'd keep -Wstrict-aliasing also.
Indications are that MSVC doesn't do type-based alias analysis by default.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Alex Wang [Tue, 30 Sep 2014 20:46:22 +0000 (13:46 -0700)]
bridge: Fix high cpu utilization.
When there are more than one ovs-vswitchd processes started,
only one process is enabled. The disabled processes should
just sleep. However, a bug in ovs makes the disabled processes
keep waking up on global connectivity sequence number which is
never sync'ed. Consequently, those processes use 100% cpu.
This commit fixes the bug by always sync up the connectivity
sequence number for disabled processes.
Reported-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Joe Stringer <joestringer@nicira.com>
Ben Pfaff [Tue, 30 Sep 2014 16:57:08 +0000 (09:57 -0700)]
ovs-vswitchd: Better diagnose errors in DPDK command-line options.
With DPDK compiled in, when the --dpdk option was given other than as the
first command-line argument, ovs-vswitchd silently ignored it. Without
DPDK compiled in, when the --dpdk option was given anywhere, ovs-vswitchd
silently ignored it. However, in each case any options following --dpdk
were not ignored, and since --dpdk is normally followed by additional
DPDK-specific options, this caused even more confusing trouble.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Daniele Di Proietto <ddiproietto@vmware.com> Tested-by: Daniele Di Proietto <ddiproietto@vmware.com>
pstream-unix: Increase listen count to 64 in punix_open().
In my test with openstack setup, ovs-ofctl executes failed when there are
many flow rules to be added by multiple threads.
The error like this:
ovs-ofctl: /var/run/openvswitch/br1.mgmt: failed to open socket (Protocol
error)
In the function listen(fd, 10) in punix_open(), the number 10 should be
modified to more bigger, such as 64 maybe a proper value.
Signed-off-by: Lilijun <jerry.lilijun@huawei.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Joe Stringer [Fri, 26 Sep 2014 17:28:05 +0000 (17:28 +0000)]
revalidator: Distinguish new and duplicate flows.
We previously counted flows that have been installed during the current
dump as duplicates, rather than recognising them as new flows. This
patch separates the counters out for these two cases.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Fri, 12 Sep 2014 21:42:47 +0000 (14:42 -0700)]
nx-match: Encode dp_hash and recirc_id in OXM also.
dp_hash and recirc_id are specific to OVS, but that doesn't mean that we
shouldn't encode them into flow matches when OXM is used in OpenFlow 1.2
and later.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Ben Pfaff [Tue, 2 Sep 2014 23:47:01 +0000 (16:47 -0700)]
ofp-actions, nx-match: Use mf_oxm_header() instead of explicit constants.
Following this change, only meta-flow.c uses any explicit NXM_* or OXM_*
constants. An upcoming commit will actually remove the definitions of
these constants, hiding them behind a functional interface, for better
abstraction.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Joe Stringer [Mon, 29 Sep 2014 18:09:57 +0000 (18:09 +0000)]
bridge: Fix bug where IDL wakeup causes 100% CPU.
Commit 9c537baf613a16e (bridge: Refactor the stats and status update.)
inadvertently changed the way that the status transaction is destroyed,
which could cause the main thread to constantly wake up.
The bug occurs when ovsdb_idl_txn_commit() returns TXN_INCOMPLETE and
there are no further changes to connectivity or 'status_txn_try_again'.
- ovsdb_idl_run() receives the transaction reply and updates the
transaction status to TXN_SUCCESS.
- status_update_wait() detects that the transaction is in progress, and
the status is TXN_SUCCESS so wakes up the main thread immediately.
- run_status_update() is meant to destroy the transaction now that it is
finished, however the logic is never run because there were no changes.
- Repeat the wakeup every time the main loop runs.
This patch fixes the behaviour by ensuring that ovsdb_idl_txn_commit()
gets a chance to run whenever there is an ongoing status transaction.
Bug was found by unloading and reloading the kernel module, then
immediately restarting ovs-vswitchd.
Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Alex Wang <alexw@nicira.com>
Currently, whenever there's a missed packet, the ovs driver allocates
memory and copies the packet even if there's no packet queue setup from
userspace. Then if there's no queue created, the packet is released and
dropped.
The solution was to check for the existence of the userspace queue before
trying to allocate and add a new missed packet to the queue. If there is
no userspace queue created, the original packet is dropped without creating
a new missed packet.
Signed-off-by: Sorin Vinturis <svinturis@cloudbasesolutions.com> Reported-by: Nithin Raju <nithin@vmware.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/32 Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Nithin Raju <nithin@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Registered FLOW_DEL command handler. The same command
handler as FLOW_ADD is good enough to handle FLOW_DEL
case as well with minor changes for checking to action
attribute.
Signed-off-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Eitan Eliahu <eliahue@vmware.com> Acked-by: Nithin Raju <nithin@vmware.com> Acked-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Tested-by: Ankur Sharma <ankursharma@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
This patch covers the changes needed to support FLOW_NEW command.
API _OvsFlowMapNlToFlowPutFlags has a bug, which will be fixed
with the patches for FLOW_DEL.
Signed-off-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Eitan Eliahu <eliahue@vmware.com> Acked-by: Nithin Raju <nithin@vmware.com> Acked-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Tested-by: Ankur Sharma <ankursharma@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
NlAttrParseNested was using the whole netlink payload for iteration.
This is not correct, as it would lead to exceeding the
nested attribute boundries. Fixed the same in this patch.
Signed-off-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Eitan Eliahu <eliahue@vmware.com> Acked-by: Nithin Raju <nithin@vmware.com> Acked-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Tested-by: Ankur Sharma <ankursharma@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Moved the structure OVS_MESSAGE to Netlink.h.
This change is done for following reasons.
a. Patch 2 in this series provides a generic API in Netlink.c
for creating netlink message. That API needs OVS_MESSAGE.
Including Datapath.h in Netlink.c/h gives compilation error.
b. OVS_MESSAGE defines netlink messages hence moving it to
Netlink.h looked fine to me.
Signed-off-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Eitan Eliahu <eliahue@vmware.com> Acked-by: Nithin Raju <nithin@vmware.com> Acked-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Tested-by: Ankur Sharma <ankursharma@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
datapath-windows: Add WMI Script that updates Hyper-V friendly port names.
The following script leverage's the advantages of WMI infrastructure
offered in Hyper-V.
This scripts allows the user to change the
Msvm_EthernetPortAllocationSettingData property of a VM network adapter
connected to a Hyper-V Virtual Switch.
The Read event handler is executed when user mode issues a socket
receive on an MC socket associated with the event queue. A new IOCTL
READ command is used to differentiate between transaction based and packet
miss sockets.
An entry for the handler will be added once the Control family (logically
should have been added to the Vport family)
implementation checked in.
User mode code for setting the socket type will follow
Samuel Ghinet [Thu, 25 Sep 2014 21:22:22 +0000 (21:22 +0000)]
datapath-windows: Add file NetlinkError.h.
Contains error codes for netlink transactional errors.
These errors are passed to the "error" field (INT) of the NL_MSG_ERR struct.
The userspace requires them to be negative values: the nl_msg_nlmsgerr userspace
function transforms them from negative to positive values.
These error codes correspond to the userspace error codes defined in:
"C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include\errno.h"
Signed-off-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Acked-by: Eitan Eliahu <eliahue@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Functionality for vport dump.
Later, when we will add more netlink dump commands, some common code will need
to be split to functions.
Notes:
a) the current implementation of vport assumes the datapath feature
"multiple upcall pids" is not used. A single upcall pid is used now.
c) the vxlan destination udp port is currently a constant. When it will become
configurable, the vport options netlink attribute will become relevant.
Signed-off-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Nithin Raju <nithin@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Samuel Ghinet [Thu, 25 Sep 2014 21:20:25 +0000 (21:20 +0000)]
datapath-windows: fix OVS_VPORT_TYPE
The windows ovs kernel uses an OVS_VPORT_TYPE enum that is incompatible with
the userspace counterpart (enum ovs_vport_type from openvswitch.h). We must use
the same enum type for the netlink communication to work properly.
This patch makes the fix: "typedef enum ovs_vport_type OVS_VPORT_TYPE" and
changes the afferent kernel driver code:
o) vport types synthetic and emulated turn to: netdev
o) vport type internal turns to: internal
o) vport type external truns to: netdev (plus, we hold a field in vport,
"isExternal"
Signed-off-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Acked-by: Nithin Raju <nithin@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
string.split() function splits a quoted string if there is a whitespace
inside the quote.
ex: The following code snippet will output ['printing', '"No', 'Diagnostic"']
args = 'printing "No Diagnostic"'
print args.split()
The above is a problem if we run the following command through vtep_ctl().
vtep-ctl set tunnel $uuid bfd_status:diagnostic="No Diagnostic"
The workaround is to use the split() function from shlex module.
ovs-vtep is an emulator and it works only on one
physical switch. This switch name is stored in the variable
'ps_name' and then passed around. An upcoming commit requires
access to this variable at more places and it is easier if this
variable is global.
Before destroying a logical switch, cleanup any left over local
mac information in Ucast_Macs_Local or Mcast_Macs_Local table.
We need to do this to atleast cleanup the 'unknown-dst' information
added in the Mcast_Macs_Local table while creating the Logical_Switch
class in setup_ls().
vtep-ctl: Add Tunnel table to vtep_ctl_table_class.
This is needed to create, get, set records in the Tunnel table.
(We need to add the Tunnel table's 'local' and 'remote' columns
that point to the Physical_Locator record to cache because vtep-ctl
commands like 'add-ucast-local' will try to add an entry in
Physical_Locator table based on the contents of the cache.)
Alex Wang [Thu, 25 Sep 2014 20:10:55 +0000 (13:10 -0700)]
netdev-dpdk: Fix crash when there is no pci numa info.
When kernel cannot obtain the pci numa info, the numa_node file
in corresponding pci directory in sysfs will show -1. Then the
rte_eth_dev_socket_id() function will return it to ovs. On
current master, ovs assumes rte_eth_dev_socket_id() always
returns non-negative value. So using this -1 in pmd thread
creation will cause ovs crash.
To fix the above issue, this commit makes ovs always check the
return value of rte_eth_dev_socket_id() and use numa node 0 if
the return value is negative.
Reported-by: Daniel Badea <daniel.badea@windriver.com> Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>
Alex Wang [Thu, 25 Sep 2014 18:40:24 +0000 (11:40 -0700)]
netdev: Fix error check.
Reported-by: Daniel Badea <daniel.badea@windriver.com> Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>
Simon Horman [Thu, 25 Sep 2014 11:57:52 +0000 (11:57 +0000)]
ofproto-dpif-rid: remove struct rid_map
struct rid_map only has one member which is a struct hmap.
This allows for a slight simplification of the code by removing
struct rid_map and using a struct hmap directly instead.
Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Andy Zhou <azhou@nicira.com>
Alex Wang [Thu, 25 Sep 2014 23:51:01 +0000 (23:51 +0000)]
rstp.at: Fix intermittent test failure.
Sub-test "RSTP - dummy interface" checks the ovs-vswitchd.log
output immediately after command execution. The check may
fail if the write of new log is delayed by the IO thread.
This commit fixes the above issue by waiting for the
ovs-vswitchd.log output.
Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Gurucharan Shetty <gshetty@nicira.com>
Simon Horman [Wed, 24 Sep 2014 12:41:02 +0000 (12:41 +0000)]
ofproto-dpif-rid: correct logic error in rid_pool_alloc_id()
When searching through the valid ids an id should
be used if is not found rather than if it is found.
It appears to me that without this change duplicate recirculation
ids may used in cases where the last recirculation id has
been allocated; selection loops back to the beginning of the pool and;
reaches a recirculation id that is still in use.
As the number of recirculation ids is currently RECIRC_ID_N_IDS = 1024 this
does not seem beyond the bounds of possibility.
I have not verified that such a scenario can actually occur. But it seems
that a likely consequence would be that some packets may be forwarded
incorrectly.
Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Andy Zhou <azhou@nicira.com>
MSVC converts 64 bit read/writes into two instructions (uses 'mov' as
seen through cl //FAs). So there is a possibility that an interrupt can
make a 64 bit read/write non-atomic even when 8 byte aligned. So we cannot
use a simple assignment. Use a full memory barrier function instead.
Alex Wang [Thu, 18 Sep 2014 21:35:30 +0000 (14:35 -0700)]
bridge: Refactor the stats and status update.
This commit refactors the stats and status update in bridge_run()
by moving the corresponding code to separate functions. This
makes the code more organized.
Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Alex Wang [Thu, 18 Sep 2014 21:10:24 +0000 (14:10 -0700)]
bridge: Rate limit the statistics update.
When ovs is running with large topology (e.g. large number of
interfaces), the stats update to ovsdb becomes huge and normally
requires multiple run of ovsdb jsonrpc message processing loop to
consume.
To prevent the periodic stats update from backlogging in the
jsonrpc sending queue, this commit adds rate limiting logic
which only allows new update if the previous one is done.
Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com> Acked-by: Flavio Leitner <fbl@redhat.com>
OVS needs to segments large skb before sending it for miss
packet handling to userspace. but skb_gso_segment uses
skb->cb. This corrupted OVS_CB which result in following panic.
Alex Wang [Mon, 22 Sep 2014 22:34:12 +0000 (15:34 -0700)]
ovs-pki: Use SHA-1 instead of SHA-512 as message digest.
Commit 9ff33ca7 (ovs-pki: Use SHA-512 instead of MD5 as message
digest.) changes the message digest algorithm to SHA-512. This
seems to break the unit tests on some xenserver 5.6/6.0 builds
causing the error: "SSL_connect: error:0D0C50A1:asn1 encoding
routines:ASN1_item_verify:unknown message digest algorithm".
As a solution, this commit changes the message digest algorithm
to SHA-1 which works for both the above xenserver builds and
centos 7.
OVS keeps pointer to packet key in skb->cb, but the packet key is
store on stack. This could make code bit tricky. So it is better to
get rid of the pointer.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Samuel Gauthier [Sat, 20 Sep 2014 13:25:23 +0000 (06:25 -0700)]
datapath: restore OVS_FLOW_CMD_NEW notifications
Since commit fb5d1e9e127a ("openvswitch: Build flow cmd netlink reply only if needed."),
the new flows are not notified to the listeners of OVS_FLOW_MCGROUP.
This commit fixes the problem by using the genl function, ie
genl_has_listerners() instead of netlink_has_listeners().
Signed-off-by: Samuel Gauthier <samuel.gauthier@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Li RongQing [Sat, 20 Sep 2014 12:10:03 +0000 (05:10 -0700)]
datapath: change the data type of error status to atomic_long_t
Change the date type of error status from u64 to atomic_long_t, and use atomic
operation, then remove the lock which is used to protect the error status.
The operation of atomic maybe faster than spin lock.
Cc: Pravin Shelar <pshelar@nicira.com> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
This was required for old compatibility code which update stats
on fake bond interface. Now vswitchd has dropped it. This
support was always deprecated, so finally removing it.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
dpif-netdev: Fix (packet) memory leaks in the slow path.
If a packet didn't match a rule in the fast path classifier its memory was
never freed. The issue was particularly clear with DPDK devices because it was
not possible to process more than ~250000 DPDK mbufs in the slow path.
This commit fixes the problem by:
* calling dpif_packet_delete() if the upcalls are disabled
* passing may_steal==true to dp_netdev_execute_actions() during normal upcall
processing
Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com> Acked-by: Alex Wang <alexw@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Ben Pfaff [Fri, 19 Sep 2014 23:17:09 +0000 (16:17 -0700)]
ovs-pki: Use SHA-512 instead of MD5 as message digest.
This fixes numerous testsuite failures of the form "SSL_connect:
error:0D0C50A1:asn1 encoding routines:ASN1_item_verify:unknown message
digest algorithm" on systems that disable MD5 in OpenSSL. Centos 7 is one
example. Presumably it increase security as well for anyone who generates
certificates based on a new configuration created by the new ovs-pki.
Reported-by: Robert Strickler <anomalyst@gmail.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
This commits adds the multithreading functionality to OVS dpdk
module. Users are able to create multiple pmd threads and set
their cpu affinity via specifying the cpu mask string similar
to the EAL '-c COREMASK' option.
Also, the number of rx queues for each dpdk interface is made
configurable to help distribution of rx packets among multiple
pmd threads.
Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
Alex Wang [Mon, 23 Jun 2014 01:08:15 +0000 (18:08 -0700)]
ovs-numa: Add support for cpu-mask configuration.
This commit adds support in ovs-numa module for reading a user
specified cpu mask, which configures the availability of the cores.
The cpu mask has the format of a hex string similar to the EAL '-c
COREMASK' option input or the 'taskset' mask input. The lowest order
bit corresponds to the first CPU core. Bit value '1' means the
corresponding core is available.
An upcoming patch will allow user to configure the mask via OVSDB.
Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
dpif-netlink: rename linux_flow variable to datapath_flow
In the flow related functions, there's a stack variable called
'linux_flow'. Since this code is not specific to Linux anymore,
in this patch, we rename the variable to 'datpath_flow'.
Signed-off-by: Nithin Raju <nithin@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
datapath-windows: add OVS_DP_CMD_SET and OVS_DP_CMD_GET transaction support
In this patch, we add support for two commands, both of them are issued
as part of transactions semantics from userspace:
1. OVS_DP_CMD_SET is used to get the properties of a DP as well as set
some properties. The set operations does not seem to make much sense for
the Windows datpath right now.
2. There's already support for OVS_DP_CMD_GET command issued via the
dump semantics from userspace. Turns out that userspace can issue
OVS_DP_CMD_GET as a transaction.
There's lot of common code between these two commands. Hence combining
the implementation and the review.
Also refactories some of the code in the implementation of dump-based
OVS_DP_CMD_GET, and updated some of the comments.
Validation:
- With these series of patches, I was able to run the following command:
> .\utilities\ovs-dpctl.exe show
system@ovs-system:
lookups: hit:0 missed:22 lost:0
flows: 0
- I got so far as to hit the PORT_DUMP command which is currently not
implemented.
Signed-off-by: Nithin Raju <nithin@vmware.com> Tested-by: Nithin Raju <nithin@vmware.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/38 Acked-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
extract-odp-netlink-windows-dp-h: add definition of IFNAMSIZ
The Windows kernel datapath needs the definition of 'IFNAMSIZ' for
specifying attribute sizes in netlink policies. Adding the definition
of 'IFNAMSIZ' to be part of OvsDpInterface.h similar to ETH_ADDR_LEN.
Signed-off-by: Nithin Raju <nithin@vmware.com> Acked-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
lib/netlink-socket.c: add support for nl_transact() on Windows
In this patch, we add support for nl_transact() on Windows using
the OVS_IOCTL_TRANSACT ioctl that sends down the request and gets
the reply in the same call to the kernel.
This is obviously a digression from the way it is implemented in
Linux where all the sends are done at once using sendmsg() and
replies are received one at a time.
Initial implementation was in the Linux way using multiple writes
followed by reads, but decided against it since it is not efficient
and also it complicates the state machine in the kernel.
The Windows implementation has equivalent code for handling corner
cases and error coditions similar to Linux. Some of it is not
applicable yet. Eg. the Windows kernel does not embed an error
in the netlink message itself. There's userspace code nevertheless
for this.
Signed-off-by: Nithin Raju <nithin@vmware.com> Acked-by: Samuel Ghinet <sghinet@cloudbasesolutions.com> Acked-by: Eitan Eliahu <eliahue@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
datapath-windows: add OvsCompareString() to compare strings
In this patch we implement a utility function to compare ANSI
strings using the Rtl* functions. As much as possible, in an
NDIS driver, we stick to Rtl* functions for memory/string
manipulation.
Signed-off-by: Nithin Raju <nithin@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Alex Wang [Fri, 19 Sep 2014 17:38:39 +0000 (10:38 -0700)]
netdev-dpdk: Fix a bug in netdev_dpdk_set_multiq().
Commit 5a0340 (dpif-netdev: Create multiple tx/rx queues when
adding dpdk interface.) introduced a bug which causes the function
netdev_dpdk_set_multiq() never resetting the tx queues. This bug
could cause pmd thread accessing unassigned memory, resulting in
segfault.
This commit fixes the bug.
Reported-by: Ethan Jackson <ethan@nicira.com> Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>
Alex Wang [Fri, 19 Sep 2014 00:02:17 +0000 (17:02 -0700)]
netdev-dpdk: Pass queue id to dpdk_do_tx_copy().
Since dpdk_do_tx_copy() will be called by both pmd and
non-pmd thread, it should take the queue id as input.
The current ovs always uses NON_PMD_THREAD_TX_QUEUE
as queue id, which causes unprotected multi-access
to the same queue.
This commit fixes the issue by passing the queue id
to dpdk_do_tx_copy().
Reported-by: Ethan Jackson <ethan@nicira.com> Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>
datapath-windows: NetLink kernel side, Event subscription and notification
This code handles an event notification subscription for a user mode thread
which joins an MC group. The event wait handler queues an IRP which is
completed upon change in a port state.
Signed-off-by: Eitan Eliahu <eliahue@vmware.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Nithin Raju <nithin@vmware.com> Acked-by: Samuel Ghinet <sghinet@cloudbasesolutions.com>
Ethan Jackson [Wed, 17 Sep 2014 20:22:14 +0000 (13:22 -0700)]
ofproto: Warn about excessive rule counts in OpenFlow tables.
Frequently we've run into controller bugs which result in hundreds of
thousands, or even millions of rules being installed in an OpenFlow
table. This isn't something trouble-shooters naturally think of to
check for, so it's nice to have a low rate warning message to hint at
the potential problem.
Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
dpif-netdev: Store miniflow length in exact match cache
This optimization is done to avoid calling count_1bits(), which, if
the popcnt istruction is not available might is slow. popcnt may not
be available because:
- We are running on old hardware
- (more likely) We're using a generic build (i.e. packaged OVS from a
distro), not tuned for the specific CPU
Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
netdev_flow_key is a miniflow with the following constraints:
1) It is used only inside dpif-netdev.c.
2) It always has inline values.
3) It contains only miniflows created by miniflow_extract().
Therefore, by using these new functions instead of the miniflow_*
ones, we get the following (performance related) benefits:
- Because of (1) the functions can be inlined.
- Because of (2) and (3) the netdev_flow_key can be treated as POD.
Specifically, because of (3), we can do comparisons with memcmp,
since if the map is different the miniflow must be different.
Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>