openvswitch: Find existing conntrack entry after upcall.
Add a new function ovs_ct_find_existing() to find an existing
conntrack entry for which this packet was already applied to. This is
only to be called when there is evidence that the packet was already
tracked and committed, but we lost the ct reference due to an
userspace upcall.
ovs_ct_find_existing() is called from skb_nfct_cached(), which can now
hide the fact that the ct reference may have been lost due to an
upcall. This allows ovs_ct_commit() to be simplified.
This patch is needed by later "openvswitch: Interface with NAT" patch,
as we need to be able to pass the packet through NAT using the
original ct reference also after the reference is lost after an
upcall.
openvswitch: Update the CT state key only after nf_conntrack_in().
Only a successful nf_conntrack_in() call can effect a connection state
change, so it suffices to update the key only after the
nf_conntrack_in() returns.
Remove the definition of IP_CT_NEW_REPLY from the kernel as it does
not make sense. This allows the definition of IP_CT_NUMBER to be
simplified as well.
Jarno Rajahalme [Tue, 21 Jun 2016 01:51:06 +0000 (18:51 -0700)]
acinclude: Add OVS_FIND_PARAM_IFELSE.
OVS_FIND_PARAM_IFELSE is more robust macro for checking function
parameters, as it does not require the parameter to be on the same
line as the function name like the OVS_GREP_IFELSE does.
Use this to fix the check for struct conntrack_zone parameter, which
is on a different line on Linux 4.3 and higher.
Ciara Loftus [Mon, 13 Jun 2016 10:10:09 +0000 (11:10 +0100)]
netdev-dpdk: NUMA Aware vHost User
This commit allows for vHost User memory from QEMU, DPDK and OVS, as
well as the servicing PMD, to all come from the same socket.
The socket id of a vhost-user port used to be set to that of the master
lcore. Now it is possible to update the socket id if it is detected
(during VM boot) that the vhost device memory is not on this node. If
this is the case, a new mempool is created from the new node, and the
PMD thread currently servicing the port will no longer, in favour of a
thread from the new node (if enabled in the pmd-cpu-mask).
To avail of this functionality, one must enable the
CONFIG_RTE_LIBRTE_VHOST_NUMA DPDK configuration option.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Nithin Raju [Fri, 17 Jun 2016 17:51:52 +0000 (10:51 -0700)]
datapath-windows: use ip proto for tunnel port lookup
In Actions.c, based on the IP Protocol type and L4 port of
the outer packet, we lookup the tunnel port. The function
that made this happen took the tunnel type as an argument.
Semantically, is is better to pass the IP protocol type and
let the lookup code map IP protocol type to tunnel type.
In the vport add code, we make sure that we block tunnel
port addition if there's already a tunnel port that uses
the same IP protocol type and L4 port number.
Benli Ye [Tue, 14 Jun 2016 08:53:34 +0000 (16:53 +0800)]
ipfix: Support tunnel information for Flow IPFIX.
Add support to export tunnel information for flow-based IPFIX.
The original steps to configure flow level IPFIX:
1) Create a new record in Flow_Sample_Collector_Set table:
'ovs-vsctl -- create Flow_Sample_Collector_Set id=1 bridge="Bridge UUID"'
2) Add IPFIX configuration which is referred by corresponding
row in Flow_Sample_Collector_Set table:
'ovs-vsctl -- set Flow_Sample_Collector_Set
"Flow_Sample_Collector_Set UUID" ipfix=@i -- --id=@i create IPFIX
targets=\"IP:4739\" obs_domain_id=123 obs_point_id=456
cache_active_timeout=60 cache_max_flows=13'
3) Add sample action to the flows:
'ovs-ofctl add-flow mybridge in_port=1,
actions=sample'('probability=65535,collector_set_id=1,
obs_domain_id=123,obs_point_id=456')',output:3'
NXAST_SAMPLE action was used in step 3. In order to support exporting tunnel
information, the NXAST_SAMPLE2 action was added and with NXAST_SAMPLE2 action
in this patch, the step 3 should be configured like below:
'ovs-ofctl add-flow mybridge in_port=1,
actions=sample'('probability=65535,collector_set_id=1,obs_domain_id=123,
obs_point_id=456,sampling_port=3')',output:3'
'sampling_port' can be equal to ingress port or one of egress ports. If sampling
port is equal to output port and the output port is a tunnel port,
OVS_USERSPACE_ATTR_EGRESS_TUN_PORT will be set in the datapath flow sample action.
When flow sample action upcall happens, tunnel information will be retrieved from
the datapath and then IPFIX can export egress tunnel port information. If
samping_port=65535 (OFPP_NONE), flow-based IPFIX will keep the same behavior
as before.
This patch mainly do three tasks:
1) Add a new flow sample action NXAST_SAMPLE2 to support exporting
tunnel information. NXAST_SAMPLE2 action has a new added field
'sampling_port'.
2) Use 'other_configure: enable-tunnel-sampling' to enable or disable
exporting tunnel information.
3) If 'sampling_port' is equal to output port and output port is a tunnel
port, the translation of OpenFlow "sample" action should first emit
set(tunnel(...)), then the sample action itself. It makes sure the
egress tunnel information can be sampled.
4) Add a test of flow-based IPFIX for tunnel set.
How to test flow-based IPFIX:
1) Setup a test environment with two Linux host with Docker supported
2) Create a Docker container and a GRE tunnel port on each host
3) Use ovs-docker to add the container on the bridge
4) Listen on port 4739 on the collector machine and use wireshark to filter
'cflow' packets.
5) Configure flow-based IPFIX:
- 'ovs-vsctl -- create Flow_Sample_Collector_Set id=1 bridge="Bridge UUID"'
- 'ovs-vsctl -- set Flow_Sample_Collector_Set
"Flow_Sample_Collector_Set UUID" ipfix=@i -- --id=@i create IPFIX \
targets=\"IP:4739\" cache_active_timeout=60 cache_max_flows=13 \
other_config:enable-tunnel-sampling=true'
- 'ovs-ofctl add-flow mybridge in_port=1,
actions=sample'('probability=65535,collector_set_id=1,obs_domain_id=123,
obs_point_id=456,sampling_port=3')',output:3'
Note: The in-port is container port. The output port and sampling_port
are both open flow port and the output port is a GRE tunnel port.
6) Ping from the container whose host enabled flow-based IPFIX.
7) Get the IPFIX template pakcets and IPFIX information packets.
Signed-off-by: Benli Ye <daniely@vmware.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Kevin Traynor [Fri, 10 Jun 2016 16:49:38 +0000 (17:49 +0100)]
netdev-dpdk: Remove vhost send retries when no packets have been sent.
If the guest is connected but not servicing the virt queue, this leads
to vhost send retries until timeout. This is fine in isolation but if
there are other high rate queues also being serviced by the same PMD
it can lead to a performance hit on those queues. Change to only retry
when at least some packets have been successfully sent on the previous
attempt.
Also, limit retries to avoid a similar delays if packets are being sent
at a very low rate due to few available descriptors.
Ben Pfaff [Mon, 13 Jun 2016 21:53:01 +0000 (14:53 -0700)]
ofp-util: Fix parsing of parenthesized values within key-value pairs.
Reported-by: james hopper <jameshopper@email.com>
Reported-at: http://openvswitch.org/pipermail/discuss/2016-June/021662.html Signed-off-by: Ben Pfaff <blp@ovn.org>
Paul Boca [Wed, 8 Jun 2016 08:40:34 +0000 (08:40 +0000)]
ovs-ofctl: Fixed PID file naming on windows
On Windows if a relative file name is given to --pidfile parameter
(not containing ':') then the application name is used for PID file,
ignoring the given name.
Signed-off-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Sorin Vinturis [Wed, 1 Jun 2016 15:50:27 +0000 (15:50 +0000)]
datapath-windows: Sample action support.
This patch adds support for sampling to the OVS extension.
The following flow was used for generating sample actions:
ovs-ofctl add-flow tcp:127.0.0.1:9999 "actions=sample(
probability=12345,collector_set_id=23456,obs_domain_id=34567,
obs_point_id=45678)"
Signed-off-by: Sorin Vinturis <svinturis@cloudbasesolutions.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Benli Ye [Tue, 14 Jun 2016 03:09:45 +0000 (11:09 +0800)]
ipfix: Bug fix for not sending template packets on 32-bit OS
'last_template_set_time' in truct dpif_ipfix_exporter is declared
as time_t and time_t is long int type. If we initialize
'last_template_set_time' as TIME_MIN, whose value is -2147483648
on 32-bit OS and -2^63 on 64-bit OS. There will be a problem on
32-bit OS when comparing 'last_template_set_time' with a unisgned int
type variable, because type casting will happen and negative value
could be a large positive number. Fix this problem by simply initialize
'last_template_set_time' as 0.
Signed-off-by: Benli Ye <daniely@vmware.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: William Tu <u9012063@gmail.com>
Benli Ye [Mon, 13 Jun 2016 21:44:09 +0000 (14:44 -0700)]
ipfix: Add support for exporting ipfix statistics.
It is meaningful for user to check the stats of IPFIX.
Using IPFIX stats, user can know how much flows the system
can support. It is also can be used for performance check
of IPFIX.
IPFIX stats is added for per IPFIX exporter. If bridge IPFIX is
enabled on the bridge, the whole bridge will have one exporter.
For flow IPFIX, the system keeps per id (column in
Flow_Sample_Collector_Set) per exporter.
1) Add 'ovs-ofctl dump-ipfix-bridge SWITCH' to export IPFIX stats of
the bridge which enable bridge IPFIX. The output format:
NXST_IPFIX_BRIDGE reply (xid=0x2):
bridge ipfix: flows=0, current flows=0, sampled pkts=0, \
ipv4 ok=0, ipv6 ok=0, tx pkts=0
pkts errs=0, ipv4 errs=0, ipv6 errs=0, tx errs=0
2) Add 'ovs-ofctl dump-ipfix-flow SWITCH' to export IPFIX stats of
the bridge which enable flow IPFIX. The output format:
NXST_IPFIX_FLOW reply (xid=0x2): 2 ids
id 1: flows=4, current flows=4, sampled pkts=14, ipv4 ok=13, \
ipv6 ok=0, tx pkts=0
pkts errs=0, ipv4 errs=0, ipv6 errs=0, tx errs=0
id 2: flows=0, current flows=0, sampled pkts=0, ipv4 ok=0, \
ipv6 ok=0, tx pkts=0
pkts errs=0, ipv4 errs=0, ipv6 errs=0, tx errs=0
flows: the number of total flow records, including those exported.
current flows: the number of current flow records cached.
sampled pkts: Successfully sampled packet count.
ipv4 ok: successfully sampled IPv4 flow packet count.
ipv6 ok: Successfully sampled IPv6 flow packet count.
tx pkts: the count of IPFIX exported packets sent to the collector(s).
pkts errs: count of packets failed when sampling, maybe not supported or other error.
ipv4 errs: Count of IPV4 flow packet in the error packets.
ipv6 errs: Count of IPV6 flow packet in the error packets.
tx errs: the count of IPFIX exported packets failed when sending to the collector(s).
Signed-off-by: Benli Ye <daniely@vmware.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Fri, 10 Jun 2016 22:19:03 +0000 (15:19 -0700)]
ovs-vsctl: Support identifying Flow_Sample_Collector_Set records by id.
This allows commands like
ovs-vsctl list Flow_Sample_Collector_Set 123
if there's a record with id 123. It's not perfect, since there can be
more than one record with the same id, but it's helpful.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
Jarno Rajahalme [Mon, 13 Jun 2016 21:22:32 +0000 (14:22 -0700)]
netlink-notifier: Support multiple groups.
A netlink notifier ('nln') already supports multiple notifiers. This
patch allows each of these notifiers to subscribe to a different
multicast group. Sharing a single socket for multiple event types
(each on their own multicast group) provides serialization of events
when reordering of different event types could be problematic. For
example, if a 'create' event and 'delete' event are on different
netlink multicast group, we may want to process those events in the
order in which kernel issued them, rather than in the order we happen
to check for them.
Moving the multicast group argument from nln_create() to
nln_notifier_create() allows each notifier to specify a different
multicast group. The parse callback needs to identify the group the
message belonged to by returning the corresponding group number, or 0
when an parse error occurs.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Jesse Gross [Sat, 28 May 2016 16:56:07 +0000 (09:56 -0700)]
dpif-netdev: Print installed flows in dpif format.
When debug logging is enabled, dpif-netdev can print each flow as it is
installed, which it currently does using OpenFlow match formatting. Compared
to ODP formatting, there generally isn't too much difference since the
fields are largely the same but it is inconsistent with other logging in
dpif-netdev as well as the analogous functions that deal with the kernel.
However, in some cases there is a difference between the two formats, such
as in the cases of input port or tunnel metadata. For input port, datapath
format helped detect that the generated masks were incorrect. As for tunnels,
at the moment, it's possible to convert between the two formats on demand as
we have a global metadata table. In the future, though this won't be possible
as the metadata table becomes per-bridge which the datapath won't have access
to.
Signed-off-by: Jesse Gross <jesse@kernel.org> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Jesse Gross [Thu, 9 Jun 2016 20:32:50 +0000 (13:32 -0700)]
odp-util: Remove odp_in_port from struct odp_flow_key_parms.
When calling odp_flow_key_from_flow (or _mask), the in_port included
as part of the flow is ignored and must be explicitly passed as a
separate parameter. This is because the assumption was that the flow's
version would often be in OFP format, rather than ODP.
However, at this point all flows that are ready for serialization in
netlink format already have their in_port properly set to ODP format.
As a result, every caller needs to explicitly initialize the extra
paramter to the value that is in the flow. This switches to just use
the value in the flow to simply things and avoid the possibility of
forgetting to initialize the extra parameter.
Signed-off-by: Jesse Gross <jesse@kernel.org> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Jesse Gross [Thu, 9 Jun 2016 20:18:45 +0000 (13:18 -0700)]
ofproto-dpif-upcall: Translate input port as part of upcall translation.
When we generate wildcards for upcalled flows, the flows and therefore
the wildcards, are in OpenFlow format. These are mostly the same but
one exception is the input port. We work around this problem by simply
performing an exact match on the input port when generating netlink
formatted keys. (This does not lose any information in practice because
action translation also always exact matches on input port.)
While this works fine for kernel based flows, it misses the userspace
datapath, which directly consumes the OFP format mask for the input
port. The effect of this is that the in_port mask is sometimes only
the lower 16 bits of the field. (This is because OFP format is a 16-bit
value stored in a 32-bit field. The full width of the field is initialized
with an exact match mask but certain operations result in cleaving this
down to 16 bits.) In practice this does not cause a problem because datapath
port numbers are almost always in the lower 16 bits of the range anyways.
This moves the masking of the datapath format field to translation so that
all datapaths see the same result. This also makes more sense conceptually
as the input port in the flow is also in ODP format at this stage.
Signed-off-by: Jesse Gross <jesse@kernel.org> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Justin Pettit [Thu, 9 Jun 2016 00:15:02 +0000 (17:15 -0700)]
ovn-nbctl: Update logical switch commands.
A few minor changes related to logical switch commands:
- Use "ls" instead of "lswitch" to be more consistent with other
command changes.
- Use commands where possible in ovn unit tests.
- Update references from "lswitch" to "ls" (code) or "switch" (user).
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ryan Moats <rmoats@us.ibm.com> Acked-by: Ben Pfaff <blp@ovn.org>
Justin Pettit [Tue, 7 Jun 2016 23:43:34 +0000 (16:43 -0700)]
ovn-nbctl: Update logical switch port commands.
A few minor changes related to logical switch port commands:
- Use "lsp" instead of "lport" to be more consistent with later
changes.
- Use commands where possible in ovn unit tests.
- Update references from "lport" to "lsp" (code) or "port" (user).
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ryan Moats <rmoats@us.ibm.com> Acked-by: Ben Pfaff <blp@ovn.org>
Justin Pettit [Wed, 18 May 2016 00:56:12 +0000 (17:56 -0700)]
ovn-nbctl: Update logical router port commands.
A few minor changes related to logical router port commands:
- Use "lrp" instead of "lrport" to be more consistent with later
changes.
- Use commands where possible in ovn unit tests.
- Move documentation to group router commands together.
- Adds mac/network/peer to lrp-add command. The existing command
doesn't require creating a mac or network address, which shouldn't
be possible.
- Drops lrport-[get|set]-mac-addresses commands in favor of
initializing them in lrp-add command.
- Update references from "lrport" to "lrp" (code) or "port" (user).
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ryan Moats <rmoats@us.ibm.com> Acked-by: Ben Pfaff <blp@ovn.org>
Justin Pettit [Tue, 17 May 2016 13:39:46 +0000 (06:39 -0700)]
ovn-nbctl: Update basic router commands.
A few minor changes related to router commands:
- Use "lr" instead of "lrouter" to be more consistent with later
changes.
- Use the commands where possible in ovn unit tests.
- Move documentation to group router commands together.
- Update references from "lrouter" to "router".
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org> Acked-by: Ryan Moats <rmoats@us.ibm.com>
openvswitch: use flow protocol when recalculating ipv6 checksums
When using masked actions the ipv6_proto field of an action
to set IPv6 fields may be zero rather than the prevailing protocol
which will result in skipping checksum recalculation.
This patch resolves the problem by relying on the protocol
in the flow key rather than that in the set field action.
Fixes: 83d2b9ba1abc ("net: openvswitch: Support masked set actions.") Cc: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
sparse: Fix conflict between netinet/in.h and linux/in.h
linux/in.h (from linux uapi headers) carries many of the same
definitions as netinet/in.h (from glibc).
If linux/in.h is included after netinet/in.h, conflicts are avoided in
two ways:
1) linux/libc-compat.h (included by linux/in.h) detects the include
guard of netinet/in.h and defines some macros (e.g.
__UAPI_DEF_IN_IPPROTO) to 0. linux/in.h avoids exporting the same
enums if those macros are 0.
2) The two files are allowed to redefine the same macros as long as the
values are the same.
1) It uses a custom include guard
2) It uses dummy values for some macros.
This commit changes include/sparse/netinet/in.h to use the same include
guard as glibc netinet/in.h, and to use the same values for some macros.
I think this problem is present with linux headers after a263653ed798("netfilter: don't pull include/linux/netfilter.h from netns
headers") which cause our lib/netlink-conntrack.c to include linux/in.h
after netinet/in.h.
sample output from sparse:
/usr/include/linux/in.h:29:9: warning: preprocessor token IPPROTO_IP
redefined
../include/sparse/netinet/in.h:60:9: this was the original definition
/usr/include/linux/in.h:31:9: warning: preprocessor token IPPROTO_ICMP
redefined
../include/sparse/netinet/in.h:63:9: this was the original definition
[...]
/usr/include/linux/in.h:28:3: error: bad enum definition
/usr/include/linux/in.h:28:3: error: Expected } at end of specifier
/usr/include/linux/in.h:28:3: error: got 0
/usr/include/linux/in.h:84:16: error: redefinition of struct in_addr
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Tested-by: Joe Stringer <joe@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Terry Wilson [Wed, 8 Jun 2016 13:55:14 +0000 (08:55 -0500)]
Add optional C extension wrapper for Python JSON parsing
The pure Python in-tree JSON parser is *much* slower than the
in-tree C JSON parser. A local test parsing a 100Mb JSON file
showed the Python version taking 270 seconds. With the C wrapper,
it took under 4 seconds.
The C extension will be used automatically if it can be built. If
the extension fails to build, a warning is displayed and the build
is restarted without the extension.
The Serializer class is replaced with Python's built-in
JSON library since the ability to process chunked data is not
needed in that case.
The extension should work with both Python 2.7 and Python 3.3+.
Signed-off-by: Terry Wilson <twilson@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Terry Wilson [Wed, 8 Jun 2016 13:55:13 +0000 (08:55 -0500)]
Ensure significand remains an integer in Python3 json parser
The / operation in Python 2 is "floor division" for int/long types
while in Python 3 is "true division". This means that the
significand can become a float with the existing code in Python 3.
This, in turn, can result in a parse of something like [1.10e1]
returning 11 in Python 2 and 11.0 in Python 3. Switching to the
// operator resolves this difference.
The JSON tests do not catch this difference because the built-in
serializer prints floats with the %.15g format which will convert
floats with no fractional part to an integer representation.
Signed-off-by: Terry Wilson <twilson@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
ofproto-dpif-upcall: Prevent memory leak on log message.
When DPIF does not support UFID (like old kernels), it may print this
message quite frequently, if using an OVS version that does not include
the upstream fix af50de800ecb ("ofproto-dpif-upcall: Pass key to
dpif_flow_get().").
Fixes: 64bb477f0568 ("dpif: Minimize memory copy for revalidation.") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Signed-off-by: Joe Stringer <joe@ovn.org>
Joe Stringer [Tue, 24 May 2016 01:20:31 +0000 (18:20 -0700)]
xenserver: Remove deprecated print statement.
PEP 3105 removed the print statement in favour of a print function.
Replace usage of the old statement with equivalent functionality that
works in both python2.7 and python3.
Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Joe Stringer [Tue, 24 May 2016 01:20:29 +0000 (18:20 -0700)]
xenserver: Remove tuple unpacking in lambdas.
PEP 3113 removed the use of tuple parameter unpacking in conjunction
with lambdas, replace this code with something that works in python2.7
and python3.
Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Joe Stringer [Tue, 24 May 2016 01:20:26 +0000 (18:20 -0700)]
xenserver: Sort vsctl port options.
In python3, dictionaries are less likely to be sorted consistently from
one run to the next, so sort port options when outputting to provide
reliable test results.
Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
This option is used to initialize the ovs_numa module with a fake
configuration and to avoid pthread_setaffinity_np() calls. It will be
useful to test dpif-netdev with pmd threads.
Since it is only used for testing it is not documented in the man pages.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
ovs-numa: Introduce function to set current thread affinity.
This commit moves the code that sets the pmd threads affinity from
netdev-dpdk to ovs-numa. There's one small part left in netdev-dpdk, to
set the lcore_id.
Now dpif-netdev will call both modules (ovs-numa and netdev-dpdk) when
starting a pmd thread.
This change will allow having a dummy implementation of the set affinity
call, for testing purposes.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
Instead of having static inline stubs for non linux platform we can use
the implementations in ovs-numa.c. With one small change to
ovs_numa_dump_cores_on_numa(), they will behave exactly like the
stubs for the non-linux case, because 'found_numa_and_core' will be
false and the socket and cpu hmaps will be empty.
There are a few places where conditional compilation is required: the
code that parses the linux specific sysfs entries and its dependencies.
It requires opendir() and readdir() and doesn't make sense outside of
linux anyway.
This change is required to have a cross-platform ovs-numa dummy
implementation for testing.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
Paul Boca [Mon, 6 Jun 2016 16:45:00 +0000 (16:45 +0000)]
datapath-windows: Improved offloading on STT tunnel
*Added OvsExtractLayers - populates only the layers field without unnecessary
memory operations for flow part
*If in STT header the flags are 0 then force packets checksums calculation
on receive.
*Ensure correct pseudo checksum is set for LSO both on send and receive.
Linux includes the segment length to TCP pseudo-checksum conforming to
RFC 793 but in case of LSO Windows expects this to be only on
Source IP Address, Destination IP Address, and Protocol.
*Fragment expiration on rx side of STT was set to 30 seconds, but the correct
timeout would be TTL of the packet
Russell Bryant [Thu, 2 Jun 2016 19:53:46 +0000 (15:53 -0400)]
INSTALL.md: Note use of "hacking" flake8 plugin.
The automatic flake8 check that runs against Python code has some
warnings enabled that come from the "hacking" flake8 plugin. If it's
not installed, the warnings just won't occur until it's run on a system
with "hacking" installed.
Signed-off-by: Russell Bryant <russell@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Ilya Maximets [Fri, 27 May 2016 13:32:53 +0000 (16:32 +0300)]
netdev-dummy: Add multiqueue support to dummy-pmd.
All previous multi-open logic preserved for rx queues.
Also, added new optional parameter '--qid' for 'netdev-dummy/receive'
in order to allow user to choose id of rx queue to which packet will
be sent.
Benli Ye [Fri, 27 May 2016 15:32:40 +0000 (23:32 +0800)]
ipfix: Bug fix for configuring IPFIX for flows
There are two kinds of IPFIX: bridge level IPFIX and flow level
IPFIX. Now if we only configure flow level IPFIX, even if there
is no bridge IPFIX configuration, the datapath flow will contain
a sample action for bridge IPFIX. Fix it.
Steps to configure flow level IPFIX:
1) Create a new record in Flow_Sample_Collector_Set table:
'ovs-vsctl -- create Flow_Sample_Collector_Set id=1 bridge="Bridge UUID"'
2) Add IPFIX configuration which is referred by corresponding
row in Flow_Sample_Collector_Set table:
'ovs-vsctl -- set Flow_Sample_Collector_Set
"Flow_Sample_Collector_Set UUID" ipfix=@i -- --id=@i create IPFIX
targets=\"IP:4739\" obs_domain_id=123 obs_point_id=456
cache_active_timeout=60 cache_max_flows=13'
3) Add sample action to the flows:
'ovs-ofctl add-flow mybridge in_port=1,
actions=sample'('probability=65535,collector_set_id=1,
obs_domain_id=123,obs_point_id=456')',output:LOCAL'
Before this fix, if you only configure flow IPFIX, the datapath flow is:
id(0),in_port(2),eth_type(0x0806), packets:0, bytes:0, used:never,
actions:sample(sample=0.0%,actions(userspace(pid=4294960835,
ipfix(output_port=4294967295)))),sample(sample=100.0%,
actions(userspace(pid=4294960835,flow_sample(probability=65535,
collector_set_id=1,obs_domain_id=123,obs_point_id=456)))),
sample(sample=0.0%,actions(userspace(pid=4294960835,
ipfix(output_port=1)))),1
The datapath flow should only contain the sample action like below:
id(0),in_port(2),eth_type(0x0800),ipv4(frag=no), packets:9, bytes:871,
used:0.656s, actions:sample(sample=100.0%,actions(userspace(pid=4294962911,
flow_sample(probability=65535,collector_set_id=1,obs_domain_id=123,
obs_point_id=456)))),1
Signed-off-by: Benli Ye <daniely@vmware.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Fri, 27 May 2016 00:02:38 +0000 (17:02 -0700)]
tests: Avoid endianness sensitivity in MPLS handling test.
The test "ofproto-dpif - MPLS handling" included a test of the "multipath"
action whose results depended on the hash function in use. The OVS hash
function yields different results on little-endian and big-endian systems,
so this caused a failure.
This commit fixes the problem by changing the modulus in the multipath
action from 256 to 1; any (nonnegative) value modulo 1 is 0, so this makes
the results consistent across endianness (and across hash function
changes). I think that this is still a good enough test.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Aaron Conole <aconole@redhat.com> Acked-by: Gerhard Stenzel <gstenzel@linux.vnet.ibm.com>
Ben Pfaff [Thu, 26 May 2016 23:57:00 +0000 (16:57 -0700)]
tests: Fix select group test on big-endian systems.
This test ensures that, when the selection criteria for a select group are
the same from packet to packet, the same bucket is always selected.
However, it hardcoded the bucket that was selected to the one that happens
to be selected with the current OVS hash function on little-endian systems.
On big-endian systems, the current OVS hash functions turns out to select
the other bucket. That's fine (it's consistent, it just consistently makes
the other choice), so this commit fixes the problem by allowing either
bucket to be selected.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Aaron Conole <aconole@redhat.com> Acked-by: Gerhard Stenzel <gstenzel@linux.vnet.ibm.com>
Ben Pfaff [Thu, 26 May 2016 23:53:52 +0000 (16:53 -0700)]
netdev-native-tnl: Fix treatment of GRE key on big-endian systems.
The GRE implementation used bitwise shifts to convert an ovs_be32 to an
ovs_be64 (with zero extension), but on big-endian systems these conversions
are no-ops. This fixes the problem.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Aaron Conole <aconole@redhat.com> Acked-by: Gerhard Stenzel <gstenzel@linux.vnet.ibm.com>
Ben Pfaff [Fri, 3 Jun 2016 20:15:01 +0000 (13:15 -0700)]
types: Change ofp_port_t from uint16_t to uint32_t.
This fixes several tests that failed on big-endian systems because "union
flow_in_port" overlays an ofp_port_t and odp_port_t and in some cases it
is not easy to determine which one is in use.
This commit also fixes up a few places where this broke other code.
Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Aaron Conole <aconole@redhat.com> Acked-by: Gerhard Stenzel <gstenzel@linux.vnet.ibm.com>
route-table: If device is not there, route is still parseable.
Do not return failure to parse a route if device has been removed before we are
able to parse the route. That prevents "received bad netlink message" warnings
on the log.
This can be reproduced by simply removing interfaces.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
There are four sessions established from ovn-controller to the following:
OVN Southbound — JSONRPC based
Local ovsdb — JSONRPC based
Local vswitchd — openflow based from ofctrl
Local vswitchd — openflow based from pinctrl
All of these sessions have their own probe_interval, For the last
two connections, they do not need probe_timer as they are over unix domain
socket. This patch takes care of that.
This change has been tested putting logs in several places like in
ovn-controller.c, lib/rconn.c to make sure the probe_timer is
disabled. Also, by making sure from ovn-controller's
log file that there is no more reconnect happening due to probe
under heavy load.
Signed-off-by: Nirapada Ghosh <nghosh@us.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Nirapada Ghosh [Fri, 3 Jun 2016 18:48:49 +0000 (11:48 -0700)]
ovn-nbctl: Add lrouter and lrport related commands.
ovn-nbctl provides a shortcut to perform commands related lswitch, lport
and such but it doesn't have similar commands related to logical routers
and logical router ports. Also, 'ovn-nbctl show' is supposed to show an
overview of database contents, which means it should show the routers
as well. "ovn-nbctl show LSWITCH" shows the switch details, similarly
"ovn-nbctl show LROUTER" should show the router details too. This patch
takes care of all of these.
Modifications;
1) ovn-nbctl show -- will now show lrouters as well
2) ovn-nbctl show <lrouter> -- will show the router now
Unit test cases have been added to test all of these modifications and
additions.
Signed-off-by: Nirapada Ghosh <nghosh@us.ibm.com>
[blp@ovn.org added features to match the lswitch and lport commands] Co-authored-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
Li Wei [Thu, 2 Jun 2016 01:09:42 +0000 (09:09 +0800)]
ovn-northd.8.xml: fix sock path of NB and SB database.
commit 60bdd01148e4 ("Separating OVN NB and SB database processes")
introduced a separating OVN NB and SB database process, the path of
sock files need to be updated.
Fixes: 60bdd01148e4 ("Separating OVN NB and SB database processes") Signed-off-by: Li Wei <lw@cn.fujitsu.com> Signed-off-by: Russell Bryant <russell@ovn.org>
ovn-controller: Refactor conntrack zone allocation.
We currently allocate conntrack zones in binding.c. It fits
in nicely there because we currently only allocate conntrack
zones to logical ports and binding.c is where we figure out
the local ones.
An upcoming commit needs conntrack zone allocation for routers
in a gateway. For that reason, this commit moves conntrack zone
allocation code to ovn-controller.c where it would be easily
accessible for router zone allocation too.
Signed-off-by: Gurucharan Shetty <guru@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>