Ben Pfaff [Wed, 14 Jun 2017 16:07:45 +0000 (09:07 -0700)]
netlink: Introduce helpers for 128-bit integer attributes.
Use the helpers in appropriate places. In most cases, this fixes a
misaligned reference, since ovs_be128 and ovs_u128 require 8-byte alignment
but Netlink only guarantees 4-byte.
Ben Pfaff [Tue, 13 Jun 2017 23:04:29 +0000 (16:04 -0700)]
ovs-ofctl: Avoid read overrun in ofperr_decode_msg().
vconn_add_bundle_error() was keeping at most 64 bytes of an OpenFlow
error message, then it was passing it to ofperr_decode_msg(), which assumed
that the full message was available. This led to a buffer overread.
There's no good reason why it was only keeping the first 64 bytes, so this
commit changes it to keep the whole error message, sidestepping the
problem.
struct vconn_bundle_error only existed for this special case, so remove it
in favor of a chain of ofpbufs.
Paul Blakey [Tue, 13 Jun 2017 15:03:29 +0000 (18:03 +0300)]
other-config: Add hw-offload switch to control netdev flow offloading
Add a new configuration option - hw-offload that enables netdev
flow api. Enabling this option will allow offloading flows
using netdev implementation instead of the kernel datapath.
This configuration option defaults to false - disabled.
Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Simon Horman <simon.horman@netronome.com>
Paul Blakey [Tue, 13 Jun 2017 15:03:28 +0000 (18:03 +0300)]
netdev: Adding a new netdev API to be used for offloading flows
Add a new API interface for offloading dpif flows to netdev.
The API consist on the following:
flow_put - offload a new flow
flow_get - query an offloaded flow
flow_del - delete an offloaded flow
flow_flush - flush all offloaded flows
flow_dump_* - dump all offloaded flows
In upcoming commits we will introduce an implementation of this
API for netdev-linux.
Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Simon Horman <simon.horman@netronome.com>
Roi Dayan [Tue, 13 Jun 2017 15:03:27 +0000 (18:03 +0300)]
tc: Add tc flower functions
Add tc helper functions to query and manipulate the flower classifier.
Signed-off-by: Paul Blakey <paulb@mellanox.com> Co-authored-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Roi Dayan <roid@mellanox.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Simon Horman <simon.horman@netronome.com>
Paul Blakey [Tue, 13 Jun 2017 15:03:24 +0000 (18:03 +0300)]
tc: Introduce tc module
Add tc module to expose tc operations to be used by other modules.
Move some tc related functions from netdev-linux.c to tc.c
This patch doesn't change any functionality.
Signed-off-by: Paul Blakey <paulb@mellanox.com> Co-authored-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Roi Dayan <roid@mellanox.com> Acked-by: Joe Stringer <joe@ovn.org> Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Simon Horman <simon.horman@netronome.com>
Roi Dayan [Tue, 13 Jun 2017 15:03:23 +0000 (18:03 +0300)]
netdev-linux: Refactor two tc functions
Refactor tc_make_request and tc_add_del_ingress_qdisc to accept
ifindex instead of netdev struct.
We later want to move those outside netdev-linux module to be
used by other modules.
This patch doesn't change any functionality.
Signed-off-by: Roi Dayan <roid@mellanox.com> Co-authored-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Paul Blakey <paulb@mellanox.com> Acked-by: Joe Stringer <joe@ovn.org> Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Simon Horman <simon.horman@netronome.com>
Lance Richardson [Tue, 13 Jun 2017 17:51:06 +0000 (13:51 -0400)]
sandbox: disable ssl for backup ovn southbound db
Since the sandbox environment was changed to enable SSL usage for
OVN_Southbound connections, the backup southbound server emits
the log message "socket_util|ERR|6642: bind: Address already in use"
every 2.5 seconds.
Fix by configuring the backup db server to not use remote configuration
from the database (the unix: socket can still be used, as was the
case before SSL was enabled).
Fixes: 0ced2a5c5e47 ("sandbox: use ssl for ovn-controller to sb db connection") Signed-off-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Darrell Ball [Tue, 13 Jun 2017 14:46:29 +0000 (07:46 -0700)]
conntrack: Reset nat_info in un_nat conns.
Un-nat conns have no nat_info as do default conns.
However, un-nat conns are originally templated from the
corresponding default conns and therefore need to
have their nat_info explicitly nulled. This
otherwise exposes a double free if conntrack_destroy()
were to be used to destroy the connection tracker. This
would apply to cleaning the datapath after testing.
Fixes: 286de2729955 ("dpdk: Userspace Datapath: Introduce NAT Support.") Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Greg Rose <gvrose8192@gmail.com>
Aaron Conole [Tue, 13 Jun 2017 18:26:03 +0000 (14:26 -0400)]
redhat: make the rpm aware of the lock file
Currently, the db lockfile will cause the openvswitch directory to
linger after uninstall because the rpm database isn't aware that it
should be treated as part of the system. This commit informs the rpmdb
properly as a 'ghost' so that when the package is uninstalled, it will
be removed automatically. This means that if no extra files exist in
/etc/openvswitch, the whole directory will be removed from /etc/.
When __SSE4_2__ && __x86_64__, it is declared as:
static inline uint32_t hash_finish(uint64_t hash, uint64_t final)
A recent commit added an unneeded prototype in the first form, which caused
an error due to the redeclaration of a different type when the second form
was actually used. This removes the prototype, fixing the problem.
It may not be a great idea to have two different forms for this function,
but it's long standing and so I don't want to change it immediately without
proper consideration.
Reported-by: "Fischetti, Antonio" <antonio.fischetti@intel.com> Fixes: 67702b79d845 ("hash: New helper functions for adding words in a buffer to a hash.") Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Darrell Ball <dlu998@gmail.com>
Ben Pfaff [Tue, 13 Jun 2017 04:51:14 +0000 (21:51 -0700)]
byte-order: Fix undefined behavior of BYTES_TO_BE32.
A left shift that would produce a result that is not representable
by the type of the expression's result has "undefined behavior"
according to the C language standard. Avoid this by casting values
that could set the upper bit to unsigned types.
Also document and convert a macro to a function.
While we're at it, delete the unused macro BE16S_TO_BE32.
Flavio Leitner [Fri, 9 Jun 2017 15:58:57 +0000 (12:58 -0300)]
testsuite: exit gracefully if it fails.
The daemon is killed leaving resources behind when a test fails.
This fixes to first signal the daemon to exit gracefully.
Fixes: 0f28164be02ac ("netdev-linux: make tap devices persistent") Suggested-by: Joe Stringer <joe@ovn.org> Co-authored-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
windows-datapath: Temporary workaround checksum issue with NAT
There is a known bug with NAT where checksum computation is wrong on
the RX path if offload is enabled. This patch works around the problem
by always computing a software checksum and should be reverted once
we figure out the root cause of checksum error.
Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Ben Pfaff [Mon, 12 Jun 2017 15:35:48 +0000 (08:35 -0700)]
ofp-actions: Properly interpret "output:in_port".
It was being misinterpreted as output:NXM_OF_IN_PORT[]. This
interpretation is incorrect because of OpenFlow rules that say that only
the special form generated by output:in_port (or "in_port" on its own)
actually outputs to the input port. The interpretation here was a no-op.
Fixes: 21b2fa617126 ("ofp-parse: Allow match field names in actions and brackets in matches.") Reported-by: nickcooper-zhangtonghao <nic@opencloud.tech> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
Aaron Conole [Thu, 8 Jun 2017 20:41:32 +0000 (16:41 -0400)]
dpdk: Deprecate vhost-user server ports.
Since vhost-user server mode ports are the preferred mechanism for
interconnecting Open vSwitch with VMs when using DPDK, and since there
are currently no known use cases for vhost-user server mode ports apart
from version incompatibilities with QEMU, announce that server mode ports
are considered deprecated and will be removed in a future release.
Cc: Ciara Loftus <ciara.loftus@intel.com> Cc: Kevin Traynor <ktraynor@redhat.com> Suggested-by: Darrell Ball <dball@vmware.com> Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch is useful to debug rstp subsystem and log the
port name instead of port number. This patch will also
be used to display rstp info for next patches.
Lance Richardson [Wed, 31 May 2017 23:04:32 +0000 (19:04 -0400)]
ovsdb: add support for role-based access controls
Add suport for ovsdb RBAC (role-based access control). This includes:
- Support for "RBAC_Role" table. A db schema containing a table
by this name will enable role-based access controls using
this table for RBAC role configuration.
The "RBAC_Role" table has one row per role, with each row having a
"name" column (role name) and a "permissions" column (map of
table name to UUID of row in separate permission table.) The
permission table has one row per access control configuration,
with the following columns:
"name" - name of table to which this row applies
"authorization" - set of column names and column:key pairs
to be compared against client ID to
determine authorization status
"insert_delete" - boolean, true if insertions and
authorized deletions are allowed.
"update" - Set of columns and column:key pairs for
which authorized updates are allowed.
- Support for a new "role" column in the remote configuration
table.
- Logic for applying the RBAC role and permission tables, in
combination with session role from the remote connection table
and client id, to determine whether operations modifying database
contents should be permitted.
- Support for specifying RBAC role string as a command-line option
to ovsdb-tool (Ben Pfaff).
Signed-off-by: Lance Richardson <lrichard@redhat.com> Co-authored-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Tue, 6 Jun 2017 15:39:34 +0000 (08:39 -0700)]
checkpatch: Also allow .at files to have leading tabs.
Autotest .at files often have lines with samples of expected output from
various programs, which fairly often includes leading tabs, so this warning
causes false positives there.
Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Aaron Conole <aconole@redhat.com>
Flavio Leitner [Wed, 7 Jun 2017 20:58:10 +0000 (17:58 -0300)]
testsuite: release resources when vswitch exits.
This change the testsuite macro to release the resources
configured by ovs-vswitchd when exiting as it used to be.
Fixes: 0f28164be02ac ("netdev-linux: make tap devices persistent") Fixes: fe13ccdca6a22 ("vswitchd: Add --cleanup option to the 'appctl
exit' command")
Reported-by: Eric Garver <e@erig.me> Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: Joe Stringer <joe@ovn.org>
Joe Stringer [Fri, 26 May 2017 21:11:31 +0000 (14:11 -0700)]
test-hash: Fix unaligned pointer value error.
Clang 4.0 complains:
../tests/test-hash.c:160:16: error: taking address of packed member 'b' of
class or structure 'offset_ovs_u128' may result in an unaligned pointer value
[-Werror,-Waddress-of-packed-member]
in0 = &in0_data.b;
Set the bit in the aligned u128 first then copy the contents into the
offset u128 so that we don't have to take the address of the non-aligned
u128 and pass it to set_bit128.
For the 256byte_hash, fix it up so that it's actually testing the 256B
hash inside a 32-bit offset u128 as well.
Suggested-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
Copy external_ids from Logical_Switch_Port to SB database
This patch makes ovn-northd copy all string-string pairs in
external_ids column of the Logical_Switch_Port table in Northbound
database to the equivalent column of the Port_Binding table in
Southbound database.
OpenStack Neutron will add some useful data to NB database that can be
later read by networking-ovn-metadata-agent without the need of
maintaining a connection to NB database. This data would include
the CIDR's of a port or the project and device ID's which are needed
when talking to Nova to request metadata.
Signed-off-by: Daniel Alvarez <dalvarez@redhat.com> Signed-off-by: Russell Bryant <russell@ovn.org>
Add SSL protocol and cipher columns to SSL tables in northbound
and southbound databases. Start nb/sb ovsdb-server with command-
line options to use these columns. Add support to ovn-nbctl
and ovn-sbctl "set-ssl" commands for user-friendly management
of these settings.
Signed-off-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Darrell Ball [Mon, 5 Jun 2017 13:34:33 +0000 (06:34 -0700)]
flow: Refactor ct_orig_tuple check in miniflow_extract().
The checks to populate ct_orig_tuple in miniflow_extract
include recirc_id being non-zero. Now, ct_orig_tuple
is only populated if the packet has passed through the
connection tracker, which is a prerequisite for having
valid ct_orig_tuple information. This is recognized by
having a non-zero ct_state. This has an added benefit
of saving some processing time.
Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Darrell Ball [Tue, 30 May 2017 21:21:33 +0000 (14:21 -0700)]
dpdk: Move tcp_payload_length to include file.
The function tcp_payload_length is moved to a private
include file to be used by other conntrack files. A
sanity check is added for general use, although
previous usage was safe in that filtering is already
done by the time it is called.
Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
tests: Export PYTHONCOERCECLOCALE=0 for python3 tests
This patch exports PYTHONCOERCECLOCALE=0 when you have Python3 tests
enabled.
This is needed since testsuite forces LC_ALL=C and Python 3, with PEP 538,
prints the following warning on stderr:
"Python runtime initialized with LC_CTYPE=C (a locale with default ASCII
encoding), which may cause Unicode compatibility problems. Using C.UTF-8,
C.utf8, or UTF-8 (if available) as alternative Unicode-compatible locales is
recommended."
AT_CHECK reports it as an error since stderr is not empty as it should be.
This patch is needed, at least, on Fedora 26 and Rawhide (backported PEP
538 on Python 3.6).
This will also be needed on any distribution with Python 3.7 (PEP 538).
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
stream: include ssl protocol/cipher options in run-time help
Include --ssl-protocols and --ssl-ciphers options in run-time
help output.
Sample output with this change:
PKI configuration (required to use SSL):
-p, --private-key=FILE file with private key
-c, --certificate=FILE file with certificate for private key
-C, --ca-cert=FILE file with peer CA certificate
--bootstrap-ca-cert=FILE file with peer CA certificate to read or create
SSL options:
--ssl-protocols=PROTOS list of SSL protocols to enable
--ssl-ciphers=CIPHERS list of SSL ciphers to enable
Output formatting options:
Signed-off-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Stephen Finucane [Fri, 26 May 2017 14:12:38 +0000 (15:12 +0100)]
docs: Document dpdkr ports
I has an idea what these were but that idea was somewhat incorrect and
out-of-date. Add a minimal guide to fill in these gaps, along with a
warning about how useless these things generally are now (yay,
vhost-user).
Signed-off-by: Stephen Finucane <stephen@that.guru> Cc: Ciara Loftus <ciara.loftus@intel.com> Cc: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Stephen Finucane [Fri, 26 May 2017 14:12:37 +0000 (15:12 +0100)]
docs: Clarify the superiority of dpdkvhostuserclient
Apparently dpdkvhostuser interfaces are inferior to dpdkvhostuserclient.
Explain why.
Signed-off-by: Stephen Finucane <stephen@that.guru> Cc: Ciara Loftus <ciara.loftus@intel.com> Cc: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Kevin Traynor <ktraynor@redhat.com>
Szucs Gabor [Tue, 6 Jun 2017 15:11:54 +0000 (17:11 +0200)]
bfd: Detect Multiplier configuration
Mult value (bfd.DetectMult in RFC5880) is hard-coded and equal to 3 in
current openvswitch. As a consequence remote and local mult is the same.
In this commit the mult (Detect Multiplier/bfd.DetectMult/Detect Mult)
can be set on each interface setting the mult=<value> in bfd Column
in Interface table of ovsdb database.
Example:
ovs-vsctl set Interface p1 bfd:mult=4
sets mult=4 on p1 interface
The modification based on RFC5880 June 2010.
The relevant paragraphs are:
4.1. Generic BFD Control Packet Format
6.8.4. Calculating the Detection Time
6.8.7. Transmitting BFD Control Packets
6.8.12. Detect Multiplier Change
The mult value is set to default 3 if it is not set in ovsdb. This
provides backward compatibility to previous openvswitch behaviour.
The RFC5880 says in 6.8.1 that DetectMult shall be a non-zero integer.
In RFC5880 4.1. "Detect Mult" has 8 bit length and is declared
as a 8 bit unsigned integer in bfd.c.
Consequently mult value shall be greater than 0 and less then 256.
In case of incorrect mult value is given in ovsdb the default value (3)
will be set and a message is logged into ovs-vswitchd.log on that.
Local or remote mult value change is also logged into ovs-vswitchd.log.
Since remote and local mult is not the same calculation of detect time
has been changed. Due to RFC5880 6.8.4 Detection Time is calculated using
mult value of the remote system.
Detection time is recalculated due to remote mult change.
The BFD packet transmission jitter is different in case of mult=1
due to RFC5880 6.8.7. The maximum interval of the transmitted bfd packet
is 90% of the transmission interval.
The value of remote mult is printed in the last line of the output of
ovs-appctl bfd/show command with label: Remote Detect Mult.
There is a feature in openvswitch connected with forwarding_if_rx that
is not the part of RFC5880. This feature also uses mult value but it is
not specified if local or remote since it was the
same in original code. The relevant description in code:
/* When 'bfd->forwarding_if_rx' is set, at least one bfd control packet
* is required to be received every 100 * bfd->cfg_min_rx. If bfd
* control packet is not received within this interval, even if data
* packets are received, the bfd->forwarding will still be false. */
Due to lack of specification local mult value is used for calculation of
forwarding_if_rx_detect_time. This detect time is recalculated at mult
change if forwarding_if_rx is true and bfd is in UP state.
A new unit test has been added: "bfd - Edit the Detect Mult values"
The following cases are tested:
- Without setting mult the mult will be the default value (3).
- The setting of the lowest (1) and highest (255) valid mult value
and the detection of remote mult value.
- The setting of out of range mult value (0, 256) in ovsdb results
sets default value in ovs-vswitchd
- Clearing non default mult value from ovsdb results sets default
value in ovs-vswitchd.
Signed-off-by: Gábor Szűcs <gabor.sz.cs@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Darrell Ball [Tue, 30 May 2017 17:49:27 +0000 (10:49 -0700)]
dpdk: Userspace Datapath: Introduce NAT Support.
This patch introduces NAT support for the userspace datapath.
Most conntrack module changes are in this patch, with the
exception of icmp related handling and recent orig tuple
support.
The per packet scope of lookups for NAT and un_NAT is at
the bucket level rather than global. One hash table is
introduced to support create/delete handling. The create/delete
events may be further optimized, if the need becomes clear.
Some NAT options with limited utility (persistent, random) are
not supported yet, but will be supported in a later patch.
Signed-off-by: Darrell Ball <dlu998@gmail.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Acked-by: Daniele Di Proietto <diproiettod@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
This patch is based on the "datapath: enable vxlangpe creation in compat mode"
from Yi Yang. It introduces an extension option "gpe" to the vxlan port in the
netdev-dpdk datapath. Description of vxlan gpe protocoll was added to header
file lib/packets.h. In the vxlan specific methods the different packet are
introduced and handled.
Added VXLAN GPE tunnel push test.
Signed-off-by: Yi Yang <yi.y.yang at intel.com> Signed-off-by: Georg Schmuecking <georg.schmuecking@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jan Scheurich [Fri, 2 Jun 2017 16:16:21 +0000 (16:16 +0000)]
userspace: L3 tunnel support for GRE and LISP
Add a boolean "layer3" configuration option for tunnel vports.
The layer3 option defaults to false for all ports except LISP.
GRE ports accept both true and false for "layer3".
A tunnel vport configured with layer3=true receives L3 packets.
which are then converted to Ethernet packets by pushing a dummy
Ethernet heder at the ingress of the OpenFlow pipeline. The
Ethernet header of a packet is stripped before sending to a
layer3 tunnel vport.
Presently a single GRE vport cannot carry both L2 and L3 packets.
But it is possible to create two GRE vports representing the same
GRE tunel, one with layer3=false, the other with layer3=true.
L2 packet from the tunnel are received on the first vport, L3
packets on the second. The controller must send packets to the
layer3 GRE vport to tunnel them without their Ethernet header.
Units tests have been added to check the L3 tunnel handling.
LISP tunnels are not yet supported by the netdev userspace datapath.
Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Jan Scheurich [Fri, 2 Jun 2017 16:16:17 +0000 (16:16 +0000)]
userspace: Switching of L3 packets in L2 pipeline
Ports have a new layer3 attribute if they send/receive L3 packets.
The packet_type included in structs dp_packet and flow is considered in
ofproto-dpif. The classical L2 match fields (dl_src, dl_dst, dl_type, and
vlan_tci, vlan_vid, vlan_pcp) now have Ethernet as pre-requisite.
A dummy ethernet header is pushed to L3 packets received from L3 ports
before the the pipeline processing starts. The ethernet header is popped
before sending a packet to a L3 port.
For datapath ports that can receive L2 or L3 packets, the packet_type
becomes part of the flow key for datapath flows and is handled
appropriately in dpif-netdev.
In the 'else' branch in flow_put_on_pmd() function, the additional check
flow_equal(&match.flow, &netdev_flow->flow) was removed, as a) the dpcls
lookup is sufficient to uniquely identify a flow and b) it caused false
negatives because the flow in netdev->flow may not properly masked.
In dpif_netdev_flow_put() we now use the same method for constructing the
netdev_flow_key as the one used when adding the flow to the dplcs to make sure
these always match. The function netdev_flow_key_from_flow() used so far was
not only inefficient but sometimes caused mismatches and subsequent flow
update failures.
The kernel datapath does not support the packet_type match field.
Instead it encodes the packet type implictly by the presence or absence of
the Ethernet attribute in the flow key and mask.
This patch filters the PACKET_TYPE attribute out of netlink flow key and
mask to be sent to the kernel datapath.
Signed-off-by: Lorand Jakab <lojakab@cisco.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
with variables uint32_t vendor, uint16_t type, uint16_t code. By C rules,
"type << 16" has type "int", which means that it will be sign-extended to
64 bits when ORed with uint64_t. Thus, if 'type' has bit 15 set, then
the overall result will have all of its top 32 bits set, which is not
the desired result.
This commit fixes the problem.
No actual error types used in OVS or OpenFlow have bit 15 set, so this
does not fix a user-visible problem.
Found by Coverity.
Reported-at: https://scan3.coverity.com/reports.htm#v16889/p10449/fileInstanceId=14762955&defectInstanceId=4304798&mergedDefectId=180406 Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
Ben Pfaff [Sat, 27 May 2017 02:59:47 +0000 (19:59 -0700)]
ofp-print: Avoid array overread in print_table_instruction_features().
If a switch claimed to support an instruction that OVS does not know about,
then print_table_instruction_features() would read past the end of the
array of instruction names. This fixes the problem.
None of the other uses of print_table_instruction_features() appear to have
the same problem.
Found by Coverity.
Reported-at: https://scan3.coverity.com/reports.htm#v16889/p10449/fileInstanceId=14762675&defectInstanceId=4305296&mergedDefectId=179859 Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
Ben Pfaff [Sat, 27 May 2017 03:36:17 +0000 (20:36 -0700)]
ofp-util: Mask config value as intended in ofputil_decode_port_mod().
The code in ofputil_decode_port_mod() was originally meant to mask
the returned 'config' value against 'mask'. That is, only bits that are
marked in 'mask' as to be modified can be set to 1-bits in the returned
'config' value; it doesn't really entirely make sense otherwise. The
actual code to do this was dead, though. This fixes that problem.
In a quick skim of callers, I wasn't able to see an actual user-visible
bug that this fixes.
Found by Coverity.
Reported-at: https://scan3.coverity.com/reports.htm#v16889/p10449/fileInstanceId=14762682&defectInstanceId=4304822&mergedDefectId=180422 Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
Ben Pfaff [Tue, 30 May 2017 14:38:56 +0000 (07:38 -0700)]
dpif-netlink: Fix multiple-free and fd leak on error path.
This function attempts to open a bunch of new handlers. If it fails, it
attempts to close all the handlers that have already been opened.
Unfortunately, the loop to close the opened handlers used the wrong array
index: 'i' instead of 'j'. This fixes the problem.
Found by Coverity.
Reported-at: https://scan3.coverity.com/reports.htm#v16889/p10449/fileInstanceId=14762827&defectInstanceId=4305351&mergedDefectId=180429 Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
Eelco Chaudron [Thu, 1 Jun 2017 12:38:09 +0000 (14:38 +0200)]
netdev: Fix netdev_open() to adhere to class type if given
When trying to configure a system port as type=internal it could start
an infinite port creation loop. When this happens you will see the
following log messages:
2017-06-01T09:00:17.900Z|02813|dpif|WARN|system@ovs-system: failed to add ve01_1 as port: File exists
2017-06-01T09:00:17.900Z|02814|bridge|WARN|could not add network device ve01_1 to ofproto (File exists)
2017-06-01T09:00:17.907Z|02815|bridge|INFO|bridge bzb: added interface ve01_1 on port 2
2017-06-01T09:00:17.909Z|02816|bridge|INFO|bridge bzb: deleted interface ve01_1 on port 2
2017-06-01T09:00:17.914Z|02817|dpif|WARN|system@ovs-system: failed to add ve01_1 as port: File exists
2017-06-01T09:00:17.914Z|02818|bridge|WARN|could not add network device ve01_1 to ofproto (File exists)
2017-06-01T09:00:17.921Z|02819|bridge|INFO|bridge bzb: added interface ve01_1 on port 3
2017-06-01T09:00:17.923Z|02820|bridge|INFO|bridge bzb: deleted interface ve01_1 on port 3
2017-06-01T09:00:17.929Z|02821|dpif|WARN|system@ovs-system: failed to add ve01_1 as port: File exists
2017-06-01T09:00:17.929Z|02822|bridge|WARN|could not add network device ve01_1 to ofproto (File exists)
2017-06-01T09:00:17.936Z|02823|bridge|INFO|bridge bzb: added interface ve01_1 on port 4
...
...
This is how to replicate it:
ip link add name ve01_1 type veth peer name ve01_2
ovs-vsctl add-br bzb
ovs-vsctl add-port bzb ve01_1
ovs-vsctl set interface ve01_1 type=internal
ip link set dev ve01_1 up
ip link set dev ve01_2 up
When changing the type to internal, the async configuration logic get
triggered and because the type has changed it will delete the
interface and the ofproto port. Next it will call iface_do_create() to
re-create the interface as internal. Because we just deleted the
interface netdev_open() will try to recreate it as internal.
However this will fail with EEXIST as a system interface already
exists withe the name.
Up till here all is fine...
Now some ipv6 route change comes along for the ve01_1 interface, and
the route infrastructure will call netdev_open(). This will create the
interface of type system.
Next the configuration verify process gets triggered due to
if_notifier_changed() being true. We now retry the above, but because
the interface exists (although in the system class) it will use it,
and create the interface successfully.
This triggers another if notification, causing yet another config
update, and because the system != internal reconfiguration happens and
it start from the top...
So the fix as presented below is causing netdev_open() only to return
the existing device for the class type requested (if the type is
specified).
Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Tue, 30 May 2017 21:22:54 +0000 (14:22 -0700)]
checkpatch: Omit some checks on comment lines.
Comments are more freeform than code, so this patch tries to ignore many
checks on comment lines. It assumes that any line that begins with "/*"
or "* " is a comment line. (Without a following space, "*" might be
something like "*x = 1;".)
Suggested-by: Joe Stringer <joe@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Aaron Conole <aconole@redhat.com>
Ben Pfaff [Thu, 1 Jun 2017 14:21:41 +0000 (07:21 -0700)]
Replace most uses of and references to "ifconfig" by "ip".
It's becoming more common that OSes include "ip" but not "ifconfig", so
it's best to avoid using the latter. This commit removes most references
to "ifconfig" and replaces them by "ip". It also adds a build-time check
to make it harder to introduce new uses of "ifconfig".
There are important differences between "ifconfig" and "ip":
- An "ifconfig" command that sets an IP address also brings the interface
up, but a similar "ip addr add" command does not, so it is often necessary
(or at least precautionary) to add an "ip link set <dev> up" command.
- "ifconfig" can infer a netmask from an IP adddress, but "ip" always
assumes /32 if none is given.
- "ifconfig" with address 0.0.0.0 removes any configured IP address, but
"ip addr add" does not, so "ifconfig <dev> 0.0.0.0" must be replaced by
"ip addr del" or "ip addr flush".
Ilya Maximets [Fri, 19 May 2017 13:37:32 +0000 (16:37 +0300)]
netdev-dpdk: Use uint8_t for port_id.
Currently, signed integer is used for 'port_id' variable and
'-1' as identifier of bad or uninitialized 'port_id'.
This inconsistent with dpdk library and, also, in few cases,
leads to passing '-1' to dpdk functions where uint8_t expected.
Such behaviour doesn't produce any issues, but it's better to
use same type as in dpdk library for consistency.
Introduced 'dpdk_port_t' typedef for better maintainability.
Also, magic number '-1' replaced with DPDK_ETH_PORT_ID_INVALID
macro.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Ilya Maximets [Fri, 19 May 2017 13:37:31 +0000 (16:37 +0300)]
netdev-dpdk: Fix device leak on port deletion.
Currently, once created device in dpdk will exist forever
even after del-port operation untill we manually call
'ovs-appctl netdev-dpdk/detach <name>', where <name> is not
the port's name but the name of dpdk eth device or pci address.
Few issues with current implementation:
1. Different API for usual (system) and DPDK devices.
(We have to call 'ovs-appctl netdev-dpdk/detach' each
time after 'del-port' to actually free the device)
This is a big issue mostly for virtual DPDK devices.
2. Follows from 1:
For DPDK devices 'del-port' leads just to
'rte_eth_dev_stop' and subsequent 'add-port' will
just start the already existing device. Such behaviour
will not reset the device to initial state as it could
be expected. For example: virtual pcap pmd will continue
reading input file instead of reading it from the beginning.
3. Follows from 2:
After execution of the following commands 'port1' will be
configured with the 'old-options' while 'ovs-vsctl show'
will show us 'new-options' in dpdk-devargs field:
4. Follows from 1:
Not detached device consumes 'port_id'. Since we have very
limited number of 'port_id's (32 in common case) this may
lead to quick exhausting of id pool and inability to add any
other port.
To avoid above issues we need to detach all the attached devices on
port destruction.
appctl 'netdev-dpdk/detach' removed because not needed anymore.
We need to use internal 'attached' variable to track ports on
which rte_eth_dev_attach() was called and returned successfully
to avoid closing and detaching devices that do not support hotplug or
by any other reason attached using the 'dpdk-extra' cmdline options.
CC: Ciara Loftus <ciara.loftus@intel.com> Fixes: 55e075e65ef9 ("netdev-dpdk: Arbitrary 'dpdk' port naming") Fixes: 69876ed78611 ("netdev-dpdk: Add support for virtual DPDK PMDs (vdevs)") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Billy O'Mahony <billy.o.mahony@intel.com>
Ben Pfaff [Wed, 31 May 2017 23:06:12 +0000 (16:06 -0700)]
Support accepting and displaying port names in OVS tools.
Until now, most ovs-ofctl commands have not accepted names for ports, only
numbers, and have not been able to display port names either. It's a lot
easier for users if they can use and see meaningful names instead of
arbitrary numbers. This commit adds that support.
For backward compatibility, only interactive ovs-ofctl commands by default
display port names; to display them in scripts, use the new --names
option.
Signed-off-by: Ben Pfaff <blp@ovn.org> Tested-by: Aaron Conole <aconole@redhat.com>
stp: Add link-state checking support for stp ports.
When bridge stp enabled, we can enable the stp ports
despite ports are down. When initializing, this patch checks
link-state of ports and enable or disable them according
to their link-state. This patch also allow user to enable
and disable a port when bridge stp is running. If a stp
port is in disable state, it can forward packets. If its
link is down and this patch sets it to disable, there is
no L2 loop.
Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech> Signed-off-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Wed, 17 May 2017 23:13:55 +0000 (16:13 -0700)]
ovn-controller: Fix log conditions for unexpected openflow messages.
Currently in pinctrl.c and ofctrl.c there are similar logic to log
ignored messages, which is somehow inaccurate and confusing. For example,
OFPTYPE_PACKET_IN is handled only in pinctrl.c but in ofctrl.c it
is listed as expected input and not logged as "ignored" messages, while
it is in fact unexpected and ignored there. This patch clearup the
unnecessary "if" conditions and logs all messages that are not
expected/handled honestly, so that there will be logs for debugging
if such abnormal case really happens.
Signed-off-by: Han Zhou <zhouhan@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>