Ben Pfaff [Fri, 13 Sep 2013 00:42:23 +0000 (17:42 -0700)]
guarded-list: New data structure for thread-safe queue.
We already had queues that were suitable for replacement by this data
structure, and I intend to add another one later on.
flow_miss_batch_ofproto_destroyed() did not work well with the guarded-list
structure (it required either adding a lot more functions or breaking the
abstraction) so I changed the caller to just use udpif_revalidate().
Checking reval_seq at the end of handle_miss_upcalls() also didn't work
well with the abstraction, so I decided that since this was a corner case
anyway it would be acceptable to just drop those in flow_miss_batch_next().
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Thu, 12 Sep 2013 06:23:00 +0000 (23:23 -0700)]
ofproto: Add a ref_count to "struct rule" to protect it from being freed.
Taking a read-lock on the 'rwlock' member of struct rule prevents members
of the rule from changing. This is a short-term use of the 'rwlock': one
takes the lock, reads some members, and releases the lock.
Taking a read-lock on the 'rwlock' also prevents the rule from being freed.
This is often a relatively long-term need. For example, until now flow
translation has held the rwlock in xlate_table_action() across the entire
recursive translation, which can call into a great deal of different code
across multiple files.
This commit switches to using a reference count for this second purpose
of struct rule's rwlock. This means that all the code that previously
held a read-lock on the rwlock across deep stacks of functions can now
simply keep a reference.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Mon, 9 Sep 2013 18:19:41 +0000 (11:19 -0700)]
ofproto: Remove soon-to-be-invalid optimizations.
Until now, ofproto_add_flow() and ofproto_delete_flow() assumed that the
flow table could not change between its flow table check and its later
modification to the flow table. This assumption will soon be untrue, so
remove it.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Sun, 1 Sep 2013 21:46:04 +0000 (14:46 -0700)]
ofproto-dpif: Remove vestigial "clogged" feature.
Before this commit, ofproto-dpif did not ordinarily ever defer
completion of a flow_mod, but it was possible to use "ovs-appctl
ofproto/clog" to force deferring completion, which allowed some basic
debugging of deferred completion. I only ever used that feature once,
while doing initial debugging of the deferred completion feature. I
intend in future commits to make changes that will make deferred
completion harder to implement in ofproto-dpif, so this commit removes
that feature in advance.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Wed, 11 Sep 2013 23:41:32 +0000 (16:41 -0700)]
ofproto: Eliminate 'ofproto_node' member from struct rule.
The ofproto_node member is convenient for collecting lists of rules, but
it is also challenging for concurrency because only a single thread at a
time can put a given rule on a list. This commit eliminates the
ofproto_node member and introduces a new 'struct rule_collection' that
can be use in a thread-safe manner.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
ofproto: Use proper error code when meter_id is out of range.
Use OUT_OF_METERS when given meter_id is greater than what is supported
by the datapath. Retain the INVALID_METER error code for the meter_ids
outside of the range supported by the specification.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Alex Wang [Thu, 12 Sep 2013 18:54:16 +0000 (11:54 -0700)]
bridge: Always call smap_destroy() after smap_init()
This commit fixes a place in bridge.c where smap_destroy() is not
always called after smap_init(). Though there is no memory leak
now, it is necessary to fix it and prevent memory leak in the
future when smap_init() may be modified to allocate dynamic memory.
Reported-by: Ansis Atteka <aatteka@nicira.com> Signed-off-by: Alex Wang <alexw@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Daniel Baluta [Wed, 11 Sep 2013 21:56:53 +0000 (00:56 +0300)]
ofproto: update flow_stats flags on flow_stats_request
This is a first step in implementing 'on demand flow counters'.
We save relevant flow_mod flags (OFPUTIL_FF_SEND_FLOW_REM,
OFPUTIL_FF_NO_PKT_COUNTS, OFPUTIL_FF_NO_BYT_COUNTS) into newly created rule
when a new flow is added, and echo them back in the flow stats request.
Notice that we remove send_flow_removed flag from struct rule/struct ofoperation
and we replace it with an enum tracking all ofputil_flow_mod_flags.
Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Andy Zhou [Tue, 10 Sep 2013 20:33:08 +0000 (13:33 -0700)]
openflow-1.1+: move OFPTC11_TABLE_MISS_MASK out of enum
OFPTC11_TABLE_MISS_MASK is not a valid configuration, but to indicate
there are only 2 bits being used for table miss configuration. Move
it out of the enum definition.
Reported-by: Simon Horman <horms@verge.net.au> Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ethan Jackson [Fri, 6 Sep 2013 19:51:02 +0000 (12:51 -0700)]
ofproto-dpif: Don't hold mac rwlock while calling send_packet().
Holding the mac learning rwlock while calling send_packet() which
calls xlate_actions() is dangerous because somewhere deep in the chain
the same lock might be acquired causing a deadlock. Though we haven't
run into this problem in testing, it seems prudent to refactor it.
Reported-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
OVS has its own workq implementation for coupe of reasons. first
was to avoid system freeze due to ovs-flow rehash softlockup.
We have moved out rehash from workq, So this problem does not exist.
second was related bugs in kernel workq implementation in pre-2.6.32
kernel. But we have dropped support for older kernel.
So there is no reason to keep ovs-workq around. Following patch
removes it.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
Daniel Borkmann [Mon, 9 Sep 2013 20:27:27 +0000 (13:27 -0700)]
datapath: flow: fix potential illegal memory access in __parse_flow_nlattrs
In function __parse_flow_nlattrs(), we check for condition
(type > OVS_KEY_ATTR_MAX) and if true, print an error, but we do
not return from this function as in other checks. It seems this
has been forgotten, as otherwise, we could access beyond the
memory of ovs_key_lens, which is of ovs_key_lens[OVS_KEY_ATTR_MAX + 1].
Hence, a maliciously prepared nla_type from user space could access
beyond this upper limit.
Introduced by 03f0d916a ("openvswitch: Mega flow implementation").
Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Andy Zhou <azhou@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
Guolin Yang [Mon, 9 Sep 2013 16:38:01 +0000 (09:38 -0700)]
util: Don't set thread name to empty
In monitor_daemon(), it set subprogram_name to "" which causes system crash
in some platform when trying to set the thread name to "". This change set
thread name to program name in this case.
Signed-off-by: Guolin Yang <gyang@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
datapath: Move flow table rehashing to flow install.
Rehashing in ovs-workqueue can cause ovs-mutex lock contentions
in case of heavy flow setups where both needs ovs-mutex. So by
moving rehashing to flow-setup we can eliminate contention.
This also simplify ovs locking and reduces dependence on
workqueue.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
datapath lisp: Use ovs offload compat functionality.
OVS already has compat functions to handle GSO packets.
Following patch get rid of GSO packet handling in lisp
and use ovs iptunnel_xmit() function for same.
CC: Lori Jakab <lojakab@cisco.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
CC: Lori Jakab <lojakab@cisco.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Lorand Jakab <lojakab@cisco.com> Acked-by: Jesse Gross <jesse@nicira.com>
Andy Zhou [Sat, 7 Sep 2013 10:02:32 +0000 (03:02 -0700)]
openflow-1.1+: OFPT_TABLE_MOD (part 1)
Added infrastructure to support Openflow OFPT_TABLE_MOD message. This patch
does not include the flexible table miss handling code that is necessary to
support the semantics specified in OFPT_TABLE_MOD messages.
Current flow miss behavior continues to conform to Openflow 1.0. Future
commits to add more flexible table miss support are needed to fully support
OPFT_TABLE_MOD for Openflow-1.1+.
Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
sw_flow_key alignment was declared as " __aligned(__alignof__(long))".
However, this breaks on the m68k architecture where long is 32 bit in
size but 16 bit aligned by default. This aligns to the size of a long to
ensure that we can always do comparsions in full long-sized chunks. It
also adds an additional build check to catch any reduction in alignment.
Simon Horman [Thu, 5 Sep 2013 05:19:13 +0000 (14:19 +0900)]
ovs-ofctl: Handle any number of buckets in group statistics
struct ofputil_group_stats has an arbitrary limit
of 16 buckets for which it can record statistics.
However the code does not appear to enforce this
limit and it seems to me that the code could overflow.
This patch aims to remove the arbitrary limit by
changing the 'bucket_stats' field of struct ofputil_group_stats
from a fixed length array to a pointer whose storage is allocated and freed
as necessary.
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Thu, 5 Sep 2013 16:36:43 +0000 (09:36 -0700)]
configure: Remove call to OVS_CHECK_CACHE_TIME that does not exist.
Commit 31ef9f5178 (timeval: Remove CACHE_TIME scheme.) removed the
OVS_CHECK_CACHE_TIME macro but not the call to it, which leads to an error
message when one runs configure.
Alex Wang [Wed, 4 Sep 2013 22:49:19 +0000 (15:49 -0700)]
timeval: Remove CACHE_TIME scheme.
This commit removes the CACHE_TIME scheme from timeval module. This
is for eliminating the lock contention over the read/write lock of
the cached time. To get the time, the thread now will directly do
the system call 'clock_gettime()'.
As a side effect, timer can only be warpped after timer is stopped
by 'appctl time/stop' command.
Signed-off-by: Alex Wang <alexw@nicira.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Simon Horman [Mon, 2 Sep 2013 02:23:16 +0000 (11:23 +0900)]
ovs-ofctl: Correct formatting of instructions in manpage.
Adjust formatting in ovs-ofctl manpage so that apply_actions, clear_actions
write_metadata and goto_table appear at the same level of indentation as
actions rather being indented as if they are arguments to the learn action.
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ethan Jackson [Wed, 4 Sep 2013 00:23:44 +0000 (17:23 -0700)]
ofproto: Rename struct rule's evict lock and use it more widely.
There are a few fields in struct rule which are accessible by
functions in ofproto-dpif and therefore need to accessed in a thread
safe manner. This patch achieves this by generalizing the rules evict
rwlock and requiring a writelock to be held to edit them.
Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ethan Jackson [Tue, 3 Sep 2013 22:37:14 +0000 (15:37 -0700)]
ofproto-dpif: Don't manage eviction groups from threads.
Managing eviction groups from threads was going to be difficult to do
in a performant thread-safe manner, so this patch punts the problem to
the main thread.
Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ethan Jackson [Tue, 27 Aug 2013 20:17:11 +0000 (13:17 -0700)]
ofproto-dpif: Hide struct rule_dpif internally.
By hiding struct rule_dpif inside ofproto-dpif, it becomes very clear
which attributes are accessed by multiple threads and need to be
protected by locks.
Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Fri, 30 Aug 2013 03:55:10 +0000 (20:55 -0700)]
ofproto-dpif: Do not dpif_port_del() patch ports in port_del().
Patch ports don't have datapath ports so it doesn't make sense to try to
call dpif_port_del() on them. If we do try, it will fail, which makes the
caller think that the port wasn't really deleted, which in turn keeps
ofproto from reporting the port deletion via OpenFlow. This fixes the
problem.
Without this patch, the following commands, executed under "make sandbox",
will report the patch port addition in "ovs-ofctl monitor" output, but not
the patch port deletion. With this patch, both the addition and deletion
will be reported.
Alex Wang [Wed, 4 Sep 2013 22:21:15 +0000 (15:21 -0700)]
ofproto-dpif-xlate: Fix confusion between "no stp port" and "stp port 0".
Commit 9d189a50e (ofproto-dpif-xlate: Pull STP xlation into
ofproto-dpif-xlate.) introduced the bug that considers 'stp_port_no'
of 0 as stp disabled on the port. However 'stp_port_no' is
actually the index of the stp struct's port array and ranges
between [0, STP_MAX_PORTS). So the bug allows the blocked
port keep transmitting packets and generates loop.
This commit fixes this bug.
Signed-off-by: Alex Wang <alexw@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Wed, 4 Sep 2013 20:37:56 +0000 (13:37 -0700)]
ofproto: Convert units correctly in ofport_open().
netdev_features_to_bps() returns a speed in bps, but struct
ofputil_phy_port's curr_speed and max_speed are in kbps, so a conversion
is necessary. This commit fixes the problem.
Reported-by: Benjamin Lunsky <benjamin.lunsky@netronome.com> Tested-by: Benjamin Lunsky <benjamin.lunsky@netronome.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Wed, 4 Sep 2013 19:33:06 +0000 (12:33 -0700)]
flow: Fix hypothetical memory leak in miniflow_move().
Ordinarily a miniflow will use its inline_values if its values can fit, but
there is nothing to prevent a small number of values from being stored
in malloc()'d memory. If this happened, then miniflow_move() would leak
memory. This commit fixes the problem.
This is a hypothetical problem. I haven't seen it in practice.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Wed, 4 Sep 2013 19:38:12 +0000 (12:38 -0700)]
ofproto-dpif: Fix use-after-free deleting a bridge with active traffic.
When a bridge that has active traffic is deleted, the bridge's ofproto can
be referenced by in-flight "flow_miss"es. This commit fixes the problem
by destroying "flow_miss"es that reference the ofproto that is being
deleted.
I found the problem by adding and removing a bridge that had active traffic
(via hping3) while running under valgrind.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ethan Jackson [Wed, 4 Sep 2013 16:27:31 +0000 (09:27 -0700)]
ofproto-dpif: Destroy facets on ofproto removal.
Facets maintain a pointer to their owning ofproto-dpif, and therefore
when an ofproto-dpif is destroyed all of their child facets should be
destroyed along with it. If they aren't, it's possible a subfacet
looked up in the dpif-backer could access it's parent facet and
therefore a defunct parent ofproto causing a segmentation fault.
Bug #19421.
Bug #19423. Diagnosed-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ethan Jackson [Wed, 4 Sep 2013 00:34:00 +0000 (17:34 -0700)]
ofproto-dpif-xlate: Fix mac learning deadlock.
xlate_normal() held the mac_learning lock while calling
output_normal(). When running with patch ports, this could cause
xlate_actions() to be called again, possibly attempting to take a
write lock on the same learning table causing a deadlock. This patch
solves the problem by holding the lock for a very brief period of
time.
Bug #19423. Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Justin Pettit <jpettit@nicira.com>
Guolin Yang [Fri, 30 Aug 2013 16:57:13 +0000 (09:57 -0700)]
Fix a bug in conversion between flow/mask and flow key
In odp_flow_key_from_flow__(), when converting ICMPv6 flow/mask
to flow/mask key, we should always use flow to check for whether
ND informaition is present or not. In mask case, both type and code
field should be masked, otherwise ND fields can be masked.
Similarly in reverse conversion (parse_l2_5_onward()), we should
have same check.
Signed-off-by: Guolin Yang <gyang@nicira.com>
[blp@nicira.com changed && to || in parse_l2_5_onward() Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
Ed Maste [Fri, 30 Aug 2013 14:56:08 +0000 (10:56 -0400)]
util: Include pthread_np.h on FreeBSD
On FreeBSD pthread_set_name_np's prototype is provided by pthread_np.h.
As I believe it is the only platform to provide the "set_name" (with an
underscore) variant I hope it's fine to use the existing autoconf macro
HAVE_PTHREAD_SET_NAME_NP.
Signed-off-by: Ed Maste <emaste@freebsd.org> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Thu, 29 Aug 2013 23:49:51 +0000 (16:49 -0700)]
ofproto: Avoid removing rules from meter lists twice.
Since the meter code was introduced, the oftable_remove_rule() function
(recently renamed oftable_remove_rule__()) has had two blocks of code to
remove a rule from its meter's list of rules that reference that meter.
This commit reduces that to one.
Also modifies oftable_remove_rule__() to maintain the invariant that
'rule' is in a meter's list if and only if
!list_is_empty(&rule->meter_list_node). I don't believe that that fixes
a real bug, but it seems like a good idea.
(This seemed like a serious bug at first, but in fact list_remove() is
idempotent: calling it two times in a row has the same effect as calling
it once.)
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Ben Pfaff [Thu, 15 Aug 2013 17:47:39 +0000 (10:47 -0700)]
packets: Do not assume that IPv4, TCP, or ARP headers are 32-bit aligned.
Ethernet headers are 14 bytes long, so when the beginning of such a header
is 32-bit aligned, the following data is misaligned. The usual trick to
fix that is to start the Ethernet header on an odd-numbered 16-bit
boundary. That trick works OK for Open vSwitch, but there are two
problems:
- OVS doesn't use that trick everywhere. Maybe it should, but it's
difficult to make sure that it does consistently because the CPUs
most commonly used with OVS don't care about misalignment, so we
only find problems when porting.
- Some protocols (GRE, VXLAN) don't use that trick, so in such a case
one can properly align the inner or outer L3/L4/L7 but not both. (OVS
userspace doesn't directly deal with such protocols yet, so this is
just future-proofing.)
- OpenFlow uses the alignment trick in a few places but not all of them.
This commit starts the adoption of what I hope will be a more robust way
to avoid misalignment problems and the resulting bus errors on RISC
architectures. Instead of trying to ensure that 32-bit quantities are
always aligned, we always read them as if they were misaligned. To ensure
that they are read this way, we change their types from 32-bit types to
pairs of 16-bit types. (I don't know of any protocols that offset the
next header by an odd number of bytes, so a 16-bit alignment assumption
seems OK.)
The same would be necessary for 64-bit types in protocol headers, but we
don't yet have any protocol definitions with 64-bit types.
IPv6 protocol headers need the same treatment, but for those we rely on
structs provided by system headers, so I'll leave them for an upcoming
patch.
Ben Pfaff [Wed, 14 Aug 2013 23:26:05 +0000 (16:26 -0700)]
ofproto: Remove obsolete comment and assertion.
At one time, the ofproto-dpif implementation of the 'rule_execute' member
function required, indirectly, at least struct(ofp10_packet_in) bytes of
headroom in the packet passed into it. (This allowed constructing an
OFPT_PACKET_IN without allocating and copying a new ofpbuf.) This
restriction has long been lifted, but rule_execute() had not yet caught
up. This commit updates it.