The new struct flow_tnl contains an extra four bytes of padding on
64-bit machines but we currently assert that the total struct flow
is a fixed size. The size difference isn't actually a problem
because both are multiples of 4 and the build assertion is only
intended to remind people to update FLOW_WC_SEQ when new fields are
added. This changes the assertion to fix just the non-tunnel field
size.
Suggested-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
flow: Extend struct flow to contain tunnel outer header.
Soon the kernel will begin supplying the information about the outer
IP header for tunneled packets and userspace will need to be able to
track it as part of the flow. For the time being this is only used
internally by OVS and not exposed outwards to OpenFlow. As a result,
this threads the information throughout userspace but simply stores
the existing tun_id in it.
Ben Pfaff [Mon, 1 Oct 2012 20:37:47 +0000 (13:37 -0700)]
ovs-ctl: Add support for glibc malloc debugging.
Unlike valgrind, glibc's built-in features for malloc debugging are cheap
enough that one can run with them enabled all the time, at least in test
scenarios.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Kyle Mestery <kmestery@cisco.com>
Ethan Jackson [Thu, 20 Sep 2012 18:13:15 +0000 (11:13 -0700)]
idl: Optionally warn when writing to read-write columns.
ovs-vswitchd should only write to write-only columns. Furthermore,
writing to a column which is not write-only can cause serious
performance degradations. This patch causes ovs-vswitchd to log
and reject writes to read-write columns.
python/ovs/db/idl: getattr(Row) raises TypeError, not AttributeError.
In some cases getattr(Row instance, attrname) doesn't raise AttributeError,
but TypeError
> File "python/ovs/db/idl.py", line 554, in __getattr__
> datum = self._data[column_name]
> TypeError: 'NoneType' object has no attribute '__getitem__'
So getattr(Row instance, attrname, default value) doesn't work.
This occurs when row._changes doesn't include attrname and row._data is None.
So teach Row.__getattr__ _data=None case.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Ben Pfaff <blp@nicira.com>
ofp-actions: Add support for OpenFlow 1.2 "set-field" action.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Simon Horman <horms@verge.net.au>
[blp@nicira.com extracted this code from a larger patch by above, so:] Signed-off-by: Ben Pfaff <blp@nicira.com>
[regarding final version of patch:] Reviewed-by: Simon Horman <horms@verge.net.au>
Ben Pfaff [Mon, 24 Sep 2012 20:18:38 +0000 (13:18 -0700)]
ofp-actions: Allow OF1.1+ actions to be variable-length.
Previously there was no need for this, because all implemented standard
OpenFlow actions had a fixed length, but the OF1.2 "set-field" action (soon
to be implemented) is variable length.
Signed-off-by: Ben Pfaff <blp@nicira.com> Reviewed-by: Simon Horman <horms@verge.net.au>
Ben Pfaff [Tue, 25 Sep 2012 17:23:38 +0000 (10:23 -0700)]
ofp-actions: Prepare to treat OF1.2 actions as OF1.1 actions.
The numbering of OpenFlow 1.0 actions overlaps with the numbering
of OpenFlow 1.1+ actions, so the two sets of actions have to be
distinguished for input and output. But OpenFlow 1.1 and 1.2
actions are numbered to avoid this problem, so there is no need
to distinguish them in the same way. Therefore, this commit
prepares to treat them together.
Signed-off-by: Ben Pfaff <blp@nicira.com> Reviewed-by: Simon Horman <horms@verge.net.au>
Ben Pfaff [Mon, 24 Sep 2012 20:11:37 +0000 (13:11 -0700)]
openflow-1.2: Remove OFPAT12_* definitions that duplicate OFPAT11_* ones.
OpenFlow 1.1 and 1.2 action numbering is compatible, in that no
OpenFlow 1.2 action uses an OpenFlow 1.1 action number in a different
way from OpenFlow 1.1. So it's confusing and unnecessary to have
separate definitions for these numbers.
Signed-off-by: Ben Pfaff <blp@nicira.com> Reviewed-by: Simon Horman <horms@verge.net.au>
When unparsing the kernel tunnel configuration, TTL was incorrectly
converted to "tos". Although it leads to confusing configuration
output, actual operation is not affected.
Ben Pfaff [Fri, 7 Sep 2012 17:07:03 +0000 (10:07 -0700)]
ovsdb-server: Add support for multiple databases.
The OVSDB protocol has supported multiple databases for a long time, but
the ovsdb-server implementation only supported one database at a time.
This commit adds support for multiple databases.
Feature #12353. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Fri, 21 Sep 2012 18:12:39 +0000 (11:12 -0700)]
ovsdb-server: Fix null pointer deref when bool "is_connected" is empty.
The ovsdb-server supports obtaining its remote connection targets from a
database table and updating that table with connection status information.
One of the supported connection status columns is a boolean column named
"is_connected". The code in ovsdb-server blindly assigned a bool into
this column without checking that it actually had space allocated for one.
This was and is fine with the ovs-vswitchd schema, which always has exactly
one value in this column. However, if a database schema makes this column
optional, and there are actually no values in it, then this assignment
dereferences a null pointer.
This commit fixes the problem by allocating space for a bool if none has
yet been allocated.
Noticed while adding an extra test for the connection status feature.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ethan Jackson [Thu, 20 Sep 2012 02:21:06 +0000 (19:21 -0700)]
bridge: Omit alerts on the cfm_remote_opstate column.
This column should be write only, otherwise every call to update it
has to make a trip to the database. Since this column is updated
every time through the run loop as part of refresh_instant_stats(),
this patch fixes a significant performance degradation.
Ben Pfaff [Thu, 20 Sep 2012 15:40:29 +0000 (08:40 -0700)]
ovs-ofctl: Accept port keywords, OF1.1 port numbers, reject port number 0.
OpenFlow 1.0 has special reserved ports in the range 0xfff8 to 0xffff.
OpenFlow 1.1 and later has the same ports in the range 0xfffffff8 to
0xffffffff and allows the OF1.0 range to be used for ordinary ("physical")
switch ports. This means that, naively, the meaning of a port number in
the range 0xfff8 to 0xffff given on the ovs-ofctl command line depends on
the protocol in use. This commit implements something a little smarter:
- Accept keyword names (e.g. LOCAL) for special reserved ports
everywhere that such a port can plausibly be used (previously they
were only accepted in some places).
- Translate 0xfff8...0xffff to 0xfffffff8...0xffffffff for now, since
OF1.1+ isn't in widespread use and those particular ports aren't
likely to be in use in OF1.1+ anyway.
- Log warnings about those ports when they are specified by number, to
allow users to fix their invocations.
Also:
- Accept the OF1.1+ port numbers for these ports, without warning, for
compatibility with the upcoming OF1.1+ support.
- Stop accepting port number 0, which has never been a valid port
number in OpenFlow 1.0 and later. (This required fixing some tests
that inadvertently used this port number).
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Simon Horman <horms@verge.net.au>
Ben Pfaff [Fri, 14 Sep 2012 20:09:33 +0000 (13:09 -0700)]
jsonrpc: Fix Python implementation of inactivity logic.
When a JSON-RPC session receives bytes, or when it successfully sends
queued bytes, then it should count that as activity. However, the code
here was reversed, in that it used the wrong check in each place. That is,
when it tried to receive data, it would check whether data had just been
sent, and when it tried to send data, it would check whether data had just
been received. Neither one makes sense and doesn't work.
Bug #13214. Reported-by: Luca Giraudo <lgiraudo@nicira.com> CC: James Schmidt <jschmidt@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
datapath: Add version info for out-of-tree modules.
The upstream version of the module always has the version of the running kernel
but for out-of-tree modules it can be difficult to tell the current version.
This adds the information to the module where it can be read using modinfo for
the on-disk version or from /sys/module/openvswitch/version for the currently
loaded module.
Ben Pfaff [Fri, 14 Sep 2012 20:04:15 +0000 (13:04 -0700)]
tests: Fix sensitivity to record ordering in test-netflow output.
The order of records in a NetFlow message is essentially random, but the
test case was picky about it. I started getting failures when I modified
apparently unrelated code, so here's a fix.
Simon Horman [Wed, 12 Sep 2012 04:47:27 +0000 (21:47 -0700)]
ofp-util: Allow decoding of Open Flow 1.1 & 1.2 Table Statistics Request Messages
Signed-off-by: Simon Horman <horms@verge.net.au>
[blp@nicira.com then made substantial changes that were then:] Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
In the above case, uuid in "row" aren't replaced by "named-uuid" because
the function doesn't look into elements of lists.
When list/tuple is found, look into elements recursively.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Wed, 5 Sep 2012 20:34:35 +0000 (13:34 -0700)]
jsonrpc: Treat receiving part of a message as activity.
Until now, the jsonrpc code has only counted receiving a full JSON-RPC
messages as activity. This could theoretically time out, then, while a
very long message is in transit or if a slow link is involved. This commit
changes this code to count receiving any part of a message as activity.
This isn't a problem for OpenFlow connections because OpenFlow messages are
at most 64 kB in size.
This problem hasn't actually been observed in practice.
Bug #12789. Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Fri, 7 Sep 2012 17:50:15 +0000 (10:50 -0700)]
jsonrpc: Treat draining data from send queue as activity.
Until now, the jsonrpc module has used messages received from the
remote peer as the sole means to determine that the JSON-RPC
connection is up. This could in theory interact badly with a
remote peer that stops reading and processing messages from the
receive queue when there is a backlog in the send queue for a
given connection (ovsdb-server is an example of a program that
behaves this way). This commit fixes the problem by expanding
the definition of "activity" to include successfully sending
JSON-RPC data that was previously queued.
The above change is exactly analogous to the similar change
made to the rconn library in commit 133f2dc95454 (rconn: Treat
draining a message from the send queue as activity.).
Bug #12789. Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Wed, 8 Aug 2012 20:32:57 +0000 (13:32 -0700)]
reconnect: Rename reconnect_received() to reconnect_activity().
Receiving data is not the only reasonable way to verify that a connection
is up. For example, on a TCP connection, receiving an acknowledgment that
the remote side has accepted data that we sent is also a reasonable means.
Therefore, this commit generalizes the naming.
Also, similarly for the Python implementation: Reconnect.received() becomes
Reconnect.activity().
Joe Stringer [Wed, 5 Sep 2012 03:25:58 +0000 (15:25 +1200)]
third-party: Fix tcpdump patch
Other parts of OVS have moved on since the tcpdump patch was created. This
commit brings the patch up to date and will compile cleanly against
tcpdump-4.3.0.
Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Justin Pettit <jpettit@nicira.com>
Ben Pfaff [Wed, 5 Sep 2012 17:35:20 +0000 (10:35 -0700)]
ovsdb: Enforce immutability of immutable columns.
OVSDB has always had the ability to mark a column as "immutable", so that
its value cannot be changed in a given row after that row is initially
inserted. However, we discovered recently that ovsdb-server has never
enforced this constraint. This commit implements enforcement.
Reported-by: Paul Ingram <paul@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Kyle Mestery <kmestery@cisco.com>
Simon Horman [Wed, 5 Sep 2012 02:50:38 +0000 (11:50 +0900)]
ofp-errors: Use OFPERR_OFPBRC_BAD_TABLE_ID
* In the case of OpenFlow 1.1+ OFPERR_OFPBRC_BAD_TABLE_ID is defined
in the specification and seems to be the most appropriate error
to use when an unknown table id is encountered.
* In the case of OpenFlow 1.0 no appropriate error message
seems to exist. Perhaps because an invalid port is not possible?
I'm unsure.
In any case, make use of a non-standard error code (1,512).
This was formerly known as OFPERR_NXBRC_BAD_TABLE_ID but
has been rolled into OFPERR_OFPBRC_BAD_TABLE_ID to allow the
latter to be used without concern for the prevailing Open Flow version.
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Wed, 5 Sep 2012 02:50:37 +0000 (11:50 +0900)]
ofp-errors: Use OFPERR_OFPBRC_BAD_PORT
* In the case of OpenFlow 1.2+ OFPERR_OFPBRC_BAD_PORT is defined
in the specification and seems to be the most appropriate error
to use when an invalid port is encountered in a Packet Out request.
* In the case of OpenFlow 1.0 and 1.1 no appropriate error message
seems to exist. Perhaps because an invalid port is not possible?
I'm unsure.
In any case, make use of a non-standard error code (1,514).
This was formerly known as OFPERR_NXBRC_BAD_IN_PORT but
has been rolled into OFPERR_NXBRC_BAD_IN_PORT to allow the
latter to be used without concern for the prevailing Open Flow version.
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Wed, 5 Sep 2012 17:18:56 +0000 (10:18 -0700)]
extract-ofp-errors: Check that error codes are in the expected ranges.
All real OpenFlow error codes are small numbers, and for Nicira extensions
we've intentionally chosen large numbers. This commit adds a check that
standard and extension codes are properly designated in the ofp-errors.h
header.
Simon Horman [Wed, 5 Sep 2012 02:50:36 +0000 (11:50 +0900)]
ofp-errors: Ignore text enclosed in square brackets
Enhance to extract-ofp-errors to omit text enclosed in square brackets from
error description. This allows some commentary other than
the error description to be supplied in ofp-errors.h
As suggested by Ben Pfaff <blp@nicira.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
[blp@nicira.com added the large comment on enum ofperr.] Signed-off-by: Ben Pfaff <blp@nicira.com>
python/ovs/poller: use select.select instead of select.poll.
eventlet/gevent doesn't work well with select.poll because select.poll blocks
python interpreter as a whole instead of switching from the current thread
which is about to block to other runnable thread.
So ovsdb python binding can't be used with eventlet/gevent.
Emulate select.poll with select.select because using python means that
performance isn't so important.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Ben Pfaff <blp@nicira.com>
OFPERR_NXFMFC_GROUPS_NOT_SUPPORTED is currently only used in paths which
are part of a non-NX extension portions of the Open Flow 1.1+
implementation.
After recent discussion it has been decided to attempt to only use
standardised, albeit lest-specify, errors unless errors arise from use of
an NX extension.
With the above in mind it seems appropriate to:
* Use OFPERR_OFPFMFC_UNKNOWN in place of OFPERR_NXFMFC_GROUPS_NOT_SUPPORTED.
* Remove OFPERR_NXFMFC_GROUPS_NOT_SUPPORTED as it is no longer used.
An unfortunate side-effect of this change is that the error for
the case in question is now less-specific.
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Simon Horman [Tue, 4 Sep 2012 06:51:59 +0000 (15:51 +0900)]
ofp-errors: Remove OFPERR_NXBIC_DUP_TYPE
OFPERR_NXBIC_DUP_TYPE is currently only used in
decode_openflow11_instructions() which is part of a non-NX extension
portion of the Open Flow 1.1+ implementation.
After recent discussion it has been decided to attempt to only use
standardised, albeit less-specific, errors unless errors arise from use of
an NX extension.
With the above in mind it seems appropriate to:
* Use OFPERR_OFPIT_BAD_INSTRUCTION in place of OFPERR_NXBIC_DUP_TYPE.
* Remove OFPERR_NXBIC_DUP_TYPE as it is no longer used.
An unfortunate side-effect of this change is that the error for
the case in question is now less-specific.
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Tue, 4 Sep 2012 19:43:53 +0000 (12:43 -0700)]
Introduce sparse flows and masks, to reduce memory usage and improve speed.
A cls_rule is 324 bytes on i386 now. The cost of a flow table lookup is
currently proportional to this size, which is going to continue to grow.
However, the required cost of a flow table lookup, with the classifier that
we currently use, is only proportional to the number of bits that a rule
actually matches. This commit implements that optimization by replacing
the match inside "struct cls_rule" by a sparse representation.
This reduces struct cls_rule to 100 bytes on i386.
There is still some headroom for further optimization following this
commit:
- I suspect that adding an 'n' member to struct miniflow would make
miniflow operations faster, since popcount() has some cost.
- It's probably possible to replace the "struct minimatch" in cls_rule
by just a "struct miniflow", since the cls_rule's cls_table has a
copy of the minimask.
- Some of the miniflow operations aren't well-optimized.
Ben Pfaff [Tue, 21 Aug 2012 21:26:23 +0000 (14:26 -0700)]
hash: Introduce an implementation of murmurhash.
Murmurhash is generally superior to the Jenkins lookup3 hash according to
the available figures. Perhaps we should generally replace our current
hashes by murmurhash.
For now, I'm introducing a parallel implementation to allow it to be used
in cases where an incremental one-word-at-a-time hash is desirable. The
first user will be added in an upcoming commit.
Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
Ben Pfaff [Fri, 20 Jul 2012 21:46:15 +0000 (14:46 -0700)]
classifier: Optimize iteration with a catch-all target rule.
When cls_cursor_init() is given a NULL target, it can skip an expensive
step comparing the rule against the target for every table and every rule
in the classifier. collect_rule_loose() and other callers could take
advantage of this optimization, except that they actually pass in a rule
that matches everything instead of a NULL rule (e.g. for "ovs-ofctl
dump-flows <bridge>" without specifying a matching rule).
Ben Pfaff [Mon, 20 Aug 2012 18:29:43 +0000 (11:29 -0700)]
classifier: Prepare for "struct cls_rule" needing to be destroyed.
Until now, "struct cls_rule" didn't own any data outside its own memory
block. An upcoming commit will make "struct cls_rule" sometimes own blocks
of memory, so it needs "destroy" and to a lesser extent "clone" functions.
This commit adds these in advance, even though they are mostly no-ops, to
make it possible to separately review the memory management.
int
popcount4(unsigned int x)
{
x -= (x >> 1) & 0x55555555;
x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
x = (x + (x >> 4)) & 0x0f0f0f0f;
x += x >> 8;
x += x >> 16;
return x & 0x3f;
}
int
popcount5(unsigned int x)
{
int n;
n = 0;
while (x) {
if (x & 0xf) {
n += ((0xe9949440 >> (x & 0xf)) & 3) + 1;
}
x >>= 4;
}
return n;
}
int
popcount6(unsigned int x)
{
int n;
n = 0;
while (x) {
n += (0xe994 >> (x & 7)) & 3;
x >>= 3;
}
return n;
}
int
popcount7(unsigned int x)
{
static const int table[16] = {
0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4
};
Ben Pfaff [Tue, 7 Aug 2012 20:43:18 +0000 (13:43 -0700)]
flow: Simplify many functions for working with flows and wildcards.
Now that "struct flow" and "struct flow_wildcards" have the same simple
and uniform structure, it's easy to handle common operations by just
iterating over the bits inside them.
Ben Pfaff [Mon, 18 Jun 2012 22:46:13 +0000 (15:46 -0700)]
flow: Remove flow_wildcards_is_exact().
It's only used in a not-very-useful assertion in some test code. In
general, exact-match flows make very little sense anymore, and they're
basically on their way out.
Ben Pfaff [Mon, 18 Jun 2012 16:55:22 +0000 (09:55 -0700)]
flow: Fully separate FWW_* from OFPFW10_*.
It might have been a useful optimization at one point to have FWW_*
correspond in OFPFW10_* where possible, but it doesn't seem worthwhile for
only 3 corresponding values. It also makes the code somewhat more
confusing.
Commit f36ff26ac08bcba28b5befba7b1dcff9b963769d (datapath: Remove
vport list node.) removed an unused list node in struct vport but
did not remove the comment describing it. This does that.
This is analogous to the change made in userspace with 2508ac16defd417b94fb69689b6b1da4fbc76282 (odp-util: Update
ODPUTIL_FLOW_KEY_BYTES for current kernel flow format.). The extra
space for vlan encapsulation was not included in the allocation for
maximum length flows.
Found by code inspection and to my knowledge has never been hit, likely
because skb allocations are padded out to a cacheline, making userspace
more susceptible to this problem than the kernel. In theory, however,
the right combination of flow and packet size could result in a kernel
panic.
Ethan Jackson [Wed, 29 Aug 2012 23:00:31 +0000 (16:00 -0700)]
vswitchd: Respect other_config:stp-enable port setting.
Commit a699f614 (lib: Utilize smaps in the idl.) broke the
other_config:stp-enable port setting in two ways. First, it
changed the default if the setting was missing to disabled.
Second, if the setting was present, it did the opposite of what the
user configured.
Bug #13122. Reported-by: Paul Ingram <paul@nicira.com> Signed-off-by: Ethan Jackson <ethan@nicira.com>
Isaku Yamahata [Tue, 28 Aug 2012 17:19:03 +0000 (02:19 +0900)]
ofproto-dpif: Make OFPP_TABLE send packet-in on miss.
The OpenFlow specification for OFPP_TABLE implies that a miss
should generate a packet-in, but Open vSwitch has never done
that. This corrects the behavior.
This also prepares for the goto-table instruction, which will
need to generate a table-miss in some circumstances.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
[blp@nicira.com rebased and updated commit message] Signed-off-by: Ben Pfaff <blp@nicira.com>