Aliasgar Ginwala [Fri, 30 Aug 2019 15:28:34 +0000 (08:28 -0700)]
ovsdb-tool: Convert clustered db to standalone db.
Add support in ovsdb-tool for migrating clustered dbs to standalone dbs.
E.g. usage to migrate nb/sb db to standalone db from raft:
ovsdb-tool cluster-to-standalone ovnnb_db.db ovnnb_db_cluster.db
Acked-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Aliasgar Ginwala <aginwala@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
It's possible that a port added to the system with certain kinds
of invalid parameters will cause the 'could not add' log to be
triggered. When this happens, the vswitch run loop can continually
re-attempt adding the port. While the parameters remain invalid
the vswitch run loop will re-trigger the warning, flooding the
syslog.
This patch adds a simple rate limit to the log.
Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
stream_ssl: fix important memory leak in ssl_connect() function
While checking valgrind reports after running "make check-valgrind" I have noticed
reports for several tests similar to the following:
....
==5345== Memcheck, a memory error detector
==5345== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==5345== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==5345== Command: ovsdb-client --private-key=/home/damijan.skvarc/doma/ovs/tests/testpki-privkey.pem --certificate=/home/damijan.skvarc/doma/ovs/tests/testpki-cert.pem --ca-cert=/home/damijan.skvarc/doma/ovs/tests/testpki-cacert.pem transact ssl:127.0.0.1:40111 \ \ \ ["ordinals",
==5345== \ \ \ \ \ \ {"op":\ "update",
==5345== \ \ \ \ \ \ \ "table":\ "ordinals",
==5345== \ \ \ \ \ \ \ "where":\ [["number",\ "==",\ 1]],
==5345== \ \ \ \ \ \ \ "row":\ {"number":\ 2,\ "name":\ "old\ two"}},
==5345== \ \ \ \ \ \ {"op":\ "update",
==5345== \ \ \ \ \ \ \ "table":\ "ordinals",
==5345== \ \ \ \ \ \ \ "where":\ [["name",\ "==",\ "two"]],
==5345== \ \ \ \ \ \ \ "row":\ {"number":\ 1,\ "name":\ "old\ one"}}]
==5345== Parent PID: 5344
==5345==
==5345==
==5345== HEAP SUMMARY:
==5345== in use at exit: 116,551 bytes in 3,341 blocks
==5345== total heap usage: 5,134 allocs, 1,793 frees, 412,290 bytes allocated
==5345==
==5345== 6,221 (184 direct, 6,037 indirect) bytes in 1 blocks are definitely lost in loss record 498 of 500
==5345== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5345== by 0x5105E77: CRYPTO_malloc (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0)
==5345== by 0x51E1D23: ??? (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0)
==5345== by 0x51E4861: ??? (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0)
==5345== by 0x51E5414: ASN1_item_ex_d2i (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0)
==5345== by 0x51E546A: ASN1_item_d2i (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0)
==5345== by 0x4E56B27: ??? (in /lib/x86_64-linux-gnu/libssl.so.1.0.0)
==5345== by 0x4E5BA11: ??? (in /lib/x86_64-linux-gnu/libssl.so.1.0.0)
==5345== by 0x4E65145: ??? (in /lib/x86_64-linux-gnu/libssl.so.1.0.0)
==5345== by 0x4522DF: ssl_connect (stream-ssl.c:530)
==5345== by 0x443D38: scs_connecting (stream.c:315)
==5345== by 0x443D38: stream_connect (stream.c:338)
==5345== by 0x443FA1: stream_open_block (stream.c:266)
==5345== by 0x40AB79: open_jsonrpc (ovsdb-client.c:507)
==5345== by 0x40AB79: open_rpc (ovsdb-client.c:143)
==5345== by 0x40B06B: do_transact__ (ovsdb-client.c:871)
==5345== by 0x40B245: do_transact (ovsdb-client.c:893)
==5345== by 0x405F76: main (ovsdb-client.c:282)
==5345==
==5345== LEAK SUMMARY:
==5345== definitely lost: 184 bytes in 1 blocks
==5345== indirectly lost: 6,037 bytes in 117 blocks
==5345== possibly lost: 0 bytes in 0 blocks
==5345== still reachable: 110,330 bytes in 3,223 blocks
==5345== suppressed: 0 bytes in 0 blocks
==5345== Reachable blocks (those to which a pointer was found) are not shown.
==5345== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==5345==
==5345== For counts of detected and suppressed errors, rerun with: -v
==5345== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
....
This report was extracted from "index uniqueness checking" test and complains about
leaking memory in ovsdb-client application. The problem is not huge, since ovsdb-client
is CLI tool which is constantly reinvoked/restarted, thus leaked memory is not accumulated.
More problematic issue is that for the same test valgrind reports the similar problem also for
ovsdb-server:
....
==5290== Memcheck, a memory error detector
==5290== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==5290== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==5290== Command: ovsdb-server --log-file --detach --no-chdir --pidfile --private-key=/home/damijan.skvarc/doma/ovs/tests/testpki-privkey2.pem --certificate=/home/damijan.skvarc/doma/ovs/tests/testpki-cert2.pem --ca-cert=/home/damijan.skvarc/doma/ovs/tests/testpki-cacert.pem --remote=pssl:0:127.0.0.1 db
==5290== Parent PID: 5289
==5290==
==5292== Warning: noted but unhandled ioctl 0x2403 with no size/direction hints.
==5292== This could cause spurious value errors to appear.
==5292== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==5292== Warning: noted but unhandled ioctl 0x2400 with no size/direction hints.
==5292== This could cause spurious value errors to appear.
==5292== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==5290==
==5290== HEAP SUMMARY:
==5290== in use at exit: 2,066 bytes in 48 blocks
==5290== total heap usage: 87 allocs, 39 frees, 14,152 bytes allocated
==5290==
==5290== LEAK SUMMARY:
==5290== definitely lost: 0 bytes in 0 blocks
==5290== indirectly lost: 0 bytes in 0 blocks
==5290== possibly lost: 0 bytes in 0 blocks
==5290== still reachable: 2,066 bytes in 48 blocks
==5290== suppressed: 0 bytes in 0 blocks
==5290== Reachable blocks (those to which a pointer was found) are not shown.
==5290== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==5290==
==5290== For counts of detected and suppressed errors, rerun with: -v
==5290== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 1 from 1)
==5292== Warning: noted but unhandled ioctl 0x2401 with no size/direction hints.
==5292== This could cause spurious value errors to appear.
==5292== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==5292==
==5292== HEAP SUMMARY:
==5292== in use at exit: 164,018 bytes in 4,252 blocks
==5292== total heap usage: 17,910 allocs, 13,658 frees, 1,907,468 bytes allocated
==5292==
==5292== 49,720 (1,472 direct, 48,248 indirect) bytes in 8 blocks are definitely lost in loss record 580 of 580
==5292== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5292== by 0x5105E77: CRYPTO_malloc (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0)
==5292== by 0x51E1D23: ??? (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0)
==5292== by 0x51E4861: ??? (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0)
==5292== by 0x51E5414: ASN1_item_ex_d2i (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0)
==5292== by 0x51E546A: ASN1_item_d2i (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0)
==5292== by 0x4E53E00: ??? (in /lib/x86_64-linux-gnu/libssl.so.1.0.0)
==5292== by 0x4E55727: ??? (in /lib/x86_64-linux-gnu/libssl.so.1.0.0)
==5292== by 0x452C4B: ssl_connect (stream-ssl.c:530)
==5292== by 0x445B18: scs_connecting (stream.c:315)
==5292== by 0x445B18: stream_connect (stream.c:338)
==5292== by 0x445B91: stream_recv (stream.c:369)
==5292== by 0x432A9C: jsonrpc_recv.part.7 (jsonrpc.c:310)
==5292== by 0x433977: jsonrpc_recv (jsonrpc.c:1139)
==5292== by 0x433977: jsonrpc_session_recv (jsonrpc.c:1112)
==5292== by 0x40CCE3: ovsdb_jsonrpc_session_run (jsonrpc-server.c:553)
==5292== by 0x40CCE3: ovsdb_jsonrpc_session_run_all (jsonrpc-server.c:586)
==5292== by 0x40CCE3: ovsdb_jsonrpc_server_run (jsonrpc-server.c:401)
==5292== by 0x40682E: main_loop (ovsdb-server.c:209)
==5292== by 0x40682E: main (ovsdb-server.c:460)
==5292==
==5292== LEAK SUMMARY:
==5292== definitely lost: 1,472 bytes in 8 blocks
==5292== indirectly lost: 48,248 bytes in 936 blocks
==5292== possibly lost: 0 bytes in 0 blocks
==5292== still reachable: 114,298 bytes in 3,308 blocks
==5292== suppressed: 0 bytes in 0 blocks
==5292== Reachable blocks (those to which a pointer was found) are not shown.
==5292== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==5292==
==5292== For counts of detected and suppressed errors, rerun with: -v
==5292== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 1 from 1)
....
In this case ovsdb-server is running as daemon process (--detach option) and leaking memory is
accumulated whenever ovsdb-client is reconnected. Within observed test ovsdb-client CLI tool
connects 8 times to ovsdb-server. Leaked memory in ovsdb-client (for each invocation) is approx.
6K bytes, while leaked memory in ovsdb-server is aprox. 48Kbytes what is actually 8*6K. Thus per
each connection both ovsdb-client and ovsdb-server leak approx. 6K bytes.
I have done a small manual test to check if ovsdb-server is indeed accumulating leaked memory
by dumping ovsdb-server in a loop:
In console2 it was evidently seen ovsdb-server is constantly leaking memory. After a while
(i.e. after a certain number of reconnections) the OOM killer jumps out and kills ovsdb-server.
Very similar situation was already noticed and described in
https://github.com/openvswitch/ovs-issues/issues/168. There, the problem pops up while connecting
controller to ovs-vswitchd daemon.
Valgrind reports point to a problem in openssl library, however after studying openssl code for
a while I have found out the problem is actually in ovs. When connection through SSL channel is
taken place openssl library allocates memory for keeping track of certificate. Reference to this
memory works very similar as std::shared_ptr pointer in recent C++ dialects. i.e. when allocated
memory is referenced its reference counter is incremented and decremented after the memory is
derefered. When reference counter becomes zero allocated memory is automatically deallocated.
In openssl library environment certificate is retrieved by calling SSL_get_peer_certificate()
where its reference counter is incremented. After retrieved certificate is not used any more its
reference counter must be decremented by calling X509_free(). If not, allocated memory is never
freed despite the ssl connection is properly closed.
The problem was caused in stream-ssl.c in function ssl_connect(), which retrieves common peer name
by calling SSL_get_peer_certificate() function and without calling X509_free() function afterwards.
Signed-off-by: Damijan Skvarc <damjan.skvarc@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Mon, 16 Sep 2019 18:56:59 +0000 (11:56 -0700)]
Documentation: Work with sphinx-build for Python 3 also.
There's nothing in OVS specific to Sphinx for Python 2, but the
compile-time check only looked for a binary named "sphinx-build", which is
typically provided only for Python 2. With Python 3, the binary is
typically called "sphinx-build-3". With this commit, either name is
accepted.
Acked-by: Numan Siddique <nusididq@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Yifeng Sun [Wed, 11 Sep 2019 21:18:36 +0000 (14:18 -0700)]
conntrack: Validate accessing of conntrack data in pkt_metadata
Valgrind reported:
1305: ofproto-dpif - conntrack - ipv6
==26942== Conditional jump or move depends on uninitialised value(s)
==26942== at 0x587C00: check_orig_tuple (conntrack.c:1006)
==26942== by 0x587C00: process_one (conntrack.c:1141)
==26942== by 0x587C00: conntrack_execute (conntrack.c:1220)
==26942== by 0x47B00F: dp_execute_cb (dpif-netdev.c:7305)
==26942== by 0x4AF756: odp_execute_actions (odp-execute.c:794)
==26942== by 0x477532: dp_netdev_execute_actions (dpif-netdev.c:7349)
==26942== by 0x477532: handle_packet_upcall (dpif-netdev.c:6630)
==26942== by 0x477532: fast_path_processing (dpif-netdev.c:6726)
==26942== by 0x47933C: dp_netdev_input__ (dpif-netdev.c:6814)
==26942== by 0x479AB8: dp_netdev_input (dpif-netdev.c:6852)
==26942== by 0x479AB8: dp_netdev_process_rxq_port (dpif-netdev.c:4287)
==26942== by 0x47A6A9: dpif_netdev_run (dpif-netdev.c:5264)
==26942== by 0x4324E7: type_run (ofproto-dpif.c:342)
==26942== by 0x41C5FE: ofproto_type_run (ofproto.c:1734)
==26942== by 0x40BAAC: bridge_run__ (bridge.c:2965)
==26942== by 0x410CF3: bridge_run (bridge.c:3029)
==26942== by 0x407614: main (ovs-vswitchd.c:127)
==26942== Uninitialised value was created by a heap allocation
==26942== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26942== by 0x532574: xmalloc (util.c:138)
==26942== by 0x46CD62: dp_packet_new (dp-packet.c:153)
==26942== by 0x4A0431: eth_from_flow_str (netdev-dummy.c:1644)
==26942== by 0x4A0431: netdev_dummy_receive (netdev-dummy.c:1783)
==26942== by 0x531990: process_command (unixctl.c:308)
==26942== by 0x531990: run_connection (unixctl.c:342)
==26942== by 0x531990: unixctl_server_run (unixctl.c:393)
==26942== by 0x40761E: main (ovs-vswitchd.c:128)
1316: ofproto-dpif - conntrack - tcp port reuse
==24039== Conditional jump or move depends on uninitialised value(s)
==24039== at 0x587BF5: check_orig_tuple (conntrack.c:1004)
==24039== by 0x587BF5: process_one (conntrack.c:1141)
==24039== by 0x587BF5: conntrack_execute (conntrack.c:1220)
==24039== by 0x47B02F: dp_execute_cb (dpif-netdev.c:7306)
==24039== by 0x4AF7A6: odp_execute_actions (odp-execute.c:794)
==24039== by 0x47755B: dp_netdev_execute_actions (dpif-netdev.c:7350)
==24039== by 0x47755B: handle_packet_upcall (dpif-netdev.c:6631)
==24039== by 0x47755B: fast_path_processing (dpif-netdev.c:6727)
==24039== by 0x47935C: dp_netdev_input__ (dpif-netdev.c:6815)
==24039== by 0x479AD8: dp_netdev_input (dpif-netdev.c:6853)
==24039== by 0x479AD8: dp_netdev_process_rxq_port
(dpif-netdev.c:4287)
==24039== by 0x47A6C9: dpif_netdev_run (dpif-netdev.c:5264)
==24039== by 0x4324F7: type_run (ofproto-dpif.c:342)
==24039== by 0x41C5FE: ofproto_type_run (ofproto.c:1734)
==24039== by 0x40BAAC: bridge_run__ (bridge.c:2965)
==24039== by 0x410CF3: bridge_run (bridge.c:3029)
==24039== by 0x407614: main (ovs-vswitchd.c:127)
==24039== Uninitialised value was created by a heap allocation
==24039== at 0x4C2DB8F: malloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==24039== by 0x5325C4: xmalloc (util.c:138)
==24039== by 0x46D144: dp_packet_new (dp-packet.c:153)
==24039== by 0x46D144: dp_packet_new_with_headroom (dp-packet.c:163)
==24039== by 0x51191E: eth_from_hex (packets.c:498)
==24039== by 0x4A03B9: eth_from_packet (netdev-dummy.c:1609)
==24039== by 0x4A03B9: netdev_dummy_receive (netdev-dummy.c:1765)
==24039== by 0x5319E0: process_command (unixctl.c:308)
==24039== by 0x5319E0: run_connection (unixctl.c:342)
==24039== by 0x5319E0: unixctl_server_run (unixctl.c:393)
==24039== by 0x40761E: main (ovs-vswitchd.c:128)
According to comments in pkt_metadata_init(), conntrack data is valid
only if pkt_metadata.ct_state != 0. This patch prevents
check_orig_tuple() get called when conntrack data is uninitialized.
Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Yifeng Sun [Wed, 11 Sep 2019 21:18:35 +0000 (14:18 -0700)]
db-ctl-base: Free leaked ovsdb_datum
Valgrind reported:
2491: database commands -- negative checks
==19245== 36 (32 direct, 4 indirect) bytes in 1 blocks are definitely lost in loss record 36 of 53
==19245== at 0x4C2FD5F: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19245== by 0x431AB4: xrealloc (util.c:149)
==19245== by 0x41656D: ovsdb_datum_reallocate (ovsdb-data.c:1883)
==19245== by 0x41656D: ovsdb_datum_union (ovsdb-data.c:1961)
==19245== by 0x4107B2: cmd_add (db-ctl-base.c:1494)
==19245== by 0x406E2E: do_vsctl (ovs-vsctl.c:2626)
==19245== by 0x406E2E: main (ovs-vsctl.c:183)
==19252== 16 bytes in 1 blocks are definitely lost in loss record 9 of 52
==19252== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19252== by 0x430F74: xmalloc (util.c:138)
==19252== by 0x414D07: clone_atoms (ovsdb-data.c:990)
==19252== by 0x4153F6: ovsdb_datum_clone (ovsdb-data.c:1012)
==19252== by 0x4104D3: cmd_remove (db-ctl-base.c:1564)
==19252== by 0x406E2E: do_vsctl (ovs-vsctl.c:2626)
==19252== by 0x406E2E: main (ovs-vsctl.c:183)
This patch fixes them.
Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Yifeng Sun [Wed, 11 Sep 2019 21:18:34 +0000 (14:18 -0700)]
ofproto-dpif: Free leaked 'webster'
Valgrind reported:
1122: ofproto-dpif - select group with explicit dp_hash selection method
==16884== 64 bytes in 1 blocks are definitely lost in loss record 320 of 346
==16884== at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16884== by 0x532512: xcalloc (util.c:121)
==16884== by 0x4262B9: group_setup_dp_hash_table (ofproto-dpif.c:4846)
==16884== by 0x4267CB: group_set_selection_method (ofproto-dpif.c:4938)
==16884== by 0x4267CB: group_construct (ofproto-dpif.c:4984)
==16884== by 0x417250: init_group (ofproto.c:7286)
==16884== by 0x41B4FC: add_group_start (ofproto.c:7316)
==16884== by 0x42247A: ofproto_group_mod_start (ofproto.c:7589)
==16884== by 0x4250EC: handle_group_mod (ofproto.c:7744)
==16884== by 0x4250EC: handle_single_part_openflow (ofproto.c:8428)
==16884== by 0x4250EC: handle_openflow (ofproto.c:8606)
==16884== by 0x4579E2: ofconn_run (connmgr.c:1318)
==16884== by 0x4579E2: connmgr_run (connmgr.c:355)
==16884== by 0x41E0F5: ofproto_run (ofproto.c:1845)
==16884== by 0x40BA63: bridge_run__ (bridge.c:2971)
==16884== by 0x410CF3: bridge_run (bridge.c:3029)
==16884== by 0x407614: main (ovs-vswitchd.c:127)
This patch fixes it.
Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Yifeng Sun [Wed, 11 Sep 2019 21:18:33 +0000 (14:18 -0700)]
dns-resolve: Free 'struct ub_result' when callback returns error results
Valgrind reported:
1074: ofproto - flush flows, groups, and meters for controller change
==5499== 695 (288 direct, 407 indirect) bytes in 3 blocks are definitely lost in loss record 344 of 355
==5499== at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5499== by 0x5E7F145: ??? (in /usr/lib/x86_64-linux-gnu/libunbound.so.2.4.0)
==5499== by 0x5E6EBDE: ub_resolve_async (in /usr/lib/x86_64-linux-gnu/libunbound.so.2.4.0)
==5499== by 0x55C739: resolve_async__.part.5 (dns-resolve.c:233)
==5499== by 0x55C85C: resolve_async__ (dns-resolve.c:261)
==5499== by 0x55C85C: resolve_callback__ (dns-resolve.c:262)
==5499== by 0x5E6FEF1: ub_process (in /usr/lib/x86_64-linux-gnu/libunbound.so.2.4.0)
==5499== by 0x55CAF3: dns_resolve (dns-resolve.c:153)
==5499== by 0x523864: parse_sockaddr_components_dns (socket-util.c:438)
==5499== by 0x523864: parse_sockaddr_components (socket-util.c:504)
==5499== by 0x524468: inet_parse_active (socket-util.c:541)
==5499== by 0x524564: inet_open_active (socket-util.c:579)
==5499== by 0x5959F9: tcp_open (stream-tcp.c:56)
==5499== by 0x529192: stream_open (stream.c:228)
==5499== by 0x529910: stream_open_with_default_port (stream.c:724)
==5499== by 0x595FAE: vconn_stream_open (vconn-stream.c:81)
==5499== by 0x535C9B: vconn_open (vconn.c:250)
==5499== by 0x517C59: reconnect (rconn.c:467)
==5499== by 0x5184C7: run_BACKOFF (rconn.c:492)
==5499== by 0x5184C7: rconn_run (rconn.c:660)
==5499== by 0x457FE8: ofservice_run (connmgr.c:1992)
==5499== by 0x457FE8: connmgr_run (connmgr.c:367)
==5499== by 0x41E0F5: ofproto_run (ofproto.c:1845)
==5499== by 0x40BA63: bridge_run__ (bridge.c:2971)
In ub_resolve_async's callback function, 'struct ub_result' should be
finally freed even if there is a resolving error. This patch fixes it.
Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Yifeng Sun [Wed, 11 Sep 2019 21:18:32 +0000 (14:18 -0700)]
ovsdb-client: Free ovsdb_schema
Valgrind reported:
1925: schema conversion online - standalone
==10727== 689 (56 direct, 633 indirect) bytes in 1 blocks are definitely lost in loss record 64 of 66
==10727== at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==10727== by 0x449D42: xcalloc (util.c:121)
==10727== by 0x40F45C: ovsdb_schema_create (ovsdb.c:41)
==10727== by 0x40F7F8: ovsdb_schema_from_json (ovsdb.c:217)
==10727== by 0x40FB4E: ovsdb_schema_from_file (ovsdb.c:101)
==10727== by 0x40B156: do_convert (ovsdb-client.c:1639)
==10727== by 0x4061C6: main (ovsdb-client.c:282)
This patch fixes it.
Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Yifeng Sun [Wed, 11 Sep 2019 21:18:31 +0000 (14:18 -0700)]
trigger: Free leaked ovsdb_schema
Valgrind reported:
1925: schema conversion online - standalone
==10884== 689 (56 direct, 633 indirect) bytes in 1 blocks are definitely lost in loss record 384 of 420
==10884== at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==10884== by 0x44A592: xcalloc (util.c:121)
==10884== by 0x40E2EC: ovsdb_schema_create (ovsdb.c:41)
==10884== by 0x40E688: ovsdb_schema_from_json (ovsdb.c:217)
==10884== by 0x416C6F: ovsdb_trigger_try (trigger.c:246)
==10884== by 0x40D4DE: ovsdb_jsonrpc_trigger_create (jsonrpc-server.c:1119)
==10884== by 0x40D4DE: ovsdb_jsonrpc_session_got_request (jsonrpc-server.c:986)
==10884== by 0x40D4DE: ovsdb_jsonrpc_session_run (jsonrpc-server.c:556)
==10884== by 0x40D4DE: ovsdb_jsonrpc_session_run_all (jsonrpc-server.c:586)
==10884== by 0x40D4DE: ovsdb_jsonrpc_server_run (jsonrpc-server.c:401)
==10884== by 0x406A6E: main_loop (ovsdb-server.c:209)
==10884== by 0x406A6E: main (ovsdb-server.c:460)
'new_schema' should also be freed when there is no error.
This patch fixes it.
Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Yifeng Sun [Wed, 11 Sep 2019 21:18:30 +0000 (14:18 -0700)]
ovs-ofctl: Free leaked minimatch
Valgrind reported:
1056: ofproto - bundle with multiple flow mods (OpenFlow 1.4)
==19220== 160 bytes in 2 blocks are definitely lost in loss record 24 of 34
==19220== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19220== by 0x4979A4: xmalloc (util.c:138)
==19220== by 0x42407D: miniflow_alloc (flow.c:3340)
==19220== by 0x4296CF: minimatch_init (match.c:1758)
==19220== by 0x46273D: parse_ofp_str__ (ofp-flow.c:1759)
==19220== by 0x465B9E: parse_ofp_str (ofp-flow.c:1790)
==19220== by 0x465CE0: parse_ofp_flow_mod_str (ofp-flow.c:1817)
==19220== by 0x465DF6: parse_ofp_flow_mod_file (ofp-flow.c:1876)
==19220== by 0x410BA3: ofctl_flow_mod_file.isra.19 (ovs-ofctl.c:1773)
==19220== by 0x417933: ovs_cmdl_run_command__ (command-line.c:223)
==19220== by 0x406F68: main (ovs-ofctl.c:179)
This patch fixes it.
Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Yifeng Sun [Wed, 11 Sep 2019 21:18:29 +0000 (14:18 -0700)]
dpif-netdev: Handle uninitialized value error for 'match.wc'
Valgrind reported that match.wc was not initialized, as below:
1176: ofproto-dpif - fragment handling - actions
==21214== Conditional jump or move depends on uninitialised value(s)
==21214== at 0x4B77C1: odp_flow_key_from_flow__ (odp-util.c:6143)
==21214== by 0x46DB58: dp_netdev_upcall (dpif-netdev.c:6239)
==21214== by 0x4774A7: handle_packet_upcall (dpif-netdev.c:6608)
==21214== by 0x4774A7: fast_path_processing (dpif-netdev.c:6726)
==21214== by 0x47933C: dp_netdev_input__ (dpif-netdev.c:6814)
==21214== by 0x479AB8: dp_netdev_input (dpif-netdev.c:6852)
==21214== by 0x479AB8: dp_netdev_process_rxq_port (dpif-netdev.c:4287)
==21214== by 0x47A6A9: dpif_netdev_run (dpif-netdev.c:5264)
==21214== by 0x4324E7: type_run (ofproto-dpif.c:342)
==21214== by 0x41C5FE: ofproto_type_run (ofproto.c:1734)
==21214== by 0x40BAAC: bridge_run__ (bridge.c:2965)
==21214== by 0x410CF3: bridge_run (bridge.c:3029)
==21214== by 0x407614: main (ovs-vswitchd.c:127)
==21214== Uninitialised value was created by a stack allocation
==21214== at 0x4769C3: fast_path_processing (dpif-netdev.c:6672)
'match' is allocated on stack but its 'wc' is accessed in
odp_flow_key_from_flow__ without proper initialization.
This patch fixes it.
Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Yifeng Sun [Wed, 11 Sep 2019 21:18:28 +0000 (14:18 -0700)]
ofproto-dpif: Uninitialize 'xlate_cache' to free resources
Valgrind reported:
1210: ofproto-dpif - continuation after clone
==32205== 4,392 (1,440 direct, 2,952 indirect) bytes in 12 blocks are definitely lost in loss record 359 of 362
==32205== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==32205== by 0x532574: xmalloc (util.c:138)
==32205== by 0x4F98CA: ofpbuf_init (ofpbuf.c:123)
==32205== by 0x42C07B: nxt_resume (ofproto-dpif.c:5110)
==32205== by 0x41796F: handle_nxt_resume (ofproto.c:3677)
==32205== by 0x424583: handle_single_part_openflow (ofproto.c:8473)
==32205== by 0x424583: handle_openflow (ofproto.c:8606)
==32205== by 0x4579E2: ofconn_run (connmgr.c:1318)
==32205== by 0x4579E2: connmgr_run (connmgr.c:355)
==32205== by 0x41E0F5: ofproto_run (ofproto.c:1845)
==32205== by 0x40BA63: bridge_run__ (bridge.c:2971)
==32205== by 0x410CF3: bridge_run (bridge.c:3029)
==32205== by 0x407614: main (ovs-vswitchd.c:127)
This is because 'xcache' was not destroyed properly. This patch fixes it.
Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Yifeng Sun [Wed, 11 Sep 2019 21:18:27 +0000 (14:18 -0700)]
raft: Free leaked json data
Valgrind reported:
1924: compacting online - cluster
==29312== 2,886 (240 direct, 2,646 indirect) bytes in 6 blocks are definitely lost in loss record 406 of 413
==29312== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==29312== by 0x44A5F4: xmalloc (util.c:138)
==29312== by 0x4308EA: json_create (json.c:1451)
==29312== by 0x4308EA: json_object_create (json.c:254)
==29312== by 0x430ED0: json_parser_push_object (json.c:1273)
==29312== by 0x430ED0: json_parser_input (json.c:1371)
==29312== by 0x431CF1: json_lex_input (json.c:991)
==29312== by 0x43233B: json_parser_feed (json.c:1149)
==29312== by 0x41D87F: parse_body.isra.0 (log.c:411)
==29312== by 0x41E141: ovsdb_log_read (log.c:476)
==29312== by 0x42646D: raft_read_log (raft.c:866)
==29312== by 0x42646D: raft_open (raft.c:951)
==29312== by 0x4151AF: ovsdb_storage_open__ (storage.c:81)
==29312== by 0x408FFC: open_db (ovsdb-server.c:642)
==29312== by 0x40657F: main (ovsdb-server.c:358)
This patch fixes it.
Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
William Tu [Thu, 12 Sep 2019 17:07:45 +0000 (10:07 -0700)]
ovs-bugtool: Add ip -s -s to get_device_stats.out.
The patch adds 'ip -s -s' to file get_device_stats.out to collect
device statistics. When debugging tunnel related issues, the command
shows much more detailed counters, ex: frame, crc, carrier, helping
to understand the root cause when packets are dropped.
Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Wed, 18 Sep 2019 14:32:52 +0000 (07:32 -0700)]
db-ctl-base: Give better error messages for ambiguous abbreviations.
Tables and columns may be abbreviated to unique prefixes, but until
now the error messages have just said there's more than one match.
This commit makes the error messages list the possibilities.
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Mark Michelson [Fri, 6 Sep 2019 14:33:03 +0000 (10:33 -0400)]
Remove OVN.
OVN is separated into its own repo. This commit removes the OVN source,
OVN tests, and OVN documentation. It also removes mentions of OVN from
most documentation. The only place where OVN has been left is in
changelogs/NEWS, since we shouldn't mess with the history of the
project.
There is an exception here. The ovsdb-cluster tests rely on ovn-nbctl
and ovn-sbctl to run. Therefore those ovn utilities, as well as their
dependencies remain in the repo with this commit.
Acked-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Useful for tracking where the uninitialized memory came from.
Report example:
Thread 13 revalidator11:
Conditional jump or move depends on uninitialised value(s)
at 0x4C35D96: __memcmp_sse4_1 (in vgpreload_memcheck.so)
by 0x9D4404: ofpbuf_equal (ofpbuf.h:273)
by 0x9D4404: revalidate_ukey__ (ofproto-dpif-upcall.c:2219)
<...>
by 0x6AF488E: clone (clone.S:95)
Uninitialised value was created by a stack allocation
at 0x9D4450: compose_slow_path (ofproto-dpif-upcall.c:1062)
Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Tested-by: William Tu <u9012063@gmail.com> Acked-by: Ben Pfaff <blp@ovn.org>
This is highly useful to see on which core PMD is running by
only looking at the thread name. Thread Id still allows to
distinguish different threads running on the same core over the time:
In gdb, top or any other utility it's useful to quickly catch up
needed thread without parsing logs, memory or matching threads by port
names they're handling.
Ilya Maximets [Tue, 13 Aug 2019 11:28:26 +0000 (14:28 +0300)]
dpdk: Use ovs-numa provided functions to manage thread affinity.
This allows to decrease code duplication and avoid using Linux-specific
functions (this might be useful in the future if we'll try to allow
running OvS+DPDK on FreeBSD).
Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: William Tu <u9012063@gmail.com>
dpif-netdev-perf: Fix TSC frequency for non-DPDK case.
Unlike 'rte_get_tsc_cycles()' which doesn't need any specific
initialization, 'rte_get_tsc_hz()' could be used only after successfull
call to 'rte_eal_init()'. 'rte_eal_init()' estimates the TSC frequency
for later use by 'rte_get_tsc_hz()'. Fairly said, we're not allowed
to use 'rte_get_tsc_cycles()' before initializing DPDK too, but it
works this way for now and provides correct results.
This patch provides TSC frequency estimation code that will be used
in two cases:
* DPDK is not compiled in, i.e. DPDK_NETDEV not defined.
* DPDK compiled in but not initialized,
i.e. other_config:dpdk-init=false
This change is mostly useful for AF_XDP netdev support, i.e. allows
to use dpif-netdev/pmd-perf-show command and various PMD perf metrics.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: William Tu <u9012063@gmail.com>
Greg Rose [Tue, 3 Sep 2019 15:50:42 +0000 (08:50 -0700)]
rhel: Revert RHEL 7.4 comp_ver change
I looked at the wrong list of kernels when I changed the value for the
RHEL 7.4 comp_ver variable. Revert that part of commit e64c2c1
("rhel: Fix ovs-kmod-manage.sh to work with RHEL 7.3").
Fixes: e64c2c1 ("rhel: Fix ovs-kmod-manage.sh to work with RHEL 7.3") Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Greg Rose [Thu, 29 Aug 2019 18:56:01 +0000 (11:56 -0700)]
rhel: Fix ovs-kmod-manage.sh to work with RHEL 7.3
Add case for RHEL 7.3. This also fixes commit 22abff2 where I forgot to
update the comp_ver variable for RHEL 7.5 and while I was in there I
updated comp_ver for the RHEL 7.4 case as well.
Fixes: 22abff2 ("rhel: Add case for RHEL 7.5 major version to...") Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Nitin Katiyar [Wed, 28 Aug 2019 16:42:07 +0000 (22:12 +0530)]
packets: Fix using outdated RSS hash after MPLS decapsulation.
When a packet is received, the RSS hash is calculated if it is not
already available. The Exact Match Cache (EMC) entry is then looked up
using this RSS hash.
When a MPLS encapsulated packet is received, the MPLS header is popped
and the packet is recirculated. Since the RSS hash has not been
invalidated here, the EMC lookup for all decapsulated packets will hit
the same entry even though these packets will have different tuple
values. This degrades performance severely as different inner packets
from the same MPLS tunnel would hit the same EMC entry.
This patch invalidates RSS hash (by resetting offload flags) after MPLS
header is popped.
dpif-netdev: Fail port addition if reconfiguration failed.
If the port was destroyed during the initial reconfiguration, we should
report an error to the upper layers. Otherwise, successful addition of
the port will be logged and upper layers will continue to configure
this port. For example, the 'dpif' layer will try to initilaize flow
API for this device.
Fix that by checking for port existence after reconfiguration. We can't
get the real error code here, so let's assume EINVAL. 'ovs-vsctl' will
tell the user to check the logs for a real reason anyway.
Fixes: e32971b8ddb4 ("dpif-netdev: Centralized threads and queues handling code.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Ian Stokes <ian.stokes@intel.com>
Make pid_exists() more robust against empty pid argument
In some of our destructive testing of ovn-dbs inside containers managed
by pacemaker we reached a situation where /var/run/openvswitch had
empty .pid files. The current code does not deal well with them
and pidfile_is_running() returns true in such a case and this confuses
the OCF resource agent.
- Before this change:
Inside a container run:
killall ovsdb-server;
echo -n '' > /var/run/openvswitch/ovnnb_db.pid; echo -n '' > /var/run/openvswitch/ovnsb_db.pid
We will observe that the cluster is unable to ever recover because
it believes the ovn processes to be running when they really aren't and
eventually just fails:
podman container set: ovn-dbs-bundle [192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest]
ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers): Master controller-0
ovn-dbs-bundle-1 (ocf::ovn:ovndb-servers): Stopped controller-1
ovn-dbs-bundle-2 (ocf::ovn:ovndb-servers): Slave controller-2
Let's make sure pid_exists() returns false when the pid is an empty
string.
- After this change the cluster is able to recover from this state and
correctly start the resource:
podman container set: ovn-dbs-bundle [192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest]
ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers): Master controller-0
ovn-dbs-bundle-1 (ocf::ovn:ovndb-servers): Slave controller-1
ovn-dbs-bundle-2 (ocf::ovn:ovndb-servers): Slave controller-2
Fixes: 3028ce2595c8 ("ovs-lib: Allow "status" command to work as non-root.") Signed-off-by: Michele Baldessari <michele@acksyn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
Darrell Ball [Tue, 27 Aug 2019 23:59:02 +0000 (16:59 -0700)]
conntrack: Fix ICMPv4 error data L4 length check.
The ICMPv4 error data L4 length check was found to be too strict for TCP,
expecting a minimum of 20 rather than 8 bytes. This worked by
hapenstance for other inner protocols. The approach is to explicitly
handle the ICMPv4 error data L4 length check and to do this for all
supported inner protocols in the same way. Making the code common
between protocols also allows the existing ICMPv4 related UDP tests to
cover TCP and ICMP inner protocol cases.
Note that ICMPv6 does not have an 8 byte limit for error L4 data.
Fixes: a489b16854b5 ("conntrack: New userspace connection tracker.") CC: Daniele Di Proietto <diproiettod@ovn.org>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/361949.html Reported-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com> Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com> Co-authored-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com> Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Yifeng Sun [Wed, 7 Aug 2019 22:25:33 +0000 (15:25 -0700)]
datapath: compat: Backports bugfixes for nf_conncount
This patch backports several critical bug fixes related to
locking and data consistency in nf_conncount code.
This backport is based on the following upstream net-next upstream commits. a007232 ("netfilter: nf_conncount: fix argument order to find_next_bit") c80f10b ("netfilter: nf_conncount: speculative garbage collection on empty lists") 2f971a8 ("netfilter: nf_conncount: move all list iterations under spinlock") df4a902 ("netfilter: nf_conncount: merge lookup and add functions") e8cfb37 ("netfilter: nf_conncount: restart search when nodes have been erased") f7fcc98 ("netfilter: nf_conncount: split gc in two phases") 4cd273b ("netfilter: nf_conncount: don't skip eviction when age is negative") c78e781 ("netfilter: nf_conncount: replace CONNCOUNT_LOCK_SLOTS with CONNCOUNT_SLOTS") d4e7df1 ("netfilter: nf_conncount: use rb_link_node_rcu() instead of rb_link_node()") 53ca0f2 ("netfilter: nf_conncount: remove wrong condition check routine") 3c5cdb1 ("netfilter: nf_conncount: fix unexpected permanent node of list.") 31568ec ("netfilter: nf_conncount: fix list_del corruption in conn_free") fd3e71a ("netfilter: nf_conncount: use spin_lock_bh instead of spin_lock")
This patch adds additional compat code so that it can build on
all supported kernel versions.
In addition, this patch helps OVS datapath to always choose bug-fixed
nf_conncount code. If kernel already has these fixes, then kernel's
nf_conncount is being used. Otherwise, OVS falls back to use compat
nf_conncount functions.
Travis tests are at
https://travis-ci.org/yifsun/ovs-travis/builds/569056850
On latest RHEL kernel, 'make check-kmod' runs good.
openvswitch: Clear the L4 portion of the key for "later" fragments.
Only the first fragment in a datagram contains the L4 headers. When the
Open vSwitch module parses a packet, it always sets the IP protocol
field in the key, but can only set the L4 fields on the first fragment.
The original behavior would not clear the L4 portion of the key, so
garbage values would be sent in the key for "later" fragments. This
patch clears the L4 fields in that circumstance to prevent sending those
garbage values as part of the upcall.
Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Justin Pettit <jpettit@ovn.org>
openvswitch: Properly set L4 keys on "later" IP fragments
When IP fragments are reassembled before being sent to conntrack, the
key from the last fragment is used. Unless there are reordering
issues, the last fragment received will not contain the L4 ports, so the
key for the reassembled datagram won't contain them. This patch updates
the key once we have a reassembled datagram.
The handle_fragments() function works on L3 headers so we pull the L3/L4
flow key update code from key_extract into a new function
'key_extract_l3l4'. Then we add a another new function
ovs_flow_key_update_l3l4() and export it so that it is accessible by
handle_fragments() for conntrack packet reassembly.
Co-authored-by: Justin Pettit <jpettit@ovn.org> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Justin Pettit <jpettit@ovn.org>
* check that expected bytes and packets stats are correctly read from
every flow.
* check that the expected elements are read for every field type
aggregation.
Signed-off-by: Jaime Caamaño Ruiz <jcaamano@suse.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Yanqin Wei [Thu, 22 Aug 2019 10:09:50 +0000 (18:09 +0800)]
flow: save "vlan_hdrs" memset for untagged traffic
For untagged traffic, it is unnecessary to clear vlan_hdrs as it costs 32B
memset. So the patch improves it by postponing to clear vlan_hdrs until
ethtype check. It can benefit both untagged and single-tagged traffic. From
testing, it does not impact performance of dual-tagged traffic.
Reviewed-by: Gavin Hu <Gavin.Hu@arm.com> Signed-off-by: Yanqin Wei <Yanqin.Wei@arm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Malvika Gupta [Tue, 27 Aug 2019 18:21:08 +0000 (13:21 -0500)]
flow: Reduce metadata connection state branches in miniflow_extract
This patch merges two separate if-else branches for metadata connection state
into one if-else branch to improve performance. It gives an average performance
improvement of ~3% on arm platforms and ~4.5% on x86 platforms.
Signed-off-by: Malvika Gupta <malvika.gupta@arm.com> Reviewed-by: Yanqin Wei <yanqin.wei@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Dumitru Ceara [Fri, 16 Aug 2019 14:31:28 +0000 (16:31 +0200)]
pinctrl: Fix DNS packet parsing
Due to the use of a uint8_t to index inside the DNS payload we could end
up in an infinite loop when specific (invalid) DNS packets were
processed by ovn-controller. In the infinite loop we keep increasing the
query_name dynamic string until running out of memory.
One way to replicate the issue is to configure DNS on the logical switch
and then inject a manually crafted DNS-like packet. For example, with
Scapy:
>>> p = IP(dst='10.0.0.2',src='10.0.0.3')/UDP(dport=53)/('a'*364)
>>> send(p)
Also add a sanity check on minimum L4 size of packets.
Lorenzo Bianconi [Mon, 26 Aug 2019 09:19:32 +0000 (11:19 +0200)]
OVN: fix memory leak in build_pre_lb
Fix memory leak of ip_address string in build_pre_lb routine if we
install logical flows for empty_lb controller event
Fixes: f49b17a6cbe3 ("OVN: use trigger_event action to report 'empty_lb_rule' events") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Flavio Leitner [Tue, 13 Aug 2019 16:34:04 +0000 (13:34 -0300)]
tnl-neigh: Use outgoing ofproto version.
When a packet needs to be encapsulated in userspace, the endpoint
address needs to be resolved to fill in the headers. If it is not,
then currently OvS sends either a Neighbor Solicitation (IPv6)
or an ARP Query (IPv4) to resolve it.
The problem is that the NS/ARP packet will go through the flow
rules in the new bridge, but inheriting the ofproto table version
from the original packet to be encapsulated. When those versions
don't match, the result is unexpected because no flow rules might
be visible, which would cause the default table rule to be used
to drop the packet. Or only part of the flow rules would be visible
and so on.
Since the NS/ARP packet is created by OvS and will be injected in
the outgoing bridge, use the corresponding ofproto version instead.
Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-By: Vasu Dasari <vdasari@gmail.com> Signed-off-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
Greg Rose [Tue, 27 Aug 2019 21:06:29 +0000 (14:06 -0700)]
rhel: Add case for RHEL 7.5 major version to kmod manage script
A Centos 7.5 kernel with an unencountered set of minor build numbers
caused an upgrade bug. Adding the case for the rhel 7.5 kmod management
script fixes the problem.
Signed-off-by: Greg Rose <gvrose8192@gmail.com> Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Gurucharan Shetty <guru@ovn.org>
William Tu [Tue, 27 Aug 2019 18:09:13 +0000 (11:09 -0700)]
ovs-lib: Add timeout at ovs-check-dead-ifs.
At SUSE12 SP3, we hit a case where ovs-check-dead-ifs tries to read
an entry in /proc/<pid>/fd/<some fd> but hangs forever. The pid is
a qemu-system-x86_64 process and we suspect it's an issue related to
qemu, not ovs. As a result, force-reload-kmod hangs and OVS bridge
never gets restarted. This patch adds a timeout of 5-seconds to
ovs-check-dead-ifs.
VMware-BZ: #2408059 Signed-off-by: William Tu <u9012063@gmail.com> Cc: Ashish Varma <ashishvarma.ovs@gmail.com> Cc: Gurucharan Shetty <guru@ovn.org> Signed-off-by: Gurucharan Shetty <guru@ovn.org>
Ilya Maximets [Tue, 6 Aug 2019 15:57:09 +0000 (18:57 +0300)]
netdev-dpdk: Refactor vhost custom stats for extensibility.
vHost interfaces currently has only one custom statistic, but there
might be others in the near future. This refactoring makes the code
work in the same way as it done for dpdk and afxdp stats to keep the
common style over the different code places and makes it easily
extensible for the new stats addition.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Kevin Traynor <ktraynor@redhat.com>
Ilya Maximets [Tue, 6 Aug 2019 15:46:36 +0000 (18:46 +0300)]
netdev-dpdk: Fix not reporting rx_oversize_errors in stats.
There is a big code duplication issue with DPDK xstats that led to
missed "rx_oversize_errors" statistics. It's defined but not used.
Fix that by actually using this stat along with code refactoring that
will allow us to not make same mistakes in the future.
Macro definitions are perfectly suitable to automate code generation
in such cases and already used in a couple of places in OVS for similar
purposes.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Ian Stokes <ian.stokes@intel.com>
Han Zhou [Thu, 22 Aug 2019 21:08:22 +0000 (14:08 -0700)]
raft: Save and read new election timer in header snapshot.
This patch store the latest election timer in snapshot during log
compression, and when server restarts it reads the value from the log.
Without this, any previous changes to election timer will be lost
in the log, and if server restarts, it will use the default value
instead of the changed value.
Fixes: commit 8e35461 ("ovsdb raft: Support leader election time change online.") Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Thu, 22 Aug 2019 21:08:21 +0000 (14:08 -0700)]
raft.c: Election timer initial reset with value from log.
After election timer is changed through cluster/change-election-timer
command, if a server restarts, it firstly initializes with the default
value and use it to reset the timer. Although it reads the latest
timer value later from the log, the first timeout may be much shorter
than expected by other servers that use latest timeout, and it would
start election before it receives the first heartbeat from the leader.
This patch fixes it by changing the order of reading log and resetting
timer so that the latest value is read from the log before the initial
resetting of the timer.
Fixes: commit 8e35461 ("ovsdb raft: Support leader election time change online.") Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
ofproto-dpif: Fix using uninitialised memory in user_action_cookie.
Designated initializers are not suitable for initializing non-packed
structures and unions which are subjects for comparison by memcmp().
Whole memory for 'struct user_action_cookie' must be explicitly cleared
before using because it will be copied with memcpy and later compared
by memcmp in ofpbuf_equal().
Few issues found be valgrind:
Thread 13 revalidator11:
Conditional jump or move depends on uninitialised value(s)
at 0x4C35D96: __memcmp_sse4_1 (in vgpreload_memcheck.so)
by 0x9D4404: ofpbuf_equal (ofpbuf.h:273)
by 0x9D4404: revalidate_ukey__ (ofproto-dpif-upcall.c:2219)
by 0x9D4404: revalidate_ukey (ofproto-dpif-upcall.c:2286)
by 0x9D62AC: revalidate (ofproto-dpif-upcall.c:2685)
by 0x9D62AC: udpif_revalidator (ofproto-dpif-upcall.c:942)
by 0xA9C732: ovsthread_wrapper (ovs-thread.c:383)
by 0x5FF86DA: start_thread (pthread_create.c:463)
by 0x6AF488E: clone (clone.S:95)
Uninitialised value was created by a stack allocation
at 0x9D4450: compose_slow_path (ofproto-dpif-upcall.c:1062)
Thread 11 revalidator16:
Conditional jump or move depends on uninitialised value(s)
at 0x4C35D96: __memcmp_sse4_1 (in vgpreload_memcheck.so)
by 0x9D4404: ofpbuf_equal (ofpbuf.h:273)
by 0x9D4404: revalidate_ukey__ (ofproto-dpif-upcall.c:2220)
by 0x9D4404: revalidate_ukey (ofproto-dpif-upcall.c:2287)
by 0x9D62BC: revalidate (ofproto-dpif-upcall.c:2686)
by 0x9D62BC: udpif_revalidator (ofproto-dpif-upcall.c:942)
by 0xA9C6D2: ovsthread_wrapper (ovs-thread.c:383)
by 0x5FF86DA: start_thread (pthread_create.c:463)
by 0x6AF488E: clone (clone.S:95)
Uninitialised value was created by a stack allocation
at 0x9DC4E0: compose_sflow_action (ofproto-dpif-xlate.c:3211)
The struct was never marked as 'packed', however it was manually
adjusted to be so in practice.
Old IPFIX related commit first made the structure non-contiguous.
Commit 8de6ff3ea864 ("ofproto-dpif: Use a fixed size userspace cookie.")
added uninitialized parts of the additional union space and the next
one introduced new holes between structure fields for all cases.
CC: Justin Pettit <jpettit@ovn.org> Fixes: 8b7ea2d48033 ("Extend OVS IPFIX exporter to export tunnel headers") Fixes: 8de6ff3ea864 ("ofproto-dpif: Use a fixed size userspace cookie.") Fixes: fcb9579be3c7 ("ofproto: Add 'ofproto_uuid' and 'ofp_in_port' to user action cookie.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Ben Pfaff <blp@ovn.org>
Lorenzo Bianconi [Thu, 22 Aug 2019 16:51:40 +0000 (18:51 +0200)]
Remove ageing check in run_put_mac_binding
Remove ageing check in run_put_mac_binding routine on mac-binding info
since if ovn-controller main thread is heavy loaded the info will be
discarded and the mac_binding table will not never be updated
Acked-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Lorenzo Bianconi [Mon, 29 Jul 2019 11:41:03 +0000 (13:41 +0200)]
OVN: fix DNAT/SNAT system-ovn unit tests
Fix conntrack checks in the following tests in tests/system-ovn.at:
- ovn -- DNAT and SNAT on distributed router - N/S
- ovn -- DNAT and SNAT on distributed router - E/W
Fixes: a6ee09882283 ("OVN: run local logical flows first in S_ROUTER_OUT_SNAT table") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Aliasgar Ginwala [Sat, 17 Aug 2019 07:21:45 +0000 (00:21 -0700)]
OVS: Containerize components
1. Start OVS components in containers so that building and shipping
of OVS components is easy.
2. Load OVS kernel modules on host from container to avoid installing ovs
on host.
3. Update documentation about how to build/run ovs in docker.
ofproto-dpif: Fix for recirc issue with mpls traffic with dp_hash
Fix infinite recirculation loop for MPLS packets sent to dp_hash-based
select group
Issue:
When a MPLS encapsulated packet is received, the MPLS header is removed,
a recirculation id assigned and then recirculated into the pipeline.
If the flow rules require the packet to be then sent over DP-HASH based
select group buckets, the packet has to be recirculated again. However,
the same recirculation id was used and this resulted in the packet being
repeatedly recirculated until it got dropped because the maximum recirculation
limit was hit.
Fix:
Include the “was_mpls” boolean which indicates whether the packet was MPLS
encapsulated when computing the hash. After popping the MPLS header this will
result in a different hash value than before and new recirculation id will
get generated.
DPCTL flows with and without the fix are shown below
Without Fix:
recirc_id(0x1),dp_hash(0x5194bf18/0xf),in_port(2),packet_type(ns=0,id=0),
eth_type(0x0800),ipv4(frag=no), packets:20, bytes:1960,
used:0.329s, actions:1
recirc_id(0x1),in_port(2),packet_type(ns=0,id=0),eth_type(0x0800),
ipv4(frag=no), packets:20, bytes:1960, used:0.329s,
actions:hash(sym_l4(0)),recirc(0x1)
recirc_id(0),in_port(2),packet_type(ns=0,id=0),eth_type(0x8847),
mpls(label=22/0xfffff,tc=0/0,ttl=64/0x0,bos=1/1), packets:20, bytes:2040,
used:0.329s, actions:pop_mpls(eth_type=0x800),recirc(0x1)
upcall: Configure datapath min-revalidate-pps through ovs-vsctl.
This patch adds a new configuration option, "min-revalidate-pps" to the
Open_vSwitch "other-config" column. This sets minimum pps that flow must
have in order to be revalidated when revalidation duration exceeds half of
max-revalidator config variable.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
upcall: Configure datapath max-revalidator through ovs-vsctl.
This patch adds a new configuration option, "max-revalidator" to the
Open_vSwitch "other-config" column. This sets maximum allowed ravalidator
timeout. Actual timeout value is determined at runtime as minimum of
"max-idle" and "max-revalidator".
Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Mon, 19 Aug 2019 23:30:35 +0000 (16:30 -0700)]
ovsdb monitor: Fix crash when using non-zero last-id with standalone DB.
When a client uses monitor-cond-since with a non-zero last-id but the
server is not in cluster mode for the DB being monitored, it leads to
segmentation fault because the txn_history list is not initialized in
this case.
Program terminated with signal SIGSEGV, Segmentation fault.
1536 struct ovsdb_txn *txn = h_node->txn;
(gdb) bt
0 ovsdb_monitor_get_changes_after (txn_uuid=txn_uuid@entry=0x7ffe8605b7e0, dbmon=0x17c1b40, p_mcs=p_mcs@entry=0x17c4900) at ovsdb/monitor.c:1536
1 0x000000000040da2d in ovsdb_jsonrpc_monitor_create (request_id=0x1804630, version=<optimized out>, params=0x17ad330, db=0x18015b0, s=<optimized out>) at ovsdb/jsonrpc-server.c:1469
2 ovsdb_jsonrpc_session_got_request (request=0x17ad520, s=<optimized out>) at ovsdb/jsonrpc-server.c:1002
3 ovsdb_jsonrpc_session_run (s=<optimized out>) at ovsdb/jsonrpc-server.c:556
...
Although it doesn't happen in normal use cases, no one can prevent a
client to send this on purpose or in a corner case when a client firstly
connected to a clustered DB but later the server restarted with a
non-clustered DB.
This patch fixes it by always initialize the txn_history list to avoid
the undefined behavior in this case. It adds a test case to cover it, too.
Fixes: 695e815 ("ovsdb-server: Transaction history tracking.") Reported-by: Aliasgar Ginwala <aginwala@ebay.com> Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Mon, 19 Aug 2019 16:30:00 +0000 (09:30 -0700)]
ovsdb raft: Support leader election time change online.
A new unixctl command cluster/change-election-timer is implemented to
change leader election timeout base value according to the scale needs.
The change takes effect upon consensus of the cluster, implemented through
the append-request RPC. A new field "election-timer" is added to raft log
entry for this purpose.
Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Mon, 19 Aug 2019 16:29:59 +0000 (09:29 -0700)]
raft.c: Set candidate_retrying if no leader elected since last election.
candiate_retrying is used to determine if the current node is disconnected
from the cluster when the node is in candiate role. However, a node
can flap between candidate and follower role before a leader is elected
when majority of the cluster is down, so is_connected() will flap, too, which
confuses clients.
This patch avoids the flapping with the help of a new member had_leader,
so that if no leader was elected since last election, we know we are
still retrying, and keep as disconnected from the cluster.
Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Mon, 19 Aug 2019 16:29:58 +0000 (09:29 -0700)]
raft.c: Stale leader should disconnect from cluster.
As mentioned in RAFT paper, section 6.2:
Leaders: A server might be in the leader state, but if it isn’t the current
leader, it could be needlessly delaying client requests. For example, suppose a
leader is partitioned from the rest of the cluster, but it can still
communicate with a particular client. Without additional mechanism, it could
delay a request from that client forever, being unable to replicate a log entry
to any other servers. Meanwhile, there might be another leader of a newer term
that is able to communicate with a majority of the cluster and would be able to
commit the client’s request. Thus, a leader in Raft steps down if an election
timeout elapses without a successful round of heartbeats to a majority of its
cluster; this allows clients to retry their requests with another server.
Reported-by: Aliasgar Ginwala <aginwala@ebay.com> Tested-by: Aliasgar Ginwala <aginwala@ebay.com> Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Han Zhou [Mon, 19 Aug 2019 16:29:57 +0000 (09:29 -0700)]
ovsdb-idl.c: Allows retry even when using a single remote.
When clustered mode is used, the client needs to retry connecting
to new servers when certain failures happen. Today it is allowed to
retry new connection only if multiple remotes are used, which prevents
using LB VIP with clustered nodes. This patch makes sure the retry
logic works when using LB VIP: although same IP is used for retrying,
the LB can actually redirect the connection to a new node.
Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Ben Pfaff [Wed, 21 Aug 2019 17:17:21 +0000 (10:17 -0700)]
sat-math: Do not use __builtin_s*_overflow() with sparse.
Some versions of sparse do not understand __builtin_saddll_overflow() and
related GCC builtins for calculations with overflow detection. This patch
avoids using them when sparse is in use.
Aliasgar Ginwala [Tue, 20 Aug 2019 01:36:13 +0000 (18:36 -0700)]
ovs-lib: Fix standalone db migration to raft
Current code of create-cluster from standalone db takes backup of existing
standalone db and then generates a new clustered dbs from backup dbs. Hence,
during migration if nb and sb dbs are still present, create-cluster will fail
saying file exists and will not really convert dbs to clustered dbs. This
patch fixes the same.
e.g message that pops up while migration from standalone to raft cluster:
* Backing up database to /etc/openvswitch/ovnnb_db.db.backup5.13.0-1278623084
ovsdb-tool: I/O error: /etc/openvswitch/ovnnb_db.db: create failed (File exists)
* Creating cluster database /etc/openvswitch/ovnnb_db.db from existing one
Signed-off-by: aginwala <aginwala@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
Anand Kumar [Thu, 15 Aug 2019 16:39:06 +0000 (09:39 -0700)]
datapath-windows: Fix updating ct label when mask is specified
When an existing label needs to be changed by specifing bits to be
updated using mask, instead of updating only the masked bits,
new label was getting overridden. This patch fixes this issue.
Signed-off-by: Anand Kumar <kumaranand@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@ovn.org> Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
Ilya Maximets [Mon, 5 Aug 2019 15:14:46 +0000 (18:14 +0300)]
travis: Drop OSX workarounds.
TravisCI currently uses xcode9.4 as a default image and it
it has good version of libtool out-of-the-box.
Removing these workarounds saves 4-6 minutes of OSX build.
Ilya Maximets [Mon, 5 Aug 2019 12:26:02 +0000 (15:26 +0300)]
travis: Combine kernel builds.
Single kernel build job takes ~3 minutes in average. Most of
this time takes VM spawning and initial configuration.
Combining these 24 jobs in 4 allows us to better utilize workers
and not waste time on spawning VMs.
Ilya Maximets [Mon, 5 Aug 2019 07:58:18 +0000 (10:58 +0300)]
travis: Cache DPDK build.
This change enables cache for DPDK build directory, so we'll never
build same version of DPDK again. This speeds up each DPDK related
job by 4-6 minutes effectively saving 30-50 minutes of the total time.
Ex. Full TravisCI run on 'trusty' images:
Without cache:
Ran for 1 hr 9 min 29 sec
Total time 4 hrs 55 min 13 sec
With populated cache:
Ran for 1 hr 2 min 18 sec
Total time 4 hrs 20 min 9 sec
Saved:
Real time: ~7 minutes.
Total worker time: ~35 minutes.
Eelco Chaudron [Thu, 8 Aug 2019 13:27:05 +0000 (09:27 -0400)]
netdev-afxdp: fix corner case where umem entries were not released
If for some reason the last element in the batch was already pushed on
the stack, none of the elements where pushed. This was leading to
buffer starvation.
John Hurley [Tue, 30 Jul 2019 11:05:17 +0000 (12:05 +0100)]
ovs-tc: offload MPLS set actions to TC datapath
Recent modifications to TC allows the modifying of fields within the
outermost MPLS header of a packet. OvS datapath rules impliment an MPLS
set action by supplying a new MPLS header that should overwrite the
current one.
Convert the OvS datapath MPLS set action to a TC modify action and allow
such rules to be offloaded to a TC datapath.
Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
John Hurley [Tue, 30 Jul 2019 11:05:16 +0000 (12:05 +0100)]
ovs-tc: offload MPLS push actions to TC datapath
TC can now be used to push an MPLS header onto a packet. The MPLS label is
the only information that needs to be passed here with the rest reverting
to default values if none are supplied. OvS, however, gives the entire
MPLS header to be pushed along with the MPLS protocol to use. TC can
optionally accept these values so can be made replicate the OvS datapath
rule.
Convert OvS MPLS push datapath rules to TC format and offload to a TC
datapath.
Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
John Hurley [Tue, 30 Jul 2019 11:05:15 +0000 (12:05 +0100)]
ovs-tc: offload MPLS pop actions to TC datapath
TC now supports an action to pop the outer MPLS header from a packet. The
next protocol after the header is required alongside this. Currently, OvS
datapath rules also supply this information.
Offload OvS MPLS pop actions to TC along with the next protocol.
Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
John Hurley [Tue, 30 Jul 2019 11:05:14 +0000 (12:05 +0100)]
compat: add compatibility headers for tc mpls action
OvS includes compat code for several TC actions including vlan, mirred and
tunnel key. MPLS actions have recently been added to TC in the kernel. In
preparation for adding TC offload code for MPLS, add the MPLS compat code.
Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
dpif-netlink: Allow offloading of flows with dl_type 0x1234.
'dpif_probe_feature()' always has DPIF_FP_PROBE flag set. Other probing
code uses dpif_execute() with DPIF_OP_EXECUTE, hence never calls
parse_flow_put().
Thus, this 'if' statement is wrong and should be removed as it only
forbids offloading of the real legitimate flows with dl_type 0x1234.
Dummy flows never reach this code.
CC: Paul Blakey <paulb@mellanox.com> Fixes: 8b668ee3f0cc ("dpif-netlink: Use netdev flow put api to insert a flow") Reported-by: Eli Britstein <elibr@mellanox.com> Acked-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
In case of failure of 'xsk_configure_all()', 'n_rxq' and 'xdpmode'
will remain in a new state. This will result in successful
reconfiguration (immediate return, because configuration is already
applied) if 'netdev_reconfigure()' will be called again.
Same issue was fixed previously for netdev-dpdk using 'dev->started'
flag in commit: 606f66507250 ("netdev-dpdk: Don't use PMD driver if not configured successfully")
Let's use similar approach with checking the 'dev->xsks' which only
exists if configuration was successful.
Additionally implemented 'netdev_afxdp_construct()' function to
explicitly initialize all the specific fields and request the
reconfiguration.
CC: William Tu <u9012063@gmail.com> Fixes: 0de1b425962d ("netdev-afxdp: add new netdev type for AF_XDP.") Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Yifeng Sun [Thu, 18 Jul 2019 22:59:45 +0000 (15:59 -0700)]
test: Fix fragment-related tests that fail on 4.19+ due to small-sized packets
These fragment-related tests are failing on later kernels (4.19.x)
because kernel quietly drops any packet fragment that is not the last
but has a size smaller than IPV6_MIN_MTU. This patch fixes
them by increasing their sizes to IPV6_MIN_MTU.
Reviewed-by: Darrell Ball <dlu998@gmail.com>
Reivewed-at: https://github.com/openvswitch/ovs/pull/278 Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
tnl-neigh-cache: Purge learnt neighbors when port/bridge is deleted
Say an ARP entry is learnt on a OVS port and when such a port is deleted,
learnt entry should be removed from the port. It would have be aged out after
ARP ageout time. This code will clean up immediately.
Added test case(tunnel - neighbor entry add and deletion) in tunnel.at, to
verify neighbors are added and removed on deletion of a ports and bridges.
Discussion for this addition is at:
https://mail.openvswitch.org/pipermail/ovs-discuss/2019-June/048754.html
Signed-off-by: Vasu Dasari <vdasari@gmail.com> Reviewed-by: Flavio Fernandes <flavio@flaviof.com> Reviewed-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
William Tu [Fri, 19 Jul 2019 14:49:01 +0000 (07:49 -0700)]
system-traffic: Make nsh test more robust.
The patch adds '-n' to tcpdump to avoid address coverting. Add '-l' for rhel8
to avoid buffering. Since '-U' is used to output to stdout, simply use 'cat'
to search result. Use OVS_WAIT_UNTIL instead of sleep, and also remove/add
some newlines. Finally, move tcpdump captured interface into the namespace,
(capture p1 instead of ovs-p1), and tested using af_xdp.
Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
William Tu [Thu, 18 Jul 2019 20:11:14 +0000 (13:11 -0700)]
netdev-afxdp: add new netdev type for AF_XDP.
The patch introduces experimental AF_XDP support for OVS netdev.
AF_XDP, the Address Family of the eXpress Data Path, is a new Linux socket
type built upon the eBPF and XDP technology. It is aims to have comparable
performance to DPDK but cooperate better with existing kernel's networking
stack. An AF_XDP socket receives and sends packets from an eBPF/XDP program
attached to the netdev, by-passing a couple of Linux kernel's subsystems
As a result, AF_XDP socket shows much better performance than AF_PACKET
For more details about AF_XDP, please see linux kernel's
Documentation/networking/af_xdp.rst. Note that by default, this feature is
not compiled in.
Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>