Ido Schimmel [Sun, 30 Dec 2018 17:14:54 +0000 (17:14 +0000)]
bridge: fdb: Use 'struct ndmsg' for FDB dumping
Since commit aea41afcfd6d ("ip bridge: Set NETLINK_GET_STRICT_CHK on
socket") iproute2 uses strict checking on kernels that support it. This
causes FDB dumping to fail [1], as iproute2 uses 'struct ifinfomsg'
whereas the kernel expects 'struct ndmsg'.
Note that with this change iproute2 continues to work on old kernels
that do not support strict checking, but contain the fix introduced in
kernel commit bd961c9bc664 ("rtnetlink: fix rtnl_fdb_dump() for ndmsg
header").
[1]
# bridge fdb show
[ 5365.137224] netlink: 4 bytes leftover after parsing attributes in process `bridge'.
Error: bytes leftover after parsing attributes.
Dump terminated
Fixes: aea41afcfd6d ("ip bridge: Set NETLINK_GET_STRICT_CHK on socket") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Michael Guralnik [Sun, 23 Dec 2018 11:24:19 +0000 (13:24 +0200)]
rdma: Add print of link CapabilityMask2 flags
CapabilityMask2 is defined in IBTA spec as a member of PortInfo.
Add translation to string of new CapabilityMask2 expansion of link caps.
The flags are concatenated to current caps print as seen in this example
printing EXT_INFO flag:
root@server-22 $ rdma -d link
1/1: mlx5_0/1: subnet_prefix fe80:0000:0000:0000 lid 2 sm_lid 2 lmc 0
state ACTIVE physical_state LINK_UP
caps: <SM, TRAP, SL_MAP, SYS_IMAGE_GUID, CABLE_INFO, EXTENDED_SPEEDS,
CAP_MASK2, CM, DEVICE_MGMT, VENDOR_CLASS, CAP_MASK_NOTICE,
CLIENT_REG, OTHER_LOCAL_CHANGES, MULT_PKER_TRAP, EXT_INFO>
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
David Ahern [Fri, 5 Oct 2018 20:49:41 +0000 (13:49 -0700)]
ip bridge: Set NETLINK_GET_STRICT_CHK on socket
iproute2 has been updated for the new strict policy in the kernel. Add a
helper to call setsockopt to enable the feature. Add a call to ip.c and
bridge.c
The setsockopt fails on older kernels and the error can be safely ignored
- any new fields or attributes are ignored by the older kernel.
David Ahern [Wed, 19 Dec 2018 21:30:44 +0000 (13:30 -0800)]
ip address: Set device index in dump request
Add a filter function to rtnl_addrdump_req to set device index in the
address dump request if the user is filtering addresses by device. In
addition, add a new ipaddr_link_get to do a single RTM_GETLINK request
instead of a device dump yet still store the data in the linfo list.
David Ahern [Fri, 19 Oct 2018 22:41:39 +0000 (15:41 -0700)]
ip route: Add protocol, table id and device to dump request
Add protocol, table id and device to dump request if set in filter. If
kernel side filtering is supported it is used to reduce the amount of
data sent to userspace.
Older kernels do not parse attributes on a route dump request, so these
are silently ignored and ip will do the filtering in userspace.
David Ahern [Thu, 4 Oct 2018 21:12:39 +0000 (14:12 -0700)]
libnetlink: linkdump_req: Only AF_UNSPEC family expects an ext_filter_mask
Only AF_UNSPEC handled by rtnl_dump_ifinfo expects an ext_filter_mask
on a dump request. Update the linkdump request functions to only set
and send ext_filter_mask for AF_UNSPEC.
David Ahern [Fri, 19 Oct 2018 22:34:44 +0000 (15:34 -0700)]
libnetlink: Use NLMSG_LENGTH to set nlmsg_len
Change nlmsg_len from sizeof(req) to use NLMSG_LENGTH on the header.
2 of the inner headers are not 4-byte aligned, so add a 0-length buf
after the header with the __aligned(NLMSG_ALIGNTO) to ensure the size
of the request is large enough. Use NLMSG_ALIGN in NLMSG_LENGTH to set
nlmsg_len.
The classifier testbed test never worked and was always being
skipped. It depended on some files it tests/cls which never made
it into the iproute2 git repository.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David Ahern <dsahern@gmail.com>
Luca Boccassi [Sun, 16 Dec 2018 20:55:40 +0000 (20:55 +0000)]
testsuite: remove gre kmods if the test loads them
The tunnel test leaves behind link devices created by the GRE kernel
modules:
$ ip -br link
...
gre0@NONE DOWN 0.0.0.0 <NOARP>
gretap0@NONE DOWN 00:00:00:00:00:00 <BROADCAST,MULTICAST>
erspan0@NONE DOWN 00:00:00:00:00:00 <BROADCAST,MULTICAST>
ip6tnl0@NONE DOWN :: <NOARP>
ip6gre0@NONE DOWN 00:00:00:00:
Check beforehand if the gre kernel module is loaded, and if not unload
them all at the end of the test. This should avoid causing problems if
a user is already using GRE for other purposes.
Signed-off-by: Luca Boccassi <bluca@debian.org> Reviewed-by: Petr Vorel <pvorel@suse.cz> Tested-by: Petr Vorel <pvorel@suse.cz> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Luca Boccassi [Sun, 16 Dec 2018 20:55:38 +0000 (20:55 +0000)]
testsuite: declare dependency between $(TESTS) and generate_nlmsg
Parallel make from the top level directory fails since tests are at the
same time as generate_nlmsg:
$ make check -j4
...
cd testsuite && make && make alltests
echo "Entering iproute2" && cd iproute2 && make configure && cd ..;
Entering iproute2
make -C tools
Removing results dir ...
make[1]: ./tools/generate_nlmsg: Command not found
make[1]: ./tools/generate_nlmsg: Command not found
Makefile:64: recipe for target 'ip/netns/set_nsid_batch.t' failed
make[1]: *** [ip/netns/set_nsid_batch.t] Error 127
make[1]: ./tools/generate_nlmsg: Command not found
make[1]: *** Waiting for unfinished jobs....
Makefile:64: recipe for target 'ip/netns/set_nsid.t' failed
make[1]: *** [ip/netns/set_nsid.t] Error 127
Makefile:64: recipe for target 'ip/link/show_dev_wo_vf_rate.t' failed
make[1]: *** [ip/link/show_dev_wo_vf_rate.t] Error 127
CC generate_nlmsg
Makefile:123: recipe for target 'check' failed
make: *** [check] Error 2
Add an explicit dependency in testuite/Makefile's $(TESTS) rule so
that the tool correctly gets compiled before any test runs.
Fixes: 3537633dcf44 ("testsuite: Generate generate_nlmsg when needed") Signed-off-by: Luca Boccassi <bluca@debian.org> Reviewed-by: Petr Vorel <petr.vorel@gmail.com> Tested-by: Petr Vorel <petr.vorel@gmail.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Luca Boccassi [Sun, 16 Dec 2018 20:55:37 +0000 (20:55 +0000)]
Makefile: have check target depend on all
Otherwise it will simply fail immediately from a just-cleaned
workspace:
$ make check -j1
cd testsuite && make && make alltests
echo "Entering iproute2" && cd iproute2 && make configure && cd ..;
Entering iproute2
make -C tools
Makefile:3: ../../config.mk: No such file or directory
make[2]: *** No rule to make target '../../config.mk'. Stop.
Fixes: 8804a8c0d387 ("Makefile: Add check target") Signed-off-by: Luca Boccassi <bluca@debian.org> Reviewed-by: Petr Vorel <petr.vorel@gmail.com> Tested-by: Petr Vorel <petr.vorel@gmail.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Syrone Wong [Wed, 12 Dec 2018 11:35:08 +0000 (19:35 +0800)]
tc: fix xtables incorrect usage of LDFLAGS
The incorrect setting of LDFLAGS causes error below:
> em_ipt.o: In function `em_ipt_print_epot':
> em_ipt.c:(.text.em_ipt_print_epot+0x2e): undefined reference to
> `xtables_init_all'
em_ipt.c gets involved when TC_CONFIG_XT=y, which requires xtables,
while tc/Makefile doesn't pass flags correctly. It adds '-lxtables'
to LDFLAGS instead of LDLIBS.
Fixes: dd296215 ("tc: add em_ipt ematch for calling xtables matches from tc matching context") Signed-off-by: Syrone Wong <wong.syrone@gmail.com> Acked-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Leon Romanovsky [Tue, 11 Dec 2018 18:14:28 +0000 (20:14 +0200)]
rdma: Fix broken 32-bit compilation
Allow compilation of rdmatool on 32-bits platforms.
rdma
CC rdma.o
CC utils.o
CC dev.o
CC link.o
In file included from rdma.h:26:0,
from dev.c:12:
dev.c: In function 'dev_caps_tostr':
../include/utils.h:269:38: warning: left shift count >= width of type [-Wshift-count-overflow]
#define BIT(nr) (1UL << (nr))
^
rdma.h:32:61: note: in expansion of macro 'BIT'
#define RDMA_BITMAP_ENUM(name, bit_no) RDMA_BITMAP_##name = BIT(bit_no),
^~~
Fixes: 40df8263a0f0 ("rdma: Add dev object") Reported-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The argument to print_0xhex is converted to unsigned long long
so the format string give for normal printout has to be some
variant of %llx. Otherwise, bogus values will be printed on
32 bit platforms.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
On some 32 bit platforms, the printf was causing warning:
ipmacsec.c: In function ‘getattr_u64’:
ipmacsec.c:655:47: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘unsigned int’ [-Wformat=]
fprintf(stderr, "invalid attribute length %lu\n",
Resolve by computing length as size_t first.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Hoang Le [Thu, 6 Dec 2018 01:40:06 +0000 (08:40 +0700)]
tipc: fix misalignment printout in non-JSON output
In the commit 1304f50a5be0ed ("tipc: JSON support for showing nametable"),
introduced misalignment in the columns of the printout in non-JSON mode
compare to the list header. Add one space per column to make alignment
with the list header.
before:
$tipc name show
Type Lower Upper Scope Port Node
1 1 1 node 4071367628
after:
$tipc name show
Type Lower Upper Scope Port Node
1 1 1 node 4071367628
Reported-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Petr Machata [Tue, 4 Dec 2018 16:07:41 +0000 (16:07 +0000)]
libnetlink: Process further iovs on no error
When no error is reported in the first iov, do not prematurely return,
but process further iovs. This fixes batch processing.
Fixes: c60389e4f9ea ("libnetlink: fix leak and using unused memory on error") Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Emeric Dupont [Mon, 3 Dec 2018 11:13:06 +0000 (12:13 +0100)]
iproute2: Installation errors without libmnl
When performing make install in iproute2 (current git master),
if $(HAVE_MNL) is not selected, some Makefiles try to call
install with an empty target, which causes a non-critical make error.
Signed-off-by: Emeric Dupont <emeric.dupont@zii.aero> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Amritha Nambiar [Tue, 27 Nov 2018 22:40:03 +0000 (14:40 -0800)]
tc: flower: Classify packets based port ranges
Added support for filtering based on port ranges.
UAPI changes have been accepted into net-next.
Example:
1. Match on a port range:
-------------------------
$ tc filter add dev enp4s0 protocol ip parent ffff:\
prio 1 flower ip_proto tcp dst_port 20-30 skip_hw\
action drop
$ tc -s filter show dev enp4s0 parent ffff:
filter protocol ip pref 1 flower chain 0
filter protocol ip pref 1 flower chain 0 handle 0x1
eth_type ipv4
ip_proto tcp
dst_port 20-30
skip_hw
not_in_hw
action order 1: gact action drop
random type none pass val 0
index 1 ref 1 bind 1 installed 85 sec used 3 sec
Action statistics:
Sent 460 bytes 10 pkt (dropped 10, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
2. Match on IP address and port range:
--------------------------------------
$ tc filter add dev enp4s0 protocol ip parent ffff:\
prio 1 flower dst_ip 192.168.1.1 ip_proto tcp dst_port 100-200\
skip_hw action drop
$ tc -s filter show dev enp4s0 parent ffff:
filter protocol ip pref 1 flower chain 0 handle 0x2
eth_type ipv4
ip_proto tcp
dst_ip 192.168.1.1
dst_port 100-200
skip_hw
not_in_hw
action order 1: gact action drop
random type none pass val 0
index 2 ref 1 bind 1 installed 58 sec used 2 sec
Action statistics:
Sent 920 bytes 20 pkt (dropped 20, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
v6:
Modified to change json output format as object for sport/dport.
Phil Sutter [Wed, 28 Nov 2018 11:12:32 +0000 (12:12 +0100)]
man: ip-route.8: Fix ENCAP references in synopsis
The different encapsulation types are described in ENCAP_*
non-terminals, but ENCAP definition lists them without the ENCAP_
prefix. Fix this for consistency.
Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Roopa Prabhu [Wed, 28 Nov 2018 02:02:52 +0000 (18:02 -0800)]
bridge: make -c match -compressvlans first instead of -color
commit c7c1a1ef51ae ("bridge: colorize output and use JSON print library")
broke previous use of -c to represent compressvlans. This restores
previous use of -c to represent compressvlans. Understand the original
motivation to use -c to represent color consistently everywhere but
there are apps and network interface managers out there that are already
using -c to prepresent compressed vlans.
Fixes: c7c1a1ef51ae ("bridge: colorize output and use JSON print library") Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Eric Dumazet [Sat, 24 Nov 2018 06:37:24 +0000 (22:37 -0800)]
tc: fq: support ce_threshold attribute
Kernel commit 48872c11b772 ("net_sched: sch_fq: add dctcp-like marking")
added support for TCA_FQ_CE_THRESHOLD attribute.
This patch adds iproute2 support for it.
It also makes sure fq_print_xstats() can deal with smaller tc_fq_qd_stats
structures given by older kernels.
Usage :
FQATTRS="ce_threshold 4ms"
TXQS=8
for ETH in eth0
do
tc qd del dev $ETH root 2>/dev/null
tc qd add dev $ETH root handle 1: mq
for i in `seq 1 $TXQS`
do
tc qd add dev $ETH parent 1:$i fq $FQATTRS
done
done
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David Ahern <dsahern@gmail.com>
David Ahern [Sat, 24 Nov 2018 15:17:30 +0000 (07:17 -0800)]
Merge branch 'tc-gred-json-perf-vq-config' into iproute2-next
Jakub Kicinski says:
====================
This set brings GRED support up to date with recent kernel changes.
In particular the new netlink attributes for more fine-grained stats
and per-virtual queue flags.
To make GRED usable in modern deployments the patch set starts with
adding JSON output.
Patch added to iproute2-master breaks builds of -next because of a
more recent patch in -next that relies on the exports. Revert the
offending patch. Unfortunately this leaves a window where builds
break.
Quentin Monnet [Tue, 20 Nov 2018 01:26:27 +0000 (01:26 +0000)]
bpf: initialise map symbol before retrieving and comparing its type
In order to compare BPF map symbol type correctly in regard to the
latest LLVM, commit 7a04dd84a7f9 ("bpf: check map symbol type properly
with newer llvm compiler") compares map symbol type to both NOTYPE and
OBJECT. To do so, it first retrieves the type from "sym.st_info" and
stores it into a temporary variable.
However, the type is collected from the symbol "sym" before this latter
symbol is actually updated. gelf_getsym() is called after that and
updates "sym", and when comparison with OBJECT or NOTYPE happens it is
done on the type of the symbol collected in the previous passage of the
loop (or on an uninitialised symbol on the first passage). This may
eventually break map collection from the ELF file.
Fix this by assigning the type to the temporary variable only after the
call to gelf_getsym().
Fixes: 7a04dd84a7f9 ("bpf: check map symbol type properly with newer llvm compiler") Reported-by: Ron Philip <ron.philip@netronome.com> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Amritha Nambiar [Fri, 16 Nov 2018 00:55:13 +0000 (16:55 -0800)]
tc: flower: Classify packets based port ranges
Added support for filtering based on port ranges.
UAPI changes have been accepted into net-next.
Example:
1. Match on a port range:
-------------------------
$ tc filter add dev enp4s0 protocol ip parent ffff:\
prio 1 flower ip_proto tcp dst_port range 20-30 skip_hw\
action drop
$ tc -s filter show dev enp4s0 parent ffff:
filter protocol ip pref 1 flower chain 0
filter protocol ip pref 1 flower chain 0 handle 0x1
eth_type ipv4
ip_proto tcp
dst_port range 20-30
skip_hw
not_in_hw
action order 1: gact action drop
random type none pass val 0
index 1 ref 1 bind 1 installed 85 sec used 3 sec
Action statistics:
Sent 460 bytes 10 pkt (dropped 10, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
2. Match on IP address and port range:
--------------------------------------
$ tc filter add dev enp4s0 protocol ip parent ffff:\
prio 1 flower dst_ip 192.168.1.1 ip_proto tcp dst_port range 100-200\
skip_hw action drop
$ tc -s filter show dev enp4s0 parent ffff:
filter protocol ip pref 1 flower chain 0 handle 0x2
eth_type ipv4
ip_proto tcp
dst_ip 192.168.1.1
dst_port range 100-200
skip_hw
not_in_hw
action order 1: gact action drop
random type none pass val 0
index 2 ref 1 bind 1 installed 58 sec used 2 sec
Action statistics:
Sent 920 bytes 20 pkt (dropped 20, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
v3:
Modified flower_port_range_attr_type calls.
v2:
Addressed Jiri's comment to sync output format with input