]> git.proxmox.com Git - mirror_iproute2.git/log
mirror_iproute2.git
6 years agoip: Fix compilation break on old systems
Leon Romanovsky [Mon, 13 Nov 2017 10:21:19 +0000 (12:21 +0200)]
ip: Fix compilation break on old systems

As was reported [1], the iproute2 fails to compile on old systems,
in Cong's case, it was Fedora 19, in our case it was RedHat 7.2, which
failed with the following errors during compilation:

ipxfrm.c: In function ‘xfrm_selector_print’:
ipxfrm.c:479:7: error: ‘IPPROTO_MH’ undeclared (first use in this
function)
  case IPPROTO_MH:
       ^
ipxfrm.c:479:7: note: each undeclared identifier is reported only once
for each function it appears in
ipxfrm.c: In function ‘xfrm_selector_upspec_parse’:
ipxfrm.c:1345:8: error: ‘IPPROTO_MH’ undeclared (first use in this
function)
   case IPPROTO_MH:
        ^                                                                                                                                                            make[1]: *** [ipxfrm.o] Error 1

The reason to it is the order of headers files. The IPPROTO_MH field is
set in kernel's UAPI header file (in6.h), but only in case
__UAPI_DEF_IPPROTO_V6 is set before. That define comes from other kernel's
header file (libc-compat.h) and is set in case there are no previous
libc relevant declarations.

In ip code, the include of <netdb.h> causes to indirect inclusion of
<netinet/in.h> and it sets __UAPI_DEF_IPPROTO_V6 to be zero and prevents from
IPPROTO_MH declaration.

This patch takes the simplest possible approach to fix the compilation
error by checking if IPPROTO_MH was defined before and in case it
wasn't, it defines it to be the same as in the kernel.

[1] https://www.spinics.net/lists/netdev/msg463980.html

Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Riad Abo Raed <riada@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
6 years agov4.14.0
Stephen Hemminger [Mon, 13 Nov 2017 00:29:43 +0000 (16:29 -0800)]
v4.14.0

6 years agodevlink: add batch command support
Ivan Vecera [Fri, 10 Nov 2017 06:20:14 +0000 (07:20 +0100)]
devlink: add batch command support

The patch adds support to batch devlink commands.

Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
6 years agolib: make resolve_hosts variable common
Ivan Vecera [Fri, 10 Nov 2017 06:20:13 +0000 (07:20 +0100)]
lib: make resolve_hosts variable common

Any iproute utility that uses any function from lib/utils.c needs
to declare its own resolve_hosts variable instance although it does
not need/use hostname resolving functionality (currently only 'ip'
and 'ss' commands uses this).
The patch declares single common instance of resolve_hosts directly
in utils.c so the existing ones can be removed (the same approach
that is used for timestamp_short).

Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
6 years agotc: distinguish Add/Replace qdisc operations
Roman Mashak [Thu, 26 Oct 2017 21:30:08 +0000 (17:30 -0400)]
tc: distinguish Add/Replace qdisc operations

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
6 years agoupdate kernel headers
Stephen Hemminger [Sun, 12 Nov 2017 23:55:49 +0000 (15:55 -0800)]
update kernel headers

To 4.14 final kernel version
Note: SPDX tag added by upstream

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoxfrm_{state, policy}: Allow to deleteall polices/states with marks
Thomas Egerer [Mon, 30 Oct 2017 18:11:46 +0000 (19:11 +0100)]
xfrm_{state, policy}: Allow to deleteall polices/states with marks

Using 'ip deleteall' with policies that have marks, fails unless you
eplicitely specify the mark values. This is very uncomfortable when
bulk-deleting policies and states. With this patch all relevant states
and policies are wiped by 'ip deleteall' regardless of their mark
values.

Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com>
6 years agoxfrm_policy: Do not attempt to deleteall a socket policy
Thomas Egerer [Mon, 30 Oct 2017 18:11:45 +0000 (19:11 +0100)]
xfrm_policy: Do not attempt to deleteall a socket policy

Socket polices are added to a socket using setsockopt(2). They cannot be
deleted by iproute2. The attempt to delete them causes an error
(EINVAL).
To avoid this unnecessary error message all socket policies are skipped
in xfrm_policy_keep.

Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com>
6 years agoxfrm_policy: Add filter option for socket policies
Thomas Egerer [Mon, 30 Oct 2017 18:11:44 +0000 (19:11 +0100)]
xfrm_policy: Add filter option for socket policies

Listing policies on systems with a lot of socket policies can be
confusing due to the number of returned polices. Even if socket polices
are not of interest, they cannot be filtered. This patch adds an option
to filter all socket policies from the output.

Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com>
6 years agoss: Fix width calculations when Netid or State columns are missing
Stefano Brivio [Tue, 31 Oct 2017 17:47:56 +0000 (18:47 +0100)]
ss: Fix width calculations when Netid or State columns are missing

If Netid or State columns are missing, we must not subtract one
for each of these two columns from the remaining screen width,
while distributing available space to columns. This one
character corresponding to one delimiting space has to be
subtracted only if the columns are actually printed.

Further, in the existing implementation, if the screen width is
an odd number, one additional character is added to the width of
one of the two columns.

But if both are not printed, this filling character needs to be
added somewhere else, in order to have the right spacing
allowing us to fill lines completely.

Address and port fields are printed in pairs (local and remote),
so we can't distribute the space to any of them, because it
would be doubled. Instead, print this additional space to the
right of the Send-Q column, to keep code changes to a minimum.

This is particularly visible with 'ss -f netlink -Z'. Before
this patch, with an 80 column terminal, we have:

$ ss -f netlink -Z|head -n3
Recv-Q Send-Q Local Address:Port                 Peer Address:Port
0      0            rtnl:evolution-calen/2049           *                     pr
oc_ctx=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
0      0            rtnl:clock-applet/1944              *                     pr
oc_ctx=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023

and with an 81 column terminal:

$ ss -f netlink -Z|head -n3
Recv-Q Send-Q Local Address:Port                 Peer Address:Port
0      0            rtnl:evolution-calen/2049           *                     pro
c_ctx=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
0      0            rtnl:clock-applet/1944              *                     pro
c_ctx=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023

After this patch, in both cases, the output is:
$ ss -f netlink -Z|head -n3
Recv-Q Send-Q Local Address:Port                 Peer Address:Port
0      0             rtnl:evolution-calen/2049            *
 proc_ctx=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
0      0             rtnl:clock-applet/1944               *
 proc_ctx=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
6 years agoss: Streamline process context printing in netlink_show_one()
Stefano Brivio [Tue, 31 Oct 2017 17:47:55 +0000 (18:47 +0100)]
ss: Streamline process context printing in netlink_show_one()

There's no need to check 'pid_context' before calling free().

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
6 years agoss: Remove useless width specifier in process context print
Stefano Brivio [Tue, 31 Oct 2017 17:47:54 +0000 (18:47 +0100)]
ss: Remove useless width specifier in process context print

Both local address and service, and remote address and service
fields are already printed out in netlink_show_one() before we
start printing process context, by calling sock_addr_print()
twice.

At this point, sock_addr_print() has already forced the remote
service field to be 'serv_width' wide -- that is, 'serv_width'
width has already been consumed, before we print process
context.

Hence, it makes no sense to force the display width of process
context to be 'serv_width' wide again: previous prints have
filled up the line already. Remove the width specifier and
prefix with a space instead, to keep this consistent with fields
which are displayed after the first output line.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
6 years agoip netns: use strtol() instead of atoi()
Roman Mashak [Tue, 31 Oct 2017 18:24:19 +0000 (14:24 -0400)]
ip netns: use strtol() instead of atoi()

Use strtol-based API to parse and validate integer input; atoi() does
not detect errors and may yield undefined behaviour if result can't be
represented.

v2: use get_unsigned() since network namespace is really an unsigned value.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
6 years agoUpdate kernel headers based on 4.14-rc7
Stephen Hemminger [Tue, 31 Oct 2017 17:01:51 +0000 (18:01 +0100)]
Update kernel headers based on 4.14-rc7

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agotc: m_ife: fix match tcindex parsing
Alexander Aring [Mon, 30 Oct 2017 16:37:49 +0000 (12:37 -0400)]
tc: m_ife: fix match tcindex parsing

This patch changes ife_prio to ife_tcindex which is right variable to
assign in the argument in this case.

Signed-off-by: Alexander Aring <aring@mojatatu.com>
6 years agoip: added missing newline in man page
Roman Mashak [Fri, 27 Oct 2017 19:05:34 +0000 (15:05 -0400)]
ip: added missing newline in man page

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
6 years agobridge: checkpatch related cleanups
Stephen Hemminger [Fri, 27 Oct 2017 07:15:23 +0000 (09:15 +0200)]
bridge: checkpatch related cleanups

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agobridge: request vlans along with link information
Roman Mashak [Fri, 8 Sep 2017 21:52:23 +0000 (17:52 -0400)]
bridge: request vlans along with link information

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
6 years agobridge: dump vlan table information for link
Roman Mashak [Fri, 8 Sep 2017 21:52:22 +0000 (17:52 -0400)]
bridge: dump vlan table information for link

Kernel also reports vlans a port is member of, so print it. Since vlan
table can be quite large, dump it only when detailed information is
requested.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
6 years agobridge: isolate vlans parsing code in a separate API
Roman Mashak [Fri, 8 Sep 2017 21:52:21 +0000 (17:52 -0400)]
bridge: isolate vlans parsing code in a separate API

IFLA_BRIDGE_VLAN_INFO parsing logic will be used in link and vlan
processing code, so it makes sense to move it in the separate function.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
6 years agoman: add additional explainations for ss
yupeng [Thu, 26 Oct 2017 07:15:31 +0000 (07:15 +0000)]
man: add additional explainations for ss

Add detail explains of -m, -o, -e and -i options, which are not documented anywhere

Signed-off-by: yupeng <yupeng0921@gmail.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
6 years agotc/actions: introduce support for jump action
Jamal Hadi Salim [Sun, 22 Oct 2017 14:48:10 +0000 (10:48 -0400)]
tc/actions: introduce support for jump action

Sample use case:

... add ingress qdisc
sudo $TC qdisc add dev $ETH ingress

 ... if we exceed rate of 1kbps (burst of 90K), do an absolute jump of 2 actions
sudo $TC actions add action police rate 1kbit burst 90k conform-exceed jump 2 / pipe

sudo $TC -s actions ls action police
 action order 0:  police 0x4 rate 1Kbit burst 23440b mtu 2Kb action jump 2/pipe overhead 0b
 ref 1 bind 0 installed 41 sec used 41 sec
 Action statistics:
  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
  backlog 0b 0p requeues 0

... lets add a couple of marks so we can use them to mark exceed/not exceed
sudo $TC actions add action skbedit mark 11 ok index 11
sudo $TC actions add action skbedit mark 12 ok index 12

... if we dont exceed our rate we get a mark of 11, else mark of 12
sudo $TC filter add dev $ETH parent ffff: protocol ip prio 8 u32 \
match ip dst 127.0.0.8/32 flowid 1:10 \
action police index 4 \
action skbedit index 11 \
action skbedit index 12

Ok, lets keep this thing a little busy..
sudo ping -f -c 10000 127.0.0.8

... now lets see the filters..
sudo $TC -s filter ls dev $ETH parent ffff: protocol ip
filter pref 8 u32 chain 0
filter pref 8 u32 chain 0 fh 800: ht divisor 1
filter pref 8 u32 chain 0 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10 not_in_hw  (rule hit 20000 success 10000)
  match 7f000008/ffffffff at 16 (success 10000 )
action order 1:  police 0x4 rate 1Kbit burst 23440b mtu 2Kb action jump 2/pipe overhead 0b
ref 2 bind 1 installed 198 sec used 2 sec
Action statistics:
Sent 840000 bytes 10000 pkt (dropped 0, overlimits 9721 requeues 0)
backlog 0b 0p requeues 0

action order 2:  skbedit mark 11 pass
 index 11 ref 2 bind 1 installed 127 sec used 2 sec
  Action statistics:
Sent 23436 bytes 279 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

action order 3:  skbedit mark 12 pass
 index 12 ref 2 bind 1 installed 127 sec used 2 sec
  Action statistics:
Sent 816564 bytes 9721 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

As can be seen 97.21% of the packets were marked as exceeding the allocated
rate; you could do something clever with the skb mark after this.

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoss: initialize 'fackets' member of tcpstat structure
Roman Mashak [Wed, 18 Oct 2017 19:44:01 +0000 (15:44 -0400)]
ss: initialize 'fackets' member of tcpstat structure

'fackets' has never been initialized with kernel extracted information, thus
never really printed.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
6 years agoip maddr: fix filtering by device
Michal Kubecek [Thu, 19 Oct 2017 08:21:08 +0000 (10:21 +0200)]
ip maddr: fix filtering by device

Commit 530903dd9003 ("ip: fix igmp parsing when iface is long") uses
variable len to keep trailing colon from interface name comparison.  This
variable is local to loop body but we set it in one pass and use it in
following one(s) so that we are actually using (pseudo)random length for
comparison. This became apparent since commit b48a1161f5f9 ("ipmaddr: Avoid
accessing uninitialized data") always initializes len to zero so that the
name comparison is always true. As a result, "ip maddr show dev eth0" shows
IPv4 multicast addresses for all interfaces.

Instead of keeping the length, let's simply replace the trailing colon with
a null byte. The bonus is that we get correct interface name in ma.name.

Fixes: 530903dd9003 ("ip: fix igmp parsing when iface is long")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Acked-by: Phil Sutter <phil@nwl.cc>
Acked-by: Petr Vorel <pvorel@suse.cz>
6 years agoss: Detect IPPROTO_ICMPV6 sockets
Phil Sutter [Wed, 18 Oct 2017 18:08:26 +0000 (20:08 +0200)]
ss: Detect IPPROTO_ICMPV6 sockets

Prefix IPPROTO_ICMPV6 sockets with 'icmp6' instead of '???'.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoss: Distinguish between IPv4 and IPv6 wildcard sockets
Phil Sutter [Wed, 18 Oct 2017 17:58:13 +0000 (19:58 +0200)]
ss: Distinguish between IPv4 and IPv6 wildcard sockets

Commit aba9c23a6e1cb ("ss: enclose IPv6 address in brackets") unified
display of wildcard sockets in IPv4 and IPv6 to print the unspecified
address as '*'. Users then complained that they can't distinguish
between address families anymore, so change this again to what Stephen
Hemminger suggested:

| *:80    << both IPV6 and IPV4
| [::]:80 << IPV6_ONLY
| 0.0.0.0:80  << IPV4_ONLY

Note that on older kernels which don't support INET_DIAG_SKV6ONLY
attribute, pure IPv6 sockets will still show as '*'.

Cc: Humberto Alves <hjalves@live.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/shemminger...
Stephen Hemminger [Thu, 19 Oct 2017 00:11:50 +0000 (17:11 -0700)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2

6 years agocolor: Rename enum
Petr Vorel [Fri, 13 Oct 2017 13:57:19 +0000 (15:57 +0200)]
color: Rename enum

COLOR_NONE is more descriptive than COLOR_CLEAR.

Signed-off-by: Petr Vorel <petr.vorel@gmail.com>
6 years agocolor: Cleanup code to remove "magic" offset + 7
Petr Vorel [Fri, 13 Oct 2017 13:57:18 +0000 (15:57 +0200)]
color: Cleanup code to remove "magic" offset + 7

Signed-off-by: Petr Vorel <petr.vorel@gmail.com>
6 years agocolor: Fix another ip segfault when using --color switch
Petr Vorel [Fri, 13 Oct 2017 13:57:17 +0000 (15:57 +0200)]
color: Fix another ip segfault when using --color switch

Commit 959f1428 ("color: add new COLOR_NONE and disable_color function")
introducing color enum COLOR_NONE, which is not only duplicite of
COLOR_CLEAR, but also caused segfault, when running ip with --color
switch, as 'attr + 8' in color_fprintf() access array item out of
bounds. Thus removing it and restoring "magic" offset + 7.

Reproduce with:
$ ip -c a

Signed-off-by: Petr Vorel <petr.vorel@gmail.com>
6 years agocolor: Fix ip segfault when using --color switch
Petr Vorel [Fri, 13 Oct 2017 13:57:16 +0000 (15:57 +0200)]
color: Fix ip segfault when using --color switch

Commit d0e72011 ("ip: ipaddress.c: add support for json output")
introduced passing -1 as enum color_attr. This is not only wrong as no
color_attr has value -1, but also causes another segfault in color_fprintf()
on this setup as there is no item with index -1 in array of enum attr_colors[].
Using COLOR_CLEAR is valid option.

Reproduce with:
$ COLORFGBG='0;15' ip -c a

NOTE: COLORFGBG is environmental variable used for defining whether user
has light or dark background.
COLORFGBG="0;15" is used to ask for color set suitable for light background,
COLORFGBG="15;0" is used to ask for color set suitable for dark background.

Signed-off-by: Petr Vorel <petr.vorel@gmail.com>
6 years agotests: Revert back /bin/sh in shebang
Petr Vorel [Sun, 15 Oct 2017 09:59:45 +0000 (11:59 +0200)]
tests: Revert back /bin/sh in shebang

This was added by mistake in commit ecd44e68
("tests: Remove bashisms (s/source/.)")

Signed-off-by: Petr Vorel <petr.vorel@gmail.com>
6 years agonetem: fix code indentation
Stephen Hemminger [Thu, 12 Oct 2017 01:08:15 +0000 (18:08 -0700)]
netem: fix code indentation

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoss: print MD5 signature keys configured on TCP sockets
Ivan Delalande [Fri, 6 Oct 2017 23:48:20 +0000 (16:48 -0700)]
ss: print MD5 signature keys configured on TCP sockets

These keys are reported by kernel 4.14 and later under the
INET_DIAG_MD5SIG attribute, when INET_DIAG_INFO is requested (ss -i)
and we have CAP_NET_ADMIN. The additional output looks like:

md5keys:fe80::/64=signing_key,10.1.2.0/24=foobar,::1/128=Test

Signed-off-by: Ivan Delalande <colona@arista.com>
6 years agoutils: add print_escape_buf to format and print arbitrary bytes
Ivan Delalande [Fri, 6 Oct 2017 23:48:19 +0000 (16:48 -0700)]
utils: add print_escape_buf to format and print arbitrary bytes

Keep it as simple as possible for now: just escape anything that is not
isprint-able, is among the "escape" parameter or '\' as an octal escape
sequence. This should be pretty easy to extend if any other user needs
something more complex in the future.

Signed-off-by: Ivan Delalande <colona@arista.com>
6 years agolib: fix multiple strlcpy definition
Baruch Siach [Mon, 9 Oct 2017 05:49:44 +0000 (08:49 +0300)]
lib: fix multiple strlcpy definition

Some C libraries, like uClibc and musl, provide BSD compatible
strlcpy(). Add check_strlcpy() to configure, and avoid defining strlcpy
and strlcat when the C library provides them.

This fixes the following static link error with uClibc-ng:

.../sysroot/usr/lib/libc.a(strlcpy.os): In function `strlcpy':
strlcpy.c:(.text+0x0): multiple definition of `strlcpy'
../lib/libutil.a(utils.o):utils.c:(.text+0x1ddc): first defined here
collect2: error: ld returned 1 exit status

Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
6 years agotests: Remove bashisms (s/source/.)
Petr Vorel [Sun, 8 Oct 2017 14:39:16 +0000 (16:39 +0200)]
tests: Remove bashisms (s/source/.)

Signed-off-by: Petr Vorel <petr.vorel@gmail.com>
6 years agordma: move headers to uapi
Stephen Hemminger [Wed, 11 Oct 2017 17:47:28 +0000 (10:47 -0700)]
rdma: move headers to uapi

And update with version from upstream.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoiproute: build more easily on Android
Lorenzo Colitti [Mon, 2 Oct 2017 17:03:37 +0000 (02:03 +0900)]
iproute: build more easily on Android

iproute2 contains a bunch of kernel headers, including uapi ones.
Android's libc uses uapi headers almost directly, and uses a
script to fix kernel types that don't match what userspace
expects.

For example: https://issuetracker.google.com/36987220 reports
that our struct ip_mreq_source contains "__be32 imr_multiaddr"
rather than "struct in_addr imr_multiaddr". The script addresses
this by replacing the uapi struct definition with a #include
<bits/ip_mreq.h> which contains the traditional userspace
definition.

Unfortunately, when we compile iproute2, this definition
conflicts with the one in iproute2's linux/in.h.

Historically we've just solved this problem by running "git rm"
on all the iproute2 include/linux headers that break Android's
libc.  However, deleting the files in this way makes it harder to
keep up with upstream, because every upstream change to
an include file causes a merge conflict with the delete.

This patch fixes the problem by moving the iproute2 linux headers
from include/linux to include/uapi/linux.

Tested: compiles on ubuntu trusty (glibc)

Signed-off-by: Elliott Hughes <enh@google.com>
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
6 years agotipc: don't need custom CFLAGS
Stephen Hemminger [Wed, 11 Oct 2017 17:35:00 +0000 (10:35 -0700)]
tipc: don't need custom CFLAGS

Since libmnl CFLAGS are now handled by config.mk

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoCheck user supplied interface name lengths
Phil Sutter [Mon, 2 Oct 2017 11:46:37 +0000 (13:46 +0200)]
Check user supplied interface name lengths

The original problem was that something like:

| strncpy(ifr.ifr_name, *argv, IFNAMSIZ);

might leave ifr.ifr_name unterminated if length of *argv exceeds
IFNAMSIZ. In order to fix this, I thought about replacing all those
cases with (equivalent) calls to snprintf() or even introducing
strlcpy(). But as Ulrich Drepper correctly pointed out when rejecting
the latter from being added to glibc, truncating a string without
notifying the user is not to be considered good practice. So let's
excercise what he suggested and reject empty, overlong or otherwise
invalid interface names right from the start - this way calls to
strncpy() like shown above become safe and the user has a chance to
reconsider what he was trying to do.

Note that this doesn't add calls to check_ifname() to all places where
user supplied interface name is parsed. In many cases, the interface
must exist already and is therefore looked up using ll_name_to_index(),
so if_nametoindex() will perform the necessary checks already.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agotc: flower: No need to cache indev arg
Phil Sutter [Mon, 2 Oct 2017 11:46:36 +0000 (13:46 +0200)]
tc: flower: No need to cache indev arg

Since addattrstrz() will copy the provided string into the attribute
payload, there is no need to cache the data.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoip{6, }tunnel: Avoid copying user-supplied interface name around
Phil Sutter [Mon, 2 Oct 2017 11:46:35 +0000 (13:46 +0200)]
ip{6, }tunnel: Avoid copying user-supplied interface name around

In both files' parse_args() functions as well as in iptunnel's do_prl()
and do_6rd() functions, a user-supplied 'dev' parameter is uselessly
copied into a temporary buffer before passing it to ll_name_to_index()
or copying into a struct ifreq.  Avoid this by just caching the argv
pointer value until the later lookup/strcpy.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoip xfrm: use correct key length for netlink message
Michal Kubecek [Fri, 29 Sep 2017 11:41:05 +0000 (13:41 +0200)]
ip xfrm: use correct key length for netlink message

When SA is added manually using "ip xfrm state add", xfrm_state_modify()
uses alg_key_len field of struct xfrm_algo for the length of key passed to
kernel in the netlink message. However alg_key_len is bit length of the key
while we need byte length here. This is usually harmless as kernel ignores
the excess data but when the bit length of the key exceeds 512
(XFRM_ALGO_KEY_BUF_SIZE), it can result in buffer overflow.

We can simply divide by 8 here as the only place setting alg_key_len is in
xfrm_algo_parse() where it is always set to a multiple of 8 (and there are
already multiple places using "algo->alg_key_len / 8").

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
6 years agotc: fix ipv6 filter selector attribute for some prefix lengths
Yulia Kartseva [Sun, 1 Oct 2017 03:18:40 +0000 (20:18 -0700)]
tc: fix ipv6 filter selector attribute for some prefix lengths

Wrong TCA_U32_SEL attribute packing if prefixLen AND 0x1f equals 0x1f.
These are  /31, /63, /95 and /127 prefix lengths.

Example:
ip6 dst face:b00f::/31
filter parent b: protocol ipv6 pref 2307 u32
filter parent b: protocol ipv6 pref 2307 u32 fh 800: ht divisor 1
filter parent b: protocol ipv6 pref 2307 u32 fh 800::800 order 2048
key ht 800 bkt 0
  match faceb00f/ffffffff at 24

v2: previous patch was made with a wrong repo

Signed-off-by: Yulia Kartseva <hex@fb.com>
6 years agoip-route: Fix for listing routes with RTAX_LOCK attribute
Phil Sutter [Thu, 28 Sep 2017 17:33:56 +0000 (19:33 +0200)]
ip-route: Fix for listing routes with RTAX_LOCK attribute

This fixes a corner-case for routes with a certain metric locked to
zero:

| ip route add 192.168.7.0/24 dev eth0 window 0
| ip route add 192.168.7.0/24 dev eth0 window lock 0

Since the kernel doesn't dump the attribute if it is zero, both routes
added above would appear as if they were equal although they are not.

Fix this by taking mxlock value for the given metric into account before
skipping it if it is not present.

Reported-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agodoc: drop old ip command documentation
Stephen Hemminger [Fri, 29 Sep 2017 17:50:13 +0000 (10:50 -0700)]
doc: drop old ip command documentation

The old IP cross reference manual was very out of date, barely updated
since 1999.  The correct documentation is in the man pages.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agolib: json_print: rework 'new_json_obj' drop FILE* argument
Julien Fortin [Tue, 26 Sep 2017 23:45:39 +0000 (16:45 -0700)]
lib: json_print: rework 'new_json_obj' drop FILE* argument

As Stephen Hemminger mentioned on the last submission the new_json_obj
function is always called with fp == stdout, so right now, there's no
need of this extra argument.

The background for the rework is the following:
The ip monitor didn't call `new_json_obj` (even for in non json context),
so the static FILE* _fp variable wasn't initialized, thus raising a
SIGSEGV in ipaddress.c. This patch should fix this issue for good, new
paths won't have to call `new_json_obj`.

How to reproduce:

$ ip -t mon label link
(gdb) bt
.#0  _IO_vfprintf_internal (s=s@entry=0x0, format=format@entry=0x45460d “%d: “, ap=ap@entry=0x7fffffff7f18) at vfprintf.c:1278
.#1  0x0000000000451310 in color_fprintf (fp=0x0, attr=<optimized out>, fmt=0x45460d “%d: “) at color.c:108
.#2  0x000000000044a856 in print_color_int (t=t@entry=PRINT_ANY, color=color@entry=4294967295, key=key@entry=0x4545fc “ifindex”,
    fmt=fmt@entry=0x45460d “%d: “, value=<optimized out>) at ip_print.c:132
.#3  0x000000000040ccd2 in print_int (value=<optimized out>, fmt=0x45460d “%d: “, key=0x4545fc “ifindex”, t=PRINT_ANY) at ip_common.h:189
.#4  print_linkinfo (who=<optimized out>, n=0x7fffffffa380, arg=0x7ffff77a82a0 <_IO_2_1_stdout_>) at ipaddress.c:1107
.#5  0x0000000000422e13 in accept_msg (who=0x7fffffff8320, ctrl=0x7fffffff8310, n=0x7fffffffa380, arg=0x7ffff77a82a0 <_IO_2_1_stdout_>) at ipmonitor.c:89
.#6  0x000000000044c58f in rtnl_listen (rtnl=0x672160 <rth>, handler=handler@entry=0x422c70 <accept_msg>, jarg=0x7ffff77a82a0 <_IO_2_1_stdout_>)
    at libnetlink.c:761
.#7  0x00000000004233db in do_ipmonitor (argc=<optimized out>, argv=0x7fffffffe5a0) at ipmonitor.c:310
.#8  0x0000000000408f74 in do_cmd (argv0=0x7fffffffe7f5 “mon”, argc=3, argv=0x7fffffffe588) at ip.c:116
.#9  0x0000000000408a94 in main (argc=4, argv=0x7fffffffe580) at ip.c:311

Fixes: 6377572f ("ip: ip_print: add new API to print JSON or regular format output")
Reported-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Julien Fortin <julien@cumulusnetworks.com>
6 years agodoc: remove outdated IPv6 flow label document
Stephen Hemminger [Fri, 29 Sep 2017 17:06:50 +0000 (10:06 -0700)]
doc: remove outdated IPv6 flow label document

Not updated since Linux 2.2

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agodoc: remove outdated tc-filters documentation
Stephen Hemminger [Fri, 29 Sep 2017 17:05:09 +0000 (10:05 -0700)]
doc: remove outdated tc-filters documentation

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoignore generated Config file
Stephen Hemminger [Fri, 29 Sep 2017 17:02:31 +0000 (10:02 -0700)]
ignore generated Config file

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agodoc: remove outdated nstat/rtstat documentation
Stephen Hemminger [Fri, 29 Sep 2017 17:01:15 +0000 (10:01 -0700)]
doc: remove outdated nstat/rtstat documentation

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agodoc: remove outdated arpd documentation
Stephen Hemminger [Fri, 29 Sep 2017 17:00:12 +0000 (10:00 -0700)]
doc: remove outdated arpd documentation

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agodoc: remove outdated ss documentation
Stephen Hemminger [Fri, 29 Sep 2017 16:58:39 +0000 (09:58 -0700)]
doc: remove outdated ss documentation

The current version is well documented on man page.
The latex documentation is very old and was never upated.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agodoc: remove obsolete ip-tunnels documentation
Stephen Hemminger [Fri, 29 Sep 2017 16:57:19 +0000 (09:57 -0700)]
doc: remove obsolete ip-tunnels documentation

This file has not been updated since conversion to git
and is really old and outdated.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agolib: json_print: rework 'new_json_obj' drop FILE* argument
Julien Fortin [Tue, 26 Sep 2017 23:45:39 +0000 (16:45 -0700)]
lib: json_print: rework 'new_json_obj' drop FILE* argument

As Stephen Hemminger mentioned on the last submission the new_json_obj
function is always called with fp == stdout, so right now, there's no
need of this extra argument.

The background for the rework is the following:
The ip monitor didn't call `new_json_obj` (even for in non json context),
so the static FILE* _fp variable wasn't initialized, thus raising a
SIGSEGV in ipaddress.c. This patch should fix this issue for good, new
paths won't have to call `new_json_obj`.

How to reproduce:

$ ip -t mon label link
(gdb) bt
.#0  _IO_vfprintf_internal (s=s@entry=0x0, format=format@entry=0x45460d “%d: “, ap=ap@entry=0x7fffffff7f18) at vfprintf.c:1278
.#1  0x0000000000451310 in color_fprintf (fp=0x0, attr=<optimized out>, fmt=0x45460d “%d: “) at color.c:108
.#2  0x000000000044a856 in print_color_int (t=t@entry=PRINT_ANY, color=color@entry=4294967295, key=key@entry=0x4545fc “ifindex”,
    fmt=fmt@entry=0x45460d “%d: “, value=<optimized out>) at ip_print.c:132
.#3  0x000000000040ccd2 in print_int (value=<optimized out>, fmt=0x45460d “%d: “, key=0x4545fc “ifindex”, t=PRINT_ANY) at ip_common.h:189
.#4  print_linkinfo (who=<optimized out>, n=0x7fffffffa380, arg=0x7ffff77a82a0 <_IO_2_1_stdout_>) at ipaddress.c:1107
.#5  0x0000000000422e13 in accept_msg (who=0x7fffffff8320, ctrl=0x7fffffff8310, n=0x7fffffffa380, arg=0x7ffff77a82a0 <_IO_2_1_stdout_>) at ipmonitor.c:89
.#6  0x000000000044c58f in rtnl_listen (rtnl=0x672160 <rth>, handler=handler@entry=0x422c70 <accept_msg>, jarg=0x7ffff77a82a0 <_IO_2_1_stdout_>)
    at libnetlink.c:761
.#7  0x00000000004233db in do_ipmonitor (argc=<optimized out>, argv=0x7fffffffe5a0) at ipmonitor.c:310
.#8  0x0000000000408f74 in do_cmd (argv0=0x7fffffffe7f5 “mon”, argc=3, argv=0x7fffffffe588) at ip.c:116
.#9  0x0000000000408a94 in main (argc=4, argv=0x7fffffffe580) at ip.c:311

Fixes: 6377572f ("ip: ip_print: add new API to print JSON or regular format output")
Reported-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Julien Fortin <julien@cumulusnetworks.com>
6 years agoman: fix documentation for range of route table ID
Thomas Haller [Fri, 22 Sep 2017 11:28:54 +0000 (13:28 +0200)]
man: fix documentation for range of route table ID

Signed-off-by: Thomas Haller <thaller@redhat.com>
6 years agobpf: properly output json for xdp
Daniel Borkmann [Thu, 21 Sep 2017 08:42:29 +0000 (10:42 +0200)]
bpf: properly output json for xdp

After merging net-next branch into master, Stephen asked
to fix up json dump for XDP. Thus, rework the json dump a
bit, such that 'ip -json l' looks as below.

  [{
        "ifindex": 1,
        "ifname": "lo",
        "flags": ["LOOPBACK","UP","LOWER_UP"],
        "mtu": 65536,
        "xdp": {
            "mode": 2,
            "prog": {
                "id": 5,
                "tag": "e1e9d0ec0f55d638",
                "jited": 1
            }
        },
        "qdisc": "noqueue",
        "operstate": "UNKNOWN",
        "linkmode": "DEFAULT",
        "group": "default",
        "txqlen": 1000,
        "link_type": "loopback",
        "address": "00:00:00:00:00:00",
        "broadcast": "00:00:00:00:00:00"
    },[...]
  ]

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agojson: move json printer to common library
Daniel Borkmann [Thu, 21 Sep 2017 08:42:28 +0000 (10:42 +0200)]
json: move json printer to common library

Move the json printer which is based on json writer into the
iproute2 library, so it can be used by library code and tools
other than ip. Should probably have been done from the beginning
like that given json writer is in the library already anyway.
No functional changes.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Julien Fortin <julien@cumulusnetworks.com>
6 years agoip: ipaddress: fix missing space after prefixlen
Julien Fortin [Wed, 20 Sep 2017 20:26:51 +0000 (13:26 -0700)]
ip: ipaddress: fix missing space after prefixlen

Fixes: d0e720111aad2 ("ip: ipaddress.c: add support for json output")
Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Julien Fortin <julien@cumulusnetworks.com>
6 years agotc: fix typo in tc-tcindex man page
Davide Caratti [Thu, 14 Sep 2017 15:00:46 +0000 (17:00 +0200)]
tc: fix typo in tc-tcindex man page

fix mis-typed 'pass_on' keyword.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
6 years agoBPF: update headers from 4.14-rc1
Stephen Hemminger [Thu, 21 Sep 2017 01:00:36 +0000 (18:00 -0700)]
BPF: update headers from 4.14-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agotc: fq: support low_rate_threshold attribute
Eric Dumazet [Fri, 8 Sep 2017 21:12:59 +0000 (14:12 -0700)]
tc: fq: support low_rate_threshold attribute

TCA_FQ_LOW_RATE_THRESHOLD sch_fq attribute was added in linux-4.9

Tested:

lpaa5:/tmp# tc -qd add dev eth1 root fq
lpaa5:/tmp# tc -s qd sh dev eth1
qdisc fq 8003: root refcnt 5 limit 10000p flow_limit 1000p buckets 4096 \
 orphan_mask 4095 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 quantum 3648 \
 initial_quantum 18240 low_rate_threshold 550Kbit refill_delay 40.0ms
 Sent 62139 bytes 395 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  116 flows (114 inactive, 0 throttled)
  1 gc, 0 highprio, 0 throttled

lpaa5:/tmp# ./netperf -H lpaa6 -t TCP_RR -l10 -- -q 500000 -r 300,300 -o P99_LATENCY
99th Percentile Latency Microseconds
7081

lpaa5:/tmp# tc qd replace dev eth1 root fq low_rate_threshold 10Mbit
lpaa5:/tmp# ./netperf -H lpaa6 -t TCP_RR -l10 -- -q 500000 -r 300,300 -o P99_LATENCY
99th Percentile Latency Microseconds
858

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
6 years agoipaddress: Fix segfault in 'addr showdump'
Phil Sutter [Tue, 12 Sep 2017 14:58:12 +0000 (16:58 +0200)]
ipaddress: Fix segfault in 'addr showdump'

Obviously, 'addr showdump' feature wasn't adjusted to json output
support. As a consequence, calls to print_string() in print_addrinfo()
tried to dereference a NULL FILE pointer.

Fixes: d0e720111aad2 ("ip: ipaddress.c: add support for json output")
Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agodevlink: Add support for protocol IPv4/IPv6/Ethernet special formats
Arkadi Sharshevsky [Thu, 7 Sep 2017 14:26:43 +0000 (17:26 +0300)]
devlink: Add support for protocol IPv4/IPv6/Ethernet special formats

Add support for protocol IPv4/IPv6/Ethernet special formats.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agodevlink: Add support for special format protocol headers
Arkadi Sharshevsky [Thu, 7 Sep 2017 14:26:41 +0000 (17:26 +0300)]
devlink: Add support for special format protocol headers

In case of global header (protocol header), the header:field ids are used
to perform lookup for special format printer. In case no printer existence
fallback to plain value printing.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agodevlink: Make match/action parsing more flexible
Arkadi Sharshevsky [Thu, 7 Sep 2017 14:26:40 +0000 (17:26 +0300)]
devlink: Make match/action parsing more flexible

This patch decouples the match/action parsing from printing. This is
done as a preparation for adding the ability to print global header
values, for example print IPv4 address, which require special formatting.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agoutils: strlcpy() and strlcat() don't clobber dst
Phil Sutter [Wed, 6 Sep 2017 16:51:42 +0000 (18:51 +0200)]
utils: strlcpy() and strlcat() don't clobber dst

As David Laight correctly pointed out, the first version of strlcpy()
modified dst buffer behind the string copied into it. Fix this by
writing NUL to the byte immediately following src string instead of to
the last byte in dst. Doing so also allows to reduce overhead by using
memcpy().

Improve strlcat() by avoiding the call to strlcpy() if dst string is
already full, not just as sanity check.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoMerge branch 'net-next'
Stephen Hemminger [Tue, 5 Sep 2017 16:48:36 +0000 (09:48 -0700)]
Merge branch 'net-next'

6 years agov4.13.0 v4.13.0
Stephen Hemminger [Tue, 5 Sep 2017 16:39:32 +0000 (09:39 -0700)]
v4.13.0

6 years agoupdate headers from 4.14 merge
Stephen Hemminger [Tue, 5 Sep 2017 16:36:54 +0000 (09:36 -0700)]
update headers from 4.14 merge

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Tue, 5 Sep 2017 16:33:29 +0000 (09:33 -0700)]
Merge branch 'master' into net-next

6 years agobpf: consolidate dumps to use bpf_dump_prog_info
Daniel Borkmann [Tue, 5 Sep 2017 00:24:32 +0000 (02:24 +0200)]
bpf: consolidate dumps to use bpf_dump_prog_info

Consolidate dump of prog info to use bpf_dump_prog_info() when possible.
Moving forward, we want to have a consistent output for BPF progs when
being dumped. E.g. in cls/act case we used to dump tag as a separate
netlink attribute before we had BPF_OBJ_GET_INFO_BY_FD bpf(2) command.

Move dumping tag into bpf_dump_prog_info() as well, and only dump the
netlink attribute for older kernels. Also, reuse bpf_dump_prog_info()
for XDP case, so we can dump tag and whether program was jited, which
we currently don't show.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agobpf: minor cleanups for bpf_trace_pipe
Daniel Borkmann [Tue, 5 Sep 2017 00:24:31 +0000 (02:24 +0200)]
bpf: minor cleanups for bpf_trace_pipe

Just minor nits, e.g. no need to fflush() and instead of returning
right away, just break and close the fd.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agotc actions: store and dump correct length of user cookies
Simon Horman [Tue, 5 Sep 2017 11:06:24 +0000 (13:06 +0200)]
tc actions: store and dump correct length of user cookies

Correct two errors which cancel each other out:
* Do not send twice the length of the actual provided by the user to the kernel
* Do not dump half the length of the cookie provided by the kernel

As the cookie is now stored in the kernel at its correct length rather
than double the that length cookies of up to the maximum size of 16 bytes
may now be stored rather than a maximum of half that length.

Output of dump is the same before and after this change,
but the data stored in the kernel is now exactly the cookie
rather than the cookie + as many trailing zeros.

Before:
 # tc filter add dev eth0 protocol ip parent ffff: \
       flower ip_proto udp action drop \
       cookie 0123456789abcdef0123456789abcdef
 RTNETLINK answers: Invalid argument

After:
 # tc filter add dev eth0 protocol ip parent ffff: \
       flower ip_proto udp action drop \
       cookie 0123456789abcdef0123456789abcdef
 # tc filter show dev eth0 ingress
   eth_type ipv4
   ip_proto udp
   not_in_hw
 action order 1: gact action drop
  random type none pass val 0
  index 1 ref 1 bind 1 installed 1 sec used 1 sec
 Action statistics:
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
 cookie len 16 0123456789abcdef0123456789abcdef

Fixes: fd8b3d2c1b9b ("actions: Add support for user cookies")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
6 years agolib/bpf: Fix bytecode-file parsing
Phil Sutter [Tue, 29 Aug 2017 15:09:45 +0000 (17:09 +0200)]
lib/bpf: Fix bytecode-file parsing

The signedness of char type is implementation dependent, and there are
architectures on which it is unsigned by default. In that case, the
check whether fgetc() returned EOF failed because the return value was
assigned an (unsigned) char variable prior to comparison with EOF (which
is defined to -1). Fix this by using int as type for 'c' variable, which
also matches the declaration of fgetc().

While being at it, fix the parser logic to correctly handle multiple
empty lines and consecutive whitespace and tab characters to further
improve the parser's robustness. Note that this will still detect double
separator characters, so doesn't soften up the parser too much.

Fixes: 3da3ebfca85b8 ("bpf: Make bytecode-file reading a little more robust")
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Fri, 1 Sep 2017 21:15:31 +0000 (14:15 -0700)]
Merge branch 'master' into net-next

6 years agoiplink: double the buffer size also in iplink_get()
Michal Kubecek [Fri, 1 Sep 2017 16:39:16 +0000 (18:39 +0200)]
iplink: double the buffer size also in iplink_get()

Commit 72b365e8e0fd ("libnetlink: Double the dump buffer size") increased
the buffer size for "ip link show" command to 32 KB to handle NICs with
large number of VFs. With "dev" filter, a different code path is taken and
iplink_get() still uses only 16 KB buffer.

The size of 32768 is not very future-proof as NICs supporting 120-128 VFs
are already in use so that single RTM_NEWLINK message in the dump can
exceed 30000 bytes. But it's what rtnl_talk() and rtnl_dump_filter_l() use
so let's be consistent. Once this proves insufficient, all three sizes
should be increased.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
6 years agoiplink: check for message truncation in iplink_get()
Michal Kubecek [Fri, 1 Sep 2017 16:39:11 +0000 (18:39 +0200)]
iplink: check for message truncation in iplink_get()

If message length exceeds maxlen argument of rtnl_talk(), it is truncated
to maxlen but unlike in the case of truncation to the length of local
buffer in rtnl_talk(), the caller doesn't get any indication of a problem.

In particular, iplink_get() passes the truncated message on and parsing it
results in various warnings and sometimes even a segfault (observed with
"ip link show dev ..." for a NIC with 125 VFs).

Handle message truncation in iplink_get() the same way as truncation in
rtnl_talk() would be handled: return an error.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Fri, 1 Sep 2017 19:17:48 +0000 (12:17 -0700)]
Merge branch 'master' into net-next

Needed to add JSON support to tclass.

6 years agolnstat_util: Make sure buffer is NUL-terminated
Phil Sutter [Fri, 1 Sep 2017 16:52:56 +0000 (18:52 +0200)]
lnstat_util: Make sure buffer is NUL-terminated

Can't use strlcpy() here since lnstat is not linked against libutil.

While being at it, fix coding style in that chunk as well.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agotc_util: No need to terminate an snprintf'ed buffer
Phil Sutter [Fri, 1 Sep 2017 16:52:55 +0000 (18:52 +0200)]
tc_util: No need to terminate an snprintf'ed buffer

snprintf() won't leave the buffer unterminated, so manually terminating
is not necessary here.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoipxfrm: Replace STRBUF_CAT macro with strlcat()
Phil Sutter [Fri, 1 Sep 2017 16:52:54 +0000 (18:52 +0200)]
ipxfrm: Replace STRBUF_CAT macro with strlcat()

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoConvert harmful calls to strncpy() to strlcpy()
Phil Sutter [Fri, 1 Sep 2017 16:52:53 +0000 (18:52 +0200)]
Convert harmful calls to strncpy() to strlcpy()

This patch converts spots where manual buffer termination was missing to
strlcpy() since that does what is needed.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoConvert the obvious cases to strlcpy()
Phil Sutter [Fri, 1 Sep 2017 16:52:52 +0000 (18:52 +0200)]
Convert the obvious cases to strlcpy()

This converts the typical idiom of manually terminating the buffer after
a call to strncpy().

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoutils: Implement strlcpy() and strlcat()
Phil Sutter [Fri, 1 Sep 2017 16:52:51 +0000 (18:52 +0200)]
utils: Implement strlcpy() and strlcat()

By making use of strncpy(), both implementations are really simple so
there is no need to add libbsd as additional dependency.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agolink_gre6: Print the tunnel's tclass setting
Phil Sutter [Fri, 1 Sep 2017 14:08:09 +0000 (16:08 +0200)]
link_gre6: Print the tunnel's tclass setting

Print the value analogous to flowlabel. While being at it, also break
the overlong lines to not exceed 80 characters boundary.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agolink_gre6: Fix for changing tclass/flowlabel
Phil Sutter [Fri, 1 Sep 2017 14:08:08 +0000 (16:08 +0200)]
link_gre6: Fix for changing tclass/flowlabel

When trying to change tclass or flowlabel of a GREv6 tunnel which has
the respective value set already, the code accidentally bitwise OR'ed
the old and the new value, leading to unexpected results. Fix this by
clearing the relevant bits of flowinfo variable prior to assigning the
new value.

Fixes: af89576d7a8c4 ("iproute2: GRE over IPv6 tunnel support.")
Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoman: add documentation for seg6 l2encap mode
David Lebrun [Mon, 28 Aug 2017 19:26:40 +0000 (20:26 +0100)]
man: add documentation for seg6 l2encap mode

This patch adds documentation for the seg6 L2ENCAP encapsulation mode.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
6 years agoiproute: add support for seg6 l2encap mode
David Lebrun [Mon, 28 Aug 2017 19:26:39 +0000 (20:26 +0100)]
iproute: add support for seg6 l2encap mode

This patch adds support for the L2ENCAP seg6 mode, enabling to encapsulate
L2 frames within SRv6 packets.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
6 years agoman: tc-ife: add default type note
Alexander Aring [Mon, 28 Aug 2017 19:07:38 +0000 (15:07 -0400)]
man: tc-ife: add default type note

This patch updates the tc-ife man page that the default IFE ethertype
will be used if it's not specified.

Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
6 years agotc: m_ife: report about kernels default type
Alexander Aring [Mon, 28 Aug 2017 19:07:37 +0000 (15:07 -0400)]
tc: m_ife: report about kernels default type

This patch will report about if the ethertype for IFE is not specified
that the default IFE type is used.

Signed-off-by: Alexander Aring <aring@mojatatu.com>
6 years agotc: m_ife: print IEEE ethertype format
Alexander Aring [Mon, 28 Aug 2017 19:07:36 +0000 (15:07 -0400)]
tc: m_ife: print IEEE ethertype format

This patch uses the usually IEEE format to display an ethertype which is
4-digits and every digit in upper case.

Signed-off-by: Alexander Aring <aring@mojatatu.com>
6 years agotc: m_ife: allow ife type to zero
Alexander Aring [Mon, 28 Aug 2017 19:07:35 +0000 (15:07 -0400)]
tc: m_ife: allow ife type to zero

This patch allows to set an ethertype for IFE which is zero. There is no
kernel side validation which forbids a type to zero.

Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
6 years agoupdate headers from net-next
Stephen Hemminger [Wed, 30 Aug 2017 15:26:43 +0000 (08:26 -0700)]
update headers from net-next

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Wed, 30 Aug 2017 15:24:57 +0000 (08:24 -0700)]
Merge branch 'master' into net-next

6 years agoss: Fix for added diag support check
Phil Sutter [Mon, 28 Aug 2017 17:31:22 +0000 (19:31 +0200)]
ss: Fix for added diag support check

Commit 9f66764e308e9 ("libnetlink: Add test for error code returned from
netlink reply") changed rtnl_dump_filter_l() to return an error in case
NLMSG_DONE would contain one, even if it was ENOENT.

This in turn breaks ss when it tries to dump DCCP sockets on a system
without support for it: The function tcp_show(), which is shared between
TCP and DCCP, will start parsing /proc since inet_show_netlink() returns
an error - yet it parses /proc/net/tcp which doesn't make sense for DCCP
sockets at all.

On my system, a call to 'ss' without further arguments prints the list
of connected TCP sockets twice.

Fix this by introducing a dedicated function dccp_show() which does not
have a fallback to /proc, just like sctp_show(). And since tcp_show()
is no longer "multi-purpose", drop it's socktype parameter.

Fixes: 9f66764e308e9 ("libnetlink: Add test for error code returned from netlink reply")
Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agodevlink: header update
Stephen Hemminger [Thu, 24 Aug 2017 22:31:57 +0000 (15:31 -0700)]
devlink: header update

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Thu, 24 Aug 2017 22:30:32 +0000 (15:30 -0700)]
Merge branch 'master' into net-next

6 years agotc: use named initializer for default mqprio options
Stephen Hemminger [Thu, 24 Aug 2017 22:27:35 +0000 (15:27 -0700)]
tc: use named initializer for default mqprio options

Use C99 initializer

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>