Lorenzo Bianconi [Wed, 10 Oct 2018 15:00:58 +0000 (17:00 +0200)]
utils: fix get_rtnl_link_stats_rta stats parsing
iproute2 walks through the list of available tunnels using netlink
protocol in order to get device info instead of reading
them from proc filesystem. However the kernel reports device statistics
using IFLA_INET6_STATS/IFLA_INET6_ICMP6STATS attributes nested in
IFLA_PROTINFO one but iproutes expects these info in
IFLA_STATS64/IFLA_STATS attributes.
The issue can be triggered with the following reproducer:
$ip link add ip6d0 type ip6tnl mode ip6ip6 local 1111::1 remote 2222::1
$ip -6 -d -s tunnel show ip6d0
ip6d0: ipv6/ipv6 remote 2222::1 local 1111::1 encaplimit 4 hoplimit 64
tclass 0x00 flowlabel 0x00000 (flowinfo 0x00000000)
Dump terminated
Fix the issue introducing IFLA_INET6_STATS attribute parsing
Fixes: 3e953938717f ("iptunnel/ip6tunnel: Use netlink to walk through
tunnels list")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Lorenzo Bianconi [Wed, 10 Oct 2018 15:00:57 +0000 (17:00 +0200)]
uapi: add snmp header file
Introduce snmp header file. It will be used in subsequent patch in
order to parse device statistics reported in
IFLA_INET6_STATS/IFLA_INET6_ICMP6STATS netlink attributes
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Sabrina Dubroca [Fri, 12 Oct 2018 15:34:12 +0000 (17:34 +0200)]
macsec: fix off-by-one when parsing attributes
I seem to have had a massive brainfart with uses of
parse_rtattr_nested(). The rtattr* array must have MAX+1 elements, and
the call to parse_rtattr_nested must have MAX as its bound. Let's fix
those.
Sabrina Dubroca [Fri, 12 Oct 2018 15:34:32 +0000 (17:34 +0200)]
json: make 0xhex handle u64
Stephen converted macsec's sci to use 0xhex, but 0xhex handles
unsigned int's, not 64 bits ints. Thus, the output of the "ip macsec
show" command is mangled, with half of the SCI replaced with 0s:
# ip macsec show
11: macsec0: [...]
cipher suite: GCM-AES-128, using ICV length 16
TXSC: 0000000001560001 on SA 0
# ip -d link show macsec0
11: macsec0@ens3: [...]
link/ether 52:54:00:12:01:56 brd ff:ff:ff:ff:ff:ff promiscuity 0
macsec sci 5254001201560001 [...]
where TXSC and sci should match.
Fixes: c0b904de6211 ("macsec: support JSON") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Phil Sutter [Tue, 9 Oct 2018 12:44:08 +0000 (14:44 +0200)]
bridge: fdb: Fix for missing keywords in non-JSON output
While migrating to JSON print library, some keywords were dropped from
standard output by accident. Add them back to unbreak output parsers.
Fixes: c7c1a1ef51aea ("bridge: colorize output and use JSON print library") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Vlad Buslov [Mon, 8 Oct 2018 20:52:26 +0000 (23:52 +0300)]
libnetlink: fix use-after-free of message buf
In __rtnl_talk_iov() main loop, err is a pointer to memory in dynamically
allocated 'buf' that is used to store netlink messages. If netlink message
is an error message, buf is deallocated before returning with error code.
However, on return err->error code is checked one more time to generate
return value, after memory which err points to has already been
freed. Save error code in temporary variable and use the variable to
generate return value.
Fixes: c60389e4f9ea ("libnetlink: fix leak and using unused memory on error") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Fixes: c60389e4f9ea ("libnetlink: fix leak and using unused memory on error") Reported-by: David Ahern <dsahern@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Lorenzo Bianconi [Fri, 21 Sep 2018 13:34:25 +0000 (15:34 +0200)]
iplink_vxlan: take into account preferred_family creating vxlan device
Take into account the configured preferred_family if neither saddr or
daddr are provided since otherwise vxlan kernel module will use IPv4 as
default remote inet family neglecting the one provided by userspace.
This behaviour was originally in commit 97d564b90ccb ("vxlan: use
preferred address family when neither group or remote is specified").
The issue can be triggered with the following reproducer:
$ip -6 link add vxlan1 type vxlan id 42 dev enp0s2 \
proxy nolearning l2miss l3miss
$bridge fdb add 46:47:1f:a7:1c:25 dev vxlan1 dst 2000::2
RTNETLINK answers: Address family not supported by protocol
Fixes: 1e9b8072de2c ("iplink_vxlan: Get rid of inet_get_addr()") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Hangbin Liu [Tue, 18 Sep 2018 09:48:40 +0000 (17:48 +0800)]
iplink: fix incorrect any address handling for ip tunnels
After commit d42c7891d26e4 ("utils: Do not reset family for default, any,
all addresses"), when call get_addr() for any/all addresses, we will set
addr->flags to ADDRTYPE_INET_UNSPEC if family is AF_INET/AF_INET6, which
makes is_addrtype_inet() checking passed and assigns incorrect address
to kernel. The ip link cmd will return error like:
]# ip link add ipip1 type ipip local any remote 1.1.1.1
RTNETLINK answers: Numerical result out of range
Fix it by using is_addrtype_inet_not_unspec() to avoid unspec addresses.
geneve, vxlan are not affected as they use AF_UNSPEC family when call
get_addr()
Reported-by: Jianlin Shi <jishi@redhat.com> Fixes: d42c7891d26e4 ("utils: Do not reset family for default, any, all addresses") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Petr Vorel [Wed, 19 Sep 2018 23:36:22 +0000 (01:36 +0200)]
testsuite: Fix missing generate_nlmsg
Commit ad23e152 caused generate_nlmsg to be always missing:
$ make alltests
make: ./tools/generate_nlmsg: Command not found
Create testclean: to remove only results directory.
Fixes: ad23e152 testsuite: remove all temp files and implement make clean Signed-off-by: Petr Vorel <petr.vorel@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Leon Romanovsky [Sun, 16 Sep 2018 17:28:13 +0000 (20:28 +0300)]
rdma: Fix representation of PortInfo CapabilityMask
The port capability mask represents IBTA PortInfo specification,
but as it is written in description of kernel commit 2f944c0fbf58
("RDMA: Fix storage of PortInfo CapabilityMask in the kernel"),
the bit 26 was mistakenly overwritten.
The rdmatool followed it too and mislead users by presenting wrong
value. Since it never showed proper value, we update the whole
port_cap_mask to comply with IBTA and show real HW values.
Fixes: da990ab40a92 ("rdma: Add link object") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
libnetlink: fix leak and using unused memory on error
If an error happens in multi-segment message (tc only)
then report the error and stop processing further responses.
This also fixes refering to the buffer after free.
The sequence check is not necessary here because the
response message has already been validated to be in
the window of the sequence number of the iov.
Hangbin Liu [Wed, 12 Sep 2018 01:39:44 +0000 (09:39 +0800)]
bridge/mdb: fix missing new line when show bridge mdb
The bridge mdb show is broken on current iproute2. e.g.
]# bridge mdb show
34: br0 veth0_br 224.1.1.2 temp 34: br0 veth0_br 224.1.1.1 temp
After fix:
]# bridge mdb show
34: br0 veth0_br 224.1.1.2 temp
34: br0 veth0_br 224.1.1.1 temp
v2: Use json print lib as Stephen suggested.
v3: No need to use is_json_context() as print_string() could handle both cases.
v4: use new function print_nl() to print new line in non-json mode.
Reported-by: Ying Xu <yinxu@redhat.com> Fixes: c7c1a1ef51aea ("bridge: colorize output and use JSON print library") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
When the GSO splitting was turned into dual split-gso/no-split-gso options,
the printing of the latter was left out. Add that, so output is consistent
with the options passed.
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Phil Sutter [Thu, 6 Sep 2018 13:31:51 +0000 (15:31 +0200)]
ip-route: Fix segfault with many nexthops
It was possible to crash ip-route by adding an IPv6 route with 37
nexthop statements. A simple reproducer is:
| for i in `seq 37`; do
| nhs="nexthop via 1111::$i "$nhs
| done
| ip -6 route add 3333::/64 $nhs
The related code was broken in multiple ways:
* parse_one_nh() assumed that rta points to 4kB of storage but caller
provided just 1kB. Fixed by passing 'len' parameter with the correct
value.
* Error checking of rta_addattr*() calls in parse_one_nh() and called
functions was completely absent, so with above fix in place output
flood would occur due to parser looping forever.
While being at it, increase message buffer sizes to 4k. This allows for
at most 144 nexthops.
Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Mahesh Bandewar [Thu, 23 Aug 2018 01:01:37 +0000 (18:01 -0700)]
iproute: make clang happy
These are primarily fixes for "string is not string literal" warnings
/ errors (with -Werror -Wformat-nonliteral). This should be a no-op
change. I had to replace couple of print helper functions with the
code they call as it was becoming harder to eliminate these warnings,
however these helpers were used only at couple of places, so no
major change as such.
Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Mahesh Bandewar [Thu, 23 Aug 2018 01:01:34 +0000 (18:01 -0700)]
ipmaddr: use preferred_family when given
When creating socket() AF_INET is used irrespective of the family
that is given at the command-line (with -4, -6, or -0). This change
will open the socket with the preferred family.
Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Cong Wang [Wed, 29 Aug 2018 17:09:27 +0000 (10:09 -0700)]
ss: add UNIX_DIAG_VFS and UNIX_DIAG_ICONS for unix sockets
UNIX_DIAG_VFS and UNIX_DIAG_ICONS are never used by ss,
make them available in ss -e output.
Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Stefan Bader [Tue, 28 Aug 2018 14:27:29 +0000 (16:27 +0200)]
iprule: Fix destination prefix output
When adding support for JSON output the new code for printing
the destination prefix adds a stray blank character before
the bitmask. This causes some user-space parsing to fail.
Current output:
...: from x.x.x.x/l to y.y.y.y /l
Previous output:
...: from x.x.x.x/l to y.y.y.y/l
Fixes: 0dd4ccc5 "iprule: add json support" Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Luca Boccassi <bluca@debian.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
q_cake: Add description of the tc filter override mechanism to man page
Since CAKE now has three different settings that can be overridden by tc
filters (priority and host and flow hashes), documenting how they work is
probably a good idea.
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Luca Boccassi [Wed, 22 Aug 2018 18:09:02 +0000 (19:09 +0100)]
testsuite: let make compile build the netlink helper
The generate_nlmsg binary is required but make -C testsuite compile
does not build it. Add the necessary includes and C*FLAGS to the tools
Makefile and have the compile target build it.
Signed-off-by: Luca Boccassi <bluca@debian.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Luca Boccassi [Wed, 22 Aug 2018 18:09:01 +0000 (19:09 +0100)]
testsuite: remove all temp files and implement make clean
Some generated test files were not removed, including one executable in
the testsuite/tools directory.
Ensure make clean from the top level directory works for the testsuite
subdirs too, and that all the files are removed.
Signed-off-by: Luca Boccassi <bluca@debian.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Stefan Bader [Wed, 22 Aug 2018 08:31:38 +0000 (10:31 +0200)]
testsuite: Handle large number of kernel options
Once there are more than a certain number of kernel config options
set (this happened for us with kernel 4.17), the method of passing
those as command line arguments exceeds the maximum number of
arguments the shell supports. This causes the whole testsuite to
fail.
Instead, create a temporary file and modify its contents so that
the config option variables are exported. Then this file can be
sourced in before running the tests.
Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Luca Boccassi <bluca@debian.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Phil Sutter [Fri, 17 Aug 2018 16:38:46 +0000 (18:38 +0200)]
lib: Make check_enable_color() return boolean
As suggested, turn return code into true/false although it's not checked
anywhere yet.
Fixes: 4d82962cccc6a ("Merge common code for conditionally colored output") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Phil Sutter [Tue, 14 Aug 2018 12:18:07 +0000 (14:18 +0200)]
testsuite: Prepare for ss tests
This merges the shared bits from ts_tc() and ts_ip() into a common
function for being wrapped by the first ones and adds a third ts_ss()
for testing ss commands.
Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Phil Sutter [Tue, 14 Aug 2018 12:18:06 +0000 (14:18 +0200)]
ss: Review ssfilter
The original problem was ssfilter rejecting single expressions if
enclosed in braces, such as:
| sport = 22 or ( dport = 22 )
This is fixed by allowing 'expr' to be an 'exprlist' enclosed in braces.
The no longer required recursion in 'exprlist' being an 'exprlist'
enclosed in braces is dropped.
In addition to that, a few other things are changed:
* Remove pointless 'null' prefix in 'appled' before 'exprlist'.
* For simple equals matches, '=' operator was required for ports but not
allowed for hosts. Make this consistent by making '=' operator
optional in both cases.
Reported-by: Samuel Mannehed <samuel@cendio.se> Fixes: b2038cc0b2403 ("ssfilter: Eliminate shift/reduce conflicts") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Phil Sutter [Wed, 15 Aug 2018 09:18:26 +0000 (11:18 +0200)]
man: ip-route: Clarify referenced versions are Linux ones
Versioning scheme of Linux and iproute2 is similar, therefore the
referenced kernel versions are likely to confuse readers. Clarify this
by prefixing each kernel version by 'Linux' prefix.
Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Phil Sutter [Wed, 15 Aug 2018 16:21:25 +0000 (18:21 +0200)]
bridge: Fix check for colored output
There is no point in calling enable_color() conditionally if it was
already called for each time '-color' flag was parsed. Align the
algorithm with that in ip and tc by actually making use of 'color'
variable.
Fixes: e9625d6aead11 ("Merge branch 'iproute2-master' into iproute2-next") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David Ahern <dsahern@gmail.com>
sch_skbprio is a qdisc that prioritizes packets according to their skb->priority
field. Under congestion, it drops already-enqueued lower priority packets to
make space available for higher priority packets. Skbprio was conceived as a
solution for denial-of-service defenses that need to route packets with
different priorities as a means to overcome DoS attacks.
Signed-off-by: Nishanth Devarajan <ndev2021@gmail.com> Reviewed-by: Michel Machado <michel@digirati.com.br> Signed-off-by: David Ahern <dsahern@gmail.com>
Tobias Klauser [Wed, 8 Aug 2018 12:33:40 +0000 (14:33 +0200)]
tc: bpf: update list of archs with eBPF support in manpage
Update the list of architectures supporting eBPF JIT as of Linux 4.18.
Also mention the Linux version where support for a particular
architecture was introduced. Finally, reformat the list of architectures
as a bullet list in order to make it more readable.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Fix hex output for both the ip_attr and tcp_flags print functions.
Sample usage:
$ $TC qdisc add dev lo ingress
$ $TC filter add dev lo parent ffff: prio 3 proto ip flower ip_tos 0x8/32
$ $TC fitler add dev lo parent ffff: prio 5 proto ip flower ip_proto tcp \
tcp_flags 0x909/f00
Matteo Croce [Fri, 3 Aug 2018 17:49:33 +0000 (19:49 +0200)]
ip link: don't stop batch processing
When 'ip link show dev DEVICE' is processed in a batch mode, ip exits
and stop processing further commands.
This because ipaddr_list_flush_or_save() calls exit() to avoid printing
the link information twice.
Replace the exit with a classic goto out instruction.
Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
In a LXC container we're unable to umount the sysfs instance, nor mount
a read-write one. We still are able to create a new read-only instance.
Nevertheless, it still makes sense to attempt the umount() even though
the sysfs is mounted read-only. Otherwise we may end up attempting to
mount a sysfs with the same flags as is already mounted, resulting in
an EBUSY error (meaning "Already mounted").
Perhaps this is not a very likely scenario in real world, but we hit
it in NetworkManager test suite and makes netns_switch() somewhat more
robust. It also fixes the case, when /sys wasn't mounted at all.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>