]> git.proxmox.com Git - mirror_iproute2.git/log
mirror_iproute2.git
6 years agoss: allow AF_FAMILY constants >32
Stefan Hajnoczi [Fri, 6 Oct 2017 15:48:39 +0000 (11:48 -0400)]
ss: allow AF_FAMILY constants >32

Linux has more than 32 address families defined in <bits/socket.h>.  Use
a 64-bit type so all of them can be represented in the filter->families
bitmask.

It's easy to introduce bugs when using (1 << AF_FAMILY) because the
value is 32-bit.  This can produce incorrect results from bitmask
operations so introduce the FAMILY_MASK() macro to eliminate these bugs.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6 years agouapi: add include linux/vm_sockets_diag.h
Stephen Hemminger [Wed, 11 Oct 2017 17:49:25 +0000 (10:49 -0700)]
uapi: add include linux/vm_sockets_diag.h

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Wed, 11 Oct 2017 17:47:55 +0000 (10:47 -0700)]
Merge branch 'master' into net-next

6 years agordma: move headers to uapi
Stephen Hemminger [Wed, 11 Oct 2017 17:47:28 +0000 (10:47 -0700)]
rdma: move headers to uapi

And update with version from upstream.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoupdate uapi headers from 4.14-rc4 net-next
Stephen Hemminger [Wed, 11 Oct 2017 17:43:38 +0000 (10:43 -0700)]
update uapi headers from 4.14-rc4 net-next

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Wed, 11 Oct 2017 17:43:13 +0000 (10:43 -0700)]
Merge branch 'master' into net-next

6 years agoiproute: build more easily on Android
Lorenzo Colitti [Mon, 2 Oct 2017 17:03:37 +0000 (02:03 +0900)]
iproute: build more easily on Android

iproute2 contains a bunch of kernel headers, including uapi ones.
Android's libc uses uapi headers almost directly, and uses a
script to fix kernel types that don't match what userspace
expects.

For example: https://issuetracker.google.com/36987220 reports
that our struct ip_mreq_source contains "__be32 imr_multiaddr"
rather than "struct in_addr imr_multiaddr". The script addresses
this by replacing the uapi struct definition with a #include
<bits/ip_mreq.h> which contains the traditional userspace
definition.

Unfortunately, when we compile iproute2, this definition
conflicts with the one in iproute2's linux/in.h.

Historically we've just solved this problem by running "git rm"
on all the iproute2 include/linux headers that break Android's
libc.  However, deleting the files in this way makes it harder to
keep up with upstream, because every upstream change to
an include file causes a merge conflict with the delete.

This patch fixes the problem by moving the iproute2 linux headers
from include/linux to include/uapi/linux.

Tested: compiles on ubuntu trusty (glibc)

Signed-off-by: Elliott Hughes <enh@google.com>
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
6 years agotipc: don't need custom CFLAGS
Stephen Hemminger [Wed, 11 Oct 2017 17:35:00 +0000 (10:35 -0700)]
tipc: don't need custom CFLAGS

Since libmnl CFLAGS are now handled by config.mk

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Mon, 2 Oct 2017 15:04:13 +0000 (08:04 -0700)]
Merge branch 'master' into net-next

6 years agoupdate headers from net-next rc
Stephen Hemminger [Mon, 2 Oct 2017 15:03:45 +0000 (08:03 -0700)]
update headers from net-next rc

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoCheck user supplied interface name lengths
Phil Sutter [Mon, 2 Oct 2017 11:46:37 +0000 (13:46 +0200)]
Check user supplied interface name lengths

The original problem was that something like:

| strncpy(ifr.ifr_name, *argv, IFNAMSIZ);

might leave ifr.ifr_name unterminated if length of *argv exceeds
IFNAMSIZ. In order to fix this, I thought about replacing all those
cases with (equivalent) calls to snprintf() or even introducing
strlcpy(). But as Ulrich Drepper correctly pointed out when rejecting
the latter from being added to glibc, truncating a string without
notifying the user is not to be considered good practice. So let's
excercise what he suggested and reject empty, overlong or otherwise
invalid interface names right from the start - this way calls to
strncpy() like shown above become safe and the user has a chance to
reconsider what he was trying to do.

Note that this doesn't add calls to check_ifname() to all places where
user supplied interface name is parsed. In many cases, the interface
must exist already and is therefore looked up using ll_name_to_index(),
so if_nametoindex() will perform the necessary checks already.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agotc: flower: No need to cache indev arg
Phil Sutter [Mon, 2 Oct 2017 11:46:36 +0000 (13:46 +0200)]
tc: flower: No need to cache indev arg

Since addattrstrz() will copy the provided string into the attribute
payload, there is no need to cache the data.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoip{6, }tunnel: Avoid copying user-supplied interface name around
Phil Sutter [Mon, 2 Oct 2017 11:46:35 +0000 (13:46 +0200)]
ip{6, }tunnel: Avoid copying user-supplied interface name around

In both files' parse_args() functions as well as in iptunnel's do_prl()
and do_6rd() functions, a user-supplied 'dev' parameter is uselessly
copied into a temporary buffer before passing it to ll_name_to_index()
or copying into a struct ifreq.  Avoid this by just caching the argv
pointer value until the later lookup/strcpy.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoip xfrm: use correct key length for netlink message
Michal Kubecek [Fri, 29 Sep 2017 11:41:05 +0000 (13:41 +0200)]
ip xfrm: use correct key length for netlink message

When SA is added manually using "ip xfrm state add", xfrm_state_modify()
uses alg_key_len field of struct xfrm_algo for the length of key passed to
kernel in the netlink message. However alg_key_len is bit length of the key
while we need byte length here. This is usually harmless as kernel ignores
the excess data but when the bit length of the key exceeds 512
(XFRM_ALGO_KEY_BUF_SIZE), it can result in buffer overflow.

We can simply divide by 8 here as the only place setting alg_key_len is in
xfrm_algo_parse() where it is always set to a multiple of 8 (and there are
already multiple places using "algo->alg_key_len / 8").

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
6 years agotc: fix ipv6 filter selector attribute for some prefix lengths
Yulia Kartseva [Sun, 1 Oct 2017 03:18:40 +0000 (20:18 -0700)]
tc: fix ipv6 filter selector attribute for some prefix lengths

Wrong TCA_U32_SEL attribute packing if prefixLen AND 0x1f equals 0x1f.
These are  /31, /63, /95 and /127 prefix lengths.

Example:
ip6 dst face:b00f::/31
filter parent b: protocol ipv6 pref 2307 u32
filter parent b: protocol ipv6 pref 2307 u32 fh 800: ht divisor 1
filter parent b: protocol ipv6 pref 2307 u32 fh 800::800 order 2048
key ht 800 bkt 0
  match faceb00f/ffffffff at 24

v2: previous patch was made with a wrong repo

Signed-off-by: Yulia Kartseva <hex@fb.com>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Fri, 29 Sep 2017 19:03:16 +0000 (12:03 -0700)]
Merge branch 'master' into net-next

6 years agoip-route: Fix for listing routes with RTAX_LOCK attribute
Phil Sutter [Thu, 28 Sep 2017 17:33:56 +0000 (19:33 +0200)]
ip-route: Fix for listing routes with RTAX_LOCK attribute

This fixes a corner-case for routes with a certain metric locked to
zero:

| ip route add 192.168.7.0/24 dev eth0 window 0
| ip route add 192.168.7.0/24 dev eth0 window lock 0

Since the kernel doesn't dump the attribute if it is zero, both routes
added above would appear as if they were equal although they are not.

Fix this by taking mxlock value for the given metric into account before
skipping it if it is not present.

Reported-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Fri, 29 Sep 2017 17:51:25 +0000 (10:51 -0700)]
Merge branch 'master' into net-next

6 years agodoc: drop old ip command documentation
Stephen Hemminger [Fri, 29 Sep 2017 17:50:13 +0000 (10:50 -0700)]
doc: drop old ip command documentation

The old IP cross reference manual was very out of date, barely updated
since 1999.  The correct documentation is in the man pages.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agolib: json_print: rework 'new_json_obj' drop FILE* argument
Julien Fortin [Tue, 26 Sep 2017 23:45:39 +0000 (16:45 -0700)]
lib: json_print: rework 'new_json_obj' drop FILE* argument

As Stephen Hemminger mentioned on the last submission the new_json_obj
function is always called with fp == stdout, so right now, there's no
need of this extra argument.

The background for the rework is the following:
The ip monitor didn't call `new_json_obj` (even for in non json context),
so the static FILE* _fp variable wasn't initialized, thus raising a
SIGSEGV in ipaddress.c. This patch should fix this issue for good, new
paths won't have to call `new_json_obj`.

How to reproduce:

$ ip -t mon label link
(gdb) bt
.#0  _IO_vfprintf_internal (s=s@entry=0x0, format=format@entry=0x45460d “%d: “, ap=ap@entry=0x7fffffff7f18) at vfprintf.c:1278
.#1  0x0000000000451310 in color_fprintf (fp=0x0, attr=<optimized out>, fmt=0x45460d “%d: “) at color.c:108
.#2  0x000000000044a856 in print_color_int (t=t@entry=PRINT_ANY, color=color@entry=4294967295, key=key@entry=0x4545fc “ifindex”,
    fmt=fmt@entry=0x45460d “%d: “, value=<optimized out>) at ip_print.c:132
.#3  0x000000000040ccd2 in print_int (value=<optimized out>, fmt=0x45460d “%d: “, key=0x4545fc “ifindex”, t=PRINT_ANY) at ip_common.h:189
.#4  print_linkinfo (who=<optimized out>, n=0x7fffffffa380, arg=0x7ffff77a82a0 <_IO_2_1_stdout_>) at ipaddress.c:1107
.#5  0x0000000000422e13 in accept_msg (who=0x7fffffff8320, ctrl=0x7fffffff8310, n=0x7fffffffa380, arg=0x7ffff77a82a0 <_IO_2_1_stdout_>) at ipmonitor.c:89
.#6  0x000000000044c58f in rtnl_listen (rtnl=0x672160 <rth>, handler=handler@entry=0x422c70 <accept_msg>, jarg=0x7ffff77a82a0 <_IO_2_1_stdout_>)
    at libnetlink.c:761
.#7  0x00000000004233db in do_ipmonitor (argc=<optimized out>, argv=0x7fffffffe5a0) at ipmonitor.c:310
.#8  0x0000000000408f74 in do_cmd (argv0=0x7fffffffe7f5 “mon”, argc=3, argv=0x7fffffffe588) at ip.c:116
.#9  0x0000000000408a94 in main (argc=4, argv=0x7fffffffe580) at ip.c:311

Fixes: 6377572f ("ip: ip_print: add new API to print JSON or regular format output")
Reported-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Julien Fortin <julien@cumulusnetworks.com>
6 years agodoc: remove outdated IPv6 flow label document
Stephen Hemminger [Fri, 29 Sep 2017 17:06:50 +0000 (10:06 -0700)]
doc: remove outdated IPv6 flow label document

Not updated since Linux 2.2

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agodoc: remove outdated tc-filters documentation
Stephen Hemminger [Fri, 29 Sep 2017 17:05:09 +0000 (10:05 -0700)]
doc: remove outdated tc-filters documentation

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoignore generated Config file
Stephen Hemminger [Fri, 29 Sep 2017 17:02:31 +0000 (10:02 -0700)]
ignore generated Config file

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agodoc: remove outdated nstat/rtstat documentation
Stephen Hemminger [Fri, 29 Sep 2017 17:01:15 +0000 (10:01 -0700)]
doc: remove outdated nstat/rtstat documentation

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agodoc: remove outdated arpd documentation
Stephen Hemminger [Fri, 29 Sep 2017 17:00:12 +0000 (10:00 -0700)]
doc: remove outdated arpd documentation

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agodoc: remove outdated ss documentation
Stephen Hemminger [Fri, 29 Sep 2017 16:58:39 +0000 (09:58 -0700)]
doc: remove outdated ss documentation

The current version is well documented on man page.
The latex documentation is very old and was never upated.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agodoc: remove obsolete ip-tunnels documentation
Stephen Hemminger [Fri, 29 Sep 2017 16:57:19 +0000 (09:57 -0700)]
doc: remove obsolete ip-tunnels documentation

This file has not been updated since conversion to git
and is really old and outdated.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Fri, 22 Sep 2017 17:10:01 +0000 (10:10 -0700)]
Merge branch 'master' into net-next

6 years agoman: fix documentation for range of route table ID
Thomas Haller [Fri, 22 Sep 2017 11:28:54 +0000 (13:28 +0200)]
man: fix documentation for range of route table ID

Signed-off-by: Thomas Haller <thaller@redhat.com>
6 years agobpf: properly output json for xdp
Daniel Borkmann [Thu, 21 Sep 2017 08:42:29 +0000 (10:42 +0200)]
bpf: properly output json for xdp

After merging net-next branch into master, Stephen asked
to fix up json dump for XDP. Thus, rework the json dump a
bit, such that 'ip -json l' looks as below.

  [{
        "ifindex": 1,
        "ifname": "lo",
        "flags": ["LOOPBACK","UP","LOWER_UP"],
        "mtu": 65536,
        "xdp": {
            "mode": 2,
            "prog": {
                "id": 5,
                "tag": "e1e9d0ec0f55d638",
                "jited": 1
            }
        },
        "qdisc": "noqueue",
        "operstate": "UNKNOWN",
        "linkmode": "DEFAULT",
        "group": "default",
        "txqlen": 1000,
        "link_type": "loopback",
        "address": "00:00:00:00:00:00",
        "broadcast": "00:00:00:00:00:00"
    },[...]
  ]

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agojson: move json printer to common library
Daniel Borkmann [Thu, 21 Sep 2017 08:42:28 +0000 (10:42 +0200)]
json: move json printer to common library

Move the json printer which is based on json writer into the
iproute2 library, so it can be used by library code and tools
other than ip. Should probably have been done from the beginning
like that given json writer is in the library already anyway.
No functional changes.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Julien Fortin <julien@cumulusnetworks.com>
6 years agotc: flower remove unused variable
Stephen Hemminger [Thu, 21 Sep 2017 01:08:16 +0000 (18:08 -0700)]
tc: flower remove unused variable

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agotc: flower: support for matching MPLS labels
Benjamin LaHaise [Tue, 12 Sep 2017 14:06:15 +0000 (16:06 +0200)]
tc: flower: support for matching MPLS labels

This patch adds support to the iproute2 tc filter command for matching MPLS
labels in the flower classifier.  The ability to match the Time To Live,
Bottom Of Stack, Traffic Control and Label fields are added as options to
the flower filter.

e.g.:
  tc filter add dev eth0 protocol 0x8847 parent ffff: \
    flower mpls_label 1 mpls_tc 2 mpls_ttl 3 mpls_bos 0 \
    action drop

Signed-off-by: Benjamin LaHaise <benjamin.lahaise@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
6 years agoip: ipaddress: fix missing space after prefixlen
Julien Fortin [Wed, 20 Sep 2017 20:26:51 +0000 (13:26 -0700)]
ip: ipaddress: fix missing space after prefixlen

Fixes: d0e720111aad2 ("ip: ipaddress.c: add support for json output")
Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Julien Fortin <julien@cumulusnetworks.com>
6 years agotc: fix typo in tc-tcindex man page
Davide Caratti [Thu, 14 Sep 2017 15:00:46 +0000 (17:00 +0200)]
tc: fix typo in tc-tcindex man page

fix mis-typed 'pass_on' keyword.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
6 years agoBPF: update headers from 4.14-rc1
Stephen Hemminger [Thu, 21 Sep 2017 01:00:36 +0000 (18:00 -0700)]
BPF: update headers from 4.14-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agotc: fq: support low_rate_threshold attribute
Eric Dumazet [Fri, 8 Sep 2017 21:12:59 +0000 (14:12 -0700)]
tc: fq: support low_rate_threshold attribute

TCA_FQ_LOW_RATE_THRESHOLD sch_fq attribute was added in linux-4.9

Tested:

lpaa5:/tmp# tc -qd add dev eth1 root fq
lpaa5:/tmp# tc -s qd sh dev eth1
qdisc fq 8003: root refcnt 5 limit 10000p flow_limit 1000p buckets 4096 \
 orphan_mask 4095 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 quantum 3648 \
 initial_quantum 18240 low_rate_threshold 550Kbit refill_delay 40.0ms
 Sent 62139 bytes 395 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  116 flows (114 inactive, 0 throttled)
  1 gc, 0 highprio, 0 throttled

lpaa5:/tmp# ./netperf -H lpaa6 -t TCP_RR -l10 -- -q 500000 -r 300,300 -o P99_LATENCY
99th Percentile Latency Microseconds
7081

lpaa5:/tmp# tc qd replace dev eth1 root fq low_rate_threshold 10Mbit
lpaa5:/tmp# ./netperf -H lpaa6 -t TCP_RR -l10 -- -q 500000 -r 300,300 -o P99_LATENCY
99th Percentile Latency Microseconds
858

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
6 years agoipaddress: Fix segfault in 'addr showdump'
Phil Sutter [Tue, 12 Sep 2017 14:58:12 +0000 (16:58 +0200)]
ipaddress: Fix segfault in 'addr showdump'

Obviously, 'addr showdump' feature wasn't adjusted to json output
support. As a consequence, calls to print_string() in print_addrinfo()
tried to dereference a NULL FILE pointer.

Fixes: d0e720111aad2 ("ip: ipaddress.c: add support for json output")
Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agodevlink: Add support for protocol IPv4/IPv6/Ethernet special formats
Arkadi Sharshevsky [Thu, 7 Sep 2017 14:26:43 +0000 (17:26 +0300)]
devlink: Add support for protocol IPv4/IPv6/Ethernet special formats

Add support for protocol IPv4/IPv6/Ethernet special formats.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agodevlink: Add support for special format protocol headers
Arkadi Sharshevsky [Thu, 7 Sep 2017 14:26:41 +0000 (17:26 +0300)]
devlink: Add support for special format protocol headers

In case of global header (protocol header), the header:field ids are used
to perform lookup for special format printer. In case no printer existence
fallback to plain value printing.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agodevlink: Make match/action parsing more flexible
Arkadi Sharshevsky [Thu, 7 Sep 2017 14:26:40 +0000 (17:26 +0300)]
devlink: Make match/action parsing more flexible

This patch decouples the match/action parsing from printing. This is
done as a preparation for adding the ability to print global header
values, for example print IPv4 address, which require special formatting.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agoutils: strlcpy() and strlcat() don't clobber dst
Phil Sutter [Wed, 6 Sep 2017 16:51:42 +0000 (18:51 +0200)]
utils: strlcpy() and strlcat() don't clobber dst

As David Laight correctly pointed out, the first version of strlcpy()
modified dst buffer behind the string copied into it. Fix this by
writing NUL to the byte immediately following src string instead of to
the last byte in dst. Doing so also allows to reduce overhead by using
memcpy().

Improve strlcat() by avoiding the call to strlcpy() if dst string is
already full, not just as sanity check.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoMerge branch 'net-next'
Stephen Hemminger [Tue, 5 Sep 2017 16:48:36 +0000 (09:48 -0700)]
Merge branch 'net-next'

6 years agov4.13.0 v4.13.0
Stephen Hemminger [Tue, 5 Sep 2017 16:39:32 +0000 (09:39 -0700)]
v4.13.0

6 years agoupdate headers from 4.14 merge
Stephen Hemminger [Tue, 5 Sep 2017 16:36:54 +0000 (09:36 -0700)]
update headers from 4.14 merge

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Tue, 5 Sep 2017 16:33:29 +0000 (09:33 -0700)]
Merge branch 'master' into net-next

6 years agobpf: consolidate dumps to use bpf_dump_prog_info
Daniel Borkmann [Tue, 5 Sep 2017 00:24:32 +0000 (02:24 +0200)]
bpf: consolidate dumps to use bpf_dump_prog_info

Consolidate dump of prog info to use bpf_dump_prog_info() when possible.
Moving forward, we want to have a consistent output for BPF progs when
being dumped. E.g. in cls/act case we used to dump tag as a separate
netlink attribute before we had BPF_OBJ_GET_INFO_BY_FD bpf(2) command.

Move dumping tag into bpf_dump_prog_info() as well, and only dump the
netlink attribute for older kernels. Also, reuse bpf_dump_prog_info()
for XDP case, so we can dump tag and whether program was jited, which
we currently don't show.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agobpf: minor cleanups for bpf_trace_pipe
Daniel Borkmann [Tue, 5 Sep 2017 00:24:31 +0000 (02:24 +0200)]
bpf: minor cleanups for bpf_trace_pipe

Just minor nits, e.g. no need to fflush() and instead of returning
right away, just break and close the fd.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agotc actions: store and dump correct length of user cookies
Simon Horman [Tue, 5 Sep 2017 11:06:24 +0000 (13:06 +0200)]
tc actions: store and dump correct length of user cookies

Correct two errors which cancel each other out:
* Do not send twice the length of the actual provided by the user to the kernel
* Do not dump half the length of the cookie provided by the kernel

As the cookie is now stored in the kernel at its correct length rather
than double the that length cookies of up to the maximum size of 16 bytes
may now be stored rather than a maximum of half that length.

Output of dump is the same before and after this change,
but the data stored in the kernel is now exactly the cookie
rather than the cookie + as many trailing zeros.

Before:
 # tc filter add dev eth0 protocol ip parent ffff: \
       flower ip_proto udp action drop \
       cookie 0123456789abcdef0123456789abcdef
 RTNETLINK answers: Invalid argument

After:
 # tc filter add dev eth0 protocol ip parent ffff: \
       flower ip_proto udp action drop \
       cookie 0123456789abcdef0123456789abcdef
 # tc filter show dev eth0 ingress
   eth_type ipv4
   ip_proto udp
   not_in_hw
 action order 1: gact action drop
  random type none pass val 0
  index 1 ref 1 bind 1 installed 1 sec used 1 sec
 Action statistics:
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
 cookie len 16 0123456789abcdef0123456789abcdef

Fixes: fd8b3d2c1b9b ("actions: Add support for user cookies")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
6 years agolib/bpf: Fix bytecode-file parsing
Phil Sutter [Tue, 29 Aug 2017 15:09:45 +0000 (17:09 +0200)]
lib/bpf: Fix bytecode-file parsing

The signedness of char type is implementation dependent, and there are
architectures on which it is unsigned by default. In that case, the
check whether fgetc() returned EOF failed because the return value was
assigned an (unsigned) char variable prior to comparison with EOF (which
is defined to -1). Fix this by using int as type for 'c' variable, which
also matches the declaration of fgetc().

While being at it, fix the parser logic to correctly handle multiple
empty lines and consecutive whitespace and tab characters to further
improve the parser's robustness. Note that this will still detect double
separator characters, so doesn't soften up the parser too much.

Fixes: 3da3ebfca85b8 ("bpf: Make bytecode-file reading a little more robust")
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Fri, 1 Sep 2017 21:15:31 +0000 (14:15 -0700)]
Merge branch 'master' into net-next

6 years agoiplink: double the buffer size also in iplink_get()
Michal Kubecek [Fri, 1 Sep 2017 16:39:16 +0000 (18:39 +0200)]
iplink: double the buffer size also in iplink_get()

Commit 72b365e8e0fd ("libnetlink: Double the dump buffer size") increased
the buffer size for "ip link show" command to 32 KB to handle NICs with
large number of VFs. With "dev" filter, a different code path is taken and
iplink_get() still uses only 16 KB buffer.

The size of 32768 is not very future-proof as NICs supporting 120-128 VFs
are already in use so that single RTM_NEWLINK message in the dump can
exceed 30000 bytes. But it's what rtnl_talk() and rtnl_dump_filter_l() use
so let's be consistent. Once this proves insufficient, all three sizes
should be increased.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
6 years agoiplink: check for message truncation in iplink_get()
Michal Kubecek [Fri, 1 Sep 2017 16:39:11 +0000 (18:39 +0200)]
iplink: check for message truncation in iplink_get()

If message length exceeds maxlen argument of rtnl_talk(), it is truncated
to maxlen but unlike in the case of truncation to the length of local
buffer in rtnl_talk(), the caller doesn't get any indication of a problem.

In particular, iplink_get() passes the truncated message on and parsing it
results in various warnings and sometimes even a segfault (observed with
"ip link show dev ..." for a NIC with 125 VFs).

Handle message truncation in iplink_get() the same way as truncation in
rtnl_talk() would be handled: return an error.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Fri, 1 Sep 2017 19:17:48 +0000 (12:17 -0700)]
Merge branch 'master' into net-next

Needed to add JSON support to tclass.

6 years agolnstat_util: Make sure buffer is NUL-terminated
Phil Sutter [Fri, 1 Sep 2017 16:52:56 +0000 (18:52 +0200)]
lnstat_util: Make sure buffer is NUL-terminated

Can't use strlcpy() here since lnstat is not linked against libutil.

While being at it, fix coding style in that chunk as well.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agotc_util: No need to terminate an snprintf'ed buffer
Phil Sutter [Fri, 1 Sep 2017 16:52:55 +0000 (18:52 +0200)]
tc_util: No need to terminate an snprintf'ed buffer

snprintf() won't leave the buffer unterminated, so manually terminating
is not necessary here.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoipxfrm: Replace STRBUF_CAT macro with strlcat()
Phil Sutter [Fri, 1 Sep 2017 16:52:54 +0000 (18:52 +0200)]
ipxfrm: Replace STRBUF_CAT macro with strlcat()

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoConvert harmful calls to strncpy() to strlcpy()
Phil Sutter [Fri, 1 Sep 2017 16:52:53 +0000 (18:52 +0200)]
Convert harmful calls to strncpy() to strlcpy()

This patch converts spots where manual buffer termination was missing to
strlcpy() since that does what is needed.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoConvert the obvious cases to strlcpy()
Phil Sutter [Fri, 1 Sep 2017 16:52:52 +0000 (18:52 +0200)]
Convert the obvious cases to strlcpy()

This converts the typical idiom of manually terminating the buffer after
a call to strncpy().

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoutils: Implement strlcpy() and strlcat()
Phil Sutter [Fri, 1 Sep 2017 16:52:51 +0000 (18:52 +0200)]
utils: Implement strlcpy() and strlcat()

By making use of strncpy(), both implementations are really simple so
there is no need to add libbsd as additional dependency.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agolink_gre6: Print the tunnel's tclass setting
Phil Sutter [Fri, 1 Sep 2017 14:08:09 +0000 (16:08 +0200)]
link_gre6: Print the tunnel's tclass setting

Print the value analogous to flowlabel. While being at it, also break
the overlong lines to not exceed 80 characters boundary.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agolink_gre6: Fix for changing tclass/flowlabel
Phil Sutter [Fri, 1 Sep 2017 14:08:08 +0000 (16:08 +0200)]
link_gre6: Fix for changing tclass/flowlabel

When trying to change tclass or flowlabel of a GREv6 tunnel which has
the respective value set already, the code accidentally bitwise OR'ed
the old and the new value, leading to unexpected results. Fix this by
clearing the relevant bits of flowinfo variable prior to assigning the
new value.

Fixes: af89576d7a8c4 ("iproute2: GRE over IPv6 tunnel support.")
Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoman: add documentation for seg6 l2encap mode
David Lebrun [Mon, 28 Aug 2017 19:26:40 +0000 (20:26 +0100)]
man: add documentation for seg6 l2encap mode

This patch adds documentation for the seg6 L2ENCAP encapsulation mode.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
6 years agoiproute: add support for seg6 l2encap mode
David Lebrun [Mon, 28 Aug 2017 19:26:39 +0000 (20:26 +0100)]
iproute: add support for seg6 l2encap mode

This patch adds support for the L2ENCAP seg6 mode, enabling to encapsulate
L2 frames within SRv6 packets.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
6 years agoman: tc-ife: add default type note
Alexander Aring [Mon, 28 Aug 2017 19:07:38 +0000 (15:07 -0400)]
man: tc-ife: add default type note

This patch updates the tc-ife man page that the default IFE ethertype
will be used if it's not specified.

Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
6 years agotc: m_ife: report about kernels default type
Alexander Aring [Mon, 28 Aug 2017 19:07:37 +0000 (15:07 -0400)]
tc: m_ife: report about kernels default type

This patch will report about if the ethertype for IFE is not specified
that the default IFE type is used.

Signed-off-by: Alexander Aring <aring@mojatatu.com>
6 years agotc: m_ife: print IEEE ethertype format
Alexander Aring [Mon, 28 Aug 2017 19:07:36 +0000 (15:07 -0400)]
tc: m_ife: print IEEE ethertype format

This patch uses the usually IEEE format to display an ethertype which is
4-digits and every digit in upper case.

Signed-off-by: Alexander Aring <aring@mojatatu.com>
6 years agotc: m_ife: allow ife type to zero
Alexander Aring [Mon, 28 Aug 2017 19:07:35 +0000 (15:07 -0400)]
tc: m_ife: allow ife type to zero

This patch allows to set an ethertype for IFE which is zero. There is no
kernel side validation which forbids a type to zero.

Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
6 years agoupdate headers from net-next
Stephen Hemminger [Wed, 30 Aug 2017 15:26:43 +0000 (08:26 -0700)]
update headers from net-next

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Wed, 30 Aug 2017 15:24:57 +0000 (08:24 -0700)]
Merge branch 'master' into net-next

6 years agoss: Fix for added diag support check
Phil Sutter [Mon, 28 Aug 2017 17:31:22 +0000 (19:31 +0200)]
ss: Fix for added diag support check

Commit 9f66764e308e9 ("libnetlink: Add test for error code returned from
netlink reply") changed rtnl_dump_filter_l() to return an error in case
NLMSG_DONE would contain one, even if it was ENOENT.

This in turn breaks ss when it tries to dump DCCP sockets on a system
without support for it: The function tcp_show(), which is shared between
TCP and DCCP, will start parsing /proc since inet_show_netlink() returns
an error - yet it parses /proc/net/tcp which doesn't make sense for DCCP
sockets at all.

On my system, a call to 'ss' without further arguments prints the list
of connected TCP sockets twice.

Fix this by introducing a dedicated function dccp_show() which does not
have a fallback to /proc, just like sctp_show(). And since tcp_show()
is no longer "multi-purpose", drop it's socktype parameter.

Fixes: 9f66764e308e9 ("libnetlink: Add test for error code returned from netlink reply")
Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agodevlink: header update
Stephen Hemminger [Thu, 24 Aug 2017 22:31:57 +0000 (15:31 -0700)]
devlink: header update

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Thu, 24 Aug 2017 22:30:32 +0000 (15:30 -0700)]
Merge branch 'master' into net-next

6 years agotc: use named initializer for default mqprio options
Stephen Hemminger [Thu, 24 Aug 2017 22:27:35 +0000 (15:27 -0700)]
tc: use named initializer for default mqprio options

Use C99 initializer

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agolib/libnetlink: Don't pass NULL parameter to memcpy()
Phil Sutter [Thu, 24 Aug 2017 09:41:31 +0000 (11:41 +0200)]
lib/libnetlink: Don't pass NULL parameter to memcpy()

Both addattr_l() and rta_addattr_l() may be called with NULL data
pointer and 0 alen parameters. Avoid calling memcpy() in that case.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agolib/fs: Fix and simplify make_path()
Phil Sutter [Thu, 24 Aug 2017 09:41:30 +0000 (11:41 +0200)]
lib/fs: Fix and simplify make_path()

Calling stat() before mkdir() is racey: The entry might change in
between. Also, the call to stat() seems to exist only to check if the
directory exists already. So simply call mkdir() unconditionally and
catch only errors other than EEXIST.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agolib/bpf: Check return value of write()
Phil Sutter [Thu, 24 Aug 2017 09:41:29 +0000 (11:41 +0200)]
lib/bpf: Check return value of write()

This is merely to silence the compiler warning. If write to stderr
failed, assume that printing an error message will fail as well so don't
even try.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agonetem/maketable: Check return value of fscanf()
Phil Sutter [Thu, 24 Aug 2017 09:41:28 +0000 (11:41 +0200)]
netem/maketable: Check return value of fscanf()

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoss: Make sure scanned index value to unix_state_map is sane
Phil Sutter [Thu, 24 Aug 2017 09:41:27 +0000 (11:41 +0200)]
ss: Make sure scanned index value to unix_state_map is sane

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoss: Make struct tcpstat fields 'timer' and 'timeout' unsigned
Phil Sutter [Thu, 24 Aug 2017 09:41:26 +0000 (11:41 +0200)]
ss: Make struct tcpstat fields 'timer' and 'timeout' unsigned

Both 'timer' and 'timeout' variables of struct tcpstat are either
scanned as unsigned values from /proc/net/tcp{,6} or copied from
'idiag_timer' and 'idiag_expries' fields of struct inet_diag_msg, which
itself are unsigned. Therefore they may be unsigned as well, which
eliminates the need to check for negative values.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agobpf: drop unused parameter to bpf_report_map_in_map
Stephen Hemminger [Thu, 24 Aug 2017 22:02:58 +0000 (15:02 -0700)]
bpf: drop unused parameter to bpf_report_map_in_map

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agolibnetlink: drop unused parameter to rtnl_dump_done
Stephen Hemminger [Thu, 24 Aug 2017 22:02:32 +0000 (15:02 -0700)]
libnetlink: drop unused parameter to rtnl_dump_done

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agordma: fix duplicate initialization in port_names
Stephen Hemminger [Thu, 24 Aug 2017 22:00:59 +0000 (15:00 -0700)]
rdma: fix duplicate initialization in port_names

Build with warnings enable spotted this.
link.c:51:58: note: (near initialization for ‘rdma_port_names[23]’)
   rdma_port_names[] = { RDMA_PORT_FLAGS(RDMA_BITMAP_NAMES) };

Assume that fields were in order and 25 is the missing value.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agolib/ll_map: Choose size of new cache items at run-time
Phil Sutter [Thu, 24 Aug 2017 09:51:50 +0000 (11:51 +0200)]
lib/ll_map: Choose size of new cache items at run-time

Instead of having a fixed buffer of 16 bytes for the interface name,
tailor size of new ll_cache entry using the interface name's actual
length. This also makes sure the following call to strcpy() is safe.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agotc/m_xt: Fix for potential string buffer overflows
Phil Sutter [Thu, 24 Aug 2017 09:51:49 +0000 (11:51 +0200)]
tc/m_xt: Fix for potential string buffer overflows

- Use strncpy() when writing to target->t->u.user.name and make sure the
  final byte remains untouched (xtables_calloc() set it to zero).
- 'tname' length sanitization was completely wrong: If it's length
  exceeded the 16 bytes available in 'k', passing a length value of 16
  to strncpy() would overwrite the previously NULL'ed 'k[15]'. Also, the
  sanitization has to happen if 'tname' is exactly 16 bytes long as
  well.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agolnstat_util: Simplify alloc_and_open() a bit
Phil Sutter [Thu, 24 Aug 2017 09:51:48 +0000 (11:51 +0200)]
lnstat_util: Simplify alloc_and_open() a bit

Relying upon callers and using unsafe strcpy() is probably not the best
idea. Aside from that, using snprintf() allows to format the string for
lf->path in one go.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agolib/inet_proto: Review inet_proto_{a2n,n2a}()
Phil Sutter [Thu, 24 Aug 2017 09:51:47 +0000 (11:51 +0200)]
lib/inet_proto: Review inet_proto_{a2n,n2a}()

The original intent was to make sure strings written by those functions
are NUL-terminated at all times, though it was suggested to get rid of
the 15 char protocol name limit as well which this patch accomplishes.

In addition to that, simplify inet_proto_a2n() a bit: Use the error
checking in get_u8() to find out whether passed 'buf' contains a valid
decimal number instead of checking the first character's value manually.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agolib/fs: Fix format string in find_fs_mount()
Phil Sutter [Thu, 24 Aug 2017 09:51:46 +0000 (11:51 +0200)]
lib/fs: Fix format string in find_fs_mount()

A field width of 4096 allows fscanf() to store that amount of characters
into the given buffer, though that doesn't include the terminating NULL
byte. Decrease the value by one to leave space for it.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoipntable: Avoid memory allocation for filter.name
Phil Sutter [Thu, 24 Aug 2017 09:51:45 +0000 (11:51 +0200)]
ipntable: Avoid memory allocation for filter.name

The original issue was that filter.name might end up unterminated if
user provided string was too long. But in fact it is not necessary to
copy the commandline parameter at all: just make filter.name point to it
instead.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agotipc/bearer: Prevent NULL pointer dereference
Phil Sutter [Thu, 24 Aug 2017 09:46:34 +0000 (11:46 +0200)]
tipc/bearer: Prevent NULL pointer dereference

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agotc/tc_filter: Make sure filter name is not empty
Phil Sutter [Thu, 24 Aug 2017 09:46:33 +0000 (11:46 +0200)]
tc/tc_filter: Make sure filter name is not empty

The later check for 'k[0] != 0' requires a non-empty filter name,
otherwise NULL pointer dereference in 'q' might happen.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agotc/q_netem: Don't dereference possibly NULL pointer
Phil Sutter [Thu, 24 Aug 2017 09:46:32 +0000 (11:46 +0200)]
tc/q_netem: Don't dereference possibly NULL pointer

Assuming 'opt' might be NULL, move the call to RTA_PAYLOAD to after the
check since it dereferences its parameter.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoifstat, nstat: Check fdopen() return value
Phil Sutter [Thu, 24 Aug 2017 09:46:31 +0000 (11:46 +0200)]
ifstat, nstat: Check fdopen() return value

Prevent passing NULL FILE pointer to fgets() later.

Fix both tools in a single patch since the code changes are basically
identical.

Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoss: fix help/man TCP-STATE description for listening
Andreas Henriksson [Wed, 23 Aug 2017 12:47:51 +0000 (14:47 +0200)]
ss: fix help/man TCP-STATE description for listening

There's some misleading information in --help and ss(8) manpage about
TCP-STATE named 'listen'.
ss doesn't know such a state, but it knows 'listening' state.

$ ss -tua state listen
ss: wrong state name: listen

$ ss -tua state listening
[...]

Addresses: https://bugs.debian.org/872990
Reported-by: Pavel Lyulchenko <p.lyulchenko@gmail.com>
Signed-off-by: Andreas Henriksson <andreas@fatal.se>
6 years agogre: add support for ERSPAN tunnel
William Tu [Wed, 23 Aug 2017 17:06:54 +0000 (10:06 -0700)]
gre: add support for ERSPAN tunnel

The patch adds ERSPAN type II tunnel support. The implementation is
based on the draft at
 https://tools.ietf.org/html/draft-foschiano-erspan-01.

One of the purposes is for Linux box to be able to receive ERSPAN
monitoring traffic sent from the Cisco switch, by creating a ERSPAN
tunnel device. In addition, the patch also adds ERSPAN TX, so traffic
can also be encapsulated into ERSPAN and sent out.

The implementation reuses the key as ERSPAN session ID, and
field 'erspan' as ERSPAN Index fields:
./ip link add dev ers11 type erspan seq key 100 erspan 123 \
local 172.16.1.200 remote 172.16.1.100

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Meenakshi Vohra <mvohra@vmware.com>
6 years agoadd ERSPAN headers
Stephen Hemminger [Wed, 23 Aug 2017 17:05:08 +0000 (10:05 -0700)]
add ERSPAN headers

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoconfig: put CFLAGS/LDLIBS in config.mk
Stephen Hemminger [Fri, 11 Aug 2017 00:05:03 +0000 (17:05 -0700)]
config: put CFLAGS/LDLIBS in config.mk

This renames Config to config.mk and includes more Make input.
Now configure generates all the required CFLAGS and LDLIBS for
the optional libraries.

Also, use pkg-config to test for libelf, rather than using a test
program. This makes it consistent with other libraries.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Tue, 22 Aug 2017 00:37:15 +0000 (17:37 -0700)]
Merge branch 'master' into net-next

6 years agolib/bpf: Don't leak fp in bpf_find_mntpt()
Phil Sutter [Mon, 21 Aug 2017 14:46:51 +0000 (16:46 +0200)]
lib/bpf: Don't leak fp in bpf_find_mntpt()

If fopen() succeeded but len != PATH_MAX, the function leaks the open
FILE pointer. Fix this by checking len value before calling fopen().

Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
6 years agodevlink: Check return code of strslashrsplit()
Phil Sutter [Mon, 21 Aug 2017 16:36:52 +0000 (18:36 +0200)]
devlink: Check return code of strslashrsplit()

This function shouldn't fail because all callers of
__dl_argv_handle_port() make sure the passed string contains enough
slashes already, but better make sure if this changes in future the
function won't access uninitialized data.

Signed-off-by: Phil Sutter <phil@nwl.cc>