]> git.proxmox.com Git - mirror_iproute2.git/log
mirror_iproute2.git
5 years agotc: Do not use addattr_nest_compat on mqprio and netem
Jesus Sanchez-Palencia [Mon, 16 Jul 2018 17:52:18 +0000 (10:52 -0700)]
tc: Do not use addattr_nest_compat on mqprio and netem

Here we are partially reverting commit c14f9d92eee107
"treewide: Use addattr_nest()/addattr_nest_end() to handle nested
attributes" .

As discussed in [1], changing from the 'manually' coded version that
used addattr_l() to addattr_nest_compat() wasn't functionally
equivalent, because now the messages have extra fields appended to it.

This introduced a regression since the implementation of parse_attr()
from both mqprio and netem can't handle this new message format.

Without this fix, mqprio returns an error. netem won't return an error
but its internal configuration ends up wrong.

As an example, this can be reproduced by the following commands when
this patch is not applied:

 1) mqprio
$ tc qdisc replace dev enp3s0 parent root handle 100 mqprio \
num_tc 3 map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
queues 1@0 1@1 2@2 hw 0

RTNETLINK answers: Numerical result out of range

 2) netem
$ tc qdisc add dev enp3s0 root netem rate 5kbit 20 100 5 \
distribution normal latency 1 1

$ tc -s qdisc

(...)
qdisc netem 8001: dev enp3s0 root refcnt 9 limit 1000 delay 0us  0us
 Sent 402 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
(...)

With this patch applied, the tc -s qdisc command above for netem instead
reads:

(...)
qdisc netem 8002: dev enp3s0 root refcnt 9 limit 1000 delay 0us  0us \
rate 5Kbit packetoverhead 20 cellsize 100 celloverhead 5
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
(...)

[1] https://patchwork.ozlabs.org/patch/867860/#1893405

Fixes: c14f9d92eee107 ("treewide: Use addattr_nest()/addattr_nest_end() to handle nested attributes")
Reported-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoip: add support for seg6local End.BPF action
Mathieu Xhonneux [Tue, 17 Jul 2018 14:49:52 +0000 (14:49 +0000)]
ip: add support for seg6local End.BPF action

This patch adds support for the End.BPF action of the seg6local
lightweight tunnel. Functions from the BPF lightweight tunnel are
re-used in this patch. Example:

$ ip -6 route add fc00::18 encap seg6local action End.BPF endpoint
obj my_bpf.o sec my_func dev eth0

$ ip -6 route show fc00::18
fc00::18  encap seg6local action End.BPF endpoint my_bpf.o:[my_func]
dev eth0 metric 1024 pref medium

v2: - re-use of print_encap_bpf_prog instead of fprintf
    - introduction of "endpoint" keyword for more consistency with
      others parameters

Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoipaddress: Fix and make consistent label match handling
Serhey Popovych [Sat, 14 Jul 2018 21:36:34 +0000 (00:36 +0300)]
ipaddress: Fix and make consistent label match handling

Since commit 9516823051ce ("ipaddress: Improve print_linkinfo()") we
return -1 instead of 0 when ip-address(8) label does not match network
device name as we did before change. This causes regression when trying
to output ip address matching label:

     # ip addr add 192.168.192.1/24 dev lo label lo:1
     # ip addr show label lo:1
     <no output>

This is special case and return 0 from print_linkinfo() earlier to match
only filter.ifindex and filter.up if given, but not rest fields in
@filter. Then call print_selected_addrinfo() without calling
print_link_stats() in ipaddr_list_flush_or_save().

Later print_selected_addrinfo() calls print_addrinfo() that finally
matches IFA_LABEL attribute in netlink buffer with filter.label using
ifa_label_match_rta().

On the other hand there is three conditions checked in print_linkinfo()
to determine label special case:

    1) filter.label != NULL
    2) filter.family == AF_UNSPEC || filter.family == AF_PACKET
    3) fnmatch(filter.label, name, 0)

With 1) it is ok to check if filtering by label is on by given pattern
in @filter.label.

Since label is IPv4 specific and AF_PACKET is for printing ip-link(8)
information (see ipaddr_link_list()::ipaddress.c as example) checking
for AF_PACKET in 2) doesn't take much sense: better to defer these
checks to print_addrinfo() determine valid combinations before calling
ifa_label_match_rta() to finally match IFA_LABEL to pattern in
filter.label.

For 3) we have following call for test case:

    fnmatch(pattern, string, flags) ->
      fnmatch(filter.label, name, 0) ->
        fnmatch("lo:1", "lo", 0) == FNM_NOMATCH (1) or non-zero on error

To support special case in print_linkinfo() for filtering by label we
only need to check if label pattern is given in filter.label and return
0 to skip print_link_stats() in ipaddr_list_flush_or_save(): actual
filtering will be done in print_addrinfo().

Before commit 9516823051ce ("ipaddress: Improve print_linkinfo()"):
-------------------------------------------------------------------

$ ip addr sh label lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN \
group default qlen 1000
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                          fnmatch("lo", "lo", 0) == 0
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
$ ip addr show label 'lo:*'
    inet 192.168.192.1/24 scope global lo:1
       valid_lft forever preferred_lft forever
$ ip addr sh label lo:1
    inet 192.168.192.1/24 scope global lo:1
       valid_lft forever preferred_lft forever
$ ip -4 addr sh label lo:1
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN \
group default qlen 1000
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                             filter.family == AF_INET
    inet 192.168.192.1/24 scope global lo:1
       valid_lft forever preferred_lft forever

After this change applied:
--------------------------

$ ip/ip addr show label lo
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
$ ip/ip addr show label 'lo:*'
    inet 192.168.192.1/24 scope global lo:1
        valid_lft forever preferred_lft forever
$ ip/ip addr show label lo:1
    inet 192.168.192.1/24 scope global lo:1
       valid_lft forever preferred_lft forever
$ ip/ip -4 addr show label lo:1
    inet 192.168.192.1/24 scope global lo:1
       valid_lft forever preferred_lft forever

Note that we no longer show link information as we did previously:
    we are filtering by "label" pattern, not showing by "dev".

Fixes: commit 9516823051ce ("ipaddress: Improve print_linkinfo()")
Reported-by: Vincent Bernat <vincent@bernat.im>
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotc: don't double print rate
Stephen Hemminger [Mon, 9 Jul 2018 16:53:45 +0000 (09:53 -0700)]
tc: don't double print rate

Conversion to print stats in JSON forgot to remove existing
fprintf.

Fixes: 4fcec7f3665b ("tc: jsonify stats2")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoman: Fix typos on tc-cbs
Jesus Sanchez-Palencia [Thu, 5 Jul 2018 15:20:09 +0000 (08:20 -0700)]
man: Fix typos on tc-cbs

Fix 2 typos on the man page of the CBS qdisc.

Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotc: Fix the bug not to display prio and quantum options of htb
fumihiko kakuma [Wed, 4 Jul 2018 03:32:33 +0000 (12:32 +0900)]
tc: Fix the bug not to display prio and quantum options of htb

A commandline like 'tc -d class show dev dev-name' does not
display value of prio and quantum option when we use htb qdisc.
This patch fixes the bug.

Signed-off-by: Fumihiko Kakuma <kakuma@valinux.co.jp>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotc: Fix output of ip attributes
Roi Dayan [Tue, 3 Jul 2018 12:54:32 +0000 (15:54 +0300)]
tc: Fix output of ip attributes

Example output is of tos and ttl.
Befoe this fix the format used %x caused output of the pointer
instead of the intended string created in the out variable.

Fixes: e28b88a464c4 ("tc: jsonify flower filter")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agouapi: update bpf.h
Stephen Hemminger [Sat, 7 Jul 2018 16:56:14 +0000 (09:56 -0700)]
uapi: update bpf.h

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agodevlink.8, translate unparseable callout syntax to parseable form.
Eric S. Raymond [Wed, 13 Jun 2018 21:31:12 +0000 (17:31 -0400)]
devlink.8, translate unparseable callout syntax to parseable form.

Signed-off-by: Eric S. Raymond <esr@thyrsus.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotc: fix batch force option
Vlad Buslov [Wed, 20 Jun 2018 07:24:21 +0000 (10:24 +0300)]
tc: fix batch force option

When sending accumulated compound command results an error, check 'force'
option before exiting. Move return code check after putting batch bufs and
freeing iovs to prevent memory leak. Break from loop, instead of returning
error code to allow cleanup at the end of batch function. Don't reset ret
code on each iteration.

Fixes: 485d0c6001c4 ("tc: Add batchsize feature for filter and actions")
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoip: add rmnet initial support
Daniele Palmas [Fri, 15 Jun 2018 08:23:05 +0000 (10:23 +0200)]
ip: add rmnet initial support

This patch adds basic support for Qualcomm rmnet devices.

Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoipaddress: strengthen check on 'label' input
Patrick Talbert [Thu, 14 Jun 2018 13:46:57 +0000 (15:46 +0200)]
ipaddress: strengthen check on 'label' input

As mentioned in the ip-address man page, an address label must
be equal to the device name or prefixed by the device name
followed by a colon. Currently the only check on this input is
to see if the device name appears at the beginning of the label
string.

This commit adds an additional check to ensure label == dev or
continues with a colon.

Signed-off-by: Patrick Talbert <ptalbert@redhat.com>
Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agordma: sync some IP headers with glibc
Hoang Le [Wed, 13 Jun 2018 04:09:56 +0000 (11:09 +0700)]
rdma: sync some IP headers with glibc

In the commit 9a362cc71a45, new userspace header:
  (i.e rdma/rdma_user_cm.h -> linux/in6.h)
is included before the kernel space header:
  (i.e utils.h -> resolv.h -> netinet/in.h).

This leads to unsynchronous some IP headers and compiler got failure
with error: redefinition of some structs IP.

In this commit, just reorder this including to make them in-sync.

Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoiproute2: Add support for a few routing protocols
Donald Sharp [Fri, 8 Jun 2018 17:47:13 +0000 (13:47 -0400)]
iproute2: Add support for a few routing protocols

Add support for:

BGP
ISIS
OSPF
RIP
EIGRP

Routing protocols to iproute2.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agouapi: update headers from linux-net
Stephen Hemminger [Fri, 8 Jun 2018 17:27:13 +0000 (10:27 -0700)]
uapi: update headers from linux-net

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoMerge ../iproute2-next
Stephen Hemminger [Fri, 8 Jun 2018 17:27:04 +0000 (10:27 -0700)]
Merge ../iproute2-next

5 years agov4.17.0
Stephen Hemminger [Fri, 8 Jun 2018 17:11:50 +0000 (10:11 -0700)]
v4.17.0

5 years agouapi: update bpf.h to include padding
Stephen Hemminger [Fri, 8 Jun 2018 17:09:21 +0000 (10:09 -0700)]
uapi: update bpf.h to include padding

Last minute upstream 4.17 change.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotipc: TIPC_NLA_LINK_NAME value pass on nesting entry TIPC_NLA_LINK
Hoang Le [Fri, 8 Jun 2018 02:19:28 +0000 (09:19 +0700)]
tipc: TIPC_NLA_LINK_NAME value pass on nesting entry TIPC_NLA_LINK

In the commit 94f6a80 on next-net, TIPC_NLA_LINK_NAME attribute should be
retrieved and validated via TIPC_NLA_LINK nesting entry in
tipc_nl_node_get_link().
According to that commit, TIPC_NLA_LINK_NAME value passing via
tipc link get command must follow above hierachy.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoiplink: enable to specify a name for the link-netns
Nicolas Dichtel [Tue, 5 Jun 2018 13:08:31 +0000 (15:08 +0200)]
iplink: enable to specify a name for the link-netns

The 'link-netnsid' argument needs a number. Add 'link-netns' when the user
wants to use the iproute2 netns name instead of the nsid.

Example:
ip link add ipip1 link-netns foo type ipip remote 10.16.0.121 local 10.16.0.249

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoip: display netns name instead of nsid
Nicolas Dichtel [Tue, 5 Jun 2018 13:08:30 +0000 (15:08 +0200)]
ip: display netns name instead of nsid

When iproute2 has a name for the nsid, let's display it. It's more
user friendly than a number.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotc: add json support in csum action
Keara Leibovitz [Tue, 5 Jun 2018 20:44:19 +0000 (16:44 -0400)]
tc: add json support in csum action

Add json output support for checksum action.

Example output:

~$ $TC actions add action csum udp continue index 7
~$ $TC actions add action csum icmp iph igmp pipe index 200 cookie 112233
~$ $TC -j actions ls action csum

[{
    "total acts":2
}, {
    "actions": [{
        "order":0,
        "csum":"udp",
        "control_action": {
            "type":"continue"
        },
        "index":7,
        "ref":1,
        "bind":0
    }, {
        "order":1,
        "csum":"iph, icmp, igmp",
        "control_action": {
            "type":"pipe"
        },
        "index":200,
        "ref":1,
        "bind":0,
        "cookie":"112233"
    }]
}]

v2:
    Don't initialized char buf[64];
    Add output example

Signed-off-by: Keara Leibovitz <kleib@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agoMerge branch 'iproute2-master' into iproute2-next
David Ahern [Tue, 5 Jun 2018 21:22:15 +0000 (14:22 -0700)]
Merge branch 'iproute2-master' into iproute2-next

Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agodevlink: don't enforce NETLINK_{CAP,EXT}_ACK sock opts
Ivan Vecera [Fri, 1 Jun 2018 08:18:49 +0000 (10:18 +0200)]
devlink: don't enforce NETLINK_{CAP,EXT}_ACK sock opts

Since commit 049c58539f5d ("devlink: mnlg: Add support for extended ack")
devlink requires NETLINK_{CAP,EXT}_ACK. This prevents devlink from
working with older kernels that don't support these features.

host # ./devlink/devlink
Failed to connect to devlink Netlink

Fixes: 049c58539f5d ("devlink: mnlg: Add support for extended ack")
Cc: Arkadi Sharshevsky <arkadis@mellanox.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
5 years agoip: IFLA_NEW_NETNSID/IFLA_NEW_IFINDEX support
Nicolas Dichtel [Thu, 31 May 2018 14:28:48 +0000 (16:28 +0200)]
ip: IFLA_NEW_NETNSID/IFLA_NEW_IFINDEX support

Parse and display those attributes.
Example:
ip l a type dummy
ip netns add foo
ip monitor link&
ip l s dummy1 netns foo
Deleted 6: dummy1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
    link/ether 66:af:3a:3f:a0:89 brd ff:ff:ff:ff:ff:ff new-nsid 0 new-ifindex 6

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoiproute2: fix 'ip xfrm monitor all' command
Nathan Harold [Wed, 30 May 2018 19:11:32 +0000 (12:11 -0700)]
iproute2: fix 'ip xfrm monitor all' command

Currently, calling 'ip xfrm monitor all' will
actually invoke the 'all-nsid' command because the
soft-match for 'all-nsid' occurs before the precise
match for 'all'. This patch rearranges the checks
so that the 'all' command, itself an alias for
invoking 'ip xfrm monitor' with no argument, can
be called consistent with the syntax for other ip
commands that accept an 'all'.

Signed-off-by: Nathan Harold <nharold@google.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoiplink_vrf: Save device index from response for return code
David Ahern [Fri, 1 Jun 2018 15:50:16 +0000 (08:50 -0700)]
iplink_vrf: Save device index from response for return code

A recent commit changed rtnl_talk_* to return the response message in
allocated memory so callers need to free it. The change to name_is_vrf
did not save the device index which is pointing to a struct inside the
now allocated and freed memory resulting in garbage getting returned
in some cases.

Fix by using a stack variable to save the return value and only set
it to ifi->ifi_index after all checks are done and before the answer
buffer is freed.

Fixes: 86bf43c7c2fdc ("lib/libnetlink: update rtnl_talk to support malloc buff at run time")
Cc: Hangbin Liu <liuhangbin@gmail.com>
Cc: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agort_protos: drop old experimental gated names
Stephen Hemminger [Fri, 1 Jun 2018 19:44:14 +0000 (15:44 -0400)]
rt_protos: drop old experimental gated names

No longer need these petroglyph values.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoiproute: ip route get support for sport, dport and ipproto match
Roopa Prabhu [Wed, 30 May 2018 17:06:18 +0000 (10:06 -0700)]
iproute: ip route get support for sport, dport and ipproto match

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agoip route: print RTA_CACHEINFO if it exists
David Ahern [Wed, 30 May 2018 15:30:09 +0000 (08:30 -0700)]
ip route: print RTA_CACHEINFO if it exists

RTA_CACHEINFO can be sent for non-cloned routes. If the attribute is
present print it. Allows route dumps to print expires times for example
which can exist on FIB entries.

Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agoMerge branch 'iproute2-master' into iproute2-next
David Ahern [Fri, 1 Jun 2018 15:17:23 +0000 (08:17 -0700)]
Merge branch 'iproute2-master' into iproute2-next

Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agoipaddress: Add support for address metric
David Ahern [Sun, 27 May 2018 15:10:00 +0000 (08:10 -0700)]
ipaddress: Add support for address metric

Add support for IFA_RT_PRIORITY using the same keywords as iproute for
RTA_PRIORITY.

Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agoUpdate kernel headers
David Ahern [Wed, 30 May 2018 15:06:19 +0000 (08:06 -0700)]
Update kernel headers

Update kernel headers to commit
ae40832e53c3 ("bpfilter: fix a build err")

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoip: defer lookup interface index
Stephen Hemminger [Fri, 25 May 2018 14:48:40 +0000 (07:48 -0700)]
ip: defer lookup interface index

The ip command would always lookup the network device index
even when not necessary. This slows down operations like creating
lots of VLAN's.

David reported the original issue, this is an alternative patch
that solves it in a slightly more general method.

Using iproute2 to create a bridge and add 4094 vlans to it can take from
2 to 3 *minutes*. The reason is the extraneous call to ll_name_to_index.
ll_name_to_index results in an ioctl(SIOCGIFINDEX) call which in turn
invokes dev_load. If the index does not exist, which it won't when
creating a new link, dev_load calls modprobe twice -- once for
netdev-NAME and again for NAME. This is unnecessary overhead for each
link create.

When ip link is invoked for a new device, there is no reason to
call ll_name_to_index for the new device. With this patch, creating
a bridge and adding 4094 vlans takes less than 3 *seconds*.

old:
# time ip -batch ip-vlan.batch
real    3m13.727s
user    0m0.076s
sys     0m1.959s

new:
# time ip -batch ip-vlan.batch
real    0m3.222s
user    0m0.044s
sys     0m1.777s

Reported-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoip route: Print expires as signed int
David Ahern [Wed, 23 May 2018 18:50:01 +0000 (11:50 -0700)]
ip route: Print expires as signed int

rta_expires is a signed int; print it as one.

Fixes: 663c3cb23103f ("iproute: implement JSON and color output")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoAllow to configure /var/run/netns directory
Pavel Maltsev [Fri, 18 May 2018 22:44:00 +0000 (15:44 -0700)]
Allow to configure /var/run/netns directory

Currently NETNS_RUN_DIR is hardcoded and refers to /var/run/netns.
However, some systems (e.g. Android) doesn't have /var
which results in error attempts to create network namespaces on these
systems.  This change makes NETNS_RUN_DIR configurable at build time
by allowing to pass environment variable to make command.
Also, this change makes /etc/netns directory configurable through
NETNS_ETC_DIR environment variable.

For example: ./configure && NETNS_RUN_DIR=/mnt/vendor/netns make

Tested: verified that iproute2 with configuration mentioned above
creates namespaces in /mnt/vendor/netns

Signed-off-by: Pavel Maltsev <pavelm@google.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agodevlink: introduce support for showing port flavours
Jiri Pirko [Sun, 20 May 2018 08:15:38 +0000 (10:15 +0200)]
devlink: introduce support for showing port flavours

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoUpdate kernel headers
David Ahern [Wed, 23 May 2018 19:58:34 +0000 (12:58 -0700)]
Update kernel headers

Update kernel headers to commit
e89e59c08d1b ("Merge branch 'net-sfp-small-improvements'")

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoMerge branch 'rdma-resource-tracking' into iproute2-next
David Ahern [Fri, 18 May 2018 16:21:42 +0000 (09:21 -0700)]
Merge branch 'rdma-resource-tracking' into iproute2-next

Steve Wise  says:

====================

This series enhances the iproute2 rdma tool to include displaying
driver-specific resource attributes.  It is the user-space part of the
kernel driver resource tracking series that has been accepted for merging
into linux-4.18 [1]

If there are no additional review comments, it can now be merged, I think.

Changes since v2:
- resync rdma_netlink.h to fix uapi break

Changes since v1:
- commit log editorial fixes
- cite kernel commits that updated rdma_netlink.h in the
  iproute2 commit syncing this header
- reorder stack definitions ala "reverse christmas tree"
- correctly handle unknown driver attributes when printing

Changes since v0/rfc:
- changed "provider" to "driver" based on kernel side changes
- updated man pages
- removed "RFC" tag

Thanks,

Steve.

[1] https://www.spinics.net/lists/linux-rdma/msg64199.html

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agordma: update man pages
Steve Wise [Tue, 15 May 2018 20:41:15 +0000 (13:41 -0700)]
rdma: update man pages

Update the man pages for the resource attributes as well
as the driver-specific attributes.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agordma: print driver resource attributes
Steve Wise [Tue, 15 May 2018 20:41:09 +0000 (13:41 -0700)]
rdma: print driver resource attributes

This enhancement allows printing rdma device-specific state, if provided
by the kernel. This is done in a generic manner, so rdma tool doesn't
need to know about the details of every type of rdma device.

Driver attributes for a rdma resource are in the form of <key,
[print_type], value> tuples, where the key is a string and the value can
be any supported driver attribute. The print_type attribute, if present,
provides a print format to use vs the standard print format for the type.
For example, the default print type for a PROVIDER_S32 value is "%d ",
but "0x%x " if the print_type of PRINT_TYPE_HEX is included inthe tuple.

Driver resources are only printed when the -dd flag is present.
If -p is present, then the output is formatted to not exceed 80 columns,
otherwise it is printed as a single row to be grep/awk friendly.

Example output:

# rdma resource show qp lqpn 1028 -dd -p
link cxgb4_0/- lqpn 1028 rqpn 0 type RC state RTS rq-psn 0 sq-psn 0 path-mig-state MIGRATED pid 0 comm [nvme_rdma]
    sqid 1028 flushed 0 memsize 123968 cidx 85 pidx 85 wq_pidx 106 flush_cidx 85 in_use 0
    size 386 flags 0x0 rqid 1029 memsize 16768 cidx 43 pidx 41 wq_pidx 171 msn 44 rqt_hwaddr 0x2a8a5d00
    rqt_size 256 in_use 128 size 130

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agordma: update rdma_netlink.h to get new driver attributes
Steve Wise [Tue, 15 May 2018 20:41:02 +0000 (13:41 -0700)]
rdma: update rdma_netlink.h to get new driver attributes

Pull in the rdma_netlink.h changes from kernel
commits:

25a0ad85156a ("RDMA/nldev: Add explicit pad attribute")
da5c85078215 ("RDMA/nldev: add driver-specific resource tracking)"
0d52d803767e ("RDMA/uapi: Fix uapi breakage")

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotipc: fixed node and name table listings
Jon Maloy [Thu, 17 May 2018 14:02:42 +0000 (16:02 +0200)]
tipc: fixed node and name table listings

We make it easier for users to correlate between 128-bit node
identities and 32-bit node hash number by extending the 'node list'
command to also show the hash number.

We also improve the 'nametable show' command to show the node identity
instead of the node hash number. Since the former potentially is much
longer than the latter, we make room for it by eliminating the (to the
user) irrelevant publication key. We also reorder some of the columns so
that the node id comes last, since this looks nicer and is more logical.

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotc: add missing space symbol in ife output
Roman Mashak [Thu, 17 May 2018 13:28:02 +0000 (09:28 -0400)]
tc: add missing space symbol in ife output

In order to make TDC tests match the output patterns, the missing space
character must be added in the mode output string.

Fixes: 8744c5d3388e3 ("tc: jsonify ife action")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotc: flower: add support for verbose logging
Marcelo Ricardo Leitner [Sun, 13 May 2018 20:44:28 +0000 (17:44 -0300)]
tc: flower: add support for verbose logging

Currently there is no way to log offloading errors if the rule is not
explicitly marked as skip_sw, making it hard for other applications such
as Open vSwitch to log why a given could not be offloaded.

This patch adds support for signaling the kernel that more verbose
logging is wanted, which now will include such messages.

Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoUpdate kernel headers
David Ahern [Fri, 18 May 2018 16:05:07 +0000 (09:05 -0700)]
Update kernel headers

Update kernel headers to commit
64a2658b58ab ("net: mscc: Add SPDX identifier")

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotc: allow 0% for percent options
Stephen Hemminger [Thu, 17 May 2018 23:20:50 +0000 (16:20 -0700)]
tc: allow 0% for percent options

Allowing 0% is sometimes useful for example in netem loss and drop
or perhaps dropping all traffic in a HTB bin.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199745
Reported-by: stuartmarsden@gmail.com
Fixes: 927e3cfb52b5 ("tc: B.W limits can now be specified in %.")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agotc-netem: fix limit description in man page
Marcelo Ricardo Leitner [Wed, 16 May 2018 00:49:55 +0000 (21:49 -0300)]
tc-netem: fix limit description in man page

As the kernel code says, limit is actually the amount of packets it can
hold queued at a time, as per:

static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch,
                         struct sk_buff **to_free)
{
...
        if (unlikely(sch->q.qlen >= sch->limit))
                return qdisc_drop_all(skb, sch, to_free);

So lets fix the description of the field in the man page.

Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agotc-netem: fix limit description in man page
Marcelo Ricardo Leitner [Wed, 16 May 2018 00:49:55 +0000 (21:49 -0300)]
tc-netem: fix limit description in man page

As the kernel code says, limit is actually the amount of packets it can
hold queued at a time, as per:

static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch,
                         struct sk_buff **to_free)
{
...
        if (unlikely(sch->q.qlen >= sch->limit))
                return qdisc_drop_all(skb, sch, to_free);

So lets fix the description of the field in the man page.

Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoMerge branch 'iproute2-master' into iproute2-next
David Ahern [Wed, 16 May 2018 21:10:27 +0000 (14:10 -0700)]
Merge branch 'iproute2-master' into iproute2-next

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoip: do not drop capabilities if net_admin=i is set
Luca Boccassi [Fri, 11 May 2018 12:39:56 +0000 (13:39 +0100)]
ip: do not drop capabilities if net_admin=i is set

Users have reported a regression due to ip now dropping capabilities
unconditionally.
zerotier-one VPN and VirtualBox use ambient capabilities in their
binary and then fork out to ip to set routes and links, and this
does not work anymore.

As a workaround, do not drop caps if CAP_NET_ADMIN (the most common
capability used by ip) is set with the INHERITABLE flag.
Users that want ip vrf exec to work do not need to set INHERITABLE,
which will then only set when the calling program had privileges to
give itself the ambient capability.

Fixes: ba2fc55b99f8 ("Drop capabilities if not running ip exec vrf with libcap")
Signed-off-by: Luca Boccassi <bluca@debian.org>
6 years agoMerge branch 'iproute2-master' into iproute2-next
David Ahern [Thu, 10 May 2018 04:04:16 +0000 (21:04 -0700)]
Merge branch 'iproute2-master' into iproute2-next

 Conflicts:
rdma/include/uapi/rdma/rdma_netlink.h

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotipc: Add support to set and get MTU for UDP bearer
GhantaKrishnamurthy MohanKrishna [Tue, 8 May 2018 11:55:28 +0000 (13:55 +0200)]
tipc: Add support to set and get MTU for UDP bearer

In this commit we introduce the ability to set and get
MTU for UDP media and bearer.

For set and get properties such as tolerance, window and priority,
we already do:

    $ tipc media set PPROPERTY media MEDIA
    $ tipc media get PPROPERTY media MEDIA

    $ tipc bearer set OPTION media MEDIA ARGS
    $ tipc bearer get [OPTION] media MEDIA ARGS

The same has been extended for MTU, with an exception to support
only media type UDP.

Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: GhantaKrishnamurthy MohanKrishna <mohan.krishna.ghanta.krishnamurthy@ericsson.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoUpdate kernel headers
David Ahern [Thu, 10 May 2018 03:52:52 +0000 (20:52 -0700)]
Update kernel headers

Update kernel headers to commit 53a7bdfb2a27
("dt-bindings: dsa: Remove unnecessary #address/#size-cells")

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoss: remove non-functional slabinfo
Stephen Hemminger [Wed, 9 May 2018 20:57:08 +0000 (13:57 -0700)]
ss: remove non-functional slabinfo

Ss was using slabinfo to try and intuit TCP statistics.
The slabinfo changed several times since 2.4 and all these statistics
are broken by renames and slab merging. Plus slabinfo does not exist
at all if kernel is compiled with SLUB option.

Rather than trying to fix kernel, just trim away the no longer
valid statistics.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agordma: add ib header files
Stephen Hemminger [Wed, 9 May 2018 15:14:55 +0000 (08:14 -0700)]
rdma: add ib header files

The iproute2 header files must be complete to allow builds on
other places where some of the headers are not present.

For example, iproute2 is built on Windows Services for Linux
as a test tool. With the partial addition of rdma it was broken.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agordma: align headers with upstream
Stephen Hemminger [Wed, 9 May 2018 15:12:13 +0000 (08:12 -0700)]
rdma: align headers with upstream

This makes rdma/include/uapi/rdma headers align with those produced
by doing make headers_install from upstream (Linus) tree.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agobpf: don't offload perf array maps
Jakub Kicinski [Sat, 5 May 2018 00:37:51 +0000 (17:37 -0700)]
bpf: don't offload perf array maps

Perf arrays are handled specially by the kernel, don't request
offload even when used by an offloaded program.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoMerge branch 'iproute2-master' into iproute2-next
David Ahern [Sat, 5 May 2018 18:07:47 +0000 (11:07 -0700)]
Merge branch 'iproute2-master' into iproute2-next

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiproute: Parse last nexthop in a multipath route
Ido Schimmel [Tue, 1 May 2018 13:16:35 +0000 (16:16 +0300)]
iproute: Parse last nexthop in a multipath route

Continue parsing a multipath payload as long as another nexthop can fit
in the payload.

# ip route add 192.0.2.0/24 nexthop dev dummy0 nexthop dev dummy1

Before:
# ip route show 192.0.2.0/24
192.0.2.0/24
        nexthop dev dummy0 weight 1

After:
# ip route show 192.0.2.0/24
192.0.2.0/24
        nexthop dev dummy0 weight 1
        nexthop dev dummy1 weight 1

Fixes: f48e14880a0e ("iproute: refactor multipath print")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoarpd: remove pthread dependency
Baruch Siach [Tue, 1 May 2018 12:43:08 +0000 (15:43 +0300)]
arpd: remove pthread dependency

Explicit link with pthread is not needed when linking dynamically. Even
static link with recent libdb does not pull in the code that uses
pthread. Finally, the configure check introduced in commit a25df4887d7
(configure: Check for Berkeley DB for arpd compilation) does not add
-lpthread to its link command.

This change allows arpd build with toolchains that do not provide
threads support.

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoREADME: update libdb build dependency information
Baruch Siach [Tue, 1 May 2018 12:43:07 +0000 (15:43 +0300)]
README: update libdb build dependency information

Debian does not distribute libdb4.x-dev for quite some time now. Current
stable carries libdb5.3-dev. Update the wording accordingly.

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agojson_print: Fix hidden 64-bit type promotion
Toke Høiland-Jørgensen [Wed, 25 Apr 2018 15:28:57 +0000 (17:28 +0200)]
json_print: Fix hidden 64-bit type promotion

print_uint() will silently promote its variable type to uint64_t, but there
is nothing that ensures that the format string specifier passed along with
it fits (and the function name suggest to pass "%u").

Fix this by changing print_uint() to use a native 'unsigned int' type, and
introduce a separate print_u64() function for printing 64-bit values. All
call sites that were actually printing 64-bit values using print_uint() are
converted to use print_u64() instead.

Since print_int() was already using native int types, just add a
print_s64() to match, but don't convert any call sites. For symmetry,
also add a print_luint() method (with no users).

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoingress: Don't break JSON output
Toke Høiland-Jørgensen [Wed, 25 Apr 2018 09:29:46 +0000 (11:29 +0200)]
ingress: Don't break JSON output

The dash printed by the ingress qdisc breaks JSON output, so only print it
in regular output mode.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agovxlan: add ttl auto in help message
Hangbin Liu [Tue, 24 Apr 2018 02:40:17 +0000 (10:40 +0800)]
vxlan: add ttl auto in help message

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agogre/gre6: allow clearing {,i,o}{key,seq,csum} flags
Sabrina Dubroca [Fri, 20 Apr 2018 08:32:00 +0000 (10:32 +0200)]
gre/gre6: allow clearing {,i,o}{key,seq,csum} flags

Currently, iproute allows setting those flags, but it's impossible to
clear them, since their current value is fetched from the kernel and
then we OR in the additional flags passed on the command line.

Add no* variants to allow clearing them.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoman: ip link: document GRE tunnels
Sabrina Dubroca [Fri, 20 Apr 2018 08:31:59 +0000 (10:31 +0200)]
man: ip link: document GRE tunnels

GRE tunnels are currently only documented together with IPIP and SIT
tunnels, but they actually have very different configuration
options. Let's separate them.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoMerge branch 'master' into iproute2-next
David Ahern [Tue, 24 Apr 2018 02:42:21 +0000 (19:42 -0700)]
Merge branch 'master' into iproute2-next

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiplink_geneve: correct size of message to avoid spurious errors
Jakub Kicinski [Wed, 18 Apr 2018 18:06:07 +0000 (11:06 -0700)]
iplink_geneve: correct size of message to avoid spurious errors

Commit 6c4b672738ac ("iplink_geneve: Get rid of inet_get_addr()")
inadvertently changed the parameter to addattr_l() resulting in:

addattr_l ERROR: message exceeded bound of 4

when remote is specified.

Fixes: 6c4b672738ac ("iplink_geneve: Get rid of inet_get_addr()")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
6 years agobpf: fix warnings on gcc-8 about string truncation
Stephen Hemminger [Fri, 20 Apr 2018 17:38:00 +0000 (10:38 -0700)]
bpf: fix warnings on gcc-8 about string truncation

In theory, the path for BPF could exceed the 4K PATH_MAX.
In practice, not really possible. But shut up gcc.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agotc: return on invalid smac or dmac in ife action
Roman Mashak [Fri, 20 Apr 2018 13:52:18 +0000 (09:52 -0400)]
tc: return on invalid smac or dmac in ife action

Return on invalid smac/dmac and use invarg consistently for invalid
arguments report.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
6 years agoflower: use 16 bit format where possible
Stephen Hemminger [Fri, 20 Apr 2018 17:04:14 +0000 (10:04 -0700)]
flower: use 16 bit format where possible

Should use print_hu not print_uint for 16 bit value.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoipneigh: fix missing format specifier
Stephen Hemminger [Fri, 20 Apr 2018 16:29:13 +0000 (09:29 -0700)]
ipneigh: fix missing format specifier

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agovxlan: fix ttl inherit behavior
Hangbin Liu [Wed, 18 Apr 2018 05:05:48 +0000 (13:05 +0800)]
vxlan: fix ttl inherit behavior

Like kernel net-next commit 72f6d71e491e6 ("vxlan: add ttl inherit support"),
vxlan ttl inherit should means inherit the inner protocol's ttl value.

But currently when we add vxlan with "ttl inherit", we only set ttl 0,
which is actually use whatever default value instead of inherit the inner
protocol's ttl value.

To make a difference with ttl inherit and ttl == 0, we add an attribute
IFLA_VXLAN_TTL_INHERIT when "ttl inherit" specified. And use "ttl auto"
to means "use whatever default value", the same behavior with ttl == 0.

Reported-by: Jianlin Shi <jishi@redhat.com>
Suggested-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoUpdate kernel headers
David Ahern [Thu, 19 Apr 2018 18:10:27 +0000 (11:10 -0700)]
Update kernel headers

Update kernel headers to commit 292eba02dbb4
("net-next/hinic: add arm64 support")

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoutils: Do not reset family for default, any, all addresses
David Ahern [Fri, 13 Apr 2018 16:36:33 +0000 (09:36 -0700)]
utils: Do not reset family for default, any, all addresses

Thomas reported a change in behavior with respect to autodectecting
address families. Specifically, 'ip ro add default via fe80::1'
syntax was failing to treat fe80::1 as an IPv6 address as it did in
prior releases. The root causes appears to be a change in family when
the default keyword is parsed.

'default', 'any' and 'all' are relevant outside of AF_INET. Leave the
family arg as is for these when setting addr.

Fixes: 93fa12418dc6 ("utils: Always specify family and ->bytelen in get_prefix_1()")
Reported-by: Thomas Deutschmann <whissi@gentoo.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Serhey Popovych <serhe.popovych@gmail.com>
6 years agoiproute: Abort if nexthop cannot be parsed
Jakub Sitnicki [Wed, 11 Apr 2018 09:43:11 +0000 (11:43 +0200)]
iproute: Abort if nexthop cannot be parsed

Attempt to add a multipath route where a nexthop definition refers to a
non-existent device causes 'ip' to crash and burn due to stack buffer
overflow:

  # ip -6 route add fd00::1/64 nexthop dev fake1
  Cannot find device "fake1"
  Cannot find device "fake1"
  Cannot find device "fake1"
  ...
  Segmentation fault (core dumped)

Don't ignore errors from the helper routine that parses the nexthop
definition, and abort immediately if parsing fails.

Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
6 years agotc: jsonify ife action
Roman Mashak [Fri, 13 Apr 2018 21:40:05 +0000 (17:40 -0400)]
tc: jsonify ife action

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotc: jsonify skbedit action
Roman Mashak [Tue, 10 Apr 2018 18:04:29 +0000 (14:04 -0400)]
tc: jsonify skbedit action

v2:
   FIxed strings format in print_string()

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agouapi/sctp: update header from 4.17-rc1
Stephen Hemminger [Tue, 10 Apr 2018 17:50:00 +0000 (10:50 -0700)]
uapi/sctp: update header from 4.17-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agouapi/tipc: update header from 4.17-rc1
Stephen Hemminger [Tue, 10 Apr 2018 17:49:41 +0000 (10:49 -0700)]
uapi/tipc: update header from 4.17-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agouapi/bpf: update kernel header from 4.17-rc1
Stephen Hemminger [Tue, 10 Apr 2018 17:48:56 +0000 (10:48 -0700)]
uapi/bpf: update kernel header from 4.17-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agobridge: fix typo in hairpin error message
Guillaume Nault [Fri, 6 Apr 2018 11:33:49 +0000 (13:33 +0200)]
bridge: fix typo in hairpin error message

No 'g' to hairpin.

Fixes: 64108901b737 ("bridge: Add support for setting bridge port attributes")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agotc: jsonify tunnel_key action
Roman Mashak [Wed, 4 Apr 2018 17:21:18 +0000 (13:21 -0400)]
tc: jsonify tunnel_key action

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotc: jsonify connmark action
Roman Mashak [Tue, 3 Apr 2018 13:09:55 +0000 (09:09 -0400)]
tc: jsonify connmark action

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agordma: Print net device name and index for RDMA device
Leon Romanovsky [Tue, 3 Apr 2018 04:29:14 +0000 (07:29 +0300)]
rdma: Print net device name and index for RDMA device

The RDMA devices are operated in RoCE and iWARP modes have net device
underneath. Present their names in regular output and their net index
in detailed mode.

[root@nps ~]# rdma link show mlx5_3/1
4/1: mlx5_3/1: state ACTIVE physical_state LINK_UP netdev ens7
[root@nps ~]# rdma link show mlx5_3/1 -d
4/1: mlx5_3/1: state ACTIVE physical_state LINK_UP netdev ens7 netdev_index 7
    caps: <CM, IP_BASED_GIDS>

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoMerge branch 'iproute2-master' into iproute2-next
David Ahern [Fri, 6 Apr 2018 16:02:02 +0000 (09:02 -0700)]
Merge branch 'iproute2-master' into iproute2-next

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agol2tp: no need to export session offsets in JSON output
Guillaume Nault [Thu, 5 Apr 2018 17:24:17 +0000 (19:24 +0200)]
l2tp: no need to export session offsets in JSON output

The offset and peer_offset parameters are only printed to avoid
confusing external scripts that may parse "ip l2tp show session"
output. There's no reason to keep them in JSON.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
6 years agotc: Correct json output for actions
Yuval Mintz [Wed, 4 Apr 2018 12:24:13 +0000 (15:24 +0300)]
tc: Correct json output for actions

Commit 9fd3f0b255d9 ("tc: enable json output for actions") added JSON
support for tc-actions at the expense of breaking other use cases that
reach tc_print_action(), as the latter don't expect the 'actions' array
to be a new object.

Consider the following taken duringrun of tc_chain.sh selftest,
and see the latter command output is broken:

$ ./tc/tc -j -p actions list action gact | grep -C 3 actions
[ {
        "total acts": 1
    },{
        "actions": [ {
                "order": 0,

$ ./tc/tc -p -j -s filter show dev enp3s0np2 ingress | grep -C 3 actions
            },
            "skip_hw": true,
            "not_in_hw": true,{
                "actions": [ {
                        "order": 1,
                        "kind": "gact",
                        "control_action": {

Relocate the open/close of the JSON object to declare the object only
for the case that needs it.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Tested-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoip/l2tp: remove offset and peer-offset options
Guillaume Nault [Tue, 3 Apr 2018 15:39:54 +0000 (17:39 +0200)]
ip/l2tp: remove offset and peer-offset options

Ignore options "peer-offset" and "offset" when creating sessions. Keep
them when dumping sessions in order to avoid breaking external scripts.

"peer-offset" has always been a noop in iproute2. "offset" is now
ignored in Linux 4.16 (and was broken before that).

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agordma: Ignore unknown netlink attributes
Leon Romanovsky [Tue, 3 Apr 2018 07:28:42 +0000 (10:28 +0300)]
rdma: Ignore unknown netlink attributes

The check if netlink attributes supplied more than maximum supported
is to strict and may lead to backward compatibility issues with old
application with a newer kernel that supports new attribute.

CC: Steve Wise <swise@opengridcomputing.com>
Fixes: 74bd75c2b68d ("rdma: Add basic infrastructure for RDMA tool")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoMerge branch 'iproute2-master' into iproute2-next
David Ahern [Mon, 2 Apr 2018 17:47:34 +0000 (10:47 -0700)]
Merge branch 'iproute2-master' into iproute2-next

Conflicts:
bridge/mdb.c
misc/ss.c
tc/tc.c

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agov4.16.0
Stephen Hemminger [Mon, 2 Apr 2018 17:06:08 +0000 (10:06 -0700)]
v4.16.0

6 years agoman: fix devlink object list
Jiri Pirko [Thu, 29 Mar 2018 14:26:16 +0000 (16:26 +0200)]
man: fix devlink object list

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agouapi/if_ether: add definition of ether type field
Stephen Hemminger [Mon, 2 Apr 2018 16:17:42 +0000 (09:17 -0700)]
uapi/if_ether: add definition of ether type field

Part of upstream commit
4bbb3e0e8239 ("net: Fix vlan untag for bridge and vlan_dev with reorder_hdr off")

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agodevlink: Print size of -1 as unlimited
David Ahern [Fri, 30 Mar 2018 16:21:44 +0000 (09:21 -0700)]
devlink: Print size of -1 as unlimited

(u64)-1  essentially means the size is unlimited. Print as 'unlimited'
as opposed to the current unsigned int range of 4294967295.

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotc: jsonify sample action
Roman Mashak [Sat, 31 Mar 2018 04:20:45 +0000 (00:20 -0400)]
tc: jsonify sample action

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotc: support oneline mode in action generic printer functions
Roman Mashak [Sat, 31 Mar 2018 04:16:45 +0000 (00:16 -0400)]
tc: support oneline mode in action generic printer functions

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoMerge branch 'rdma-res-tracking' into iproute2-next
David Ahern [Sun, 1 Apr 2018 15:19:21 +0000 (08:19 -0700)]
Merge branch 'rdma-res-tracking' into iproute2-next

Steve Wise  says:

====================

This series enhances the iproute2 rdma tool to include dumping of
connection manager id (cm_id), completion queue (cq), memory region (mr),
and protection domain (pd) rdma resources.  It is the user-space part of
the kernel resource tracking series merged into rdma-next for 4.17 [1]
and [2].

Changes since v3:
- replaced rdma_cma.h inclusion with UAPI rdma_user_cm.h
- display only device names instead of device/port for cq, mr, and pd
since they are not associated with a specific port.

Changes since v2:
- pull in rdma-core:include/rdma/rdma_cma.h
- 80 column reformat
- add reviewed-by tags

Changes since v1/RFC:
- removed RFC tag
- initialize rd properly to avoid passing a garbage port number
- revert accidental change to qp_valid_filters
- removed cm_id dev/network/transport types
- cm_id ip addrs now passed up as __kernel_sockaddr_storage
- cm_id ip address ports printed as "address:port" strings
- only parse/display memory keys and iova if available
- filter on "users" for cqs and pds
- fixed memory leaks
- removed PD_FLAGS attribute
- filter on "mrlen" for mrs
- filter on "poll-ctx" for cqs
- don't require addrs or qp_type for parsing cm_ids
- only filter optional attrs if they are present
- remove PGSIZE MR attr to match kernel

[1] https://www.spinics.net/lists/linux-rdma/msg61720.html
[2] https://www.spinics.net/lists/linux-rdma/msg62979.html
    https://www.spinics.net/lists/linux-rdma/msg62980.html

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agordma: Add PD resource tracking information
Steve Wise [Thu, 29 Mar 2018 16:10:44 +0000 (09:10 -0700)]
rdma: Add PD resource tracking information

Sample output:

Without CAP_NET_ADMIN capability:

dev mlx4_0 users 0 pid 0 comm [ib_srpt]
dev mlx4_0 users 0 pid 0 comm [ib_srp]
dev mlx4_0 users 1 pid 0 comm [ib_core]
dev cxgb4_0 users 0 pid 0 comm [ib_srp]

With CAP_NET_ADMIN capability:
dev mlx4_0 local_dma_lkey 0x8000 users 0 pid 0 comm [ib_srpt]
dev mlx4_0 local_dma_lkey 0x8000 users 0 pid 0 comm [ib_srp]
dev mlx4_0 local_dma_lkey 0x8000 users 1 pid 0 comm [ib_core]
dev cxgb4_0 local_dma_lkey 0x0 users 0 pid 0 comm [ib_srp]

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>