]> git.proxmox.com Git - mirror_iproute2.git/log
mirror_iproute2.git
7 years agotc: m_xt: Fix segfault with iptables-1.6.0
Phil Sutter [Thu, 12 Jan 2017 14:22:49 +0000 (15:22 +0100)]
tc: m_xt: Fix segfault with iptables-1.6.0

Said iptables version introduced struct xtables_globals field
'compat_rev', a function pointer. Initializing it is mandatory as
libxtables calls it without existence check.

Without this, tc segfaults when using the xt action like so:

| tc filter add dev d0 parent ffff: u32 match u32 0 0 \
| action xt -j MARK --set-mark 20

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agowhitespace cleanup
Stephen Hemminger [Fri, 13 Jan 2017 01:29:41 +0000 (17:29 -0800)]
whitespace cleanup

Get rid of blanks at end of line and extra lines at eof

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoAdd support for rt_protos.d
David Ahern [Mon, 9 Jan 2017 23:43:09 +0000 (15:43 -0800)]
Add support for rt_protos.d

Add support for reading proto id/name mappings from rt_protos.d
directory. Allows users to have custom protocol values converted
to human friendly names.

Each file under rt_protos.d has the 'id name' format used by
rt_protos. Only .conf files are read and parsed.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
7 years agoip vrf: Improve bpf error messages
David Ahern [Fri, 6 Jan 2017 00:22:23 +0000 (16:22 -0800)]
ip vrf: Improve bpf error messages

Next up a non-root user gets various bpf related error messages:

$ ip vrf exec mgmt bash
Failed to load BPF prog: 'Operation not permitted'
Kernel compiled with CGROUP_BPF enabled?

Catch the EPERM error and do not show the kernel config option.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
7 years agoip vrf: Improve cgroup2 error messages
David Ahern [Fri, 6 Jan 2017 00:22:22 +0000 (16:22 -0800)]
ip vrf: Improve cgroup2 error messages

Currently, if a non-root user attempts to run ip vrf exec a non-helpful
error is returned:

$ ip vrf exec mgmt bash
Failed to mount cgroup2. Are CGROUPS enabled in your kernel?

Only show the CGROUPS kernel hint for the ENODEV error and for the
rest show the strerror for the errno. So now:

$ ip/ip vrf exec mgmt bash
Failed to mount cgroup2: Operation not permitted

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
7 years agoip vrf: Fix run-on error message on mkdir failure
David Ahern [Fri, 6 Jan 2017 00:22:21 +0000 (16:22 -0800)]
ip vrf: Fix run-on error message on mkdir failure

Andy reported a missing newline if a non-root user attempts to run
'ip vrf exec':

$ ./ip/ip vrf exec default /bin/echo asdf
mkdir failed for /var/run/cgroup2: Permission deniedFailed to setup vrf cgroup2 directory

Reported-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
7 years agotc: make tc linking depend on libtc.a
David Michael [Tue, 3 Jan 2017 23:32:46 +0000 (15:32 -0800)]
tc: make tc linking depend on libtc.a

There was a race condition where the command to link the tc binary
could (rarely) run before the libtc.a archive existed.

7 years agofix typo in ip-xfrm man page, rmd610 -> rmd160
Alexey Kodanev [Fri, 23 Dec 2016 11:03:16 +0000 (14:03 +0300)]
fix typo in ip-xfrm man page, rmd610 -> rmd160

Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
7 years agotc: add missing limits.h header
Baruch Siach [Thu, 22 Dec 2016 18:52:48 +0000 (20:52 +0200)]
tc: add missing limits.h header

This fixes under musl build issues like:

f_matchall.c: In function ‘matchall_parse_opt’:
f_matchall.c:48:12: error: ‘LONG_MIN’ undeclared (first use in this function)
   if (h == LONG_MIN || h == LONG_MAX) {
            ^
f_matchall.c:48:12: note: each undeclared identifier is reported only once for each function it appears in
f_matchall.c:48:29: error: ‘LONG_MAX’ undeclared (first use in this function)
   if (h == LONG_MIN || h == LONG_MAX) {
                             ^

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
7 years agotc/m_tunnel_key: Add to the usage encapsulation dest UDP port
Hadar Hen Zion [Thu, 22 Dec 2016 08:14:41 +0000 (10:14 +0200)]
tc/m_tunnel_key: Add to the usage encapsulation dest UDP port

tunnel key set parameters includes also dest UDP port, add it to the
usage.

Fixes: 449c709c3868 ("tc/m_tunnel_key: Add dest UDP port to tunnel key action")
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reported-by: Simon Horman <simon.horman@netronome.com>
7 years agotc/cls_flower: Add to the usage encapsulation dest UDP port
Hadar Hen Zion [Thu, 22 Dec 2016 08:14:40 +0000 (10:14 +0200)]
tc/cls_flower: Add to the usage encapsulation dest UDP port

Encapsulation dest UDP port is part of the classifier matching
parameters, add it to the usage.

Fixes: 41aa17ff4668 ("tc/cls_flower: Add dest UDP port to tunnel params")
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reported-by: Simon Horman <simon.horman@netronome.com>
7 years agoRevert "tc: flower: Allow *_mac options to accept a mask"
Stephen Hemminger [Thu, 22 Dec 2016 00:06:49 +0000 (16:06 -0800)]
Revert "tc: flower: Allow *_mac options to accept a mask"

This reverts commit 0390185078dedd551028fba58d53ef303ab57a2f.

7 years agoRevert "tc: flower: document that *_ip parameters take a PREFIX as an argument."
Stephen Hemminger [Thu, 22 Dec 2016 00:06:35 +0000 (16:06 -0800)]
Revert "tc: flower: document that *_ip parameters take a PREFIX as an argument."

This reverts commit a8a1dccd2af957077aa9d975db979c39d571bb6c.

7 years agoupdate kernel headers
Stephen Hemminger [Wed, 21 Dec 2016 23:58:49 +0000 (15:58 -0800)]
update kernel headers

7 years agotc: updated man page to reflect filter-id use in filter GET command.
Roman Mashak [Sun, 18 Dec 2016 17:25:37 +0000 (12:25 -0500)]
tc: updated man page to reflect filter-id use in filter GET command.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
7 years agotc: fixed man page fonts for keywords and variable values
Roman Mashak [Sun, 18 Dec 2016 17:25:12 +0000 (12:25 -0500)]
tc: fixed man page fonts for keywords and variable values

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
7 years agoip: vfinfo: remove code duplication for IFLA_VF_RSS_QUERY_EN
Julien Fortin [Fri, 16 Dec 2016 16:36:05 +0000 (17:36 +0100)]
ip: vfinfo: remove code duplication for IFLA_VF_RSS_QUERY_EN

Fixes: 4fb4a10e120b1 ("ipaddress: Print IFLA_VF_QUERY_RSS_EN setting”)
Signed-off-by: Julien Fortin <julien@cumulusnetworks.com>
Acked-by: Phil Sutter <phil@nwl.cc>
7 years agotc: flower: Allow *_mac options to accept a mask
Simon Horman [Fri, 16 Dec 2016 13:54:37 +0000 (14:54 +0100)]
tc: flower: Allow *_mac options to accept a mask

* The argument to src_mac and dst_mac may now take an optional mask
  to limit the scope of matching.
* This address is is documented as a LLADDR in keeping with ip-link(8).
* The formats accepted match those already output when dumping flower
  filters from the kernel.

Example of use of LLADDR with and without a mask:

tc qdisc add dev eth0 ingress
tc filter add dev eth0 protocol ip parent ffff: flower indev eth0 \
src_mac 52:54:01:00:00:00/ff:ff:00:00:00:01 action drop
tc filter add dev eth0 protocol ip parent ffff: flower indev eth0 \
src_mac 52:54:00:00:00:00/23 action drop
tc filter add dev eth0 protocol ip parent ffff: flower indev eth0 \
src_mac 52:54:00:00:00:00 action drop

Signed-off-by: Simon Horman <simon.horman@netronome.com>
7 years agotc: flower: document that *_ip parameters take a PREFIX as an argument.
Simon Horman [Fri, 16 Dec 2016 13:54:36 +0000 (14:54 +0100)]
tc: flower: document that *_ip parameters take a PREFIX as an argument.

* The argument to src_ip, dst_ip, enc_src_ip and enc_dst_ip take an
  optional prefix length which is used to provide a mask to limit the scope
  of matching.
* This is documented as a PREFIX in keeping with ip-route(8).

Example of uses of IPv4 and IPv6 prefixes

tc qdisc add dev eth0 ingress
tc filter add dev eth0 protocol ip parent ffff: flower \
    indev eth0 dst_ip 192.168.1.1 action drop
tc filter add dev eth0 protocol ip parent ffff: flower \
    indev eth0 src_ip 10.0.0.0/8 action drop
tc filter add dev eth0 protocol ipv6 parent ffff: flower \
    indev eth0 src_ip 2001:DB8:1::/48 action drop
tc filter add dev eth0 protocol ipv6 parent ffff: flower \
    indev eth0 dst_ip 2001:DB8::1 action drop

Signed-off-by: Simon Horman <simon.horman@netronome.com>
7 years agoip netns: Reset vrf to default VRF on namespace switch
David Ahern [Thu, 15 Dec 2016 20:07:02 +0000 (12:07 -0800)]
ip netns: Reset vrf to default VRF on namespace switch

A vrf is local to a namespace. Drop any VRF association before trying
to exec a command in the new namespace.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
7 years agoip vrf: Fix reset to default VRF
David Ahern [Thu, 15 Dec 2016 20:07:01 +0000 (12:07 -0800)]
ip vrf: Fix reset to default VRF

Path in vrf_switch for "default" VRF is supposed to be MNT/vrf not
MNT/default. Also, default_vrf flag is redundant with ifindex. Remove
the flag in favor of ifindex != 0.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
7 years agoip vrf: Refactor ipvrf_identify
David Ahern [Thu, 15 Dec 2016 20:07:00 +0000 (12:07 -0800)]
ip vrf: Refactor ipvrf_identify

Split ipvrf_identify into arg processing and a function that does the
actual cgroup file parsing. The latter function is used in a follow
on patch.

In the process, convert the reading of the cgroups file to use fopen
and fgets just in case the file ever grows beyond 4k. Move printing
of any error message and the vrf name to the caller of the new
vrf_identify.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
7 years agoip vrf: Move kernel config hint to prog_load failure
David Ahern [Thu, 15 Dec 2016 20:06:59 +0000 (12:06 -0800)]
ip vrf: Move kernel config hint to prog_load failure

Move the hint about CGROUP_BPF enabled to prog_load failure since
it fails before the attach. Update the existing error message to
print to stderr.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
7 years agoconfigure: fix elftest when warnings enabled
Stephen Hemminger [Thu, 15 Dec 2016 03:09:55 +0000 (19:09 -0800)]
configure: fix elftest when warnings enabled

If compile testing with -W then elftest.c would fail because
of unused variables.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoFix compile warning in get_addr_1
David Ahern [Tue, 13 Dec 2016 23:34:32 +0000 (15:34 -0800)]
Fix compile warning in get_addr_1

A recent cleanup causes a compile warning on Debian jessie:

    CC       utils.o
utils.c: In function ‘get_addr_1’:
utils.c:486:21: warning: passing argument 1 of ‘ll_addr_a2n’ from incompatible pointer type
   len = ll_addr_a2n(&addr->data, sizeof(addr->data), name);
                     ^
In file included from utils.c:34:0:
../include/rt_names.h:27:5: note: expected ‘char *’ but argument is of type ‘__u32 (*)[8]’
 int ll_addr_a2n(char *lladdr, int len, const char *arg);
     ^

Revert the removal of the typecast

Fixes: e1933b928125 ("utils: cleanup style")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
7 years agotc: pass correct conversion specifier to print 'unsigned int' action index.
Roman Mashak [Tue, 13 Dec 2016 20:31:16 +0000 (15:31 -0500)]
tc: pass correct conversion specifier to print 'unsigned int' action index.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
7 years agoipvrf: cleanup style issues
Stephen Hemminger [Tue, 13 Dec 2016 18:43:24 +0000 (10:43 -0800)]
ipvrf: cleanup style issues

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoutils: cleanup style
Stephen Hemminger [Tue, 13 Dec 2016 18:41:36 +0000 (10:41 -0800)]
utils: cleanup style

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agolibnetlink: break up dump function
Stephen Hemminger [Tue, 13 Dec 2016 18:40:49 +0000 (10:40 -0800)]
libnetlink: break up dump function

Indentation is deep here.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoIntroduce ip vrf command
David Ahern [Mon, 12 Dec 2016 00:53:15 +0000 (16:53 -0800)]
Introduce ip vrf command

'ip vrf' follows the user semnatics established by 'ip netns'.

The 'ip vrf' subcommand supports 3 usages:

1. Run a command against a given vrf:
       ip vrf exec NAME CMD

   Uses the recently committed cgroup/sock BPF option. vrf directory
   is added to cgroup2 mount. Individual vrfs are created under it. BPF
   filter attached to vrf/NAME cgroup2 to set sk_bound_dev_if to the VRF
   device index. From there the current process (ip's pid) is addded to
   the cgroups.proc file and the given command is exected. In doing so
   all AF_INET/AF_INET6 (ipv4/ipv6) sockets are automatically bound to
   the VRF domain.

   The association is inherited parent to child allowing the command to
   be a shell from which other commands are run relative to the VRF.

2. Show the VRF a process is bound to:
       ip vrf id
   This command essentially looks at /proc/pid/cgroup for a "::/vrf/"
   entry with the VRF name following.

3. Show process ids bound to a VRF
       ip vrf pids NAME
   This command dumps the file MNT/vrf/NAME/cgroup.procs since that file
   shows the process ids in the particular vrf cgroup.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
7 years agolibnetlink: Add variant of rtnl_talk that does not display RTNETLINK answers error
David Ahern [Mon, 12 Dec 2016 00:53:14 +0000 (16:53 -0800)]
libnetlink: Add variant of rtnl_talk that does not display RTNETLINK answers error

iplink_vrf has 2 functions used to validate a user given device name is
a VRF device and to return the table id. If the user string is not a
device name ip commands with a vrf keyword show a confusing error
message: "RTNETLINK answers: No such device".

Add a variant of rtnl_talk that does not display the "RTNETLINK answers"
message and update iplink_vrf to use it.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
7 years agochange name_is_vrf to return index
David Ahern [Mon, 12 Dec 2016 00:53:13 +0000 (16:53 -0800)]
change name_is_vrf to return index

index of 0 means name is not a valid vrf.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
7 years agoAdd filesystem APIs to lib
David Ahern [Mon, 12 Dec 2016 00:53:12 +0000 (16:53 -0800)]
Add filesystem APIs to lib

Add make_path to recursively call mkdir as needed to create a given
path with the given mode.

Add find_cgroup2_mount to lookup path where cgroup2 is mounted. If it
is not already mounted, cgroup2 is mounted under /var/run/cgroup2 for
use by iproute2.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
7 years agomove cmd_exec to lib utils
David Ahern [Mon, 12 Dec 2016 00:53:11 +0000 (16:53 -0800)]
move cmd_exec to lib utils

Code move only; no functional change intended.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
7 years agobpf: Add BPF_ macros
David Ahern [Mon, 12 Dec 2016 00:53:10 +0000 (16:53 -0800)]
bpf: Add BPF_ macros

Based on version in kernel repo, samples/bpf/libbpf.h

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
7 years agobpf: export bpf_prog_load
David Ahern [Mon, 12 Dec 2016 00:53:09 +0000 (16:53 -0800)]
bpf: export bpf_prog_load

Code move only; no functional change intended.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
7 years agolib bpf: Add support for BPF_PROG_ATTACH and BPF_PROG_DETACH
David Ahern [Mon, 12 Dec 2016 00:53:08 +0000 (16:53 -0800)]
lib bpf: Add support for BPF_PROG_ATTACH and BPF_PROG_DETACH

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
7 years agotc: tunnel_key: Add tc-tunnel_key man page to Makefile
Roi Dayan [Tue, 13 Dec 2016 12:39:02 +0000 (14:39 +0200)]
tc: tunnel_key: Add tc-tunnel_key man page to Makefile

To be installed with the other man pages.

Fixes: d57639a475a9 ("tc/act_tunnel: Introduce ip tunnel action")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Amir Vadai <amir@vadai.me>
7 years agotc: flower: Fix typo and style in flower man page
Roi Dayan [Tue, 13 Dec 2016 12:39:01 +0000 (14:39 +0200)]
tc: flower: Fix typo and style in flower man page

Replace vlan_eth_type with vlan_ethtype.

Fixes: 745d91726006 ("tc: flower: Introduce vlan support")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
7 years agotc/m_tunnel_key: Add dest UDP port to tunnel key action
Hadar Hen Zion [Tue, 13 Dec 2016 08:07:47 +0000 (10:07 +0200)]
tc/m_tunnel_key: Add dest UDP port to tunnel key action

Enhance tunnel key action parameters by adding destination UDP port.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
7 years agotc/cls_flower: Add dest UDP port to tunnel params
Hadar Hen Zion [Tue, 13 Dec 2016 08:07:46 +0000 (10:07 +0200)]
tc/cls_flower: Add dest UDP port to tunnel params

Enhance IP tunnel parameters by adding destination UDP port.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
7 years agolwtunnel: style cleanup
Stephen Hemminger [Mon, 12 Dec 2016 23:37:00 +0000 (15:37 -0800)]
lwtunnel: style cleanup

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agolwt: BPF support for LWT
Thomas Graf [Mon, 12 Dec 2016 00:14:35 +0000 (01:14 +0100)]
lwt: BPF support for LWT

Adds support to configure BPF programs as nexthop actions via the LWT
framework.

Example:
   ip route add 192.168.253.2/32 \
     encap bpf out obj lwt_len_hist_kern.o section len_hist \
     dev veth0

Signed-off-by: Thomas Graf <tgraf@suug.ch>
7 years agoupdate to net-next headers (pre 4.10 rc)
Stephen Hemminger [Mon, 12 Dec 2016 23:26:34 +0000 (15:26 -0800)]
update to net-next headers (pre 4.10 rc)

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoMerge branch 'master' into net-next
Stephen Hemminger [Mon, 12 Dec 2016 23:24:40 +0000 (15:24 -0800)]
Merge branch 'master' into net-next

7 years agov4.9.0
Stephen Hemminger [Mon, 12 Dec 2016 23:07:42 +0000 (15:07 -0800)]
v4.9.0

7 years agoupdate to 4.9 release headers
Stephen Hemminger [Mon, 12 Dec 2016 23:05:59 +0000 (15:05 -0800)]
update to 4.9 release headers

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoMakefile: really suppress printing of directories
David Ahern [Wed, 7 Dec 2016 20:55:09 +0000 (12:55 -0800)]
Makefile: really suppress printing of directories

Makefile adds --no-print-directory to MAKEFLAGS if VERBOSE is not
defined however Config always defines VERBOSE. Update the check to
whether VERBOSE is 0.

Fixes: 57bdf8b76451 ("Make builds default to quiet mode")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
7 years agotc: flower: support matching on ICMP type and code
Simon Horman [Wed, 7 Dec 2016 13:54:03 +0000 (14:54 +0100)]
tc: flower: support matching on ICMP type and code

Support matching on ICMP type and code.

Example usage:

tc qdisc add dev eth0 ingress

tc filter add dev eth0 protocol ip parent ffff: flower \
indev eth0 ip_proto icmp type 8 code 0 action drop

tc filter add dev eth0 protocol ipv6 parent ffff: flower \
indev eth0 ip_proto icmpv6 type 128 code 0 action drop

Signed-off-by: Simon Horman <simon.horman@netronome.com>
7 years agotc: flower: introduce enum flower_endpoint
Simon Horman [Wed, 7 Dec 2016 13:54:02 +0000 (14:54 +0100)]
tc: flower: introduce enum flower_endpoint

Introduce enum flower_endpoint and use it instead of a bool
as the type for paramatising source and destination.

This is intended to improve read-ability and provide some type
checking of endpoint parameters.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
7 years agobpf: add initial support for attaching xdp progs
Daniel Borkmann [Tue, 6 Dec 2016 01:21:57 +0000 (02:21 +0100)]
bpf: add initial support for attaching xdp progs

Now that we made the BPF loader generic as a library, reuse it
for loading XDP programs as well. This basically adds a minimal
start of a facility for iproute2 to load XDP programs. There
currently only exists the xdp1_user.c sample code in the kernel
tree that sets up netlink directly and an iovisor/bcc front-end.

Since we have all the necessary infrastructure in place already
from tc side, we can just reuse its loader back-end and thus
facilitate migration and usability among the two for people
familiar with tc/bpf already. Sharing maps, performing tail calls,
etc works the same way as with tc. Naturally, once kernel
configuration API evolves, we will extend new features for XDP
here as well, resp. extend dumping of related netlink attributes.

Minimal example:

  clang -target bpf -O2 -Wall -c prog.c -o prog.o
  ip [-force] link set dev em1 xdp obj prog.o       # attaching
  ip [-d] link                                      # dumping
  ip link set dev em1 xdp off                       # detaching

For the dump, intention is that in the first line for each ip
link entry, we'll see "xdp" to indicate that this device has an
XDP program attached. Once we dump some more useful information
via netlink (digest, etc), idea is that 'ip -d link' will then
display additional relevant program information below the "link/
ether [...]" output line for such devices, for example.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
7 years agobpf: check for owner_prog_type and notify users when differ
Daniel Borkmann [Tue, 6 Dec 2016 01:17:58 +0000 (02:17 +0100)]
bpf: check for owner_prog_type and notify users when differ

Kernel commit 21116b7068b9 ("bpf: add owner_prog_type and accounted mem
to array map's fdinfo") added support for telling the owner prog type in
case of prog arrays. Give a notification to the user when they differ,
and the program eventually fails to load.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
7 years agobpf: Fix number of retries when growing log buffer
Thomas Graf [Wed, 7 Dec 2016 09:47:59 +0000 (10:47 +0100)]
bpf: Fix number of retries when growing log buffer

The log buffer is automatically grown when the verifier output does not
fit into the default buffer size. The number of growing attempts was
not sufficient to reach the maximum buffer size so far.

Perform 9 iterations to reach max and let the 10th one fail.

j:0     i:65536         max:16777215
j:1     i:131072        max:16777215
j:2     i:262144        max:16777215
j:3     i:524288        max:16777215
j:4     i:1048576       max:16777215
j:5     i:2097152       max:16777215
j:6     i:4194304       max:16777215
j:7     i:8388608       max:16777215
j:8     i:16777216      max:16777215

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
7 years agodevlink: Add option to set and show eswitch inline mode
Roi Dayan [Sun, 27 Nov 2016 11:21:03 +0000 (13:21 +0200)]
devlink: Add option to set and show eswitch inline mode

This is needed for some HWs to do proper macthing and steering.
Possible values are none, link, network, transport.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
7 years agodevlink: Add usage help for eswitch subcommand
Roi Dayan [Sun, 27 Nov 2016 11:21:02 +0000 (13:21 +0200)]
devlink: Add usage help for eswitch subcommand

Add missing usage help for devlink dev eswitch subcommand.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
7 years agoupdate kernel headers from net-next
Stephen Hemminger [Fri, 9 Dec 2016 20:39:39 +0000 (12:39 -0800)]
update kernel headers from net-next

Net-next now closed.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoMerge branch 'master' into net-next
Stephen Hemminger [Fri, 9 Dec 2016 20:38:51 +0000 (12:38 -0800)]
Merge branch 'master' into net-next

7 years agoupdate kernel headers
Stephen Hemminger [Fri, 9 Dec 2016 20:38:35 +0000 (12:38 -0800)]
update kernel headers

7 years agoRevert "devlink: Add usage help for eswitch subcommand"
Stephen Hemminger [Fri, 9 Dec 2016 20:37:39 +0000 (12:37 -0800)]
Revert "devlink: Add usage help for eswitch subcommand"

This reverts commit 11f4cd31d2776bbffecceb6775d0210fe16cc04e.

7 years agoRevert "devlink: Add option to set and show eswitch inline mode"
Stephen Hemminger [Fri, 9 Dec 2016 20:37:19 +0000 (12:37 -0800)]
Revert "devlink: Add option to set and show eswitch inline mode"

This reverts commit b9dcf9c2826cc193937e5c337dee96a4c111e56a.

Intended for net-next

7 years agotc: flower: make use of flower_port_attr_type() safe and silent
Simon Horman [Sat, 3 Dec 2016 08:52:40 +0000 (09:52 +0100)]
tc: flower: make use of flower_port_attr_type() safe and silent

Make use of flower_port_attr_type() safe:
* flower_port_attr_type() may return a valid index into tb[] or -1.
  Only access tb[] in the case of the former.
* Do not access null entries in tb[]

Also make usage silent - it is valid for ip_proto to be invalid,
for example if it is not specified as part of the filter.

Fixes: a1fb0d484237 ("tc: flower: Support matching on SCTP ports")
Signed-off-by: Simon Horman <simon.horman@netronome.com>
7 years agotc: flower: correct name of ip_proto parameter to flower_parse_port()
Simon Horman [Sat, 3 Dec 2016 08:52:39 +0000 (09:52 +0100)]
tc: flower: correct name of ip_proto parameter to flower_parse_port()

This corrects a typo.

Fixes: a1fb0d484237 ("tc: flower: Support matching on SCTP ports")
Signed-off-by: Simon Horman <simon.horman@netronome.com>
7 years agotc: flower: document SCTP ip_proto
Simon Horman [Sat, 3 Dec 2016 08:52:38 +0000 (09:52 +0100)]
tc: flower: document SCTP ip_proto

Add SCTP ip_proto to help text and man page.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
7 years agotc: flower: remove references to eth_type in manpage
Simon Horman [Fri, 2 Dec 2016 22:59:43 +0000 (14:59 -0800)]
tc: flower: remove references to eth_type in manpage

Remove references to eth_type and ether_type (spelling error) in
the tc flower manpage.

Also correct formatting of boldface text with whitespace.

Cc: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoupdate kernel headers from net-next
Stephen Hemminger [Fri, 2 Dec 2016 22:54:33 +0000 (14:54 -0800)]
update kernel headers from net-next

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoMerge branch 'master' into net-next
Stephen Hemminger [Fri, 2 Dec 2016 22:19:08 +0000 (14:19 -0800)]
Merge branch 'master' into net-next

7 years agoss: initialise variables outside of for loop
Simon Horman [Fri, 2 Dec 2016 11:56:05 +0000 (12:56 +0100)]
ss: initialise variables outside of for loop

Initialise for loops outside of for loops. GCC flags this as being
out of spec unless C99 or C11 mode is used.

With this change the entire tree appears to compile cleanly with -Wall.

$ gcc --version
gcc (Debian 4.9.2-10) 4.9.2
...
$ make
...
ss.c: In function ‘unix_show_sock’:
ss.c:3128:4: error: ‘for’ loop initial declarations are only allowed in C99 or C11 mode
...

Signed-off-by: Simon Horman <simon.horman@netronome.com>
7 years agotc/act_tunnel: Introduce ip tunnel action
Amir Vadai [Fri, 2 Dec 2016 11:25:15 +0000 (13:25 +0200)]
tc/act_tunnel: Introduce ip tunnel action

This action could be used before redirecting packets to a shared tunnel
device, or when redirecting packets arriving from a such a device.

The 'unset' action is optional. It is used to explicitly unset the
metadata created by the tunnel device during decap. If not used, the
metadata will be released automatically by the kernel.
The 'set' operation, will set the metadata with the specified values for
the encap.

For example, the following flower filter will forward all ICMP packets
destined to 11.11.11.2 through the shared vxlan device 'vxlan0'. Before
redirecting, a metadata for the vxlan tunnel is created using the
tunnel_key action and it's arguments:

$ tc filter add dev net0 protocol ip parent ffff: \
    flower \
      ip_proto 1 \
      dst_ip 11.11.11.2 \
    action tunnel_key set \
      src_ip 11.11.0.1 \
      dst_ip 11.11.0.2 \
      id 11 \
    action mirred egress redirect dev vxlan0

Signed-off-by: Amir Vadai <amir@vadai.me>
7 years agotc/cls_flower: Classify packet in ip tunnels
Amir Vadai [Fri, 2 Dec 2016 11:25:14 +0000 (13:25 +0200)]
tc/cls_flower: Classify packet in ip tunnels

Introduce classifying by metadata extracted by the tunnel device.
Outer header fields - source/dest ip and tunnel id, are extracted from
the metadata when classifying.

For example, the following will add a filter on the ingress Qdisc of shared
vxlan device named 'vxlan0'. To forward packets with outer src ip
11.11.0.2, dst ip 11.11.0.1 and tunnel id 11. The packets will be
forwarded to tap device 'vnet0':

$ tc filter add dev vxlan0 protocol ip parent ffff: \
    flower \
      enc_src_ip 11.11.0.2 \
      enc_dst_ip 11.11.0.1 \
      enc_key_id 11 \
      dst_ip 11.11.11.1 \
    action mirred egress redirect dev vnet0

Signed-off-by: Amir Vadai <amir@vadai.me>
7 years agolibnetlink: Introduce rta_getattr_be*()
Amir Vadai [Fri, 2 Dec 2016 11:25:13 +0000 (13:25 +0200)]
libnetlink: Introduce rta_getattr_be*()

Add the utility functions rta_getattr_be16() and rta_getattr_be32(), and
change existing code to use it.

Signed-off-by: Amir Vadai <amir@vadai.me>
7 years agoss: unix_show: No need to initialize members of calloc'ed structs
Phil Sutter [Fri, 2 Dec 2016 10:40:02 +0000 (11:40 +0100)]
ss: unix_show: No need to initialize members of calloc'ed structs

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoss: Make sstate_namel local to scan_state()
Phil Sutter [Fri, 2 Dec 2016 10:40:01 +0000 (11:40 +0100)]
ss: Make sstate_namel local to scan_state()

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoss: Make sstate_name local to sock_state_print()
Phil Sutter [Fri, 2 Dec 2016 10:40:00 +0000 (11:40 +0100)]
ss: Make sstate_name local to sock_state_print()

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoss: Make unix_state_map local to unix_show()
Phil Sutter [Fri, 2 Dec 2016 10:39:59 +0000 (11:39 +0100)]
ss: Make unix_state_map local to unix_show()

Also make it const, since there won't be any write access happening.

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoss: Get rid of single-fielded struct snmpstat
Phil Sutter [Fri, 2 Dec 2016 10:39:58 +0000 (11:39 +0100)]
ss: Get rid of single-fielded struct snmpstat

A struct with only a single field does not make much sense. Besides
that, it was used by print_summary() only.

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoss: Get rid of useless goto in handle_follow_request()
Phil Sutter [Fri, 2 Dec 2016 10:39:57 +0000 (11:39 +0100)]
ss: Get rid of useless goto in handle_follow_request()

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoss: Make slabstat_ids local to get_slabstat()
Phil Sutter [Fri, 2 Dec 2016 10:39:56 +0000 (11:39 +0100)]
ss: Make slabstat_ids local to get_slabstat()

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoss: Make some variables function-local
Phil Sutter [Fri, 2 Dec 2016 10:39:55 +0000 (11:39 +0100)]
ss: Make some variables function-local

addrp_width and screen_width are used in main() only, so no need to have
them globally available.

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoss: Make user_ent_hash_build_init local to user_ent_hash_build()
Phil Sutter [Fri, 2 Dec 2016 10:39:54 +0000 (11:39 +0100)]
ss: Make user_ent_hash_build_init local to user_ent_hash_build()

By having it statically defined, there is no need for it to be global.

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoss: Make tmr_name local to tcp_timer_print()
Phil Sutter [Fri, 2 Dec 2016 10:39:53 +0000 (11:39 +0100)]
ss: Make tmr_name local to tcp_timer_print()

It's used only there, so no need to have it globally defined.

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoss: Turn generic_proc_open() wrappers into macros
Phil Sutter [Fri, 2 Dec 2016 10:39:52 +0000 (11:39 +0100)]
ss: Turn generic_proc_open() wrappers into macros

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoss: Eliminate unix_use_proc()
Phil Sutter [Fri, 2 Dec 2016 10:39:51 +0000 (11:39 +0100)]
ss: Eliminate unix_use_proc()

This function is used only at a single place anymore, so replace the
call to it by it's content, which makes that specific part of
unix_show() consistent with e.g. tcp_show().

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoss: Drop list traversal from unix_stats_print()
Phil Sutter [Fri, 2 Dec 2016 10:39:50 +0000 (11:39 +0100)]
ss: Drop list traversal from unix_stats_print()

Although this complicates the dedicated procfs-based code path in
unix_show() a bit, it's the only sane way to get rid of unix_show_sock()
output diverging from other socket types in that it prints all socket
details in a new line.

As a side effect, it allows to eliminate all procfs specific code in
the same function.

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoss: introduce proc_ctx_print()
Phil Sutter [Fri, 2 Dec 2016 10:39:49 +0000 (11:39 +0100)]
ss: introduce proc_ctx_print()

This consolidates identical code in three places. While the function
name is not quite perfect as there is different proc_ctx printing code
in netlink_show_one() as well, I sadly didn't find a more suitable one.

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoss: Use sockstat->type in all socket types
Phil Sutter [Fri, 2 Dec 2016 10:39:48 +0000 (11:39 +0100)]
ss: Use sockstat->type in all socket types

Unix sockets used that field already to hold info about the socket type.
By replicating this approach in all other socket types, we can get rid
of protocol parameter in inet_stats_print() and have sock_state_print()
figure things out by itself.

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoss: Add missing tab when printing UNIX details
Phil Sutter [Fri, 2 Dec 2016 10:39:47 +0000 (11:39 +0100)]
ss: Add missing tab when printing UNIX details

When dumping UNIX sockets and show_details is active but not show_mem
(ss -xne), the socket details are printed without being prefixed by tab.
Fix this by printing the tab character when either one of '-e' or '-m'
has been specified.

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoss: Drop empty lines in UDP output
Phil Sutter [Fri, 2 Dec 2016 10:39:46 +0000 (11:39 +0100)]
ss: Drop empty lines in UDP output

When dumping UDP sockets and show_tcpinfo (-i) is active but not
show_mem (-m), print_tcpinfo() does not output anything leading to an
empty line being printed after every socket. Fix this by skipping the
call to print_tcpinfo() and the previous newline printing in that case.

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoss: Mark fall through in arg parsing switch()
Phil Sutter [Fri, 2 Dec 2016 10:39:45 +0000 (11:39 +0100)]
ss: Mark fall through in arg parsing switch()

As there is a certain chance of overlooking this, better add a comment
to draw readers' attention.

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoss: print new tcp_info fields: busy, rwnd-limited, sndbuf-limited times
Yuchung Cheng [Thu, 1 Dec 2016 18:21:40 +0000 (13:21 -0500)]
ss: print new tcp_info fields: busy, rwnd-limited, sndbuf-limited times

Dump some new fields added to tcp_info in v4.10: tcpi_busy_time,
tcpi_rwnd_limited, tcpi_sndbuf_limited.

Example output for a flow busy for 110ms but never measurably limited by
receive window or send buffer:
   busy:110ms

Example output for a flow usually limited by receive window:
   busy:111ms rwnd_limited:101ms(91.0%)

Example output for a flow sometimes limited by send buffer:
   busy:50ms sndbuf_limited:10ms(20.0%)

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
7 years agoss: print new tcp_info fields: delivery_rate and app_limited
Neal Cardwell [Thu, 1 Dec 2016 18:21:39 +0000 (13:21 -0500)]
ss: print new tcp_info fields: delivery_rate and app_limited

Dump the new delivery_rate and delivery_rate_app_limited fields that
were added to tcp_info in Linux v4.9.

Example output:
  pacing_rate 65.7Mbps delivery_rate 62.9Mbps

And for the application-limited case this looks like:
  pacing_rate 1031.1Mbps delivery_rate 87.4Mbps app_limited

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
7 years agoss: Add inet raw sockets information gathering via netlink diag interface
Cyrill Gorcunov [Wed, 2 Nov 2016 13:14:56 +0000 (16:14 +0300)]
ss: Add inet raw sockets information gathering via netlink diag interface

unix, tcp, udp[lite], packet, netlink sockets already support diag
interface for their collection and killing. Implement support
for raw sockets.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
7 years agolibnetlink: Add test for error code returned from netlink reply
Cyrill Gorcunov [Wed, 2 Nov 2016 13:14:55 +0000 (16:14 +0300)]
libnetlink: Add test for error code returned from netlink reply

In case if some diag module is not present in the system,
say the kernel is not modern enough, we simply skip the
error code reported. Instead we should check for data
length in NLMSG_DONE and process unsupported case.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
7 years agoUpdate kernel headers for XDP and tcp_info
Stephen Hemminger [Thu, 1 Dec 2016 18:52:30 +0000 (10:52 -0800)]
Update kernel headers for XDP and tcp_info

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
7 years agoMerge branch 'master' into net-next
Stephen Hemminger [Thu, 1 Dec 2016 18:48:05 +0000 (10:48 -0800)]
Merge branch 'master' into net-next

7 years agoman: ip-route.8: Add notes about dropped IPv4 route cache
Phil Sutter [Wed, 30 Nov 2016 08:29:48 +0000 (09:29 +0100)]
man: ip-route.8: Add notes about dropped IPv4 route cache

Signed-off-by: Phil Sutter <phil@nwl.cc>
7 years agoMerge branch 'master' into net-next
Stephen Hemminger [Thu, 1 Dec 2016 18:29:12 +0000 (10:29 -0800)]
Merge branch 'master' into net-next

7 years agodevlink: Add option to set and show eswitch inline mode
Roi Dayan [Sun, 27 Nov 2016 11:21:03 +0000 (13:21 +0200)]
devlink: Add option to set and show eswitch inline mode

This is needed for some HWs to do proper macthing and steering.
Possible values are none, link, network, transport.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
7 years agodevlink: Add usage help for eswitch subcommand
Roi Dayan [Sun, 27 Nov 2016 11:21:02 +0000 (13:21 +0200)]
devlink: Add usage help for eswitch subcommand

Add missing usage help for devlink dev eswitch subcommand.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
7 years agolink: add team and team_slave link type
Zhang Shengju [Fri, 25 Nov 2016 14:01:29 +0000 (22:01 +0800)]
link: add team and team_slave link type

Add missing team and team_slave link type.

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
7 years agol2tp: style cleanup
Stephen Hemminger [Tue, 29 Nov 2016 21:40:06 +0000 (13:40 -0800)]
l2tp: style cleanup

Make l2tp conform to kernel style guidelines