]> git.proxmox.com Git - mirror_iproute2.git/log
mirror_iproute2.git
8 years agoman: ip-link: Add vrf type
David Ahern [Tue, 21 Jun 2016 23:29:01 +0000 (16:29 -0700)]
man: ip-link: Add vrf type

Add description for vrf type to ip-link man page.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
8 years agobridge: man: fix "brige" typo
Vivien Didelot [Tue, 21 Jun 2016 19:28:50 +0000 (15:28 -0400)]
bridge: man: fix "brige" typo

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
8 years agobridge: vlan: fix a few "fdb" typos in vlan doc
Vivien Didelot [Tue, 21 Jun 2016 19:28:26 +0000 (15:28 -0400)]
bridge: vlan: fix a few "fdb" typos in vlan doc

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
8 years agoFix MAC address length check
Phil Sutter [Wed, 22 Jun 2016 10:05:38 +0000 (12:05 +0200)]
Fix MAC address length check

I forgot to change the variable in the conditional, too.

Fixes: 8fe58d58941f4 ("iplink: Check address length via netlink")
Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agoman: ip-address, ip-link: Document 'type' quirk
Phil Sutter [Fri, 24 Jun 2016 10:14:23 +0000 (12:14 +0200)]
man: ip-address, ip-link: Document 'type' quirk

This covers the fact that calling 'ip {link|addr} show type foobar' does
not return an error.

Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agoif: add missing kernel headers
Stephen Hemminger [Tue, 21 Jun 2016 18:24:52 +0000 (11:24 -0700)]
if: add missing kernel headers

Add kernel headers for all headers that included by current source.

8 years agoiplink: Check address length via netlink
Phil Sutter [Thu, 16 Jun 2016 14:19:40 +0000 (16:19 +0200)]
iplink: Check address length via netlink

This is a feature which was lost during the conversion to netlink
interface: If the device exists and a user tries to change the link
layer address, query the kernel for the old address first and reject the
new one if sizes differ.

This patch adds the same check when setting VF address by assuming same
length as PF device.

Note that at least for VFs the check can't be done in kernel space since
struct ifla_vf_mac lacks a length field and due to netlink padding the
exact size can't be communicated to the kernel.

Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agoiplink: Add missing variable initialization
Phil Sutter [Thu, 16 Jun 2016 14:19:39 +0000 (16:19 +0200)]
iplink: Add missing variable initialization

Without this, we might feed garbage to the kernel when the address is
shorter than expected.

Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agoss: Add tcp_info fields data_segs_in/out
Martin KaFai Lau [Sat, 18 Jun 2016 00:38:53 +0000 (17:38 -0700)]
ss: Add tcp_info fields data_segs_in/out

tcp_info fields, data_segs_in and data_segs_out, have been added to the
kernel in commit a44d6eacdaf5 ("tcp: Add RFC4898 tcpEStatsPerfDataSegsOut/In")
since kernel 4.6.

This patch supports those fileds in ss:

ESTAB      801736 360                            face:face:face:face::1:22                                      face:face:face:face::face:46779
         cubic wscale:9,7 rto:223 rtt:22.195/8.202 ato:40 mss:1428 cwnd:11 ssthresh:7 bytes_acked:203649 bytes_received:334034603 segs_out:18513 segs_in:241825 data_segs_out:4192 data_segs_in:241672 send 5.7Mbps lastsnd:2 lastack:3 pacing_rate 6.8Mbps unacked:10 retrans:0/1 rcv_rtt:29.375 rcv_space:1241704 minrtt:0.013

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
8 years agotc: m_action: Drop unused variable nladdr in tc_action_gd()
Phil Sutter [Wed, 15 Jun 2016 22:50:39 +0000 (00:50 +0200)]
tc: m_action: Drop unused variable nladdr in tc_action_gd()

This has been there since the introduction of tc/m_action.c back in 2004
and was apparently never in use.

Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agotc: m_action: Use C99 style initializers for struct req
Phil Sutter [Wed, 15 Jun 2016 22:50:38 +0000 (00:50 +0200)]
tc: m_action: Use C99 style initializers for struct req

Instead of initializing fields after (or sometimes even before) zeroing
the whole struct via memset(), initialize the whole thing at declaration
time.

Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agotc: let m_ipt work with new iptables API headers
Alexander Aring [Sun, 29 May 2016 18:27:13 +0000 (20:27 +0200)]
tc: let m_ipt work with new iptables API headers

Since commit 5cd1adb ("Update to current iptables headers") the build
with m_ipt.o and the following config will fail:

TC_CONFIG_XT:=n
TC_CONFIG_XT_OLD:=n
TC_CONFIG_XT_OLD_H:=n

This patch renames "iptables_target" to "xtables_target" and some other
things which gets renamed and I noticed while reading iptables git log.
Functions which are not used in m_ipt.c and not exported by the header
are removed, if they still used in m_ipt.c I added a static to the function.

Reported-by: Clemens Gruber <clemens.gruber@pqgruber.com>
Signed-off-by: Alexander Aring <aar@pengutronix.de>
8 years agom_xt: whitespace cleanup
Stephen Hemminger [Tue, 14 Jun 2016 21:40:53 +0000 (14:40 -0700)]
m_xt: whitespace cleanup

Make it 99% checkpatch clean.

8 years agotc: m_xt: Introduce get_xtables_target_opts()
Phil Sutter [Fri, 10 Jun 2016 11:42:08 +0000 (13:42 +0200)]
tc: m_xt: Introduce get_xtables_target_opts()

This pulls common code from parse_ipt() and print_ipt() functions
together.

While here, also fix for incorrect use of the global 'optarg' variable
in print_ipt().

Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agotc: m_xt: Simplify argc adjusting in parse_ipt()
Phil Sutter [Fri, 10 Jun 2016 11:42:07 +0000 (13:42 +0200)]
tc: m_xt: Simplify argc adjusting in parse_ipt()

And while at it, also improve the error message in case too few
parameters have been given.

Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agotc: m_xt: Get rid of iargc variable in parse_ipt()
Phil Sutter [Fri, 10 Jun 2016 11:42:06 +0000 (13:42 +0200)]
tc: m_xt: Get rid of iargc variable in parse_ipt()

After dropping the unused decrement of argc in the function's tail, it
can fully take over what iargc has been used for.

Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agotc: m_xt: Get rid of rargc in parse_ipt()
Phil Sutter [Fri, 10 Jun 2016 11:42:05 +0000 (13:42 +0200)]
tc: m_xt: Get rid of rargc in parse_ipt()

No need to copy the passed parameter, it's changed only once right
before function return.

Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agotc: m_xt: Drop unused variable fw in parse_ipt()
Phil Sutter [Fri, 10 Jun 2016 11:42:04 +0000 (13:42 +0200)]
tc: m_xt: Drop unused variable fw in parse_ipt()

Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agotc: m_xt: Get rid of one indentation level in parse_ipt()
Phil Sutter [Fri, 10 Jun 2016 11:42:03 +0000 (13:42 +0200)]
tc: m_xt: Get rid of one indentation level in parse_ipt()

Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agotc: m_xt: Fix indenting
Phil Sutter [Fri, 10 Jun 2016 11:42:02 +0000 (13:42 +0200)]
tc: m_xt: Fix indenting

By exiting early if xtables_find_target() fails, one indenting level can
be dropped. Some of the wrongly indented code then happens to sit at the
right spot by accident which is why this patch is smaller than expected.

Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agotc: m_xt: Fix segfault when adding multiple actions at once
Phil Sutter [Fri, 10 Jun 2016 11:42:01 +0000 (13:42 +0200)]
tc: m_xt: Fix segfault when adding multiple actions at once

Without this, the following call to tc would segfault:

| tc filter add dev d0 parent ffff: u32 match u32 0 0 \
|  action xt -j MARK --set-mark 0x1 \
|  action xt -j MARK --set-mark 0x1

The reason is basically the same as for 6e2e5ec28bad4 ("fix print_ipt:
segfault if more then one filter with action -j MARK.") but in
parse_ipt() instead of print_ipt().

Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agotc: m_xt: Prevent segfault with standard targets
Phil Sutter [Fri, 10 Jun 2016 11:42:00 +0000 (13:42 +0200)]
tc: m_xt: Prevent segfault with standard targets

Iptables standard targets like DROP or REJECT don't implement the print
callback in libxtables. Hence the following command would segfault:

| tc filter add dev d0 parent ffff: u32 match u32 0 0 action xt -j DROP

With this patch standard targets still can't be used (and are not really
useful anyway), but at least it doesn't crash anymore.

Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agopedit: fix whitespace etc
Stephen Hemminger [Tue, 14 Jun 2016 21:31:37 +0000 (14:31 -0700)]
pedit: fix whitespace etc

Minor changes from checkpatch

8 years agoaction pedit: stylistic changes
Jamal Hadi Salim [Sun, 12 Jun 2016 21:40:34 +0000 (17:40 -0400)]
action pedit: stylistic changes

More modern layout.

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
8 years agoman: ip-link: Document query_rss option
Phil Sutter [Fri, 10 Jun 2016 14:39:50 +0000 (16:39 +0200)]
man: ip-link: Document query_rss option

Doc text shamelessly stolen from the introducing commit's message
(6c55c8c4617c5 ['ip link set vf: Added "query_rss" command']).

Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agoutils: fix hex digits parsing in hexstring_a2n()
Beniamino Galvani [Tue, 14 Jun 2016 20:55:17 +0000 (22:55 +0200)]
utils: fix hex digits parsing in hexstring_a2n()

strtoul() only modifies errno on overflow, so if errno is not zero
before calling the function its value is preserved and makes the
function fail for valid inputs; initialize it.

Signed-off-by: Beniamino Galvani <bgalvani@redhat.com>
8 years agoipaddress: Allow listing addresses by type
Phil Sutter [Thu, 9 Jun 2016 17:20:36 +0000 (19:20 +0200)]
ipaddress: Allow listing addresses by type

Not sure why this was limited to ip-link before. It is semantically
equal to the 'master' keyword, which is not restricted at all.

The man page and help text adjustments include the 'master' keyword as
well since that is also supported but wasn't documented before.

Cc: Vadim Kochan <vadim4j@gmail.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agotc: f_u32 cleanup indentation and long lines
Stephen Hemminger [Wed, 8 Jun 2016 23:45:26 +0000 (16:45 -0700)]
tc: f_u32 cleanup indentation and long lines

Several long lines and too long messages here.

8 years agotc: f_u32: Add support for skip_hw and skip_sw flags
Samudrala, Sridhar [Wed, 8 Jun 2016 23:16:01 +0000 (16:16 -0700)]
tc: f_u32: Add support for skip_hw and skip_sw flags

On devices that support TC U32 offloads, these flags enable a filter to be
added only to HW or only to SW. skip_sw and skip_hw are mutually exclusive
flags. By default without any flags, the filter is added to both HW and SW,
but no error checks are done in case of failure to add to HW.
With skip-sw, failure to add to HW is treated as an error.

Here is a sample script that adds 2 filters, one with skip_sw and the other
with skip_hw flag.

   # add ingress qdisc
   tc qdisc add dev p4p1 ingress

   # enable hw tc offload.
   ethtool -K p4p1 hw-tc-offload on

   # add u32 filter with skip-sw flag.
   tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
      handle 800:0:1 u32 ht 800: flowid 800:1 \
      skip-sw \
      match ip src 192.168.1.0/24 \
      action drop

   # add u32 filter with skip-hw flag.
   tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
      handle 800:0:2 u32 ht 800: flowid 800:2 \
      skip-hw \
      match ip src 192.168.2.0/24 \
      action drop

Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
8 years agoip: add MACsec support
Sabrina Dubroca [Wed, 8 Jun 2016 16:34:21 +0000 (09:34 -0700)]
ip: add MACsec support

Extend ip-link to create MACsec devices

  ip link add link <master> <macsec> type macsec [options]

Add `ip macsec` command to configure receive-side secure channels and
secure associations within a macsec netdevice.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Phil Sutter <phil@nwl.cc>
8 years agoutils: provide get_hex to read a hex digit from a char
Sabrina Dubroca [Fri, 3 Jun 2016 14:45:47 +0000 (16:45 +0200)]
utils: provide get_hex to read a hex digit from a char

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Phil Sutter <phil@nwl.cc>
8 years agoutils: add get_be{16, 32, 64}, use them where possible
Sabrina Dubroca [Fri, 3 Jun 2016 14:45:46 +0000 (16:45 +0200)]
utils: add get_be{16, 32, 64}, use them where possible

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Phil Sutter <phil@nwl.cc>
8 years agoutils: make hexstring_a2n provide the number of hex digits parsed
Sabrina Dubroca [Fri, 3 Jun 2016 14:45:45 +0000 (16:45 +0200)]
utils: make hexstring_a2n provide the number of hex digits parsed

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Phil Sutter <phil@nwl.cc>
8 years agoip: minor checkpatch cleanup
Stephen Hemminger [Wed, 8 Jun 2016 16:15:52 +0000 (09:15 -0700)]
ip: minor checkpatch cleanup

8 years agofq_codel: add per queue memory limit
Eric Dumazet [Wed, 8 Jun 2016 15:42:00 +0000 (08:42 -0700)]
fq_codel: add per queue memory limit

This patch adds support for TCA_FQ_CODEL_MEMORY_LIMIT attribute.

..
qdisc fq_codel 8008: root refcnt 257 limit 10240p flows 1024
 quantum 1514 target 5.0ms interval 100.0ms memory_limit 4Mb ecn
 Sent 2083566791363 bytes 1376214889 pkt (dropped 4994406, overlimits 0
requeues 21705223)
 rate 9841Mbit 812549pps backlog 3906120b 376p requeues 21705223
  maxpacket 68130 drop_overlimit 4994406 new_flow_count 28855414
  ecn_mark 0 memory_used 4190048 drop_overmemory 4994406
new_flows_len 1 old_flows_len 177

Signed-off-by: Eric Dumazet <edumazet@google.com>
8 years agoman: tc-ife.8: man page for ife action
Lucas Bates [Sun, 5 Jun 2016 13:17:15 +0000 (09:17 -0400)]
man: tc-ife.8: man page for ife action

Signed-off-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Phil Sutter <phil@nwl.cc>
8 years agoman: rtpr: Fix minor typo
Phil Sutter [Wed, 1 Jun 2016 19:58:21 +0000 (21:58 +0200)]
man: rtpr: Fix minor typo

Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agomisc/ss: Add family list to -f option in _usage()
Fabien Siron [Mon, 6 Jun 2016 14:53:38 +0000 (14:53 +0000)]
misc/ss: Add family list to -f option in _usage()

Signed-off-by: Fabien Siron <fabien.siron@epita.fr>
8 years agoman: ip-link: Added HSR part
Peter Heise [Wed, 1 Jun 2016 07:43:15 +0000 (09:43 +0200)]
man: ip-link: Added HSR part

Added HSR part to manpage as follow-up to last commit's
feedback.

Signed-off-by: Peter Heise <peter.heise@airbus.com>
8 years agotc action policer: enable timestamp display
Jamal Hadi Salim [Wed, 25 May 2016 10:05:49 +0000 (06:05 -0400)]
tc action policer: enable timestamp display

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
8 years agotc: update headers for TCA_POLICE
Stephen Hemminger [Tue, 31 May 2016 20:02:28 +0000 (13:02 -0700)]
tc: update headers for TCA_POLICE

These are from linux-net but will be in next rc.

8 years agoman: ip, ip-link: Fix ip option location
Phil Sutter [Mon, 30 May 2016 18:46:27 +0000 (20:46 +0200)]
man: ip, ip-link: Fix ip option location

This patch drops the redundant description of some of ip's options in
ip-link.8's description of the 'show' subcommand, preserving the
description of -iec (but appending it to the list in ip.8 with minor
fixes).

Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agotc filter u32: Coding style fixes
Jamal Hadi Salim [Wed, 25 May 2016 10:11:55 +0000 (06:11 -0400)]
tc filter u32: Coding style fixes

"handle" was being used several times for different things.
Fix the 80 character limit abuse and other little issues while at it.

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
8 years agotc: action result is u32
Stephen Hemminger [Tue, 31 May 2016 19:22:45 +0000 (12:22 -0700)]
tc: action result is u32

In kernel action result is u32 not int in netlink messages.

8 years agotc action policer: Avoid nonsensical input
Jamal Hadi Salim [Wed, 25 May 2016 10:05:48 +0000 (06:05 -0400)]
tc action policer: Avoid nonsensical input

The user must at least specify a choice of the token bucket or
ewma policing or late binding index. TB policing requires at minimal
a rate and burst.

In addition fix formatting issues (80 chars etc).

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
8 years agoMake builds default to quiet mode
David Ahern [Tue, 24 May 2016 22:04:49 +0000 (15:04 -0700)]
Make builds default to quiet mode

Similar to the Linux kernel and perf add infrastructure to reduce the
amount of output tossed to a user during a build. Full build output
can be obtained with 'make V=1'

Builds go from:

make[1]: Leaving directory `/home/dsa/iproute2.git/lib'
make[1]: Entering directory `/home/dsa/iproute2.git/ip'
gcc -Wall -Wstrict-prototypes  -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wformat=2 -O2 -I../include -DRESOLVE_HOSTNAMES -DLIBDIR=\"/usr/lib\" -DCONFDIR=\"/etc/iproute2\" -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE    -c -o ip.o ip.c
gcc -Wall -Wstrict-prototypes  -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wformat=2 -O2 -I../include -DRESOLVE_HOSTNAMES -DLIBDIR=\"/usr/lib\" -DCONFDIR=\"/etc/iproute2\" -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE    -c -o ipaddress.o ipaddress.c

to:

...
    AR       libutil.a

ip
    CC       ip.o
    CC       ipaddress.o
...

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
8 years agotc simple action: bug fix
Jamal Hadi Salim [Tue, 24 May 2016 11:52:48 +0000 (07:52 -0400)]
tc simple action: bug fix

Failed compile
m_simple.c: In function â€˜parse_simple’:
m_simple.c:154:6: warning: too many arguments for format [-Wformat-extra-args]
      *argv);
      ^
m_simple.c:103:14: warning: unused variable â€˜maybe_bind’ [-Wunused-variable]

Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
8 years agoip, token: add del command
Daniel Borkmann [Mon, 23 May 2016 22:47:38 +0000 (00:47 +0200)]
ip, token: add del command

For convenience also add a del command for deleting a token and
update the man page accordingly.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
8 years agoAdded support for selection of new HSR version
Peter Heise [Mon, 30 May 2016 13:32:07 +0000 (15:32 +0200)]
Added support for selection of new HSR version

A new HSR version was added in 4.7 that can be enabled
via iproute2. Per default the old version is selected,
however, with "ip link add [..] type hsr [..] version 1"
the newer version can be enabled.

Signed-off-by: Peter Heise <peter.heise@airbus.com>
8 years agotc fix ife late binding
Jamal Hadi Salim [Sun, 22 May 2016 17:11:16 +0000 (13:11 -0400)]
tc fix ife late binding

following late binding didn't work

sudo tc actions add action ife encode \
type 0xDEAD allow mark dst 02:15:15:15:15:15 index 1

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
8 years agof_bpf: fix filling of handle when no further arg is provided
Daniel Borkmann [Wed, 18 May 2016 09:58:41 +0000 (11:58 +0200)]
f_bpf: fix filling of handle when no further arg is provided

We need to fill handle when provided by the user, even if no further
argument is provided. Thus, move the test for arg to the correct location,
so that it works correctly:

  # tc filter show dev foo egress
  filter protocol all pref 1 bpf
  filter protocol all pref 1 bpf handle 0x1 bpf.o:[classifier] direct-action
  filter protocol all pref 1 bpf handle 0x2 bpf.o:[classifier] direct-action
  # tc filter del dev foo egress prio 1 handle 2 bpf
  # tc filter show dev foo egress
  filter protocol all pref 1 bpf
  filter protocol all pref 1 bpf handle 0x1 bpf.o:[classifier] direct-action

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
8 years agoipaddress: fix build with musl libc
Kylie McClain [Sun, 22 May 2016 23:52:02 +0000 (19:52 -0400)]
ipaddress: fix build with musl libc

MIN() is defined within sys/param.h.

Signed-off-by: Kylie McClain <somasis@exherbo.org>
8 years agoadd if_macsec header
Stephen Hemminger [Mon, 23 May 2016 23:10:43 +0000 (16:10 -0700)]
add if_macsec header

Current version from 4.7-pre-rc1

8 years agoupdate kernel headers (from 4.7-rc1)
Stephen Hemminger [Mon, 23 May 2016 16:06:11 +0000 (09:06 -0700)]
update kernel headers (from 4.7-rc1)

8 years agoMerge branch 'master' into net-next
Stephen Hemminger [Wed, 18 May 2016 18:57:28 +0000 (11:57 -0700)]
Merge branch 'master' into net-next

8 years agovv4.6.0
Stephen Hemminger [Wed, 18 May 2016 18:56:02 +0000 (11:56 -0700)]
vv4.6.0

8 years agoip link: Add support for kernel side filtering
David Ahern [Wed, 11 May 2016 13:51:58 +0000 (06:51 -0700)]
ip link: Add support for kernel side filtering

Kernel gained support for filtering link dumps with commit dc599f76c22b
("net: Add support for filtering link dump by master device and kind").
Add support to ip link command. If a user passes master device or
kind to ip link command they are added to the link dump request message.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
8 years agodevlink: implement shared buffer occupancy control
Jiri Pirko [Sat, 14 May 2016 13:21:02 +0000 (15:21 +0200)]
devlink: implement shared buffer occupancy control

Use kernel shared buffer occupancy control commands to make snapshot and
clear occupancy watermarks. Also, allow to show occupancy values in a
nice way.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
8 years agodevlink: implement shared buffer support
Jiri Pirko [Sat, 14 May 2016 13:21:01 +0000 (15:21 +0200)]
devlink: implement shared buffer support

Implement kernel devlink shared buffer interface. Introduce new object
"sb" and allow to browse the shared buffer parameters and also change
configuration.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
8 years agoingress, clsact: don't add TCA_OPTIONS to nl msg
Daniel Borkmann [Sun, 15 May 2016 16:36:03 +0000 (18:36 +0200)]
ingress, clsact: don't add TCA_OPTIONS to nl msg

In ingress and clsact qdisc TCA_OPTIONS are ignored, since it's
parameterless. In tc, we add an empty addattr_l(... TCA_OPTIONS,
NULL, 0) to the netlink message nevertheless. This has the
side effect that when someone tries a 'tc qdisc replace' and
already an existing such qdisc is present, tc fails with
EINVAL here.

Reason is that in the kernel, this invokes qdisc_change() when
such requested qdisc is already present. When TCA_OPTIONS are
passed to modify parameters, it looks whether qdisc implements
.change() callback, and if not present (like in both cases here)
it returns with error. Rather than adding an empty stub to the
kernel that ignores TCA_OPTIONS again, just don't add TCA_OPTIONS
to the netlink message in the first place.

Before:

  # tc qdisc replace dev foo clsact    # first try
  # tc qdisc replace dev foo clsact    # second one
  RTNETLINK answers: Invalid argument

After:

  # tc qdisc replace dev foo clsact
  # tc qdisc replace dev foo clsact
  # tc qdisc replace dev foo clsact

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
8 years agoMerge branch 'master' into net-next
Stephen Hemminger [Mon, 16 May 2016 18:20:40 +0000 (11:20 -0700)]
Merge branch 'master' into net-next

8 years agotc simple action update and breakage
Jamal Hadi Salim [Sun, 8 May 2016 15:02:06 +0000 (11:02 -0400)]
tc simple action update and breakage

Brings it closer to more serious actions (adding branching
and allowing for late binding)

Unfortunately this breaks old syntax of the simple action.
But because simple is a pedagogical example unlikely to be used
in production environments (i.e its role is to serve as an example
on how to write actions), then this is ok.

New syntax for simple has new keyword "sdata". Example usage is:

sudo tc actions add action simple sdata "foobar" index 1
or
tc filter add dev $DEV parent ffff: protocol ip prio 1 u32\
match ip dst 17.0.0.1/32 flowid 1:10 action simple sdata "foobar"

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
8 years agotc: don't ignore ok as an action branch
Jamal Hadi Salim [Sat, 7 May 2016 13:39:36 +0000 (09:39 -0400)]
tc: don't ignore ok as an action branch

This is what used to happen before:

tc filter add dev tap1 parent ffff: protocol 0xfefe prio 10 \
     u32 match u32 0 0 flowid 1:16 \
     action ife decode allow mark ok

tc -s filter ls dev tap1 parent ffff:
filter protocol [65278] pref 10 u32
filter protocol [65278] pref 10 u32 fh 800: ht divisor 1
filter protocol [65278] pref 10 u32 fh 800::800 order 2048 key ht 800
bkt 0 flowid 1:16
  match 00000000/00000000 at 0
        action order 1: ife decode action pipe
         index 2 ref 1 bind 1 installed 4 sec used 4 sec
         type: 0x0
         Metadata: allow mark
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

        action order 2: gact action pass
         random type none pass val 0
         index 1 ref 1 bind 1 installed 4 sec used 4 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

Note the extra action added at the end..

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
8 years agotc: introduce IFE action
Jamal Hadi Salim [Sat, 7 May 2016 13:35:23 +0000 (09:35 -0400)]
tc: introduce IFE action

This action allows for a sending side to encapsulate arbitrary metadata
which is decapsulated by the receiving end.
The sender runs in encoding mode and the receiver in decode mode.
Both sender and receiver must specify the same ethertype.
At some point we hope to have a registered ethertype and we'll
then provide a default so the user doesnt have to specify it.
For now we enforce the user specify it.

Described in netdev01 paper:
   "Distributing Linux Traffic Control Classifier-Action Subsystem"
    Authors: Jamal Hadi Salim and Damascene M. Joachimpillai

Also refer to IETF draft-ietf-forces-interfelfb-04.txt

Lets show example usage where we encode icmp from a sender towards
a receiver with an skbmark of 17; both sender and receiver use
ethertype of 0xdead to interop.

YYYY: Lets start with Receiver-side policy config:
xxx: add an ingress qdisc
sudo tc qdisc add dev $ETH ingress

xxx: any packets with ethertype 0xdead will be subjected to ife decoding
xxx: we then restart the classification so we can match on icmp at prio 3
sudo $TC filter add dev $ETH parent ffff: prio 2 protocol 0xdead \
u32 match u32 0 0 flowid 1:1 \
action ife decode reclassify

xxx: on restarting the classification from above if it was an icmp
xxx: packet, then match it here and continue to the next rule at prio 4
xxx: which will match based on skb mark of 17
sudo tc filter add dev $ETH parent ffff: prio 3 protocol ip \
u32 match ip protocol 1 0xff flowid 1:1 \
action continue

xxx: match on skbmark of 0x11 (decimal 17) and accept
sudo tc filter add dev $ETH parent ffff: prio 4 protocol ip \
handle 0x11 fw flowid 1:1 \
action ok

xxx: Lets show the decoding policy
sudo tc -s filter ls dev $ETH parent ffff: protocol 0xdead
xxx:
filter pref 2 u32
filter pref 2 u32 fh 800: ht divisor 1
filter pref 2 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1  (rule hit 0 success 0)
  match 00000000/00000000 at 0 (success 0 )
action order 1: ife decode action reclassify type 0x0
 allow mark allow prio
 index 11 ref 1 bind 1 installed 45 sec used 45 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

xxx:
Observe that above lists all metadatum it can decode. Typically these
submodules will already be compiled into a monolithic kernel or
loaded as modules

YYYY: Lets show the sender side now ..
xxx: Add an egress qdisc on the sender netdev
sudo tc qdisc add dev $ETH root handle 1: prio
xxx:
xxx: Match all icmp packets to 192.168.122.237/24, then
xxx: tag the packet with skb mark of decimal 17, then
xxx: Encode it with:
xxx:    ethertype 0xdead
xxx:    add skb->mark to whitelist of metadatum to send
xxx:    rewrite target dst MAC address to 02:15:15:15:15:15
xxx:
sudo $TC filter add dev $ETH parent 1: protocol ip prio 10  u32 \
match ip dst 192.168.122.237/24 \
match ip protocol 1 0xff \
flowid 1:2 \
action skbedit mark 17 \
action ife encode \
type 0xDEAD \
allow mark \
dst 02:15:15:15:15:15

xxx: Lets show the encoding policy
filter pref 10 u32
filter pref 10 u32 fh 800: ht divisor 1
filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:2  (rule hit 118 success 0)
  match c0a87a00/ffffff00 at 16 (success 0 )
  match 00010000/00ff0000 at 8 (success 0 )
action order 1:  skbedit mark 17
 index 11 ref 1 bind 1 installed 3 sec used 3 sec
  Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

action order 2: ife encode action pipe type 0xDEAD
 allow mark dst 02:15:15:15:15:15
 index 12 ref 1 bind 1 installed 3 sec used 3 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
xxx:

Now test by sending ping from sender to destination

Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
8 years agoadd tc_ife.h
Stephen Hemminger [Mon, 16 May 2016 18:13:05 +0000 (11:13 -0700)]
add tc_ife.h

8 years agoupdate kernel headers from net-next
Stephen Hemminger [Fri, 13 May 2016 21:56:31 +0000 (14:56 -0700)]
update kernel headers from net-next

Take sanitized headers for davem net-next

8 years agodevlink: update uapi header
Stephen Hemminger [Fri, 13 May 2016 21:49:40 +0000 (14:49 -0700)]
devlink: update uapi header

Get santized version from net-next

8 years agoMerge branch 'master' into net-next
Stephen Hemminger [Fri, 13 May 2016 21:48:53 +0000 (14:48 -0700)]
Merge branch 'master' into net-next

8 years agodevlink: remove more unused code
Stephen Hemminger [Fri, 13 May 2016 21:48:32 +0000 (14:48 -0700)]
devlink: remove more unused code

8 years agoss: Remove unused argument from kill_inet_sock
subashab@codeaurora.org [Mon, 9 May 2016 20:54:36 +0000 (14:54 -0600)]
ss: Remove unused argument from kill_inet_sock

addr is not used here.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
8 years agoMerge branch 'master' into net-next
Stephen Hemminger [Fri, 13 May 2016 21:44:48 +0000 (14:44 -0700)]
Merge branch 'master' into net-next

8 years agodevlink: remove unused code
Stephen Hemminger [Fri, 13 May 2016 21:42:06 +0000 (14:42 -0700)]
devlink: remove unused code

Unused code causes warnings, removed.

8 years agoupdate kernel headers to 4.6-rc6
Stephen Hemminger [Fri, 13 May 2016 21:41:45 +0000 (14:41 -0700)]
update kernel headers to 4.6-rc6

Close to final upstream headers

8 years agoRevert "devlink: implement shared buffer support"
Stephen Hemminger [Fri, 13 May 2016 21:38:47 +0000 (14:38 -0700)]
Revert "devlink: implement shared buffer support"

This reverts commit b56700bf8add4ebb2fe451c85f50602b58a886a2.

8 years agoRevert "devlink: implement shared buffer occupancy control"
Stephen Hemminger [Fri, 13 May 2016 21:38:38 +0000 (14:38 -0700)]
Revert "devlink: implement shared buffer occupancy control"

This reverts commit a60ebcb6f34f4c43cba092f52b1150d7fb1deec5.

8 years agogeneve: fix IPv6 remote address reporting
Edward Cree [Fri, 6 May 2016 14:28:25 +0000 (15:28 +0100)]
geneve: fix IPv6 remote address reporting

Since we can only configure unicast, we probably want to be able to
display unicast, rather than multicast.

Fixes: 906ac5437ab8 ("geneve: add support for IPv6 link partners")
Signed-off-by: Edward Cree <ecree@solarflare.com>
8 years agoip link gre: print only relevant info in external mode
Jiri Benc [Wed, 27 Apr 2016 14:11:14 +0000 (16:11 +0200)]
ip link gre: print only relevant info in external mode

Display only attributes that are relevant when a GRE interface is in
'external' mode instead of the default values (which are ignored by the
kernel even if passed back).

Fixes: 926b39e1feffd ("gre: add support for collect metadata flag")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
8 years agoip link gre: create interfaces in external mode correctly
Jiri Benc [Wed, 27 Apr 2016 14:11:13 +0000 (16:11 +0200)]
ip link gre: create interfaces in external mode correctly

For GRE interfaces in 'external' mode, the kernel ignores all manual
settings like remote IP address or TTL. However, for some of those
attributes, kernel checks their value and does not allow them to be zero
(even though they're ignored later).

Currently, 'ip link' always includes all attributes in the netlink message.
This leads to problem with creating interfaces in 'external' mode. For
example, this command does not work:

ip link add gre1 type gretap external

and needs a bogus remote IP address to be specified, as the kernel enforces
remote IP address to be either not present, or not null.

Ignore the parameters that do not make sense in 'external' mode.
Unfortunately, we cannot error out, as there may be existing deployments
that workarounded the bug by specifying bogus values.

Fixes: 926b39e1feffd ("gre: add support for collect metadata flag")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
8 years agotc: add bash-completion function
Quentin Monnet [Tue, 3 May 2016 07:39:08 +0000 (09:39 +0200)]
tc: add bash-completion function

Add function for command completion for tc in bash, and update Makefile
to install it under /usr/share/bash-completion/completions/.

Inside iproute2 repository, the completion code is in a new
`bash-completion` toplevel directory.

v2: Remove `if` statement in Makefile: do not try to install in
    /etc/bash_completion.d/ if /usr/share/bash-completion/completions/
    is not found; instead, the user can override the installation path
    with the specific environment variable.

Signed-off-by: Quentin Monnet <quentin.monnet@6wind.com>
8 years agoupdate kernel headers from net-next
Stephen Hemminger [Mon, 25 Apr 2016 05:30:46 +0000 (22:30 -0700)]
update kernel headers from net-next

8 years agoss: add SK_MEMINFO_DROPS display
Eric Dumazet [Thu, 21 Apr 2016 12:19:04 +0000 (05:19 -0700)]
ss: add SK_MEMINFO_DROPS display

SK_MEMINFO_DROPS is added in linux-4.7 for TCP, UDP and SCTP

skmem will display the socket drop count using d prefix as in :

$ ss -tm src :22 | more
State      Recv-Q Send-Q Local Address:Port    Peer Address:Port
ESTAB      0      52     10.246.7.151:ssh      172.20.10.101:50759
 skmem:(r0,rb8388608,t0,tb8388608,f1792,w2304,o0,bl0,d0)

Signed-off-by: Eric Dumazet <edumazet@google.com>
8 years agoupdate kernel headers from net-next
Stephen Hemminger [Fri, 22 Apr 2016 17:01:12 +0000 (10:01 -0700)]
update kernel headers from net-next

8 years agoupdate inet_diag.h header
Stephen Hemminger [Tue, 19 Apr 2016 15:06:11 +0000 (08:06 -0700)]
update inet_diag.h header

8 years agoMerge branch 'master' into net-next
Stephen Hemminger [Tue, 19 Apr 2016 15:01:55 +0000 (08:01 -0700)]
Merge branch 'master' into net-next

8 years agodevlink: add manpage for shared buffer
Jiri Pirko [Fri, 15 Apr 2016 07:51:53 +0000 (09:51 +0200)]
devlink: add manpage for shared buffer

Manpage for devlink "sb" object.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
8 years agodevlink: implement shared buffer occupancy control
Jiri Pirko [Fri, 15 Apr 2016 07:51:52 +0000 (09:51 +0200)]
devlink: implement shared buffer occupancy control

Use kernel shared buffer occupancy control commands to make snapshot and
clear occupancy watermarks. Also, allow to show occupancy values in a
nice way.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
8 years agodevlink: implement shared buffer support
Jiri Pirko [Fri, 15 Apr 2016 07:51:51 +0000 (09:51 +0200)]
devlink: implement shared buffer support

Implement kernel devlink shared buffer interface. Introduce new object
"sb" and allow to browse the shared buffer parameters and also change
configuration.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
8 years agodevlink: allow to parse both devlink and port handle in the same time
Jiri Pirko [Fri, 15 Apr 2016 07:51:50 +0000 (09:51 +0200)]
devlink: allow to parse both devlink and port handle in the same time

For filtering purposes, it makes sense for used to either specify
devlink handle of port handle.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
8 years agodevlink: introduce dump filtering function
Jiri Pirko [Fri, 15 Apr 2016 07:51:49 +0000 (09:51 +0200)]
devlink: introduce dump filtering function

This function is to be used from dump callbacks to decide if the output
currect output should be filtered off or not. Filtering is based on
previously parsed and stored command line options.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
8 years agodevlink: split dl_argv_parse_put to parse and put parts
Jiri Pirko [Fri, 15 Apr 2016 07:51:48 +0000 (09:51 +0200)]
devlink: split dl_argv_parse_put to parse and put parts

It is handy to have parsed cmdline data stored so they can be used for
dumps filtering. So split original dl_argv_parse_put into parse and put
parts.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
8 years agodevlink: introduce helper to print out nice names (ifnames)
Jiri Pirko [Fri, 15 Apr 2016 07:51:47 +0000 (09:51 +0200)]
devlink: introduce helper to print out nice names (ifnames)

By default, ifnames will be printed out. User can turn that off using
"-n" option on the command line.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
8 years agodevlink: introduce pr_out_port_handle helper
Jiri Pirko [Fri, 15 Apr 2016 07:51:46 +0000 (09:51 +0200)]
devlink: introduce pr_out_port_handle helper

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
8 years agolist: add list_add_tail helper
Jiri Pirko [Fri, 15 Apr 2016 07:51:45 +0000 (09:51 +0200)]
list: add list_add_tail helper

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
8 years agolist: add list_for_each_entry_reverse macro
Jiri Pirko [Fri, 15 Apr 2016 07:51:44 +0000 (09:51 +0200)]
list: add list_for_each_entry_reverse macro

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
8 years agodevlink: fix "devlink port" help message
Jiri Pirko [Fri, 15 Apr 2016 07:51:43 +0000 (09:51 +0200)]
devlink: fix "devlink port" help message

"dl" -> "devlink"

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
8 years agoss: take care of unknown min_rtt
Eric Dumazet [Wed, 13 Apr 2016 22:18:38 +0000 (15:18 -0700)]
ss: take care of unknown min_rtt

Kernel sets info->tcpi_min_rtt to ~0U when no RTT sample was ever
taken for the session, thus min_rtt is unknown.

Signed-off-by: Eric Dumazet <edumazet@google.com>
8 years agoss: Fix accidental state filter override
Phil Sutter [Wed, 13 Apr 2016 20:07:05 +0000 (22:07 +0200)]
ss: Fix accidental state filter override

Passing a filter expression and selecting an address family using the
'-f' flag would overwrite the state filter by accident. Therefore
calling e.g. 'ss -nl -f inet '(sport = :22)' would not only print
listening sockets (as requested by '-l' flag) but connected ones, as
well.

Fix this by reusing the formerly ineffective call to filter_states_set()
to restore the state filter as it was before the call to
filter_af_set().

Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agoss: Drop silly assignment
Phil Sutter [Wed, 13 Apr 2016 20:07:04 +0000 (22:07 +0200)]
ss: Drop silly assignment

An expression of the form '(a | b) & b' will evaluate to the value of b
for any value of a or b.

Signed-off-by: Phil Sutter <phil@nwl.cc>
8 years agoip: neigh: Fix leftover attributes message during flush
Jeff Harris [Thu, 14 Apr 2016 18:15:03 +0000 (14:15 -0400)]
ip: neigh: Fix leftover attributes message during flush

Use the same rtnl_dump_request_n call as the show.  The rtnl_wilddump_request
assumes the type uses an ifinfomsg which is not the case for the neighbor
table.

Signed-off-by: Jeff Harris <jefftharris@gmail.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
8 years agovxlan: add support for VXLAN-GPE
Jiri Benc [Thu, 7 Apr 2016 12:36:29 +0000 (14:36 +0200)]
vxlan: add support for VXLAN-GPE

Adds support to create a VXLAN-GPE interface.

Signed-off-by: Jiri Benc <jbenc@redhat.com>