]> git.proxmox.com Git - mirror_iproute2.git/log
mirror_iproute2.git
5 years agobridge: fdb: Fix for missing keywords in non-JSON output
Phil Sutter [Tue, 9 Oct 2018 12:44:08 +0000 (14:44 +0200)]
bridge: fdb: Fix for missing keywords in non-JSON output

While migrating to JSON print library, some keywords were dropped from
standard output by accident. Add them back to unbreak output parsers.

Fixes: c7c1a1ef51aea ("bridge: colorize output and use JSON print library")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agolibnetlink: use local variable
Stephen Hemminger [Tue, 9 Oct 2018 16:46:11 +0000 (09:46 -0700)]
libnetlink: use local variable

Now that err->error is in local variable, use it consistently.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agolibnetlink: fix use-after-free of message buf
Vlad Buslov [Mon, 8 Oct 2018 20:52:26 +0000 (23:52 +0300)]
libnetlink: fix use-after-free of message buf

In __rtnl_talk_iov() main loop, err is a pointer to memory in dynamically
allocated 'buf' that is used to store netlink messages. If netlink message
is an error message, buf is deallocated before returning with error code.
However, on return err->error code is checked one more time to generate
return value, after memory which err points to has already been
freed. Save error code in temporary variable and use the variable to
generate return value.

Fixes: c60389e4f9ea ("libnetlink: fix leak and using unused memory on error")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotc: jsonify output of q_fifo
Jakub Kicinski [Fri, 5 Oct 2018 00:08:34 +0000 (17:08 -0700)]
tc: jsonify output of q_fifo

Print limits correctly in JSON context.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agovxlan: show correct ttl inherit info
Hangbin Liu [Thu, 27 Sep 2018 07:28:36 +0000 (15:28 +0800)]
vxlan: show correct ttl inherit info

We should only show ttl inherit when IFLA_VXLAN_TTL_INHERIT supplied.
Otherwise show the ttl number, or auto when it is 0.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agolibnetlink: don't return error on success
Stephen Hemminger [Tue, 25 Sep 2018 08:08:48 +0000 (10:08 +0200)]
libnetlink: don't return error on success

Change to error handling broke normal code.

Fixes: c60389e4f9ea ("libnetlink: fix leak and using unused memory on error")
Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotestsuite: add libmnl
Stephen Hemminger [Tue, 25 Sep 2018 07:59:37 +0000 (09:59 +0200)]
testsuite: add libmnl

Supporting external ack requires libmnl now.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoMakefile: Add check target
Petr Vorel [Fri, 21 Sep 2018 20:29:16 +0000 (22:29 +0200)]
Makefile: Add check target

Signed-off-by: Petr Vorel <petr.vorel@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoiplink_vxlan: take into account preferred_family creating vxlan device
Lorenzo Bianconi [Fri, 21 Sep 2018 13:34:25 +0000 (15:34 +0200)]
iplink_vxlan: take into account preferred_family creating vxlan device

Take into account the configured preferred_family if neither saddr or
daddr are provided since otherwise vxlan kernel module will use IPv4 as
default remote inet family neglecting the one provided by userspace.
This behaviour was originally in commit 97d564b90ccb ("vxlan: use
preferred address family when neither group or remote is specified").
The issue can be triggered with the following reproducer:

$ip -6 link add vxlan1 type vxlan id 42 dev enp0s2 \
     proxy nolearning l2miss l3miss
$bridge fdb add 46:47:1f:a7:1c:25 dev vxlan1 dst 2000::2
RTNETLINK answers: Address family not supported by protocol

Fixes: 1e9b8072de2c ("iplink_vxlan: Get rid of inet_get_addr()")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoiplink: fix incorrect any address handling for ip tunnels
Hangbin Liu [Tue, 18 Sep 2018 09:48:40 +0000 (17:48 +0800)]
iplink: fix incorrect any address handling for ip tunnels

After commit d42c7891d26e4 ("utils: Do not reset family for default, any,
all addresses"), when call get_addr() for any/all addresses, we will set
addr->flags to ADDRTYPE_INET_UNSPEC if family is AF_INET/AF_INET6, which
makes is_addrtype_inet() checking passed and assigns incorrect address
to kernel. The ip link cmd will return error like:

]# ip link add ipip1 type ipip local any remote 1.1.1.1
RTNETLINK answers: Numerical result out of range

Fix it by using is_addrtype_inet_not_unspec() to avoid unspec addresses.

geneve, vxlan are not affected as they use AF_UNSPEC family when call
get_addr()

Reported-by: Jianlin Shi <jishi@redhat.com>
Fixes: d42c7891d26e4 ("utils: Do not reset family for default, any, all addresses")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoMakefile: add help target
Stephen Hemminger [Fri, 21 Sep 2018 16:15:26 +0000 (09:15 -0700)]
Makefile: add help target

Add help target to Makefile

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotestsuite: Warn about empty $(IPVERS)
Petr Vorel [Wed, 19 Sep 2018 23:36:24 +0000 (01:36 +0200)]
testsuite: Warn about empty $(IPVERS)

alltests target requires having symlink created by configure target
(default target). Without that there is no test being run.

Signed-off-by: Petr Vorel <petr.vorel@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotestsuite: Generate generate_nlmsg when needed
Petr Vorel [Wed, 19 Sep 2018 23:36:23 +0000 (01:36 +0200)]
testsuite: Generate generate_nlmsg when needed

Commit 886f2c43 added generate_nlmsg.c. Running alltests
target, which uses the binary required to run 'make -C tools' before.

Fixes: 886f2c43 testsuite: Generate nlmsg blob at runtime
Signed-off-by: Petr Vorel <petr.vorel@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotestsuite: Fix missing generate_nlmsg
Petr Vorel [Wed, 19 Sep 2018 23:36:22 +0000 (01:36 +0200)]
testsuite: Fix missing generate_nlmsg

Commit ad23e152 caused generate_nlmsg to be always missing:

$ make alltests
make: ./tools/generate_nlmsg: Command not found

Create testclean: to remove only results directory.

Fixes: ad23e152 testsuite: remove all temp files and implement make clean
Signed-off-by: Petr Vorel <petr.vorel@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agordma: Fix representation of PortInfo CapabilityMask
Leon Romanovsky [Sun, 16 Sep 2018 17:28:13 +0000 (20:28 +0300)]
rdma: Fix representation of PortInfo CapabilityMask

The port capability mask represents IBTA PortInfo specification,
but as it is written in description of kernel commit 2f944c0fbf58
("RDMA: Fix storage of PortInfo CapabilityMask in the kernel"),
the bit 26 was mistakenly overwritten.

The rdmatool followed it too and mislead users by presenting wrong
value. Since it never showed proper value, we update the whole
port_cap_mask to comply with IBTA and show real HW values.

Fixes: da990ab40a92 ("rdma: Add link object")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agolibnetlink: fix leak and using unused memory on error
Stephen Hemminger [Thu, 13 Sep 2018 19:33:38 +0000 (12:33 -0700)]
libnetlink: fix leak and using unused memory on error

If an error happens in multi-segment message (tc only)
then report the error and stop processing further responses.
This also fixes refering to the buffer after free.

The sequence check is not necessary here because the
response message has already been validated to be in
the window of the sequence number of the iov.

Reported-by: Mahesh Bandewar <mahesh@bandewar.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Mahesh Bandewar <maheshb@google.com>
5 years agoq_cake: Also print nonat, nowash and no-ack-filter keywords
Toke Høiland-Jørgensen [Fri, 14 Sep 2018 13:51:39 +0000 (15:51 +0200)]
q_cake: Also print nonat, nowash and no-ack-filter keywords

Similar to the previous patch for no-split-gso, the negative keywords for
'nat', 'wash' and 'ack-filter' were not printed either. Add those well.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agobridge/mdb: fix missing new line when show bridge mdb
Hangbin Liu [Wed, 12 Sep 2018 01:39:44 +0000 (09:39 +0800)]
bridge/mdb: fix missing new line when show bridge mdb

The bridge mdb show is broken on current iproute2. e.g.
]# bridge mdb show
34: br0  veth0_br  224.1.1.2  temp 34: br0  veth0_br  224.1.1.1  temp

After fix:
]# bridge mdb show
34: br0  veth0_br  224.1.1.2  temp
34: br0  veth0_br  224.1.1.1  temp

v2: Use json print lib as Stephen suggested.
v3: No need to use is_json_context() as print_string() could handle both cases.
v4: use new function print_nl() to print new line in non-json mode.

Reported-by: Ying Xu <yinxu@redhat.com>
Fixes: c7c1a1ef51aea ("bridge: colorize output and use JSON print library")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoq_cake: Add printing of no-split-gso option
Toke Høiland-Jørgensen [Tue, 11 Sep 2018 22:32:16 +0000 (00:32 +0200)]
q_cake: Add printing of no-split-gso option

When the GSO splitting was turned into dual split-gso/no-split-gso options,
the printing of the latter was left out. Add that, so output is consistent
with the options passed.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agolib: introduce print_nl
Stephen Hemminger [Tue, 11 Sep 2018 15:29:33 +0000 (08:29 -0700)]
lib: introduce print_nl

Common pattern in iproute commands is to print a line seperator
in non-json mode. Make that a simple function.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoip-route: Fix segfault with many nexthops
Phil Sutter [Thu, 6 Sep 2018 13:31:51 +0000 (15:31 +0200)]
ip-route: Fix segfault with many nexthops

It was possible to crash ip-route by adding an IPv6 route with 37
nexthop statements. A simple reproducer is:

| for i in `seq 37`; do
|  nhs="nexthop via 1111::$i "$nhs
| done
| ip -6 route add 3333::/64 $nhs

The related code was broken in multiple ways:

* parse_one_nh() assumed that rta points to 4kB of storage but caller
  provided just 1kB. Fixed by passing 'len' parameter with the correct
  value.

* Error checking of rta_addattr*() calls in parse_one_nh() and called
  functions was completely absent, so with above fix in place output
  flood would occur due to parser looping forever.

While being at it, increase message buffer sizes to 4k. This allows for
at most 144 nexthops.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotc/mqprio: Print extra info on invalid args.
Caleb Raitto [Thu, 6 Sep 2018 21:01:17 +0000 (14:01 -0700)]
tc/mqprio: Print extra info on invalid args.

Print the name of the argument that wasn't understood.

Signed-off-by: Caleb Raitto <caraitto@google.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agogenl: remove unnecessary extern
Stephen Hemminger [Mon, 10 Sep 2018 18:53:07 +0000 (11:53 -0700)]
genl: remove unnecessary extern

extern not necessary on function prototype.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotc/fifo: remove unnecessary prototype
Stephen Hemminger [Mon, 10 Sep 2018 18:50:22 +0000 (11:50 -0700)]
tc/fifo: remove unnecessary prototype

The prototype for prio_print_opt is already in tc_util.h

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agobridge: fix vlan show formatting
Stephen Hemminger [Thu, 6 Sep 2018 13:42:46 +0000 (14:42 +0100)]
bridge: fix vlan show formatting

The output of vlan show was broken previous change to use json_print.
Clean the code up and return to original format.

Note: the JSON syntax has changed to make the bridge vlan
show more like other outputs (e.g. ip -j li show).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agobridge: use print_json for some outputs
Stephen Hemminger [Thu, 6 Sep 2018 13:15:36 +0000 (14:15 +0100)]
bridge: use print_json for some outputs

Rather than using is_json_context(), use the print_string functions
which handle both cases.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agobridge: minor change to mdb print
Stephen Hemminger [Thu, 6 Sep 2018 13:14:46 +0000 (14:14 +0100)]
bridge: minor change to mdb print

Get port ifname once rather than on both sides of if(is_json_context).

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoman: Change numtc to num_tc
Caleb Raitto [Wed, 5 Sep 2018 20:23:33 +0000 (13:23 -0700)]
man: Change numtc to num_tc

The argument parser only accepts num_tc:

https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/tree/tc/q_mqprio.c#n55

Signed-off-by: Caleb Raitto <caraitto@google.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agouapi: update ib_verbs
Stephen Hemminger [Fri, 31 Aug 2018 22:03:49 +0000 (15:03 -0700)]
uapi: update ib_verbs

Merge current uapi from 4.19-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotc/htb: remove unused variable
Florent Fourcot [Thu, 30 Aug 2018 14:38:54 +0000 (16:38 +0200)]
tc/htb: remove unused variable

Since introduction of htb module, this variable has never been used.

Signed-off-by: Florent Fourcot <florent.fourcot@wifirst.fr>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoiproute: make clang happy
Mahesh Bandewar [Thu, 23 Aug 2018 01:01:37 +0000 (18:01 -0700)]
iproute: make clang happy

These are primarily fixes for "string is not string literal" warnings
/ errors (with -Werror -Wformat-nonliteral). This should be a no-op
change. I had to replace couple of print helper functions with the
code they call as it was becoming harder to eliminate these warnings,
however these helpers were used only at couple of places, so no
major change as such.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoipmaddr: use preferred_family when given
Mahesh Bandewar [Thu, 23 Aug 2018 01:01:34 +0000 (18:01 -0700)]
ipmaddr: use preferred_family when given

When creating socket() AF_INET is used irrespective of the family
that is given at the command-line (with -4, -6, or -0). This change
will open the socket with the preferred family.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agouapi: update bpf headers
Stephen Hemminger [Thu, 30 Aug 2018 14:55:49 +0000 (07:55 -0700)]
uapi: update bpf headers

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoss: add UNIX_DIAG_VFS and UNIX_DIAG_ICONS for unix sockets
Cong Wang [Wed, 29 Aug 2018 17:09:27 +0000 (10:09 -0700)]
ss: add UNIX_DIAG_VFS and UNIX_DIAG_ICONS for unix sockets

UNIX_DIAG_VFS and UNIX_DIAG_ICONS are never used by ss,
make them available in ss -e output.

Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoiprule: Fix destination prefix output
Stefan Bader [Tue, 28 Aug 2018 14:27:29 +0000 (16:27 +0200)]
iprule: Fix destination prefix output

When adding support for JSON output the new code for printing
the destination prefix adds a stray blank character before
the bitmask. This causes some user-space parsing to fail.

Current output:
  ...: from x.x.x.x/l to y.y.y.y /l
Previous output:
  ...: from x.x.x.x/l to y.y.y.y/l

Fixes: 0dd4ccc5 "iprule: add json support"
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoq_cake: Add description of the tc filter override mechanism to man page
Toke Høiland-Jørgensen [Thu, 23 Aug 2018 10:05:05 +0000 (12:05 +0200)]
q_cake: Add description of the tc filter override mechanism to man page

Since CAKE now has three different settings that can be overridden by tc
filters (priority and host and flow hashes), documenting how they work is
probably a good idea.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotestsuite: run dmesg with sudo
Luca Boccassi [Wed, 22 Aug 2018 18:09:03 +0000 (19:09 +0100)]
testsuite: run dmesg with sudo

Some distributions like Debian nowadays restrict the dmesg command to
root-only. Run it with sudo in the testsuite.

Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotestsuite: let make compile build the netlink helper
Luca Boccassi [Wed, 22 Aug 2018 18:09:02 +0000 (19:09 +0100)]
testsuite: let make compile build the netlink helper

The generate_nlmsg binary is required but make -C testsuite compile
does not build it. Add the necessary includes and C*FLAGS to the tools
Makefile and have the compile target build it.

Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotestsuite: remove all temp files and implement make clean
Luca Boccassi [Wed, 22 Aug 2018 18:09:01 +0000 (19:09 +0100)]
testsuite: remove all temp files and implement make clean

Some generated test files were not removed, including one executable in
the testsuite/tools directory.
Ensure make clean from the top level directory works for the testsuite
subdirs too, and that all the files are removed.

Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotestsuite: Handle large number of kernel options
Stefan Bader [Wed, 22 Aug 2018 08:31:38 +0000 (10:31 +0200)]
testsuite: Handle large number of kernel options

Once there are more than a certain number of kernel config options
set (this happened for us with kernel 4.17), the method of passing
those as command line arguments exceeds the maximum number of
arguments the shell supports. This causes the whole testsuite to
fail.
Instead, create a temporary file and modify its contents so that
the config option variables are exported. Then this file can be
sourced in before running the tests.

Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotc: drop extern from function prototypes
Stephen Hemminger [Mon, 20 Aug 2018 23:01:31 +0000 (16:01 -0700)]
tc: drop extern from function prototypes

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agogenl: drop extern from function prototypes
Stephen Hemminger [Mon, 20 Aug 2018 23:01:01 +0000 (16:01 -0700)]
genl: drop extern from function prototypes

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agobridge: drop extern from function prototypes
Stephen Hemminger [Mon, 20 Aug 2018 23:00:38 +0000 (16:00 -0700)]
bridge: drop extern from function prototypes

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoip: drop extern from function prototype
Stephen Hemminger [Mon, 20 Aug 2018 22:58:50 +0000 (15:58 -0700)]
ip: drop extern from function prototype

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agolib: Make check_enable_color() return boolean
Phil Sutter [Fri, 17 Aug 2018 16:38:46 +0000 (18:38 +0200)]
lib: Make check_enable_color() return boolean

As suggested, turn return code into true/false although it's not checked
anywhere yet.

Fixes: 4d82962cccc6a ("Merge common code for conditionally colored output")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoMake colored output configurable
Phil Sutter [Fri, 17 Aug 2018 16:38:45 +0000 (18:38 +0200)]
Make colored output configurable

Allow for -color={never,auto,always} to have colored output disabled,
enabled only if stdout is a terminal or enabled regardless of stdout
state.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoip: Add missing -M flag to help text
Phil Sutter [Thu, 16 Aug 2018 10:27:59 +0000 (12:27 +0200)]
ip: Add missing -M flag to help text

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoMerge branch 'ipmonitor'
Stephen Hemminger [Thu, 16 Aug 2018 17:30:10 +0000 (10:30 -0700)]
Merge branch 'ipmonitor'

5 years agogenl: code cleanup
Stephen Hemminger [Thu, 16 Aug 2018 17:28:13 +0000 (10:28 -0700)]
genl: code cleanup

Run through checkpatch and cleanup line wraps etc.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoman: ss.8: Describe --events option
Phil Sutter [Thu, 16 Aug 2018 10:28:02 +0000 (12:28 +0200)]
man: ss.8: Describe --events option

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agortmon: List options in help text
Phil Sutter [Thu, 16 Aug 2018 10:28:01 +0000 (12:28 +0200)]
rtmon: List options in help text

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoman: rtacct.8: Fix nstat options
Phil Sutter [Thu, 16 Aug 2018 10:28:00 +0000 (12:28 +0200)]
man: rtacct.8: Fix nstat options

Add missing --pretty and --json options, correct --zero to --zeros and
correct the mess around --scan/--interval including broken man page
formatting.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoman: ifstat.8: Document --json and --pretty options
Phil Sutter [Thu, 16 Aug 2018 10:27:58 +0000 (12:27 +0200)]
man: ifstat.8: Document --json and --pretty options

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agogenl: Fix help text
Phil Sutter [Thu, 16 Aug 2018 10:27:57 +0000 (12:27 +0200)]
genl: Fix help text

The '| help' part was misleading: In fact, 'genl help' does not work but
'genl <OBJECT> help' does. Fix the help text to make that clear.

In addition to that, list -Version and -help flags as well.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoman: devlink.8: Document -verbose option
Phil Sutter [Thu, 16 Aug 2018 10:27:56 +0000 (12:27 +0200)]
man: devlink.8: Document -verbose option

This was the only bit missing in comparison to devlink help text.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agodevlink: trivial: Make help text consistent
Phil Sutter [Thu, 16 Aug 2018 10:27:55 +0000 (12:27 +0200)]
devlink: trivial: Make help text consistent

Typically the part of the flag in brackets completes the leading part
instead of repeating it.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agobridge: trivial: Make help text consistent
Phil Sutter [Thu, 16 Aug 2018 10:27:54 +0000 (12:27 +0200)]
bridge: trivial: Make help text consistent

Change curly braces into brackets for -json option in help text to be
consistent with the rest.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoman: bridge.8: Document -oneline option
Phil Sutter [Thu, 16 Aug 2018 10:27:53 +0000 (12:27 +0200)]
man: bridge.8: Document -oneline option

Copied the description from ip.8.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoipmonitor: decode DELNETCONF message
Stephen Hemminger [Wed, 15 Aug 2018 21:29:42 +0000 (14:29 -0700)]
ipmonitor: decode DELNETCONF message

When device is deleted DELNETCONF is sent, but ipmonitor
was unable to decode it.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoip: convert monitor to switch
Stephen Hemminger [Wed, 15 Aug 2018 21:29:41 +0000 (14:29 -0700)]
ip: convert monitor to switch

The decoding of netlink message types is natural for a C
switch statement.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotestsuite: Add a first ss test validating ssfilter
Phil Sutter [Tue, 14 Aug 2018 12:18:08 +0000 (14:18 +0200)]
testsuite: Add a first ss test validating ssfilter

This tests a few ssfilter expressions by selecting sockets from a TCP
dump file. The dump was created using the following command:

| ss -ntaD testsuite/tests/ss/ss1.dump

It is fed into ss via TCPDIAG_FILE environment variable.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotestsuite: Prepare for ss tests
Phil Sutter [Tue, 14 Aug 2018 12:18:07 +0000 (14:18 +0200)]
testsuite: Prepare for ss tests

This merges the shared bits from ts_tc() and ts_ip() into a common
function for being wrapped by the first ones and adds a third ts_ss()
for testing ss commands.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoss: Review ssfilter
Phil Sutter [Tue, 14 Aug 2018 12:18:06 +0000 (14:18 +0200)]
ss: Review ssfilter

The original problem was ssfilter rejecting single expressions if
enclosed in braces, such as:

| sport = 22 or ( dport = 22 )

This is fixed by allowing 'expr' to be an 'exprlist' enclosed in braces.
The no longer required recursion in 'exprlist' being an 'exprlist'
enclosed in braces is dropped.

In addition to that, a few other things are changed:

* Remove pointless 'null' prefix in 'appled' before 'exprlist'.
* For simple equals matches, '=' operator was required for ports but not
  allowed for hosts. Make this consistent by making '=' operator
  optional in both cases.

Reported-by: Samuel Mannehed <samuel@cendio.se>
Fixes: b2038cc0b2403 ("ssfilter: Eliminate shift/reduce conflicts")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoman: ip-route: Clarify referenced versions are Linux ones
Phil Sutter [Wed, 15 Aug 2018 09:18:26 +0000 (11:18 +0200)]
man: ip-route: Clarify referenced versions are Linux ones

Versioning scheme of Linux and iproute2 is similar, therefore the
referenced kernel versions are likely to confuse readers. Clarify this
by prefixing each kernel version by 'Linux' prefix.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/network/iproute2/iproute2-next
Stephen Hemminger [Wed, 15 Aug 2018 21:21:45 +0000 (14:21 -0700)]
Merge branch 'master' of git://git.kernel.org/pub/scm/network/iproute2/iproute2-next

5 years agoMerge branch 'iproute2-master' into iproute2-next
David Ahern [Wed, 15 Aug 2018 16:56:30 +0000 (09:56 -0700)]
Merge branch 'iproute2-master' into iproute2-next

Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agoMerge common code for conditionally colored output
Phil Sutter [Wed, 15 Aug 2018 16:21:26 +0000 (18:21 +0200)]
Merge common code for conditionally colored output

Instead of calling enable_color() conditionally with identical check in
three places, introduce check_enable_color() which does it in one place.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agobridge: Fix check for colored output
Phil Sutter [Wed, 15 Aug 2018 16:21:25 +0000 (18:21 +0200)]
bridge: Fix check for colored output

There is no point in calling enable_color() conditionally if it was
already called for each time '-color' flag was parsed. Align the
algorithm with that in ip and tc by actually making use of 'color'
variable.

Fixes: e9625d6aead11 ("Merge branch 'iproute2-master' into iproute2-next")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agotc: Fix typo in check for colored output
Phil Sutter [Wed, 15 Aug 2018 16:21:24 +0000 (18:21 +0200)]
tc: Fix typo in check for colored output

The check used binary instead of boolean AND, which means colored output
was enabled only if the number of specified '-color' flags was odd.

Fixes: 2d165c0811058 ("tc: implement color output")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agoAdd SKB Priority qdisc support in tc(8)
Nishanth Devarajan [Tue, 14 Aug 2018 02:57:21 +0000 (08:27 +0530)]
Add SKB Priority qdisc support in tc(8)

sch_skbprio is a qdisc that prioritizes packets according to their skb->priority
field. Under congestion, it drops already-enqueued lower priority packets to
make space available for higher priority packets. Skbprio was conceived as a
solution for denial-of-service defenses that need to route packets with
different priorities as a means to overcome DoS attacks.

Signed-off-by: Nishanth Devarajan <ndev2021@gmail.com>
Reviewed-by: Michel Machado <michel@digirati.com.br>
Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/network/iproute2/iproute2-next
Stephen Hemminger [Mon, 13 Aug 2018 19:17:53 +0000 (12:17 -0700)]
Merge branch 'master' of git://git.kernel.org/pub/scm/network/iproute2/iproute2-next

5 years agov4.18.0
Stephen Hemminger [Mon, 13 Aug 2018 19:11:32 +0000 (12:11 -0700)]
v4.18.0

5 years agotc: bpf: update list of archs with eBPF support in manpage
Tobias Klauser [Wed, 8 Aug 2018 12:33:40 +0000 (14:33 +0200)]
tc: bpf: update list of archs with eBPF support in manpage

Update the list of architectures supporting eBPF JIT as of Linux 4.18.
Also mention the Linux version where support for a particular
architecture was introduced. Finally, reformat the list of architectures
as a bullet list in order to make it more readable.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoMerge branch 'iproute2-master' into iproute2-next
David Ahern [Mon, 13 Aug 2018 14:46:26 +0000 (07:46 -0700)]
Merge branch 'iproute2-master' into iproute2-next

Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agosch_cake: Make gso-splitting configurable
Toke Høiland-Jørgensen [Mon, 13 Aug 2018 11:36:17 +0000 (13:36 +0200)]
sch_cake: Make gso-splitting configurable

This patch makes sch_cake's gso/gro splitting configurable
from userspace.

To disable breaking apart superpackets in sch_cake:

tc qdisc replace dev whatever root cake no-split-gso

to enable:

tc qdisc replace dev whatever root cake split-gso

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agoip: show min and max mtu
Stephen Hemminger [Fri, 27 Jul 2018 20:43:50 +0000 (13:43 -0700)]
ip: show min and max mtu

Add min/max MTU to the link details

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agoUpdate kernel headers
David Ahern [Sun, 12 Aug 2018 21:23:31 +0000 (14:23 -0700)]
Update kernel headers

Update kernel headers to commit
78cbac647e61 (Merge branch 'ip-faster-in-order-IP-fragments'")

Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agol2tp: drop lns_mode
Guillaume Nault [Fri, 27 Jul 2018 10:26:32 +0000 (12:26 +0200)]
l2tp: drop lns_mode

This option is never set.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agol2tp: drop mtu
Guillaume Nault [Fri, 27 Jul 2018 10:26:31 +0000 (12:26 +0200)]
l2tp: drop mtu

This option can't be set by user and is never printed.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agol2tp: drop data_seq
Guillaume Nault [Fri, 27 Jul 2018 10:26:30 +0000 (12:26 +0200)]
l2tp: drop data_seq

This option can't be set by user and is never printed. Furthermore,
L2TP_ATTR_DATA_SEQ has always been a noop in Linux.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agotc: fix bugs for tcp_flags and ip_attr hex output
Keara Leibovitz [Thu, 26 Jul 2018 13:45:30 +0000 (09:45 -0400)]
tc: fix bugs for tcp_flags and ip_attr hex output

Fix hex output for both the ip_attr and tcp_flags print functions.

Sample usage:

$ $TC qdisc add dev lo ingress
$ $TC filter add dev lo parent ffff: prio 3 proto ip flower ip_tos 0x8/32
$ $TC fitler add dev lo parent ffff: prio 5 proto ip flower ip_proto tcp \
tcp_flags 0x909/f00

$ $TC filter show dev lo parent ffff:

filter protocol ip pref 3 flower chain 0
filter protocol ip pref 3 flower chain 0 handle 0x1
  eth_type ipv4
  ip_tos 0x8/32
  not_in_hw
filter protocol ip pref 5 flower chain 0
filter protocol ip pref 5 flower chain 0 handle 0x1
  eth_type ipv4
  ip_proto tcp
  tcp_flags 0x909/f00
  not_in_hw

$ $TC -j filter show dev lo parent ffff:

[{
    "protocol":"ip",
    "pref":3,
    "kind":"flower",
    "chain":0
},{
    "protocol":"ip",
    "pref":3,
    "kind":"flower",
    "chain":0,
    "options": {
"handle":1,
"keys": {
    "eth_type":"ipv4",
    "ip_tos":"0x8/32"
    },
    "not_in_hw":true
    }
},{
    "protocol":"ip",
    "pref":5,
    "kind":"flower",
    "chain":0
},{
    "protocol":"ip",
    "pref":5,
    "kind":"flower",
    "chain":0,
    "options": {
"handle":1,
"keys": {
    "eth_type":"ipv4",
    "ip_proto":"tcp",
    "tcp_flags":"0x909/f00"
},
"not_in_hw":true
    }
}]

Signed-off-by: Keara Leibovitz <kleib@mojatatu.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agoip link: don't stop batch processing
Matteo Croce [Fri, 3 Aug 2018 17:49:33 +0000 (19:49 +0200)]
ip link: don't stop batch processing

When 'ip link show dev DEVICE' is processed in a batch mode, ip exits
and stop processing further commands.
This because ipaddr_list_flush_or_save() calls exit() to avoid printing
the link information twice.
Replace the exit with a classic goto out instruction.

Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agotc: flush after each command in batch mode
Stephen Hemminger [Wed, 8 Aug 2018 16:23:48 +0000 (09:23 -0700)]
tc: flush after each command in batch mode

After each command flush output.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agolib/namespace: avoid double-mounting a /sys
Lubomir Rintel [Tue, 24 Jul 2018 17:26:38 +0000 (19:26 +0200)]
lib/namespace: avoid double-mounting a /sys

This partly reverts 8f0807023d067e2bb585a2ae8da93e59689d10f1, bringing
back the umount(/sys) attempt.

In a LXC container we're unable to umount the sysfs instance, nor mount
a read-write one. We still are able to create a new read-only instance.

Nevertheless, it still makes sense to attempt the umount() even though
the sysfs is mounted read-only. Otherwise we may end up attempting to
mount a sysfs with the same flags as is already mounted, resulting in
an EBUSY error (meaning "Already mounted").

Perhaps this is not a very likely scenario in real world, but we hit
it in NetworkManager test suite and makes netns_switch() somewhat more
robust. It also fixes the case, when /sys wasn't mounted at all.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoip: show min and max mtu
Stephen Hemminger [Fri, 27 Jul 2018 20:30:19 +0000 (13:30 -0700)]
ip: show min and max mtu

Add min/max MTU to the link details

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoip/address: fix bracketing in help message
Stephen Hemminger [Fri, 27 Jul 2018 20:26:21 +0000 (13:26 -0700)]
ip/address: fix bracketing in help message

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoMerge branch 'iproute2-master' into iproute2-next
David Ahern [Wed, 25 Jul 2018 17:08:04 +0000 (10:08 -0700)]
Merge branch 'iproute2-master' into iproute2-next

Conflicts:
include/uapi/linux/bpf.h

Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agotc: introduce support for chain templates
Jiri Pirko [Mon, 23 Jul 2018 07:24:40 +0000 (09:24 +0200)]
tc: introduce support for chain templates

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agoip: Add violation counters to VF statisctics
Eran Ben Elisha [Sun, 22 Jul 2018 10:31:12 +0000 (13:31 +0300)]
ip: Add violation counters to VF statisctics

Extend VFs statistics by receive and transmit violation counters.

Example: "ip -s link show dev enp5s0f0"

6: enp5s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 24:8a:07:a5:28:f0 brd ff:ff:ff:ff:ff:ff
    RX: bytes  packets  errors  dropped overrun mcast
    0          0        0       0       0       2
    TX: bytes  packets  errors  dropped carrier collsns
    1406       17       0       0       0       0
    vf 0 MAC 00:00:ca:fe:ca:fe, vlan 5, spoof checking off, link-state auto, trust off, query_rss off
    RX: bytes  packets  mcast   bcast   dropped
    1666       29       14         32      0
    TX: bytes  packets   dropped
    2880       44       2412

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agoUpdate kernel headers
David Ahern [Wed, 25 Jul 2018 16:58:00 +0000 (09:58 -0700)]
Update kernel headers

Update kernel headers to commit
aea5f654e6b7 ("net/sched: add skbprio scheduler")

Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agordam: uapi update ib_user_verbs.h
Stephen Hemminger [Mon, 23 Jul 2018 20:49:20 +0000 (13:49 -0700)]
rdam: uapi update ib_user_verbs.h

Merge in latest santized kernel header.
Put sanitized version of current ib_user_verbs.h.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agouapi: fix tcp.h repair
Stephen Hemminger [Mon, 23 Jul 2018 20:47:22 +0000 (13:47 -0700)]
uapi: fix tcp.h repair

Upstream define for TCP_REPAIR changed.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agodevlink: CTRL_ATTR_FAMILY_ID is a u16
David Ahern [Fri, 20 Jul 2018 16:35:26 +0000 (09:35 -0700)]
devlink: CTRL_ATTR_FAMILY_ID is a u16

CTRL_ATTR_FAMILY_ID is a u16, not a u32. Update devlink accordingly.

Fixes: a3c4b484a1edd ("add devlink tool")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoMerge branch 'tc-tunnels-tos-ttl' into iproute2-next
David Ahern [Fri, 20 Jul 2018 15:59:43 +0000 (08:59 -0700)]
Merge branch 'tc-tunnels-tos-ttl' into iproute2-next

Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agotc/flower: Add match on encapsulating tos/ttl
Or Gerlitz [Thu, 19 Jul 2018 11:02:15 +0000 (14:02 +0300)]
tc/flower: Add match on encapsulating tos/ttl

Add matching on tos/ttl of the IP tunnel headers.

For example, here's decap rule that matches on the tunnel tos:

tc filter add dev vxlan_sys_4789 protocol ip parent ffff: prio 10 flower \
   enc_src_ip 192.168.10.2 enc_dst_ip 192.168.10.1 enc_key_id 100 enc_dst_port 4789 enc_tos 0x30 \
   src_mac e4:11:22:33:44:70 dst_mac e4:11:22:33:44:50  \
   action tunnel_key unset \
   action mirred egress redirect dev eth0_0

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agotc/act_tunnel_key: Enable setup of tos and ttl
Or Gerlitz [Thu, 19 Jul 2018 11:02:14 +0000 (14:02 +0300)]
tc/act_tunnel_key: Enable setup of tos and ttl

Allow to set tos and ttl for the tunnel.

For example, here's encap rule that sets tos to the tunnel:

tc filter add dev eth0_0 protocol ip parent ffff: prio 10 flower \
   src_mac e4:11:22:33:44:50 dst_mac e4:11:22:33:44:70 \
   action tunnel_key set src_ip 192.168.10.1 dst_ip 192.168.10.2 id 100 dst_port 4789 tos 0x30 \
   action mirred egress redirect dev vxlan_sys_4789

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agoUpdate kernel headers
David Ahern [Fri, 20 Jul 2018 15:57:23 +0000 (08:57 -0700)]
Update kernel headers

Update kernel headers to
a3eed83a1895 ("Merge branch 'qed-Add-support-for-phy-module-query'")

Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agoq_cake: Rename autorate_ingress parameter to use dash as word separator
Toke Høiland-Jørgensen [Thu, 19 Jul 2018 16:55:29 +0000 (18:55 +0200)]
q_cake: Rename autorate_ingress parameter to use dash as word separator

This is consistent with the other multi-word parameters. Also change the
JSON output to be consistent with way it is formatted for the other
options.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David Ahern <dsahern@gmail.com>
5 years agotc: Do not use addattr_nest_compat on mqprio and netem
Jesus Sanchez-Palencia [Mon, 16 Jul 2018 17:52:18 +0000 (10:52 -0700)]
tc: Do not use addattr_nest_compat on mqprio and netem

Here we are partially reverting commit c14f9d92eee107
"treewide: Use addattr_nest()/addattr_nest_end() to handle nested
attributes" .

As discussed in [1], changing from the 'manually' coded version that
used addattr_l() to addattr_nest_compat() wasn't functionally
equivalent, because now the messages have extra fields appended to it.

This introduced a regression since the implementation of parse_attr()
from both mqprio and netem can't handle this new message format.

Without this fix, mqprio returns an error. netem won't return an error
but its internal configuration ends up wrong.

As an example, this can be reproduced by the following commands when
this patch is not applied:

 1) mqprio
$ tc qdisc replace dev enp3s0 parent root handle 100 mqprio \
num_tc 3 map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
queues 1@0 1@1 2@2 hw 0

RTNETLINK answers: Numerical result out of range

 2) netem
$ tc qdisc add dev enp3s0 root netem rate 5kbit 20 100 5 \
distribution normal latency 1 1

$ tc -s qdisc

(...)
qdisc netem 8001: dev enp3s0 root refcnt 9 limit 1000 delay 0us  0us
 Sent 402 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
(...)

With this patch applied, the tc -s qdisc command above for netem instead
reads:

(...)
qdisc netem 8002: dev enp3s0 root refcnt 9 limit 1000 delay 0us  0us \
rate 5Kbit packetoverhead 20 cellsize 100 celloverhead 5
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
(...)

[1] https://patchwork.ozlabs.org/patch/867860/#1893405

Fixes: c14f9d92eee107 ("treewide: Use addattr_nest()/addattr_nest_end() to handle nested attributes")
Reported-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoAdd support for CAKE qdisc
Toke Høiland-Jørgensen [Thu, 19 Jul 2018 13:56:17 +0000 (15:56 +0200)]
Add support for CAKE qdisc

sch_cake is intended to squeeze the most bandwidth and latency out of even
the slowest ISP links and routers, while presenting an API simple enough
that even an ISP can configure it.

Example of use on a cable ISP uplink:

tc qdisc add dev eth0 cake bandwidth 20Mbit nat docsis ack-filter

To shape a cable download link (ifb and tc-mirred setup elided)

tc qdisc add dev ifb0 cake bandwidth 200mbit nat docsis ingress wash besteffort

Cake is filled with:

* A hybrid Codel/Blue AQM algorithm, "Cobalt", tied to an FQ_Codel
  derived Flow Queuing system, which autoconfigures based on the bandwidth.
* A novel "triple-isolate" mode (the default) which balances per-host
  and per-flow FQ even through NAT.
* An deficit based shaper, that can also be used in an unlimited mode.
* 8 way set associative hashing to reduce flow collisions to a minimum.
* A reasonable interpretation of various diffserv latency/loss tradeoffs.
* Support for zeroing diffserv markings for entering and exiting traffic.
* Support for interacting well with Docsis 3.0 shaper framing.
* Support for DSL framing types and shapers.
* Support for ack filtering.
* Extensive statistics for measuring, loss, ecn markings, latency variation.

Various versions baking have been available as an out of tree build for
kernel versions going back to 3.10, as the embedded router world has been
running a few years behind mainline Linux. A stable version has been
generally available on lede-17.01 and later.

sch_cake replaces a combination of iptables, tc filter, htb and fq_codel
in the sqm-scripts, with sane defaults and vastly simpler configuration.

Cake's principal author is Jonathan Morton, with contributions from
Kevin Darbyshire-Bryant, Toke Høiland-Jørgensen, Sebastian Moeller,
Ryan Mounce, Tony Ambardar, Dean Scarff, Nils Andreas Svee, Dave Täht,
and Loganaden Velvindron.

Testing from Pete Heist, Georgios Amanakis, and the many other members of
the cake@lists.bufferbloat.net mailing list.

Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David Ahern <dsahern@gmail.com>