]> git.proxmox.com Git - mirror_iproute2.git/log
mirror_iproute2.git
6 years agoUpdate kernel headers to 4.15-rc8
David Ahern [Fri, 19 Jan 2018 20:33:41 +0000 (12:33 -0800)]
Update kernel headers to 4.15-rc8

Update kernel headers to commit 30c3e9d47035
("l2tp: remove switch block in l2tp_nl_cmd_session_create()")

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoMerge branch 'tc-batch' into net-next
David Ahern [Mon, 15 Jan 2018 16:25:30 +0000 (08:25 -0800)]
Merge branch 'tc-batch' into net-next

Chris Mi says:

====================
Currently in tc batch mode, only one command is read from the batch
file and sent to kernel to process. With this patchset, at most 128
commands can be accumulated before sending to kernel.

We introduced a new function in patch 1 to support for sending
multiple messages. In patch 2, we add this support for filter
add/delete/change/replace and actions add/change/replace commands.

But please note that kernel still processes the requests one by one.
To process the requests in parallel in kernel is another effort.
The time we're saving in this patchset is the user mode and kernel mode
context switch. So this patchset works on top of the current kernel.

Using the following script in kernel, we can generate 1,000,000 rules.
tools/testing/selftests/tc-testing/tdc_batch.py

Without this patchset, 'tc -b $file' exection time is:

real    0m15.555s
user    0m7.211s
sys     0m8.284s

With this patchset, 'tc -b $file' exection time is:

real    0m12.360s
user    0m6.082s
sys     0m6.213s

The insertion rate is improved more than 10%.
====================

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotc: Add batchsize feature for filter and actions
Chris Mi [Fri, 12 Jan 2018 05:13:16 +0000 (14:13 +0900)]
tc: Add batchsize feature for filter and actions

Currently in tc batch mode, only one command is read from the batch
file and sent to kernel to process. With this support, at most 128
commands can be accumulated before sending to kernel.

Now it only works for the following successive commands:
1. filter add/delete/change/replace
2. actions add/change/replace

Signed-off-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agolib/libnetlink: Add a new function rtnl_talk_iov
Chris Mi [Fri, 12 Jan 2018 05:13:15 +0000 (14:13 +0900)]
lib/libnetlink: Add a new function rtnl_talk_iov

rtnl_talk can only send a single message to kernel. Add a new function
rtnl_talk_iov that can send multiple messages to kernel.
rtnl_talk_iov takes struct iovec * and iovlen as arguments.

Signed-off-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoMerge branch 'master' into net-next
David Ahern [Mon, 8 Jan 2018 18:10:45 +0000 (10:10 -0800)]
Merge branch 'master' into net-next

 Conflicts:
man/man8/ip-link.8.in

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agolink_iptnl: Open "encap" JSON object
Serhey Popovych [Tue, 2 Jan 2018 21:27:59 +0000 (23:27 +0200)]
link_iptnl: Open "encap" JSON object

It seems missing pair of open_json_object()/close_json_object()
in iptnl implementation.

Note that we open "encap" JSON object in ip6tnl.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
6 years agolink_iptnl: Print tunnel mode
Serhey Popovych [Tue, 2 Jan 2018 21:27:58 +0000 (23:27 +0200)]
link_iptnl: Print tunnel mode

Tunnel mode does not appear in parameters print for iptnl
supported tunnels like ipip and sit, while printed for
ip6tnl.

Print tunnel mode as "proto" field name for JSON and
without any name when printing to cli to follow ip6tnl
behaviour.

For non JSON output we have:

   $ ip -d link show dev sit1

Before:
-------
17: sit1@NONE: <NOARP> mtu 1480 qdisc noop state DOWN ...
    link/sit X.X.X.X brd 0.0.0.0 promiscuity 0
    sit remote any local X.X.X.X ...
        ~~~

After:
------
17: sit1@NONE: <NOARP> mtu 1480 qdisc noop state DOWN ...
    link/sit X.X.X.X brd 0.0.0.0 promiscuity 0
    sit any remote any local X.X.X.X ...
        ^^^

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
6 years agolink_iptnl: Kill code duplication
Serhey Popovych [Tue, 2 Jan 2018 21:27:57 +0000 (23:27 +0200)]
link_iptnl: Kill code duplication

Both sit and ipip "mode" parameter handling nearly the same.
Except for sit we have "ip6ip" mode: check it only when
configuring sit.

Note that there is no need strcmp(lu->id, "ipip"): if it is
not sit it is "ipip" because we have only these two link util
defined in module.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
6 years agodevlink, rdma, tipc: properly define TARGETS without HAVE_MNL
Matthias Schiffer [Wed, 3 Jan 2018 15:28:52 +0000 (16:28 +0100)]
devlink, rdma, tipc: properly define TARGETS without HAVE_MNL

Leaving a variable with a generic name such as TARGETS undefined would lead
to Make picking up its value from the environment. Avoid this by always
defining TARGETS in the Makefiles.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
6 years agoip: link: add support for netdevsim device type
Jakub Kicinski [Tue, 2 Jan 2018 22:54:52 +0000 (14:54 -0800)]
ip: link: add support for netdevsim device type

netdevsim is a new software device for testing kernel APIs
without any hardware attached.  Allow users to create such
devices.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
6 years agoman: fix small formatting errors
Luca Boccassi [Tue, 2 Jan 2018 17:42:16 +0000 (18:42 +0100)]
man: fix small formatting errors

Lintian detected the following formatting errors:

 man/man8/devlink-sb.8.gz 230: warning: macro `b' not defined
 man/man8/ip-link.8.gz 1243: warning: macro `in-8' not defined
  (possibly missing space after `in')
 man/man8/tc-u32.8.gz `R' is a string (producing the registered sign),
  not a macro.

Signed-off-by: Luca Boccassi <bluca@debian.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoman: routel/routef: don't mention filesystem paths
Luca Boccassi [Sat, 30 Dec 2017 10:31:17 +0000 (11:31 +0100)]
man: routel/routef: don't mention filesystem paths

The filesytem paths to these scripts might be different on various
distros, so don't mention it in the manpages. It is not really useful
information anyway.

Originally submitted as Debian bug:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=561424

Reported-by: jidanni@jidanni.org
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoman: ip-address: document 15-char limit for LABEL
Luca Boccassi [Sat, 30 Dec 2017 10:31:16 +0000 (11:31 +0100)]
man: ip-address: document 15-char limit for LABEL

Trying to set a label longer than 15 characters returns an error:
 RTNETLINK answers: Numerical result out of range

Document the limit in the manpage.

Originally reported as a Debian bug:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=661886

Reported-by: Gabor Kiss <kissg@ssg.ki.iif.hu>
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoman: add more keywords to ip.8 short description
Luca Boccassi [Sat, 30 Dec 2017 10:31:15 +0000 (11:31 +0100)]
man: add more keywords to ip.8 short description

A Debian user suggested adding more network-related keywords to the
ip manpage, so that manpage-scraping and indexing software like
apropos can do a better job of categorizing the programs.

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=877983

Suggested-by: Lynoure Braakman <lynoure@gmail.com>
Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoman: drop references to Debian-specific paths
Luca Boccassi [Sat, 30 Dec 2017 10:31:14 +0000 (11:31 +0100)]
man: drop references to Debian-specific paths

Documentation should be distribution-agnostic - any specific quirks
should be handled by downstream maintainers, if necessary.
Remove mentions of Debian paths and package names.

Signed-off-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoip/tunnel: Document "external" parameter
Serhey Popovych [Thu, 28 Dec 2017 11:11:42 +0000 (13:11 +0200)]
ip/tunnel: Document "external" parameter

Add it to ip-link(8) "type gre" output help message
as well as to ip-link(8) page.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
6 years agovxcan,veth: Forbid "type" for peer device
Serhey Popovych [Thu, 28 Dec 2017 11:01:04 +0000 (13:01 +0200)]
vxcan,veth: Forbid "type" for peer device

It is already given for original device we configure this
peer for.

Results from following command before/after change applied
are shown below:

  $ ip link add dev veth1a type veth peer name veth1b \
                           type veth peer name veth1c

Before:
-------

<no output, no netdevs created>

After:
------

Error: duplicate "type": "veth" is the second value.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoqdisc: print offload indication
Yuval Mintz [Tue, 26 Dec 2017 09:48:45 +0000 (11:48 +0200)]
qdisc: print offload indication

Use the newly added TCA_HW_OFFLOAD indication from kernel
to print a consistent 'offloaded' message to user when listing qdiscs.

Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agogre6/tunnel: Do not submit garbage in flowinfo
Serhey Popovych [Wed, 27 Dec 2017 11:28:15 +0000 (13:28 +0200)]
gre6/tunnel: Do not submit garbage in flowinfo

We always send flowinfo to the kernel. If flowlabel/tclass
was set first to non-inherit value and then reset to
inherit we do not clear flowlabel/tclass part in flowinfo,
send it to kernel and can get from the kernel back.

Even if we check for IP6_TNL_F_USE_ORIG_TCLASS and
IP6_TNL_F_USE_ORIG_FLOWLABEL when printing options
sending invalid flowlabel/tclass to the kernel seems
bad idea.

Note that ip6tnl always clean corresponding flowinfo
parts on inherit.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
6 years agogre,ip6tnl/tunnel: Fix noencap- support
Serhey Popovych [Wed, 27 Dec 2017 11:28:14 +0000 (13:28 +0200)]
gre,ip6tnl/tunnel: Fix noencap- support

We must clear bit, not set all but given bit.

Fixes: 858dbb208e39 ("ip link: Add support for remote checksum offload to IP tunnels")
Fixes: 73516e128a5a ("ip6tnl: Support for fou encapsulation"
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
6 years agordma: Move link execution logic to common code
Leon Romanovsky [Wed, 27 Dec 2017 07:57:59 +0000 (09:57 +0200)]
rdma: Move link execution logic to common code

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agordma: Rename rd_free_devmap to be rd_free
Leon Romanovsky [Wed, 27 Dec 2017 07:57:58 +0000 (09:57 +0200)]
rdma: Rename rd_free_devmap to be rd_free

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agordma: Rename free function to be rd_cleanup
Leon Romanovsky [Wed, 27 Dec 2017 07:57:57 +0000 (09:57 +0200)]
rdma: Rename free function to be rd_cleanup

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agordma: Get rid of dev_map_free call
Leon Romanovsky [Wed, 27 Dec 2017 07:57:56 +0000 (09:57 +0200)]
rdma: Get rid of dev_map_free call

The dev_map_free() is called once only and it is short,
so it is better to integrate it into the caller's site.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agordma: Print supplied device name in case of wrong name
Leon Romanovsky [Wed, 27 Dec 2017 07:57:55 +0000 (09:57 +0200)]
rdma: Print supplied device name in case of wrong name

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agordma: Check that port index exists before operate on link layer
Leon Romanovsky [Wed, 27 Dec 2017 07:57:54 +0000 (09:57 +0200)]
rdma: Check that port index exists before operate on link layer

Link layer operates on port layer, hence it should check
it existence before execution commands.

Fixes: da990ab40a92 ("rdma: Add link object")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agordma: Fix misspelled SYS_IMAGE_GUID
Leon Romanovsky [Wed, 27 Dec 2017 07:57:53 +0000 (09:57 +0200)]
rdma: Fix misspelled SYS_IMAGE_GUID

SYS_IMAGE_GUIG is actually SYS_IMAGE_GUID.

Fixes: da990ab40a92 ("rdma: Add link object")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agordma: Move per-device handler function to generic code
Leon Romanovsky [Wed, 27 Dec 2017 07:57:52 +0000 (09:57 +0200)]
rdma: Move per-device handler function to generic code

Most of the proposed objects are working in the scope "dev"
and will implement the same logic. Move the code to utils.c,
so other objects will be able to reuse the code.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agordma: Protect dev_map_lookup from wrong input
Leon Romanovsky [Wed, 27 Dec 2017 07:57:51 +0000 (09:57 +0200)]
rdma: Protect dev_map_lookup from wrong input

Despite the fact that all callers to dev_map_lookup are ensuring that
there is always device name prior to call to that function, it is better
and safer to check that in the dev_map_lookup itself.

Fixes: 40df8263a0f0 ("rdma: Add dev object")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agordma: Reduce scope of _dev_map_lookup call
Leon Romanovsky [Wed, 27 Dec 2017 07:57:50 +0000 (09:57 +0200)]
rdma: Reduce scope of _dev_map_lookup call

There is no external users of _dev_map_lookup function,
so let's limit its scope to be local.

Fixes: 40df8263a0f0 ("rdma: Add dev object")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoerspan: add erspan usage description
William Tu [Tue, 26 Dec 2017 18:31:16 +0000 (10:31 -0800)]
erspan: add erspan usage description

The patch adds erspan usage description, so 'ip link help erspan'
and 'ip link help ip6erspan' shows the options.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoip/tunnel: No need to free answer after rtnl_talk() on error
Serhey Popovych [Wed, 20 Dec 2017 07:57:10 +0000 (09:57 +0200)]
ip/tunnel: No need to free answer after rtnl_talk() on error

Since rtnl_talk() never returns with answer buffer allocated
on error we do not need to release it manually. After this
initializing answer with NULL before rtnl_talk() is useless.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
6 years agoutils: ll_addr: Handle ARPHRD_IP6GRE in ll_addr_n2a()
Serhey Popovych [Wed, 20 Dec 2017 07:57:09 +0000 (09:57 +0200)]
utils: ll_addr: Handle ARPHRD_IP6GRE in ll_addr_n2a()

ll_addr_n2a() correctly prints tunnel endpoints for gre, ipip, sit
and ip6tnl, but not for ip6gre. Fix this by adding ARPHRD_IP6GRE to
IPv6 tunnel endpoing address conversion.

Before:
-------

$ ip link show
...
18: ip6tnl0: <NOARP> mtu 1452 qdisc noop state DOWN mode DEFAULT group default
    link/tunnel6 :: brd ::
19: ip6gre0: <NOARP> mtu 1456 qdisc noop state DOWN mode DEFAULT group default
    link/gre6 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00 brd \
00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00

After:
------

$ ip link show
...
18: ip6tnl0: <NOARP> mtu 1452 qdisc noop state DOWN mode DEFAULT group default
    link/tunnel6 :: brd ::
19: ip6gre0: <NOARP> mtu 1456 qdisc noop state DOWN mode DEFAULT group default
    link/gre6 :: brd ::

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
6 years agoerspan: add erspan version II support
William Tu [Wed, 20 Dec 2017 02:01:06 +0000 (18:01 -0800)]
erspan: add erspan version II support

The patch adds support for configuring the erspan v2, for both
ipv4 and ipv6 erspan implementation.  Three additional fields
are added: 'erspan_ver' for distinguishing v1 or v2, 'erspan_dir'
for specifying direction of the mirrored traffic, and 'erspan_hwid'
for users to set ERSPAN engine ID within a system.

As for manpage, the ERSPAN descriptions used to be under GRE, IPIP,
SIT Type paragraph.  Since IP6GRE/IP6GRETAP also supports ERSPAN,
the patch removes the old one, creates a separate ERSPAN paragrah,
and adds an example.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoUpdate headers from 4.15-rc3
David Ahern [Tue, 19 Dec 2017 20:58:58 +0000 (12:58 -0800)]
Update headers from 4.15-rc3

Update kernel headers to commit f39a5c01c3d2 ("Merge branch
'nfp-flower-add-Geneve-tunnel-support'")

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiproute: "list/flush/save default" selected all of the routes
Alexander Zubkov [Sun, 17 Dec 2017 11:09:00 +0000 (12:09 +0100)]
iproute: "list/flush/save default" selected all of the routes

When running "ip route list default" and not specifying address family,
one will get all of the routes instead of just default only. The same
is for "exact default" and "match default".

It behaves in such a way because default route with unspecified family
has the same all-zeroes value like no prefix specified at all. Thus
following code blindly ignores the fact, that prefix was actually
specified.

This patch adds the flag PREFIXLEN_SPECIFIED to the default route too.
And then checks its value when filtering routes.

Signed-off-by: Alexander Zubkov <green@msu.ru>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoiproute: list/flush/save filter also by metric
Alexander Zubkov [Sun, 17 Dec 2017 12:02:11 +0000 (13:02 +0100)]
iproute: list/flush/save filter also by metric

Metric is one of the "unique key" fields of the route in Linux. But
still one can not use its value in filter while running ip list.
Because of this writing checks in scripts for example is incovenient.

Signed-off-by: Alexander Zubkov <green@msu.ru>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agolink_vti6: Always add local/remote endpoint attributes
Serhey Popovych [Mon, 18 Dec 2017 17:48:05 +0000 (19:48 +0200)]
link_vti6: Always add local/remote endpoint attributes

All tunnels already support for parsing/adding zero
endpoints and vti6 isn't an exception.

This check was added as part of commit 2a80154fde40
(vti6: fix local/remote any addr handling) and looks
too restrictive as purpose of change is to avoid
endpoint configuration from uninitialized data.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agolink_ip6tnl: Use IN6ADDR_ANY_INIT to initialize local/remote endpoints
Serhey Popovych [Mon, 18 Dec 2017 17:48:04 +0000 (19:48 +0200)]
link_ip6tnl: Use IN6ADDR_ANY_INIT to initialize local/remote endpoints

Use specialized helper to initialize endpoint addresses with
zeros instead of open coding this. This unifies initialization
style with other ipv6 tunnel variants (i.e. gre6 and vti6).

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoip/tunnel: Use tnl_parse_key() to parse tunnel key
Serhey Popovych [Mon, 18 Dec 2017 17:48:03 +0000 (19:48 +0200)]
ip/tunnel: Use tnl_parse_key() to parse tunnel key

It is added with
commit a7ed1520ee96 ("ip/tunnel: introduce tnl_parse_key()")
to avoid code duplication in ip6?tunnel.c.

Reuse it for gre/gre6 and vti/vti6 tunnel rtnl
configuration interface with the same purpose
it is used in tunnel ioctl interface in ip6?tunnel.c.

While there change type of key variables from
unsigned integer to __be32 to reflect nature of the
value they store and place error message in
tnl_parse_key() on a single line to make single
call to fprintf().

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoiplink: Kill redundant network device name checks
Serhey Popovych [Mon, 18 Dec 2017 18:54:08 +0000 (20:54 +0200)]
iplink: Kill redundant network device name checks

Since commit 625df645b703 (Check user supplied interface name lengths)
iplink_parse() validates network device name using check_ifname()
helpers.

Remove redundant "name" length checks from iplink_parse() callers.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoiplink: Process "alias" parameter correctly
Serhey Popovych [Mon, 18 Dec 2017 18:54:07 +0000 (20:54 +0200)]
iplink: Process "alias" parameter correctly

Do not stop parameters processing after "alias" parameter: it might
not be a last one. Seems copy pasted from "type" parameter code.

Check it's length does not exceed IFALIASZ - 1. Better we warn
than get RTNL error.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoiplink: Improve index parameter handling
Serhey Popovych [Mon, 18 Dec 2017 18:54:06 +0000 (20:54 +0200)]
iplink: Improve index parameter handling

Correctly check for valid network device index supplied on
command line: indexes are always greather than zero. Check
for duplicate "index" argument.

Initialize @index to 0 to simplify handling it in iplink_modify().
Other callers (link_veth.c, iplink_vxcan.c) already did so.

No need to initialize ifi_index with 0 since it is already
initialized at the @struct req initialization time and not
modified in iplink_parse().

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoutils: fix makeargs stack overflow
Stephen Hemminger [Mon, 18 Dec 2017 19:10:53 +0000 (11:10 -0800)]
utils: fix makeargs stack overflow

The makeargs() function did not handle end of string correctly
and would reference past end of string.

Found by fuzzing with ASAN.

Reported-by:Bug Basher <iamliketohack@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoss: fix crash with invalid command input file
Stephen Hemminger [Mon, 18 Dec 2017 17:51:02 +0000 (09:51 -0800)]
ss: fix crash with invalid command input file

If given an invalid input file with -F flag, ss would crash.
Examples of invalid input are line to long, or null file.

Found by fuzzing with ASAN.

Reported-by:Bug Basher <iamliketohack@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoip: validate vlan value for vlan info
Stephen Hemminger [Fri, 15 Dec 2017 02:17:43 +0000 (18:17 -0800)]
ip: validate vlan value for vlan info

The VLAN tag must be 0..4095 to be valid.
Better to trap it here.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoip: gre: fix IFLA_GRE_LINK attribute sizing
Serhey Popovych [Wed, 13 Dec 2017 19:36:02 +0000 (21:36 +0200)]
ip: gre: fix IFLA_GRE_LINK attribute sizing

Attribute IFLA_GRE_LINK is 32 bit long, not 8 bit.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
6 years agoip/tunnel: Use get_addr() instead of get_prefix() for local/remote endpoints
Serhey Popovych [Wed, 13 Dec 2017 19:36:01 +0000 (21:36 +0200)]
ip/tunnel: Use get_addr() instead of get_prefix() for local/remote endpoints

Manual page ip-link(8) states that both local and remote accept
IPADDR not PREFIX. Use get_addr() instead of get_prefix() to
parse local/remote endpoint address correctly.

Force corresponding address family instead of using preferred_family
to catch weired cases as shown below.

Before this patch it is possible to create tunnel with commands:

  ip    li add dev ip6gre2 type ip6gre local fe80::1/64 remote fe80::2/64
  ip -4 li add dev ip6gre2 type ip6gre local 10.0.0.1/24 remote 10.0.0.2/24

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
6 years agoip/tunnel: Unify setup and accept zero address for local/remote endpoints
Serhey Popovych [Wed, 13 Dec 2017 19:36:00 +0000 (21:36 +0200)]
ip/tunnel: Unify setup and accept zero address for local/remote endpoints

It is fully legal to submit zero (INADDR_ANY/IN6ADDR_ANY_INIT)
value for local and/or remote endpoints for all tunnel drivers:
no need additionally check this in userspace.

Note that all tunnel specific code already can pass zero address
to the kernel.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
6 years agoip: add vxcan/veth to ip-link man page
Oliver Hartkopp [Sat, 16 Dec 2017 11:38:57 +0000 (12:38 +0100)]
ip: add vxcan/veth to ip-link man page

veth and vxcan both create a vitual tunnel between a pair of virtual network
devices. This patch adds the content for the now supported vxcan netdevices
and the documentation to create peer devices for vxcan and veth.

Additional remove 'can' that accidently was on the list of link types which
can be created by 'ip link add' as 'can' devices are real network devices.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoss: add missing path MTU parameter
Roman Mashak [Fri, 15 Dec 2017 14:27:42 +0000 (09:27 -0500)]
ss: add missing path MTU parameter

v3:
   Rebase and use out() instead of printf().
v2:
   Print the path MTU immediately after the MSS, as it is easier to parse
   for humans (suggested by Neal Cardwell).

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoinclude: qdisc offload defines
Stephen Hemminger [Sat, 16 Dec 2017 18:00:43 +0000 (10:00 -0800)]
include: qdisc offload defines

UAPI changes from upstream:
net: sched: Add TCA_HW_OFFLOAD
pkt_sched: Remove TC_RED_OFFLOADED from uapi

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Fri, 15 Dec 2017 05:19:54 +0000 (21:19 -0800)]
Merge branch 'master' into net-next

6 years agogre6: add collect metadata support
William Tu [Wed, 13 Dec 2017 02:22:52 +0000 (18:22 -0800)]
gre6: add collect metadata support

The patch adds 'external' option to support collect metadata
gre6 tunnel.  The 'external' keyword is already used to set the
device into collect metadata mode such as vxlan, geneve, ipip,
etc.  This patch extends support for ipv6 gre and gretap.
Example of L3 and L2 gre device:
bash:~# ip link add dev ip6gre123 type ip6gre external
bash:~# ip link add dev ip6gretap123 type ip6gretap external

Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
6 years agotc: fix command "tc actions del" hang issue
Chris Mi [Thu, 14 Dec 2017 09:09:00 +0000 (18:09 +0900)]
tc: fix command "tc actions del" hang issue

If command is RTM_DELACTION, a non-NULL pointer is passed to rtnl_talk().
Then flag NLM_F_ACK is not set on n->nlmsg_flags and netlink_ack() will
not be called. Command tc will wait for the reply for ever.

Fixes: 86bf43c7c2fd ("lib/libnetlink: update rtnl_talk to support malloc buff at run time")
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoiplink: add definitions for GSO_MAX
Stephen Hemminger [Fri, 15 Dec 2017 02:22:56 +0000 (18:22 -0800)]
iplink: add definitions for GSO_MAX

Until kernel exports these, add GSO_MAX values into iplink
rather than assuming they are UINT_MAX + 1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoiplink: validate maximum gso_max_size
Solio Sarabia [Tue, 12 Dec 2017 22:25:22 +0000 (14:25 -0800)]
iplink: validate maximum gso_max_size

Validate the upper limit for gso_max_size, valid range is [0-65,536]
inclusive. Fix minor whitespace in iplink man page.

Signed-off-by: Solio Sarabia <solio.sarabia@intel.com>
6 years agotc: fix json array closing
Jiri Pirko [Wed, 13 Dec 2017 19:56:16 +0000 (20:56 +0100)]
tc: fix json array closing

Fixes: 2704bd625583 ("tc: jsonify actions core")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agoip: add vxcan to help text
Oliver Hartkopp [Wed, 13 Dec 2017 20:21:28 +0000 (21:21 +0100)]
ip: add vxcan to help text

Add missing tag 'vxcan' inside the help text which was missing in commit
efe459c76d35f ('ip: link add vxcan support').

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
6 years agoShow 'external' link mode in output
Phil Dibowitz [Tue, 12 Dec 2017 21:54:06 +0000 (13:54 -0800)]
Show 'external' link mode in output

Recently `external` support was added to the tunnel drivers, but there is no way
to introspect this from userspace. This adds support for that.

Now `ip -details link` shows it:

```
7: tunl60@NONE: <NOARP> mtu 1452 qdisc noop state DOWN mode DEFAULT group
default qlen 1
    link/tunnel6 :: brd :: promiscuity 0
    ip6tnl external any remote :: local :: encaplimit 0 hoplimit 0 tclass 0x00 flowlabel 0x00000 (flowinfo 0x00000000) addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
```

Signed-off-by: Phil Dibowitz <phil@ipom.com>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Tue, 12 Dec 2017 20:12:20 +0000 (12:12 -0800)]
Merge branch 'master' into net-next

6 years agotc: bash-completion: add missing 'classid' keyword
Davide Caratti [Tue, 12 Dec 2017 15:45:15 +0000 (16:45 +0100)]
tc: bash-completion: add missing 'classid' keyword

users of 'matchall' filter can specify a value for the class id: update
bash-completion accordingly.

Fixes: b32c0b64fa2b ("tc: bash-completion: Add support for matchall")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
6 years agoss: Implement automatic column width calculation
Stefano Brivio [Tue, 12 Dec 2017 00:46:33 +0000 (01:46 +0100)]
ss: Implement automatic column width calculation

Group fitting fields into lines and space them equally using the
remaining screen width for each line. If columns don't fit on
one line, break them into the least possible amount of lines and
keep them aligned across lines.

This is done by:
 - recording the length of the longest item in each column during
   formatting and buffering (which was added in the previous patch)
 - fitting as many fields as possible on each line of output
 - distributing the remaining padding space equally between the
   columns

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
6 years agoss: Buffer raw fields first, then render them as a table
Stefano Brivio [Tue, 12 Dec 2017 00:46:32 +0000 (01:46 +0100)]
ss: Buffer raw fields first, then render them as a table

This allows us to measure the maximum field length for each
column before printing fields and will permit us to apply
optimal field spacing and distribution. Structure of the output
buffer with chunked allocation is described in comments.

Output is still unchanged, original spacing is used.

Running over one million sockets with -tul options by simply
modifying main() to loop 50,000 times over the *_show()
functions, buffering the whole output and rendering it at the
end, with 10 UDP sockets, 10 TCP sockets, while throwing
output away, doesn't show significant changes in execution time
on my laptop with an Intel i7-6600U CPU:

- before this patch:
$ time ./ss -tul > /dev/null
real 0m29.899s
user 0m2.017s
sys 0m27.801s

- after this patch:
$ time ./ss -tul > /dev/null
real 0m29.827s
user 0m1.942s
sys 0m27.812s

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
6 years agoss: Introduce columns lightweight abstraction
Stefano Brivio [Tue, 12 Dec 2017 00:46:31 +0000 (01:46 +0100)]
ss: Introduce columns lightweight abstraction

Instead of embedding spacing directly while printing contents,
logically declare columns and functions to buffer their content,
to print left and right spacing around fields, to flush them to
screen, and to print headers.

This makes it a bit easier to handle layout changes and prepares
for full output buffering, needed for optimal spacing in field
output layout.

Columns are currently set up to retain exactly the same output
as before. This needs some slight adjustments of the values
previously calculated in main(), as the width value introduced
here already includes the width of left delimiters and spacing
is not explicitly printed anymore whenever a field is printed.
These calculations will go away altogether once automatic width
calculation is implemented.

We can also remove explicit printing of newlines after the final
content for a given line is printed, flushing the last field on
a line will cause field_flush() to print newlines where
appropriate.

No changes in output expected here.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
6 years agoss: Replace printf() calls for "main" output by calls to helper
Stefano Brivio [Tue, 12 Dec 2017 00:46:30 +0000 (01:46 +0100)]
ss: Replace printf() calls for "main" output by calls to helper

This is preparation work for output buffering, which will allow
us to use optimal spacing and alignment of logical "columns".

The new out() function is just a re-implementation of a typical
libc's printf(), except that the return value of vfprintf() is
ignored as no callers use it. This implementation will be
replaced in the next patches to provide column width adjustment
and adequate spacing.

All printf() calls that output parts of the socket list are now
replaced by calls to out(). Output of summary and version is
excluded from this.

No functional differences here, output not affected.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Tue, 12 Dec 2017 00:06:11 +0000 (16:06 -0800)]
Merge branch 'master' into net-next

6 years agouapi: tun add eBPF based queue selection method
Stephen Hemminger [Tue, 12 Dec 2017 00:03:27 +0000 (16:03 -0800)]
uapi: tun add eBPF based queue selection method

Upstream commit 96f84061620c6325a2ca9a9a05b410e6461d03c3
    tun: add eBPF based queue selection method

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agouapi: add access to snd_cwnd and other sock_ops
Stephen Hemminger [Tue, 12 Dec 2017 00:01:17 +0000 (16:01 -0800)]
uapi: add access to snd_cwnd and other sock_ops

From upstream kernel commit f19397a5c65665d66e3866b42056f1f58b7a366b
    bpf: Add access to snd_cwnd and others in sock_ops

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoss: remove duplicate assignment
Roman Mashak [Mon, 11 Dec 2017 21:24:31 +0000 (16:24 -0500)]
ss: remove duplicate assignment

Fixes: 8250bc9ff4e5 ("ss: Unify inet sockets output")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoiplink: allow configuring GSO max values
Stephen Hemminger [Fri, 1 Dec 2017 19:52:34 +0000 (11:52 -0800)]
iplink: allow configuring GSO max values

This allows sending GSO maximum values when configuring a device.
The values are advisory. Most devices will ignore them but for some
pseudo devices such as veth pairs they can be set.

Example:
# ip link add dev vm1 type veth peer name vm2 gso_max_size 32768

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Sat, 9 Dec 2017 05:32:33 +0000 (21:32 -0800)]
Merge branch 'master' into net-next

6 years agotc: util: Don't call NEXT_ARG_FWD() in __parse_action_control()
Michal Privoznik [Fri, 8 Dec 2017 10:18:07 +0000 (11:18 +0100)]
tc: util: Don't call NEXT_ARG_FWD() in __parse_action_control()

Not all callers want parse_action_control*() to advance the
arguments. For instance act_parse_police() does the argument
advancing itself.

Fixes: e67aba559581 ("tc: actions: add helpers to parse and print control actions")
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
6 years agoss: print tcpi_rcv_ssthresh
Wei Wang [Fri, 8 Dec 2017 00:12:00 +0000 (16:12 -0800)]
ss: print tcpi_rcv_ssthresh

tcpi_rcv_ssthresh is an important stats when debugging receive side
behavior.
Add it to the ss output.

Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
6 years agoupdate headers from 4.15-rc2
Stephen Hemminger [Wed, 6 Dec 2017 01:30:22 +0000 (17:30 -0800)]
update headers from 4.15-rc2

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoman: tc-csum.8: Fix inconsistency in example description
Phil Sutter [Wed, 29 Nov 2017 17:34:09 +0000 (18:34 +0100)]
man: tc-csum.8: Fix inconsistency in example description

Commit 6bbe5e6290db5 ("man: tc-csum.8: Fix example") changed both source
and destination IP addresses in example code but missed to update the
example's description accordingly.

Fixes: 6bbe5e6290db5 ("man: tc-csum.8: Fix example")
Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoupdate bpf header from net-next
Stephen Hemminger [Wed, 29 Nov 2017 02:16:51 +0000 (18:16 -0800)]
update bpf header from net-next

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoMerge branch 'master' into net-next
Stephen Hemminger [Tue, 28 Nov 2017 17:53:28 +0000 (09:53 -0800)]
Merge branch 'master' into net-next

6 years agoman: add -json option to tc manpage
Jiri Pirko [Mon, 27 Nov 2017 08:09:04 +0000 (09:09 +0100)]
man: add -json option to tc manpage

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agovxlan: Make id optional when modifying a link
Robert Shearman [Tue, 28 Nov 2017 11:16:50 +0000 (11:16 +0000)]
vxlan: Make id optional when modifying a link

Specifying the IFLA_VXLAN_LINK attribute on a vxlan link modify is
optional in the kernel, so make the id argument optional for "ip link
set ..." to avoid a user needing to specify it when changing another
attribute.

Signed-off-by: Robert Shearman <rs823p@att.com>
6 years agogre: Fix ttl inherit option
Robert Shearman [Tue, 28 Nov 2017 11:16:21 +0000 (11:16 +0000)]
gre: Fix ttl inherit option

Specifying "... ttl inherit" currently does nothing on a GRE link
modify since the previous ttl value is retrieved up front. Fix this by
explicitly setting ttl to 0 when "inherit" is specified for the
option, since 0 represents the semantics of inherit.

Signed-off-by: Robert Shearman <rs823p@att.com>
6 years agolink_gre6: Detect invalid encaplimit values
Phil Sutter [Tue, 28 Nov 2017 15:49:58 +0000 (16:49 +0100)]
link_gre6: Detect invalid encaplimit values

Looks like a typo: get_u8() returns 0 on success and -1 on error, so the
error checking here was ineffective.

Fixes: a11b7b71a6eba ("link_gre6: really support encaplimit option")
Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agom_mirred: style cleanups
Stephen Hemminger [Sun, 26 Nov 2017 20:42:17 +0000 (12:42 -0800)]
m_mirred: style cleanups

Fix whitespace and long lines.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agom_gact: whitespace cleanup
Stephen Hemminger [Sun, 26 Nov 2017 20:38:21 +0000 (12:38 -0800)]
m_gact: whitespace cleanup

Fix whitespace errors reported by checkpatch

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agom_action: style cleanup
Stephen Hemminger [Sun, 26 Nov 2017 20:36:15 +0000 (12:36 -0800)]
m_action: style cleanup

Break long lines, and use bool where possible.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agom_vlan: style cleanups
Stephen Hemminger [Sun, 26 Nov 2017 20:28:55 +0000 (12:28 -0800)]
m_vlan: style cleanups

Break long lines and make duplicated code into function.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agotc: jsonify vlan action
Jiri Pirko [Sat, 25 Nov 2017 14:48:35 +0000 (15:48 +0100)]
tc: jsonify vlan action

Add json output to vlan action.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agotc: jsonify mirred action
Jiri Pirko [Sat, 25 Nov 2017 14:48:34 +0000 (15:48 +0100)]
tc: jsonify mirred action

Add json output to mirred action.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agotc: jsonify gact action
Jiri Pirko [Sat, 25 Nov 2017 14:48:33 +0000 (15:48 +0100)]
tc: jsonify gact action

Add json output to gact action.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agotc: jsonify actions core
Jiri Pirko [Sat, 25 Nov 2017 14:48:32 +0000 (15:48 +0100)]
tc: jsonify actions core

Add json output to actions core.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agotc: jsonify matchall filter
Jiri Pirko [Sat, 25 Nov 2017 14:48:31 +0000 (15:48 +0100)]
tc: jsonify matchall filter

Add json output to matchall filter.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agotc: jsonify flower filter
Jiri Pirko [Sat, 25 Nov 2017 14:48:30 +0000 (15:48 +0100)]
tc: jsonify flower filter

Add json output to flower filter.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agotc: jsonify filter core
Jiri Pirko [Sat, 25 Nov 2017 14:48:29 +0000 (15:48 +0100)]
tc: jsonify filter core

Add json output to filter core.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agotc: jsonify htb qdisc
Jiri Pirko [Sat, 25 Nov 2017 14:48:28 +0000 (15:48 +0100)]
tc: jsonify htb qdisc

Add json output to htb qdisc.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agotc: jsonify fq_codel qdisc
Jiri Pirko [Sat, 25 Nov 2017 14:48:27 +0000 (15:48 +0100)]
tc: jsonify fq_codel qdisc

Add json output to fq_codel qdisc.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agotc: jsonify stats2
Jiri Pirko [Sat, 25 Nov 2017 14:48:26 +0000 (15:48 +0100)]
tc: jsonify stats2

Add json output to stats2.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agotc: jsonify qdisc core
Jiri Pirko [Sat, 25 Nov 2017 14:48:25 +0000 (15:48 +0100)]
tc: jsonify qdisc core

Add json output to qdisc core.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agotc: remove action cookie len from printout
Jiri Pirko [Sat, 25 Nov 2017 10:07:57 +0000 (11:07 +0100)]
tc: remove action cookie len from printout

Make the output same as input and avoid printout of unnecessary len.

Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Fixes: fd8b3d2c1b9b ("actions: Add support for user cookies")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agotc: move action cookie print out of the stats if
Jiri Pirko [Sat, 25 Nov 2017 10:07:56 +0000 (11:07 +0100)]
tc: move action cookie print out of the stats if

Cookie print was made dependent on show_stats for no good reason. Fix
this bu pushing cookie print ot of the stats if.

Fixes: fd8b3d2c1b9b ("actions: Add support for user cookies")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
6 years agoiplink: communicate ifindex for xdp offload
Jakub Kicinski [Fri, 24 Nov 2017 02:12:08 +0000 (18:12 -0800)]
iplink: communicate ifindex for xdp offload

When xdpoffload option is used, communicate the ifindex down
to the kernel to trigger device-specific load.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>