]> git.proxmox.com Git - mirror_iproute2.git/log
mirror_iproute2.git
6 years agoutils: Introduce and use inet_prefix_reset()
Serhey Popovych [Mon, 12 Feb 2018 20:17:56 +0000 (22:17 +0200)]
utils: Introduce and use inet_prefix_reset()

Initializing @inet_prefix using C initializers or memset() seems
inefficient and unnecessary: only small part of ->data[] field will be
used to store address corresponding to ->family.

Instead initialize ->flags with zero and assume no other fields accessed
before checking corresponding bits in ->flags. For example special
helpers (e.g. is_addrtype_*()) can be used to ensure that @inet_prefix
contains valid ip or ipv6 address.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoMerge branch 'iproute-json-color' into next
David Ahern [Sat, 10 Feb 2018 16:24:57 +0000 (08:24 -0800)]
Merge branch 'iproute-json-color' into next

Stephen Hemminger  says:

====================

From: Stephen Hemminger <stephen@networkplumber.org>

This set of patches adds JSON output to route printing.
Tested for the simple cases, but there are many variations and there
such as lw tunnels which have not be tested.

The color formatting may need some additional tweaks. It looks
like for some tags the tag is also showing up in color.
This should be fixed in print_color_string rather than having
to do special case handling in so many places.

This patchset also changes the default JSON output to be compressed
(since the purpose of JSON is to make output machine readable);
but do optional pretty print formatting with -p flag.

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiproute: implement JSON and color output
Stephen Hemminger [Thu, 8 Feb 2018 16:26:25 +0000 (08:26 -0800)]
iproute: implement JSON and color output

Add JSON and color output formatting to ip route command.
Similar to existing address and link output.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agojson: fix newline at end of array
Stephen Hemminger [Thu, 8 Feb 2018 16:26:24 +0000 (08:26 -0800)]
json: fix newline at end of array

The json print library was toggling pretty print at the end of
an array to workaround a bug in underlying json_writer.
Instead, just fix json_writer to pretty print array correctly.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoman: add documentation for json and pretty flags
Stephen Hemminger [Thu, 8 Feb 2018 16:26:23 +0000 (08:26 -0800)]
man: add documentation for json and pretty flags

Add description for -json and -pretty options.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agojson: make pretty printing optional
Stephen Hemminger [Thu, 8 Feb 2018 16:26:22 +0000 (08:26 -0800)]
json: make pretty printing optional

Since JSON is intended for programmatic consumption, it makes
sense for the default output format to be concise as possible.

For programmer and other uses, it is helpful to keep the pretty
whitespace format; therefore enable it with -p flag.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoip: Use print_0xhex() where appropriate
Serhey Popovych [Thu, 8 Feb 2018 16:04:05 +0000 (18:04 +0200)]
ip: Use print_0xhex() where appropriate

In gre/gre6 for non-JSON output 0x%x format is used: use print_0xhex()
to get the same value for JSON.

Get rid of custom _print_hex() in bridge slave code: print_0xhex() can
be used perfectly.

Break long print_uint() with long argument list to fit into 80 columns.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoip/tunnel: Minor cleanups
Serhey Popovych [Thu, 8 Feb 2018 10:57:02 +0000 (12:57 +0200)]
ip/tunnel: Minor cleanups

Few minor changes to reduce diffs between ip and ipv6 tunnel code:

  1) reduce intendation by one level when adding attributes in gre and
     gre6; reorder addattr*() calls to simplify diff

  2) reorder local variables definition; change their type (e.g. for
     IFLA_LINK) to match ones returned by rta_getattr_*()

  3) move "mode" parameter parsing in link_iptnl.c to the similar
     position as in link_ip6tnl.c

  4) handle "tc" as shortcut for "tclass"/"tos" in link_iptnl.c

  5) add whitespace where required

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoMerge branch 'unify_tunnel_help' into next
David Ahern [Fri, 9 Feb 2018 16:04:39 +0000 (08:04 -0800)]
Merge branch 'unify_tunnel_help' into next

Serhey Popovych  says:

====================

To show only relevant diffs of ip and ipv6 variants help message print
routines needs to be unified and improved.

Get rid of print_usage() and usage() wrappers: use single function to
output help message. As side effect we return -1 from parse function
instead of calling exit(2) in case of "... tunnel <help|garbage>" is
found.

Additionally we get pointer to @struct link_util and can directly access
->id information to prepare customized help message.

Split calls to fprintf() two group: one that contains format string with
specifiers (thus requiring parameters) and another one that does not.
This helps compiler to optimize calls to fprintf() with fputs() when no
format specifiers in string. Do not use fputs() directly to keep code
formatting nice.

After this series applied following diffs:

  # diff -urN ip/link_gre{,6}.c
  # diff -urN ip/link_vti{,6}.c
  # diff -urN ip/link_ip{,6}tnl.c

in scope of help print routines reduced to necessary minimum.

Tested minimally by compiling and executing "ip link help <kind>" and
"ip link add type help" commands. Looks correct.

See individual patch description for more information.

Reviews, commands and suggestions are welcome.

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiptnl/ip6tnl: Unify iptunnel_print_help()
Serhey Popovych [Fri, 9 Feb 2018 06:58:42 +0000 (08:58 +0200)]
iptnl/ip6tnl: Unify iptunnel_print_help()

Reduce diff lines between iptnl and ip6tnl help printing code.

Use @struct link_util ->id field to print correct link help: all callers
now pass this data structure to iptunnel_print_help().

Get rid of custom print_usage() and usage() functions and use
iptunnel_print_help() directly, return from function on "... type
<help|garbage>" instead of exit(2).

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agogre/gre6: Unify gre_print_help()
Serhey Popovych [Fri, 9 Feb 2018 06:58:41 +0000 (08:58 +0200)]
gre/gre6: Unify gre_print_help()

Reduce diff lines between gre and gre6 help printing code.

Use @struct link_util ->id field to print correct link help: all callers
now pass this data structure to gre_print_help().

Get rid of custom print_usage() and usage() functions and use
gre_print_help() directly, return from function on "... type
<help|garbage>" instead of exit(2).

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agovti/vti6: Unify vti_print_help()
Serhey Popovych [Fri, 9 Feb 2018 06:58:40 +0000 (08:58 +0200)]
vti/vti6: Unify vti_print_help()

Reduce diff lines between vti and vti6 help printing code.

Use @struct link_util ->id field to print correct link help: all callers
now pass this data structure to vti_print_help().

Get rid of custom print_usage() and usage() functions and use
vti_print_help() directly, return from function on "... type
<help|garbage>" instead of exit(2).

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiproute: make flush a separate function
Stephen Hemminger [Thu, 8 Feb 2018 00:25:06 +0000 (16:25 -0800)]
iproute: make flush a separate function

Minor refactoring to move flush into separate function to improve
readability and reduce depth of nesting.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiproute: don't do assignment in condition
Stephen Hemminger [Thu, 8 Feb 2018 00:25:05 +0000 (16:25 -0800)]
iproute: don't do assignment in condition

Fix checkpatch complaints about assignment in conditions.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiproute: whitespace fixes
Stephen Hemminger [Thu, 8 Feb 2018 00:25:04 +0000 (16:25 -0800)]
iproute: whitespace fixes

Add whitespace around operators for consistency.
Use tabs for indentation.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoMerge branch 'dev_walk' into iproute2-next
David Ahern [Thu, 8 Feb 2018 00:19:12 +0000 (16:19 -0800)]
Merge branch 'dev_walk' into iproute2-next

Serhey Popovych  says:

====================

In this seris I replace /proc/net/dev and /sys/class/net usage for walk
through network device list in iptunnel/ip6tunnel and iptuntap with
netlink dump.

Following changed since RFC was sent:

  1) Treat @struct rtnl_link_stats and @struct rtnl_link_stats64 as
     array with __u32 and __u64 elements respectively in
     copy_rtnl_link_stats64() as suggested by Stephen Hemminger.

  2) Remove @name and @size parameters from @struct tnl_print_nlmsg_info
     since we can get them easily from other data.

Testing.
========

Following script is used to ensure I didn't broke things too much:

\#!/bin/bash

iproute2_dir="$1"
iface='gre1'

pushd "$iproute2_dir" &>/dev/null

for i in new old; do
DIR="/tmp/$i"
mkdir -p "$DIR"

ln -snf ip.$i ip/ip

for o in '' -s -d; do
ip/ip $o tunnel show           >"$DIR/ip${o}-tunnel-show"
ip/ip -4 $o tunnel show        >"$DIR/ip-4${o}-tunnel-show"
ip/ip -6 $o tunnel show        >"$DIR/ip-6${o}-tunnel-show"
ip/ip $o tunnel show dev "$iface" \
>"$DIR/ip${o}-tunnel-show-$iface"
ip/ip $o tuntap show           >"$DIR/ip${o}-tuntap-show"
done
done
rm -f ip/ip

diff -urN /tmp/{old,new} |sed -n -Ee'/^(-{3}|\+{3})[[:space:]]+/!p'
rc=$?

popd &>/dev/null
exit $rc

Results:
========

...
fopen /sys/class/net/ipip1/tun_flags: No such file or directory
fopen /sys/class/net/ipip2/tun_flags: No such file or directory
fopen /sys/class/net/gre10/tun_flags: No such file or directory
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note that this comes from ip.old
...
diff -urN /tmp/old/ip-d-tuntap-show /tmp/new/ip-d-tuntap-show
@@ -1,4 +1,4 @@
-tun1: tap user 1004 group 27
- Attached to processes:
 tun0: tun user 1000 group 27
  Attached to processes:
+tun1: tap user 1004 group 27
+ Attached to processes:
diff -urN /tmp/old/ip-s-tuntap-show /tmp/new/ip-s-tuntap-show
@@ -1,2 +1,2 @@
-tun1: tap user 1004 group 27
 tun0: tun user 1000 group 27
+tun1: tap user 1004 group 27
diff -urN /tmp/old/ip-tuntap-show /tmp/new/ip-tuntap-show
@@ -1,2 +1,2 @@
-tun1: tap user 1004 group 27
 tun0: tun user 1000 group 27
+tun1: tap user 1004 group 27

So basically only print order for ip tuntap get changes. Rest is intact.

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotuntap: Use netlink to walk through tuntap list
Serhey Popovych [Wed, 7 Feb 2018 06:30:56 +0000 (08:30 +0200)]
tuntap: Use netlink to walk through tuntap list

It seems bad idea to depend on sysfs being mounted and reflected to the
current network namespace. Same applies to procfs.

Instead netlink should be used to talk to the kernel and get list of
specific network devices among with their parameters.

Support for kernel netlink message filtering by passing IFLA_INFO_KIND
in RTM_GETLINK request: if kernel does not support filtering by the kind
we will check it in reply anyway. Check for ifi->ifi_type to be either
ARPHRD_NONE or ARPHRD_ETHER to seed up things a bit without kernel level
filtering.

Unfortunately tun driver does not implement dumping it's configuration
via netlink and we still need to use read_prop() which depends on sysfs
to get additional tun device information.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiptunnel/ip6tunnel: Use netlink to walk through tunnels list
Serhey Popovych [Wed, 7 Feb 2018 06:30:55 +0000 (08:30 +0200)]
iptunnel/ip6tunnel: Use netlink to walk through tunnels list

Both tunnels use legacy /proc/net/dev interface to get tunnel device and
it's statistics. This may cause problems for cases when procfs either
not mounted or not unshare(2)d for given network namespace.

Use netlink to walk through list of tunnel devices which is network
namespace aware and provides additional information such as statistics
in the dump message.

Since both address family specific variants of do_tunnels_list() nearly
the same, except for tunnel parameters structure initialization,
matching and printing we can introduce common one in tunnel.c.

To implement address family specific parts introduce new data structure
@struct tnl_print_nlmsg_info what contains all necessary information as
well as pointers to ->init(), ->match() and ->print() callbacks.

Annotate data structures by const where appropriate.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiptunnel/ip6tunnel: Code cleanups
Serhey Popovych [Wed, 7 Feb 2018 06:30:54 +0000 (08:30 +0200)]
iptunnel/ip6tunnel: Code cleanups

Use switch () instead of if () to compare tunnel type to fit into 80
columns and make code more readable. Print "\n" using fputc().

In iptunnel.c abstract tunnel parameters matching code in iptunnel.c
into ip_tunnel_parm_match() helper to conform with ip6tunnel.c. Use
memset() to initialize @p1.

In ip6tunnel.c no need to call ll_name_to_index() with name twice: just
use found previously index. Do not initialize @p1: this is done in
ip6_tnl_parm_init().

This is to show real differences between ip and ipv6 do_tunnels_list()
implementations and prepare for upcoming unification of them.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotunnel: Split statistic getting and printing
Serhey Popovych [Wed, 7 Feb 2018 06:30:53 +0000 (08:30 +0200)]
tunnel: Split statistic getting and printing

This is first step to move tunnel code to use rtnl dump interface
instead of /proc/net/dev read.

Make tnl_print_stats() to accept @struct rtnl_link_stats64 parameter,
introduce tnl_get_stats() that will parse line from /proc/net/dev into
@struct rtnl_link_stats64.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoip: Introduce get_rtnl_link_stats_rta() to get link statistics
Serhey Popovych [Wed, 7 Feb 2018 06:30:52 +0000 (08:30 +0200)]
ip: Introduce get_rtnl_link_stats_rta() to get link statistics

Assume all statistics in ip(8) represented either by IFLA_STATS64 or
IFLA_STATS is 64 bit. It is clean that we can store __u32 counters of
@struct rtnl_link_stats in __u64 counters in @struct rtnl_link_stats64.

New get_rtnl_link_stats_rta() follows __print_link_stats() behaviour on
handling of stats attribute: copy no more than size of data structure
and no less than attribute length zeroing rest.

Drop print_link_stats32() as it's functionality can be handled by 64bit
variant. Move code from __print_link_stats() to print_link_stats64() and
finally rename print_link_stats64() to __print_link_stats().

More users of introduced function will come in future.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoipaddress: Unify print_link_stats() and print_link_stats64()
Serhey Popovych [Wed, 7 Feb 2018 06:30:51 +0000 (08:30 +0200)]
ipaddress: Unify print_link_stats() and print_link_stats64()

To show real differences between these two variants adjust whitespace
intendation and use print_uint() instead of print_int() as all members
in both @struct rtnl_link_stats and @struct rtnl_link_stats64 are
unsigned.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoMerge branch 'route_print_refactor' into iproute2-next
David Ahern [Thu, 8 Feb 2018 00:12:52 +0000 (16:12 -0800)]
Merge branch 'route_print_refactor' into iproute2-next

Stephen Hemminger  says:

====================

This patch set breaks up the big print_route function into
smaller pieces for readability and to make later changes
to support JSON and color output easier.

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiproute: refactor printing of interface
Stephen Hemminger [Wed, 7 Feb 2018 17:10:17 +0000 (09:10 -0800)]
iproute: refactor printing of interface

For JSON and colorization, make common code a function.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiproute: refactor multipath print
Stephen Hemminger [Wed, 7 Feb 2018 17:10:16 +0000 (09:10 -0800)]
iproute: refactor multipath print

Make printing of multipath attributes a function to improve
readability.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiproute: refactor newdst, gateway and via printing
Stephen Hemminger [Wed, 7 Feb 2018 17:10:15 +0000 (09:10 -0800)]
iproute: refactor newdst, gateway and via printing

Since these fields are printed in both route and multipath case;
avoid duplicating code.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiproute: refactor printing flow info
Stephen Hemminger [Wed, 7 Feb 2018 17:10:14 +0000 (09:10 -0800)]
iproute: refactor printing flow info

Use common code for printing flow info.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiproute: refactor metrics print
Stephen Hemminger [Wed, 7 Feb 2018 17:10:13 +0000 (09:10 -0800)]
iproute: refactor metrics print

Make a separate function to improve readability and enable
easier JSON conversion.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiproute: refactor cacheinfo printing
Stephen Hemminger [Wed, 7 Feb 2018 17:10:12 +0000 (09:10 -0800)]
iproute: refactor cacheinfo printing

Make common function for decoding cacheinfo.
This code may print more info than old version in some cases.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiproute: make printing IPv4 cache flags a function
Stephen Hemminger [Wed, 7 Feb 2018 17:10:11 +0000 (09:10 -0800)]
iproute: make printing IPv4 cache flags a function

More refactoring prior to JSON support.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiproute: make printing icmpv6 a function
Stephen Hemminger [Wed, 7 Feb 2018 17:10:10 +0000 (09:10 -0800)]
iproute: make printing icmpv6 a function

Refactor to reduce size of print_route and improve
readability.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiproute: refactor printing flags
Stephen Hemminger [Wed, 7 Feb 2018 17:10:09 +0000 (09:10 -0800)]
iproute: refactor printing flags

Both next hop and route need to decode flags.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotreewide: Use addattr_nest()/addattr_nest_end() to handle nested attributes
Serhey Popovych [Wed, 31 Jan 2018 08:15:08 +0000 (10:15 +0200)]
treewide: Use addattr_nest()/addattr_nest_end() to handle nested attributes

We have helper routines to support nested attribute addition into
netlink buffer: use them instead of open coding.

Use addattr_nest_compat()/addattr_nest_compat_end() where appropriate.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoip: Minor cleanups
Serhey Popovych [Wed, 31 Jan 2018 08:15:07 +0000 (10:15 +0200)]
ip: Minor cleanups

  1) Rename @hdr parameter to @n to be coherent with rest of the parsing
     code.

  2) Use NLMSG_DATA() to get pointer to the data after nlmsghdr instead
     of calculating it directly in ip/tunnel code.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoip: Consolidate ip, xdp and lwtunnel parse/dump prototypes in ip_common.h
Serhey Popovych [Wed, 31 Jan 2018 08:15:06 +0000 (10:15 +0200)]
ip: Consolidate ip, xdp and lwtunnel parse/dump prototypes in ip_common.h

Having iplink_parse() and @struct iplink_req in include/utils.h does not
reflect it's IP nature: move to ip/ip_common.h.

Move contents of ip/iplink_xdp.h and ip/iproute_lwtunnel.h to
ip/ip_common.h since they are small (i.e. only two function prototypes):
ip/iplink_bridge.c and ip/iplink_vrf.c prototypes already there.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoRevert "ip address: Change print_linkinfo_brief to take filter as an input"
Serhey Popovych [Thu, 1 Feb 2018 11:25:52 +0000 (13:25 +0200)]
Revert "ip address: Change print_linkinfo_brief to take filter as an input"

This reverts commit 63891c70137f200105c539c92eb73abade2c05d5.

It seems print_linkinfo_brief() never accepts filter different than
default one and David Ahern suggests to revert it instead of making
new change that actually do revert.

Conflicts:
ip/ipaddress.c
ip/iplink.c

These are caused by JSON support addition after commit we reverting.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoMerge branch 'iproute2-master' into iproute2-next
David Ahern [Mon, 29 Jan 2018 16:24:57 +0000 (08:24 -0800)]
Merge branch 'iproute2-master' into iproute2-next

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agov4.15.0 v4.15.0
Stephen Hemminger [Mon, 29 Jan 2018 16:08:52 +0000 (08:08 -0800)]
v4.15.0

6 years agotc: fix second printing of requeues
Jakub Kicinski [Sat, 27 Jan 2018 09:19:04 +0000 (01:19 -0800)]
tc: fix second printing of requeues

Non-JSON tc qdisc output used to print the "requeues" statistic
twice.  Commit 4fcec7f3665b ("tc: jsonify stats2") tried to preserve
this behaviour for both standard output and JSON, but used the wrong
statistic (q.qlen).  Also duplicating keys in JSON is not allowed,
so the second occurrence should be completely skipped with JSON.

Fixes: 4fcec7f3665b ("tc: jsonify stats2")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoip: address: fix stats64 JSON object name
Jakub Kicinski [Fri, 26 Jan 2018 19:30:35 +0000 (11:30 -0800)]
ip: address: fix stats64 JSON object name

The JSON object name for statistics in ip link show is "stats644".
Looks like a typo, commit d0e720111aad ("ip: ipaddress.c: add support
for json output") contains an example with the expected "stats64" name.

The fact that no one has noticed until now is probably an indication
that no one is using this object.  Hopefully it's not too late to fix
this, although IIUC this has already been in 4.13 and 4.14 releases :S

Fixes: d0e720111aad ("ip: ipaddress.c: add support for json output")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agotc: prio: JSON-ify prio output
Jakub Kicinski [Fri, 26 Jan 2018 19:27:57 +0000 (11:27 -0800)]
tc: prio: JSON-ify prio output

Make JSON output work with prio Qdiscs.  This will also make
other qdiscs which reuse the print_qopt work, like mqprio or
pfifo_fast.

Note that there is a double space between "priomap" and first
prio number.  Keep this original behaviour.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotc: red: JSON-ify RED output
Jakub Kicinski [Fri, 26 Jan 2018 19:27:56 +0000 (11:27 -0800)]
tc: red: JSON-ify RED output

Make JSON output work with RED Qdiscs.  Float/double printing
helpers have to be added/uncommented to print the probability.
Since TC stats in general are not split out to a separate object
the xstats printed by this patch are not separated either.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoMerge branch 'get_addr_rta' into iproute2-next
David Ahern [Thu, 25 Jan 2018 17:32:27 +0000 (09:32 -0800)]
Merge branch 'get_addr_rta' into iproute2-next

Serhey Popovych  says:

====================

Now we enhance get_addr() to return additional information about address
(e.g. if it unspecified or multicast) we want to have same functionality
for attributes in netlink message.

Introduce and use get_addr_rta() that parses given netlink attribute
into @inet_prefix data structure in the same way similar get_addr()
parses address from it's string representation.

Use attribute length to guess address family: force it by giving non
AF_UNSPEC @family to get_addr_rta() to ensure address is of expected
family.

Introduce and use inet_addr_match_rta() to further simplify and unify
code where get_addr_rta() intended to be used together with
inet_addr_match().

This is next step in ipv4 and ipv6 modules unification to prepare for
merge in the future.

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoip/tunnel: Unify local/remote endpoint address printing
Serhey Popovych [Wed, 24 Jan 2018 18:56:40 +0000 (20:56 +0200)]
ip/tunnel: Unify local/remote endpoint address printing

Introduce and use tnl_print_endpoint() helper to print of tunnel
endpoint address.

Note that for AF_INET and AF_INET6 inet_ntop(3) is used that may return
NULL in case of failure and while unlikely format_host_rta() might
return NULL too. Handle this case when passing local/remote to
print_string().

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotcp_metric: Use get_addr_rta()
Serhey Popovych [Wed, 24 Jan 2018 18:56:39 +0000 (20:56 +0200)]
tcp_metric: Use get_addr_rta()

While there remove & from inet_prefix.data when since it is array.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoipl2tp: Use get_addr_rta()
Serhey Popovych [Wed, 24 Jan 2018 18:56:38 +0000 (20:56 +0200)]
ipl2tp: Use get_addr_rta()

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoipneigh: Use inet_addr_match_rta()
Serhey Popovych [Wed, 24 Jan 2018 18:56:37 +0000 (20:56 +0200)]
ipneigh: Use inet_addr_match_rta()

While there check return from get_prefix() for filter address.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoipmroute: Use inet_addr_match_rta()
Serhey Popovych [Wed, 24 Jan 2018 18:56:36 +0000 (20:56 +0200)]
ipmroute: Use inet_addr_match_rta()

While there check return from get_prefix() for filter address.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiprule: Use inet_addr_match_rta()
Serhey Popovych [Wed, 24 Jan 2018 18:56:35 +0000 (20:56 +0200)]
iprule: Use inet_addr_match_rta()

While there check return from get_prefix() for filter address.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoipaddress: Use inet_addr_match_rta()
Serhey Popovych [Wed, 24 Jan 2018 18:56:34 +0000 (20:56 +0200)]
ipaddress: Use inet_addr_match_rta()

While there check return from get_prefix() for filter address.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoutils: Introduce get_addr_rta() and inet_addr_match_rta()
Serhey Popovych [Wed, 24 Jan 2018 18:56:33 +0000 (20:56 +0200)]
utils: Introduce get_addr_rta() and inet_addr_match_rta()

First is used to get address from netlink attribute to
inet_prefix data structure. Use memcpy() with constant
value to let complier optimize by replacing a call by
inlining load/store instructions.

Second is used to match address in given netlink attribute
with one given as reference. It matches successfully if
no attribute is given (@rta is NULL), reference address
family is AF_UNSPEC or it's length isn't given; fails if
get_attr_rta() can't get attribute or it's family does
not match reference; calls inet_addr_match() to get final
verdict.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoMerge branch 'unify_external' into iproute2-next
David Ahern [Wed, 24 Jan 2018 18:02:27 +0000 (10:02 -0800)]
Merge branch 'unify_external' into iproute2-next

Serhey Popovych  says:

====================

With this series I want to unify collect metadata
handling in tunnels:

  1) Use "external" name for JSON and non-JSON output.

     Do not *print* any options when tunnel in
     collect metadata mode: gre6 already do
     this, so just apply to others.

  2) Do not *add* any attributes when configuring
     gre tunnel in collect metadata mode.

     Other tunnels (e.g. gre6, iptnl, ip6tnl)
     alredy do that.

This is next step in ipv4 and ipv6 modules
unification to prepare for merge in the future.

Any comments, suggestions and criticism as always
welcome.

v2
  For all tunnels implementing collect metadata
  use "external" keyword for both JSON. Thanks
  to Jiri Benc for detailed explanation.
====================

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agogre/gre6: Unify attribute addition to netlink buffer
Serhey Popovych [Mon, 22 Jan 2018 17:23:46 +0000 (19:23 +0200)]
gre/gre6: Unify attribute addition to netlink buffer

There are couple of minor improvements:

  1) Check erspan_ver == 2 in gre6. It still could
     be 1 if erspan_idx is 0.

  2) Add tunnel encapsulation attributes only when
     collect metadata not in effect in gre.

  3) Trivial: address checkpatch issues.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoip/tunnel: Be consistent when printing tunnel collect metadata
Serhey Popovych [Mon, 22 Jan 2018 17:23:45 +0000 (19:23 +0200)]
ip/tunnel: Be consistent when printing tunnel collect metadata

Print only "external" if collect meta data attribute
is given: rest of parameters are irrelevant. This is
to follow gre6.

For both JSON and non-JSON output use "external" for
all tunnels including vxlan and geneve.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoMerge branch 'iproute2-master' into iproute2-next
David Ahern [Wed, 24 Jan 2018 17:59:03 +0000 (09:59 -0800)]
Merge branch 'iproute2-master' into iproute2-next

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotc/lexer: let quotes actually start strings
Wolfgang Bumiller [Mon, 22 Jan 2018 10:53:46 +0000 (11:53 +0100)]
tc/lexer: let quotes actually start strings

The lexer will go with the longest match, so previously
the starting double quotes of a string would be swallowed by
the [^ \t\r\n()]+ pattern leaving the user no way to
actually use strings with escape sequences.
Fix this by not allowing this case to start with double
quotes.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
6 years agoiplink: Use ll_name_to_index() instead of if_nametoindex()
Serhey Popovych [Fri, 19 Jan 2018 16:44:03 +0000 (18:44 +0200)]
iplink: Use ll_name_to_index() instead of if_nametoindex()

While benefit from using ll_name_to_index() with populated
cache can potentially be exploited only in few places
(e.g. bridge fdb/mdb/vlan show routines) there is another
advantage of ll_name_to_index() over plain if_nametoindex():

  in case of if_nametoindex() failure ll_name_to_index()
  will attempt to get index from common name in form "if%d"
  that may be returned from ll_index_to_name().

This makes output from ip(8) coherent with it's input.

Note that most of the code already switched from plain
if_nametoindex() to ll_name_to_index() to cached variant.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agovti/vti6: Minor improvements
Serhey Popovych [Fri, 19 Jan 2018 16:44:02 +0000 (18:44 +0200)]
vti/vti6: Minor improvements

In prepare of link_vti.c and link_vti6.c merge:

  1) Make @fwmark of __u32 type instead of unsigned int
     in vti to match with rest tunneling code.

  2) Report when unable to translate @link network device
     name to index instead of silently exiting in vti6.

  3) Remove newline separating local/remote attributes
     from the ikey/okey in vti6 to match vti module.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoiptnl/ip6tnl: Unify ttl/hoplimit parsing routines
Serhey Popovych [Fri, 19 Jan 2018 16:44:01 +0000 (18:44 +0200)]
iptnl/ip6tnl: Unify ttl/hoplimit parsing routines

Handle "inherit" case properly for gre6 and ip6tnl.

Use get_u8() in gre to parse ttl/hoplimit.

Be consistent about "hlim" alias to ttl/hoplimit
support.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agotunnel: Add space between encap-dport and encap-sport in non-JSON output
Serhey Popovych [Fri, 19 Jan 2018 16:44:00 +0000 (18:44 +0200)]
tunnel: Add space between encap-dport and encap-sport in non-JSON output

Fixes: bad76e6b1f44 ("ip/tunnel: Abstract tunnel encapsulation options printing")
Fixes: e2d4588331fc ("ip: link_gre.c: add json output support")
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agogre/gre6: Post merge fixes
Serhey Popovych [Mon, 22 Jan 2018 14:50:08 +0000 (16:50 +0200)]
gre/gre6: Post merge fixes

Few minor changes after merge of 'master' into 'net-next' branch:

  1) Follow 80 line length for printing erspan_index parameter
     as we did in master with commit 2a8d0f6e9c3f ("gre/tunnel:
     Print erspan_index using print_uint()").

  2) Remove remnants of encapsulation option printing: now it
     is done using tnl_print_encap() helper in commit bad76e6b1f44
     ("ip/tunnel: Abstract tunnel encapsulation options printing").

Fixes: 8c75f69411bc ("Merge branch 'master' into net-next")
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
6 years agoMerge branch 'shared_block' into net-next
David Ahern [Sun, 21 Jan 2018 19:20:56 +0000 (11:20 -0800)]
Merge branch 'shared_block' into net-next

Jiri Pirko  says:

====================

From: Jiri Pirko <jiri@mellanox.com>

Kernel allows to share all filters between qdiscs with use
of shared block.

Example:

block number 22. "22" is just an identification:
$ tc qdisc add dev ens7 ingress_block 22 ingress
                        ^^^^^^^^^^^^^^^^
$ tc qdisc add dev ens8 ingress_block 22 ingress
                        ^^^^^^^^^^^^^^^^

If we don't specify "block" command line option, no shared block would
be created:
$ tc qdisc add dev ens9 ingress

Now if we list the qdiscs, we will see the block index in the output:

$ tc qdisc
qdisc ingress ffff: dev ens7 parent ffff:fff1 ingress_block 22
qdisc ingress ffff: dev ens8 parent ffff:fff1 ingress_block 22
qdisc ingress ffff: dev ens9 parent ffff:fff1

To make is more visual, the situation looks like this:

   ens7 ingress qdisc                 ens7 ingress qdisc
          |                                  |
          |                                  |
          +---------->  block 22  <----------+

Unlimited number of qdiscs may share the same block.

Block sharing is also supported for clsact qdisc:
$ tc qdisc add dev ens10 ingress_block 23 egress_block 24 clsact
$ tc qdisc show dev ens10
qdisc clsact ffff: dev ens10 parent ffff:fff1 ingress_block 23 egress_block 24

We can add filter using the block index:

$ tc filter add block 22 protocol ip pref 25 flower dst_ip 192.168.0.0/16 action drop

Note we cannot use the qdisc for filter manipulations of shared blocks:

$ tc filter add dev ens8 ingress protocol ip pref 1 flower dst_ip 192.168.100.2 action drop
Error: This filter block is shared. Please use the block index to manipulate the filters.

We will see the same output if we list filters for ingress qdisc of
ens7 and ens8, also for the block 22:

$ tc filter show block 22
filter protocol ip pref 25 flower chain 0
filter protocol ip pref 25 flower chain 0 handle 0x1
...

$ tc filter show dev ens7 ingress
filter block 22 protocol ip pref 25 flower chain 0
filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
...

$ tc filter show dev ens8 ingress
filter block 22 protocol ip pref 25 flower chain 0
filter block 22 protocol ip pref 25 flower chain 0 handle 0x1
...

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotc: implement ingress/egress block index attributes for qdiscs
Jiri Pirko [Sat, 20 Jan 2018 10:00:29 +0000 (11:00 +0100)]
tc: implement ingress/egress block index attributes for qdiscs

During qdisc creation it is possible to specify shared block for bot
ingress and egress. Pass this values to kernel according to the command
line options.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotc: introduce support for block-handle for filter operations
Jiri Pirko [Sat, 20 Jan 2018 10:00:28 +0000 (11:00 +0100)]
tc: introduce support for block-handle for filter operations

So far, qdisc was the only handle that could be used to manipulate
filters. Kernel added support for using block to manipulate it. So add
the support to use block index to manipulate filters. The magic
TCM_IFINDEX_MAGIC_BLOCK indicates the block index is in use.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotc: introduce tc_qdisc_block_exists helper
Jiri Pirko [Sat, 20 Jan 2018 10:00:27 +0000 (11:00 +0100)]
tc: introduce tc_qdisc_block_exists helper

This hepler used qdisc dump to list all qdisc and find if block index in
question is used by any of them. That means the block with specified
index exists.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoMerge branch 'inet_get_addr' into net-next
David Ahern [Sun, 21 Jan 2018 18:11:07 +0000 (10:11 -0800)]
Merge branch 'inet_get_addr' into net-next

Serhey Popovych  says:

====================

It looks confusing to have multiple independent
routines to get internet address from it's string
representation: get_addr() and inet_get_addr().

Most complicated users of inet_get_addr() is
iplink_geneve.c and iplink_vxlan.c because they
required to handle both AF_INET and AF_INET6
for their local/remote endpoints.

On the other hand get_addr() does not provide
additional information like address type: need
to address this. to get rid of current and
possible future code duplications. Note that
this functionality is first step to make proto
independent handling of local/remote endpoints
in ip/tunnel code (there will be additional
series based on this one).

Also fix get_addr_1() and get_prefix() to make
sure it always provide correct ->family and
->bitlen.

As always comments, suggestions and criticism
are welcome.

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoip: Get rid of inet_get_addr()
Serhey Popovych [Thu, 18 Jan 2018 18:13:47 +0000 (20:13 +0200)]
ip: Get rid of inet_get_addr()

Both geneve and vxlan modules are converted to
use get_addr() we can replace inet_get_addr()
in less problematic places and finally get
rid of inet_get_addr().

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiplink_vxlan: Get rid of inet_get_addr()
Serhey Popovych [Thu, 18 Jan 2018 18:13:46 +0000 (20:13 +0200)]
iplink_vxlan: Get rid of inet_get_addr()

Now we have additional information about address
class from get_addr() we can use it in place of
inet_get_addr().

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoiplink_geneve: Get rid of inet_get_addr()
Serhey Popovych [Thu, 18 Jan 2018 18:13:45 +0000 (20:13 +0200)]
iplink_geneve: Get rid of inet_get_addr()

Now we have additional information about address
class from get_addr() we can use it in place of
inet_get_addr().

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoutils: Fast inet address classification after get_addr()
Serhey Popovych [Thu, 18 Jan 2018 18:13:44 +0000 (20:13 +0200)]
utils: Fast inet address classification after get_addr()

It looks very useful to receive additional information
from get_addr_1() and get_addr() about address to simplify
caller and get rid of code duplications.

For now following information can be returned:

  1) address is unspecified (zero)
  2) address is multicast
  3) address is internet: family is either AF_INET or
     AF_INET6.

More information can be added in the future.

Introduce inline helpers to make code using this new
address classification interface more self explaining:

  bool is_addrtype_inet(inet_prefix *addr)
    true if @addr is inet address

  bool is_addrtype_inet_unspec(inet_prefix *addr)
    true if @addr is unspecified inet address

  bool is_addrtype_inet_multi(inet_prefix *addr)
    true if @addr is multicast inet address

  bool is_addrtype_inet_not_unspec(inet_prefix *addr)
    true if @addr is not unspecified inet address
    false if @addr is not inet or unspecified inet

  bool is_addrtype_inet_not_multi(inet_prefix *addr)
    true if @addr is not multicast inet address
    false if @addr is not inet or multicast inet

Last two are useful for case when we need inet address
that is not unspecified or multicast.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoutils: Always specify family and ->bytelen in get_prefix_1()
Serhey Popovych [Thu, 18 Jan 2018 18:13:43 +0000 (20:13 +0200)]
utils: Always specify family and ->bytelen in get_prefix_1()

Handle default/all/any special case in get_addr_1() to setup
->family and ->bytelen correctly.

Make get_addr_1() return ->bitlen == -2 instead of -1 to
distinguish default/all/any special case from the rest:
it is safe because all callers check ->bitlen < 0, not
explicit value -1.

Reduce intendation by one level and get rid of goto/label
to make code more readable.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoutils: Always specify family for address in get_addr_1()
Serhey Popovych [Thu, 18 Jan 2018 18:13:42 +0000 (20:13 +0200)]
utils: Always specify family for address in get_addr_1()

Set ->family correctly when string representing address
is "default", "all" or "any": get_addr_1() might be called
with AF_UNSPEC (e.g. get_addr() -> get_addr_1()).

Extend support for zero address to all address families,
not only AF_INET and AF_INET6 when one explicitly given
as @family: use af_byte_len() to correctly set address length.

Still assume AF_INET when @family is AF_UNSPEC.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoMerge branch 'master' into net-next
David Ahern [Sun, 21 Jan 2018 17:37:01 +0000 (09:37 -0800)]
Merge branch 'master' into net-next

Conflicts:
ip/link_gre.c
ip/link_gre6.c

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agobpf: support map offload
Jakub Kicinski [Wed, 17 Jan 2018 07:50:54 +0000 (23:50 -0800)]
bpf: support map offload

When program is loaded with a specified ifindex, use that
ifindex also when creating maps.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotc: red: allow setting th_min and th_max to the same value
Jakub Kicinski [Tue, 16 Jan 2018 23:08:50 +0000 (15:08 -0800)]
tc: red: allow setting th_min and th_max to the same value

Setting th_min and th_max to the same value may be useful for DCTCP
deployments.  The original DCTCP paper describes it as a simplest way
of achieving simple ECN threshold marking.  Indeed, there doesn't seem
to be any simpler qdisc in Linux which would allow such a setup today.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agoUpdate kernel headers to 4.15-rc8
David Ahern [Fri, 19 Jan 2018 20:33:41 +0000 (12:33 -0800)]
Update kernel headers to 4.15-rc8

Update kernel headers to commit 30c3e9d47035
("l2tp: remove switch block in l2tp_nl_cmd_session_create()")

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotunnel: Return constant string without copying it
Serhey Popovych [Thu, 18 Jan 2018 14:04:36 +0000 (16:04 +0200)]
tunnel: Return constant string without copying it

We return constant string from tnl_strproto(), no need
to copy it to temporary buffer and then return such
buffer as const: return constant string instead.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agovti6/tunnel: Unify and simplify link type help functions
Serhey Popovych [Thu, 18 Jan 2018 14:04:35 +0000 (16:04 +0200)]
vti6/tunnel: Unify and simplify link type help functions

Both of these two changes are missing for link_vti6.c:

  commit 8b47135474cd ("ip: link: Unify link type help functions a bit")
  commit 561e650eff67 ("ip link: Shortify printing the usage of link type")

Replay them on link_vti6.c to bring link type help functions
inline with other tunneling code.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agovti/tunnel: Unify ikey/okey printing
Serhey Popovych [Thu, 18 Jan 2018 14:04:34 +0000 (16:04 +0200)]
vti/tunnel: Unify ikey/okey printing

For vti6 tunnel we print [io]key in dotted-quad notation
(ipv4 address) while in vti we do that in hex format.

For vti tunnel we print [io]key only if value is not
zero while for vti6 we miss such check.

Unify vti and vti6 tunnel [io]key output.

While here enlarge s2 buffer to the same size as in rest
of tunnel support code (64 bytes) and check return from
inet_ntop().

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agogre/tunnel: Print erspan_index using print_uint()
Serhey Popovych [Thu, 18 Jan 2018 14:04:33 +0000 (16:04 +0200)]
gre/tunnel: Print erspan_index using print_uint()

One is missing in JSON output because fprintf()
is used instead of print_uint().

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoip/tunnel: Abstract tunnel encapsulation options printing
Serhey Popovych [Thu, 18 Jan 2018 14:04:32 +0000 (16:04 +0200)]
ip/tunnel: Abstract tunnel encapsulation options printing

Get rid of code duplications and consolidate encapsulation
options printing in single function - tnl_print_encap().

Introduce and use tnl_encap_str() to format encapsulation
option string according to tempate and given values to avoid
code duplication and simplify it.

Use print_string() instead of fputs() and fprintf() to
print encapsulation for !is_json_context().

Print "unknown" parameter for "encap" type in PRINT_FP
context using "%s " format specifier and benefit from
complite time string merge.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoip/tunnel: Use print_0xhex() instead of print_string()
Serhey Popovych [Thu, 18 Jan 2018 14:04:31 +0000 (16:04 +0200)]
ip/tunnel: Use print_0xhex() instead of print_string()

No need for custom SPRINT_BUF() and snprintf() 0x%x
value to this buffer: we can use print_0xhex() instead
of print_string().

In link_iptnl.c use s2 instead of s1 buffer and remove
s1.

While there adjust fwmark option print order in iptnl
and ip6tnl to get it match each other.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoip/tunnel: Simplify and unify tos printing
Serhey Popovych [Thu, 18 Jan 2018 14:04:30 +0000 (16:04 +0200)]
ip/tunnel: Simplify and unify tos printing

For ip tunnels tos can be 0 when not configured, 1 when
inherited from encapsulated packet and rest specifying
diffserv (rfc2474) or tos (rfc1349) bits. It is stored
in packet tos/diffserv field and returned in tos
netlink attribute to userspace.

Simplify and unify tos printing by using print_0xhex()
and print_string() instead of fprintf() to output values.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoip/tunnel: Correct and unify ttl/hoplimit printing
Serhey Popovych [Thu, 18 Jan 2018 14:04:29 +0000 (16:04 +0200)]
ip/tunnel: Correct and unify ttl/hoplimit printing

Both ttl/hoplimit is from 1 to 255. Zero has special meaning:
use encapsulated packet value. In ip-link(8) -d output this
looks like "ttl/hoplimit inherit". In JSON we have "int" type
for ttl and therefore values from 0 (inherit) to 255.

To do the best in handling ttl/hoplimit we need to accept
both cases: missing attribute in netlink dump and zero value
for "inherit"ed case. Last one is broken since JSON output
introduction for gre/iptnl versions and was never true for
gre6/ip6tnl.

For all tunnels, except ip6tnl change JSON type from "int" to
"uint" to reflect true nature of the ttl.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoiplink: Use ll_index_to_name() instead of if_indextoname()
Serhey Popovych [Thu, 18 Jan 2018 14:04:28 +0000 (16:04 +0200)]
iplink: Use ll_index_to_name() instead of if_indextoname()

There are two reasons for switching to cached variant:

  1) ll_index_to_name() may return result from cache,
     eliminating expensive ioctl() to the kernel.

     Note that most of the code already switched from plain
     if_indextoname() to ll_index_to_name() to cached variant
     in print path because in most cases cache populated.

  2) It always return name in the form "if%d", even if
     entry is not in cache and ioctl() fails. This drops
     "link_index" from JSON output.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agodevlink: Ignore unknown attributes
Arkadi Sharshevsky [Wed, 17 Jan 2018 13:28:00 +0000 (15:28 +0200)]
devlink: Ignore unknown attributes

In case of extending the UAPI old packages would break.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoiplink: Fix "alias" parameter length calculations
Serhey Popovych [Thu, 18 Jan 2018 14:24:40 +0000 (16:24 +0200)]
iplink: Fix "alias" parameter length calculations

We need NEXT_ARG() to get *argv pointing to "alias"
parameter value. Overwise we get and check "alias"
string length.

Fixes: f88becf35e08 ("iplink: Process "alias" parameter correctly")
Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoman: Document the meaning of zero in min/max_tx_rate parameters
Gal Pressman [Tue, 16 Jan 2018 13:42:00 +0000 (15:42 +0200)]
man: Document the meaning of zero in min/max_tx_rate parameters

Zero value in min/max_tx_rate has a special meaning of no rate limit,
document it.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
6 years agoipaddress: Make sure VF min/max rate API is supported before using it
Gal Pressman [Tue, 16 Jan 2018 13:41:59 +0000 (15:41 +0200)]
ipaddress: Make sure VF min/max rate API is supported before using it

When using the new minimum rate API and providing only one parameter
(minimum rate/maximum rate), we query the VF min and max rate regardless
of kernel support.
This resulted in segmentation fault in ipaddr_loop_each_vf, which tries
to access NULL pointer.

This patch identifies such cases by testing the VF table for NULL
pointer in IFLA_VF_RATE, and aborts the operation.
Aborting on the first VF is valid since if the kernel does not support
the new API for the first VF, it will not support it for the other VFs
as well.

Fixes: f89a2a05ffa9 ("Add support to configure SR-IOV VF minimum and maximum Tx rate through ip tool")
Cc: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Gal Pressman <galp@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
6 years agoiplink: Validate minimum tx rate is less than maximum tx rate
Gal Pressman [Tue, 16 Jan 2018 13:41:58 +0000 (15:41 +0200)]
iplink: Validate minimum tx rate is less than maximum tx rate

According to the documentation (man ip-link), the minimum TXRATE should
be always <= Maximum TXRATE, but commit f89a2a05ffa9 ("Add support to
configure SR-IOV VF minimum and maximum Tx rate through ip tool") didn't
enforce it.

Fixes: f89a2a05ffa9 ("Add support to configure SR-IOV VF minimum and maximum Tx rate through ip tool")
Cc: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Gal Pressman <galp@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
6 years agoipaddress: Use family_name() for better code reuse
Serhey Popovych [Fri, 12 Jan 2018 18:03:39 +0000 (20:03 +0200)]
ipaddress: Use family_name() for better code reuse

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
6 years agotc: Optimize gact action lookup
Phil Sutter [Fri, 12 Jan 2018 11:20:21 +0000 (12:20 +0100)]
tc: Optimize gact action lookup

When adding a filter with a gact action such as 'drop', tc first tries
to open a shared object with equivalent name (m_drop.so in this case)
before trying gact. Avoid this by matching the action name against those
handled by gact prior to calling get_action_kind().

Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
6 years agoMerge branch 'tc-batch' into net-next
David Ahern [Mon, 15 Jan 2018 16:25:30 +0000 (08:25 -0800)]
Merge branch 'tc-batch' into net-next

Chris Mi says:

====================
Currently in tc batch mode, only one command is read from the batch
file and sent to kernel to process. With this patchset, at most 128
commands can be accumulated before sending to kernel.

We introduced a new function in patch 1 to support for sending
multiple messages. In patch 2, we add this support for filter
add/delete/change/replace and actions add/change/replace commands.

But please note that kernel still processes the requests one by one.
To process the requests in parallel in kernel is another effort.
The time we're saving in this patchset is the user mode and kernel mode
context switch. So this patchset works on top of the current kernel.

Using the following script in kernel, we can generate 1,000,000 rules.
tools/testing/selftests/tc-testing/tdc_batch.py

Without this patchset, 'tc -b $file' exection time is:

real    0m15.555s
user    0m7.211s
sys     0m8.284s

With this patchset, 'tc -b $file' exection time is:

real    0m12.360s
user    0m6.082s
sys     0m6.213s

The insertion rate is improved more than 10%.
====================

Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotc: Add batchsize feature for filter and actions
Chris Mi [Fri, 12 Jan 2018 05:13:16 +0000 (14:13 +0900)]
tc: Add batchsize feature for filter and actions

Currently in tc batch mode, only one command is read from the batch
file and sent to kernel to process. With this support, at most 128
commands can be accumulated before sending to kernel.

Now it only works for the following successive commands:
1. filter add/delete/change/replace
2. actions add/change/replace

Signed-off-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agolib/libnetlink: Add a new function rtnl_talk_iov
Chris Mi [Fri, 12 Jan 2018 05:13:15 +0000 (14:13 +0900)]
lib/libnetlink: Add a new function rtnl_talk_iov

rtnl_talk can only send a single message to kernel. Add a new function
rtnl_talk_iov that can send multiple messages to kernel.
rtnl_talk_iov takes struct iovec * and iovlen as arguments.

Signed-off-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
6 years agotests: make sure rand_dev suffix has 6 chars
Christian Ehrhardt [Wed, 10 Jan 2018 15:11:37 +0000 (16:11 +0100)]
tests: make sure rand_dev suffix has 6 chars

The change to limit the read size from /dev/urandom is a tradeoff.
Reading too much can trigger an issue, but so it could if the
suggested 250 random chars would not contain enough [:alpha:] chars.
If they contain:
 a) >=6 all is ok
 b) [1-5] the devname would be shorter than expected (non fatal).
 c) 0 things would break

In loops of hundreds of thousands it always was (a) for my, but since
if occuring in a test it might be hard to track what happened avoid
this issue by retrying on the length condition.

Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agotests: read limited amount from /dev/urandom
Christian Ehrhardt [Wed, 10 Jan 2018 15:11:36 +0000 (16:11 +0100)]
tests: read limited amount from /dev/urandom

In some test environments like e.g. Ubuntu & Debian autopkgtest it
can happen that while generating random device names the pipes
between tr and head are considered dead while processing.
That prints (non fatal) issues like:
  Running ip/link/new_link.t [iproute2-this/4.13.0-17-generic]: tr:
write error: Broken pipe
  tr: write error
  PASS

This only happens if reading an infinite amount of chars with the
read from urandom, so reading a defined amount fixes the issue.

Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoifcfg/rtpr: convert to POSIX shell
Mike Frysinger [Tue, 9 Jan 2018 23:19:59 +0000 (18:19 -0500)]
ifcfg/rtpr: convert to POSIX shell

These files are already mostly written in POSIX shell, so convert their
shebangs to /bin/sh and tweak the few bashisms in here.

URL: https://crbug.com/756559
Reported-by: Pat Erley <perley@chromium.org>
Signed-off-by: Mike Frysinger <vapier@chromium.org>
6 years agomark shell scripts +x
Mike Frysinger [Tue, 9 Jan 2018 23:20:03 +0000 (18:20 -0500)]
mark shell scripts +x

This makes it easier to execute locally for testing.

Signed-off-by: Mike Frysinger <vapier@chromium.org>
6 years agotc: remove no longer relevant README
Stephen Hemminger [Wed, 10 Jan 2018 16:21:22 +0000 (08:21 -0800)]
tc: remove no longer relevant README

This document described how kernel and tc used to handle
timing. In last two years, kernel has switched over to using
ktime. Nothing to see here, move along.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>