]> git.proxmox.com Git - mirror_iproute2.git/log
mirror_iproute2.git
4 years agoss: allow dumping kTLS info
Davide Caratti [Mon, 7 Oct 2019 10:16:44 +0000 (12:16 +0200)]
ss: allow dumping kTLS info

now that INET_DIAG_INFO requests can dump TCP ULP information, extend 'ss'
to allow diagnosing kTLS when it is attached to a TCP socket. While at it,
import kTLS uAPI definitions from the latest net-next tree.

CC: Andrea Claudi <aclaudi@redhat.com>
Co-developed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agoUpdate kernel headers and import tls.h
David Ahern [Tue, 15 Oct 2019 02:54:12 +0000 (19:54 -0700)]
Update kernel headers and import tls.h

Update kernel headers to commit:
    85a83a8fca7f ("Merge branch 'PTP-driver-refactoring-for-SJA1105-DSA'")

and add tls.h.

Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agoMerge branch 'master' into next
David Ahern [Mon, 7 Oct 2019 22:02:36 +0000 (22:02 +0000)]
Merge branch 'master' into next

Signed-off-by: David Ahern <dsahern@kernel.org>
4 years agodevlink: extend reload command to add support for network namespace change
Jiri Pirko [Thu, 3 Oct 2019 09:51:15 +0000 (11:51 +0200)]
devlink: extend reload command to add support for network namespace change

Extend existing devlink reload command by adding option "netns" by which
user can instruct kernel to reload the devlink instance into specified
network namespace.

Example:

$ ip netns add testns1
$ devlink dev reload netdevsim/netdevsim10 netns testns1

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
4 years agodevlink: introduce cmdline option to switch to a different namespace
Jiri Pirko [Thu, 3 Oct 2019 09:51:14 +0000 (11:51 +0200)]
devlink: introduce cmdline option to switch to a different namespace

Similar to ip tool, add an option to devlink to operate under certain
network namespace. Unfortunately, "-n" is already taken, so use "-N"
instead.

Example:

$ devlink -N testns1 dev show

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
4 years agordma: Relax requirement to have PID for HW objects
Leon Romanovsky [Wed, 2 Oct 2019 13:49:34 +0000 (16:49 +0300)]
rdma: Relax requirement to have PID for HW objects

RDMA has weak connection between PIDs and HW objects, because
the latter tied to file descriptors for their lifetime management.

The outcome of such connection is that for the following scenario,
the returned PID will be 0 (not-valid):
 1. Create FD and context
 2. Share it with ephemeral child
 3. Create any object and exit that child

This flow was revealed in testing environment and of course real users
are not running such scenario, because it makes no sense at all in RDMA
world.

Let's do two changes in the code to support such workflow anyway:
 1. Remove need to provide PID/kernel name. Code already supports it,
    just need to remove extra validation.
 2. Ball-out in case PID is 0.

Link: https://lore.kernel.org/linux-rdma/20191002123245.18153-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
4 years agoUpdate kernel headers
David Ahern [Mon, 7 Oct 2019 20:43:13 +0000 (20:43 +0000)]
Update kernel headers

Update kernel headers to commit:
    940f13821528 ("Merge branch 'dpaa2-eth-misc-cleanup'")

Signed-off-by: David Ahern <dsahern@kernel.org>
4 years agouapi: update btf from 5.4-rc1
Stephen Hemminger [Tue, 1 Oct 2019 15:55:01 +0000 (08:55 -0700)]
uapi: update btf from 5.4-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agoipneigh: neigh get support
Roopa Prabhu [Tue, 1 Oct 2019 04:52:23 +0000 (21:52 -0700)]
ipneigh: neigh get support

This patch adds support to lookup a neigh entry
using recently added support in the kernel using RTM_GETNEIGH

example:
$ip neigh get 10.0.2.4 dev test-dummy0
10.0.2.4 dev test-dummy0 lladdr de:ad:be:ef:13:37 PERMANENT

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Tested-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agobridge: fdb get support
Roopa Prabhu [Tue, 1 Oct 2019 04:52:22 +0000 (21:52 -0700)]
bridge: fdb get support

This patch adds support to lookup a bridge fdb entry
using recently added support in the kernel using RTM_GETNEIGH
(and AF_BRIDGE family).

example:
$bridge fdb get 02:02:00:00:00:03 dev test-dummy0 vlan 1002
02:02:00:00:00:03 dev test-dummy0 vlan 1002 master bridge

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Tested-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agoip: fix ip route show json output for multipath nexthops
Julien Fortin [Thu, 26 Sep 2019 15:29:34 +0000 (17:29 +0200)]
ip: fix ip route show json output for multipath nexthops

print_rta_multipath doesn't support JSON output:

{
    "dst":"27.0.0.13",
    "protocol":"bgp",
    "metric":20,
    "flags":[],
    "gateway":"169.254.0.1"dev uplink-1 weight 1 ,
    "flags":["onlink"],
    "gateway":"169.254.0.1"dev uplink-2 weight 1 ,
    "flags":["onlink"]
},

since RTA_MULTIPATH has nested objects we should print them
in a json array.

With the path we have the following output:

{
    "flags": [],
    "dst": "36.0.0.13",
    "protocol": "bgp",
    "metric": 20,
    "nexthops": [
        {
            "weight": 1,
            "flags": [
                "onlink"
            ],
            "gateway": "169.254.0.1",
            "dev": "uplink-1"
        },
        {
            "weight": 1,
            "flags": [
                "onlink"
            ],
            "gateway": "169.254.0.1",
            "dev": "uplink-2"
        }
    ]
}

Fixes: 663c3cb23103f4 ("iproute: implement JSON and color output")
Signed-off-by: Julien Fortin <julien@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agoman: add note to ip-macsec manual about necessary key management
Thomas Haller [Wed, 25 Sep 2019 10:24:03 +0000 (12:24 +0200)]
man: add note to ip-macsec manual about necessary key management

The man page of ip-macsec and the existance of the tool makes it seem like
the user could just configure static keys once, and be done with it. That is
not the case. Some form or key management must be done in user space.

Add a note about that.

Signed-off-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agoip vrf: Add json support for show command
David Ahern [Wed, 18 Sep 2019 16:42:21 +0000 (09:42 -0700)]
ip vrf: Add json support for show command

Add json support to 'ip vrf sh':
$ ip -j -p vrf ls
[ {
        "name": "mgmt",
        "table": 1001
    } ]

Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agoMerge branch 'master' into next
David Ahern [Wed, 25 Sep 2019 02:34:34 +0000 (19:34 -0700)]
Merge branch 'master' into next

Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agouapi: update headers from 5.4-rc
Stephen Hemminger [Tue, 24 Sep 2019 19:38:57 +0000 (12:38 -0700)]
uapi: update headers from 5.4-rc

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agoMerge ../iproute2-next
Stephen Hemminger [Tue, 24 Sep 2019 19:37:33 +0000 (12:37 -0700)]
Merge ../iproute2-next

4 years agov5.3.0
Stephen Hemminger [Tue, 24 Sep 2019 19:32:05 +0000 (12:32 -0700)]
v5.3.0

4 years agobpf: Fix race condition with map pinning
Joe Stringer [Fri, 20 Sep 2019 02:04:47 +0000 (19:04 -0700)]
bpf: Fix race condition with map pinning

If two processes attempt to invoke bpf_map_attach() at the same time,
then they will both create maps, then the first will successfully pin
the map to the filesystem and the second will not pin the map, but will
continue operating with a reference to its own copy of the map. As a
result, the sharing of the same map will be broken from the two programs
that were concurrently loaded via loaders using this library.

Fix this by adding a retry in the case where the pinning fails because
the map already exists on the filesystem. In that case, re-attempt
opening a fd to the map on the filesystem as it shows that another
program already created and pinned a map at that location.

Signed-off-by: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agoMerge branch 'master' into next
David Ahern [Thu, 19 Sep 2019 14:55:53 +0000 (07:55 -0700)]
Merge branch 'master' into next

Conflicts:
devlink/devlink.c

Fixed the conflict by updating the numbering for all new attributes
after the ones in master branch.

Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agodevlink: add reload failed indication
Jiri Pirko [Mon, 16 Sep 2019 09:44:48 +0000 (11:44 +0200)]
devlink: add reload failed indication

Add indication about previous failed devlink reload.

Example outputs:

$ devlink dev
netdevsim/netdevsim10: reload_failed true
$ devlink dev -j -p
{
    "dev": {
        "netdevsim/netdevsim10": {
            "reload_failed": true
        }
    }
}

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agobpf: replace snprintf with asprintf when dealing with long buffers
Andrea Claudi [Mon, 16 Sep 2019 13:00:55 +0000 (15:00 +0200)]
bpf: replace snprintf with asprintf when dealing with long buffers

This reduces stack usage, as asprintf allocates memory on the heap.

This indirectly fixes a snprintf truncation warning (from gcc v9.2.1):

bpf.c: In function ‘bpf_get_work_dir’:
bpf.c:784:49: warning: ‘snprintf’ output may be truncated before the last format character [-Wformat-truncation=]
  784 |  snprintf(bpf_wrk_dir, sizeof(bpf_wrk_dir), "%s/", mnt);
      |                                                 ^
bpf.c:784:2: note: ‘snprintf’ output between 2 and 4097 bytes into a destination of size 4096
  784 |  snprintf(bpf_wrk_dir, sizeof(bpf_wrk_dir), "%s/", mnt);
      |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fixes: e42256699cac ("bpf: make tc's bpf loader generic and move into lib")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agolink_xfrm: don't force to set phydev
Nicolas Dichtel [Mon, 16 Sep 2019 15:36:27 +0000 (17:36 +0200)]
link_xfrm: don't force to set phydev

Since linux commit 22d6552f827e ("xfrm interface: fix management of
phydev"), phydev is not mandatory anymore.

Note that it also could be useful before the above commit to not force the
user to put a phydev (the kernel was checking it anyway).
For example, it was useful to not set it in case of x-netns, because the
phydev is not available in the current netns:

Before the patch:
$ ip netns add foo
$ ip link add xfrm1 type xfrm dev eth1 if_id 1
$ ip link set xfrm1 netns foo
$ ip -n foo link set xfrm1 type xfrm dev eth1 if_id 2
Cannot find device "eth1"
$ ip -n foo link set xfrm1 type xfrm if_id 2
must specify physical device

Fixes: 286446c1e8c7 ("ip: support for xfrm interfaces")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Matt Ellison <matt@arroyo.io>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agoman: ss.8: add documentation for drop counter
Andrea Claudi [Wed, 11 Sep 2019 10:19:29 +0000 (12:19 +0200)]
man: ss.8: add documentation for drop counter

After commit 6df9c7a06a845 ("ss: add SK_MEMINFO_DROPS display") ss -m
displays also a drop counter for each socket.

This commit properly document it into the man page.

Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agordma: Check comm string before print in print_comm()
Mark Zhang [Wed, 11 Sep 2019 08:12:43 +0000 (11:12 +0300)]
rdma: Check comm string before print in print_comm()

Broken kernels (not-upstream) can provide wrong empty "comm" field.
It causes to segfault while printing in JSON format.

Fixes: 8ecac46a60ff ("rdma: Add QP resource tracking information")
Signed-off-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agodevlink: implement flash status monitoring
Jiri Pirko [Thu, 12 Sep 2019 11:29:38 +0000 (13:29 +0200)]
devlink: implement flash status monitoring

Listen to status notifications coming from kernel during flashing and
put them on stdout to inform user about the status.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agodevlink: implement flash update status monitoring
Jiri Pirko [Thu, 12 Sep 2019 11:29:37 +0000 (13:29 +0200)]
devlink: implement flash update status monitoring

Kernel sends notifications about flash update status, so implement these
messages for monitoring.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agodevlink: unknown 'fw_load_policy' string validation
Dirk van der Merwe [Wed, 11 Sep 2019 14:56:29 +0000 (15:56 +0100)]
devlink: unknown 'fw_load_policy' string validation

The 'fw_load_policy' devlink parameter now supports an unknown value.

Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agodevlink: add 'reset_dev_on_drv_probe' devlink param
Dirk van der Merwe [Wed, 11 Sep 2019 13:05:17 +0000 (14:05 +0100)]
devlink: add 'reset_dev_on_drv_probe' devlink param

Add support for the new devlink parameter along with string to uint
conversion.

Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agoiproute2-next: police: support 64bit rate and peakrate in tc utility
David Dai [Wed, 4 Sep 2019 15:06:51 +0000 (10:06 -0500)]
iproute2-next: police: support 64bit rate and peakrate in tc utility

For high speed adapter like Mellanox CX-5 card, it can reach upto
100 Gbits per second bandwidth. Currently htb already supports 64bit rate
in tc utility. However police action rate and peakrate are still limited
to 32bit value (upto 32 Gbits per second). Taking advantage of the 2 new
attributes TCA_POLICE_RATE64 and TCA_POLICE_PEAKRATE64 from kernel,
tc can use them to break the 32bit limit, and still keep the backward
binary compatibility.

Tested-by: David Dai <zdai@linux.vnet.ibm.com>
Signed-off-by: David Dai <zdai@linux.vnet.ibm.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agoUpdate kernel headers
David Ahern [Sun, 15 Sep 2019 17:32:58 +0000 (10:32 -0700)]
Update kernel headers

Update kernel headers to commit:
    aa2eaa8c272a ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net")

Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agonexthop: Add space after blackhole
David Ahern [Wed, 4 Sep 2019 15:09:52 +0000 (08:09 -0700)]
nexthop: Add space after blackhole

Add a space after 'blackhole' is missing to properly separate the
protocol when it is given.

Fixes: 63df8e8543b0 ("Add support for nexthop objects")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agodevlink: fix segfault on health command
Andrea Claudi [Wed, 4 Sep 2019 17:26:14 +0000 (19:26 +0200)]
devlink: fix segfault on health command

devlink segfaults when using grace_period without reporter

$ devlink health set pci/0000:00:09.0 grace_period 3500
Segmentation fault

devlink is instead supposed to gracefully fail printing a warning
message

$ devlink health set pci/0000:00:09.0 grace_period 3500
Reporter's name is expected.

This happens because DL_OPT_HEALTH_REPORTER_NAME and
DL_OPT_HEALTH_REPORTER_GRACEFUL_PERIOD are both defined as BIT(27).
When dl_opts_put() parse options and grace_period is set, it erroneously
tries to set reporter name to null.

This is fixed simply shifting by 1 bit enumeration starting with
DL_OPT_HEALTH_REPORTER_GRACEFUL_PERIOD.

Fixes: b18d89195b16 ("devlink: Add devlink health set command")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agoip nexthop: Allow flush|list operations to specify a specific protocol
Donald Sharp [Sat, 10 Aug 2019 00:18:43 +0000 (20:18 -0400)]
ip nexthop: Allow flush|list operations to specify a specific protocol

In the case where we have a large number of nexthops from a specific
protocol, allow the flush and list operations to take a protocol
to limit the commands scopes.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agoMerge branch 'master' into next
David Ahern [Wed, 4 Sep 2019 14:48:15 +0000 (07:48 -0700)]
Merge branch 'master' into next

Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agouapi: update bpf.h header
Stephen Hemminger [Thu, 29 Aug 2019 23:20:21 +0000 (16:20 -0700)]
uapi: update bpf.h header

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agoMerge branch 'devlink-trap' into next
David Ahern [Sun, 18 Aug 2019 18:51:38 +0000 (11:51 -0700)]
Merge branch 'devlink-trap' into next

Ido Schimmel  says:

====================

From: Ido Schimmel <idosch@mellanox.com>

This patchset adds devlink-trap support in iproute2.

Patch #1 increases the number of options devlink can handle.

Patches #2-#3 gradually add support for all devlink-trap commands.

Patch #4 adds a man page for devlink-trap.

See individual commit messages for example usage and output.

Changes in v2:
* Remove report option and monitor command since monitoring is done
  using drop monitor

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agodevlink: Add man page for devlink-trap
Ido Schimmel [Tue, 13 Aug 2019 08:31:43 +0000 (11:31 +0300)]
devlink: Add man page for devlink-trap

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agodevlink: Add devlink trap group set and show commands
Ido Schimmel [Tue, 13 Aug 2019 08:31:42 +0000 (11:31 +0300)]
devlink: Add devlink trap group set and show commands

These commands are similar to the trap set and show commands, but
operate on a trap group and not individual traps. Example:

# devlink trap group set netdevsim/netdevsim10 group l3_drops action trap
# devlink -jps trap group show netdevsim/netdevsim10 group l3_drops
{
    "trap_group": {
        "netdevsim/netdevsim10": [ {
                "name": "l3_drops",
                "generic": true,
                "stats": {
                    "rx": {
                        "bytes": 0,
                        "packets": 0
                    }
                }
            } ]
    }
}

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agodevlink: Add devlink trap set and show commands
Ido Schimmel [Tue, 13 Aug 2019 08:31:41 +0000 (11:31 +0300)]
devlink: Add devlink trap set and show commands

The trap set command allows the user to set the action of an individual
trap. Example:

# devlink trap set netdevsim/netdevsim10 trap blackhole_route action trap

The trap show command allows the user to get the current status of an
individual trap or a dump of all traps in case one is not specified.
When '-s' is specified the trap's statistics are shown. When '-v' is
specified the metadata types the trap can provide are shown. Example:

# devlink -jvps trap show netdevsim/netdevsim10 trap blackhole_route
{
    "trap": {
        "netdevsim/netdevsim10": [ {
                "name": "blackhole_route",
                "type": "drop",
                "generic": true,
                "action": "trap",
                "group": "l3_drops",
                "metadata": [ "input_port" ],
                "stats": {
                    "rx": {
                        "bytes": 0,
                        "packets": 0
                    }
                }
            } ]
    }
}

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agodevlink: Increase number of supported options
Ido Schimmel [Tue, 13 Aug 2019 08:31:40 +0000 (11:31 +0300)]
devlink: Increase number of supported options

Currently, the number of supported options is capped at 32 which is a
problem given we are about to add a few more and go over the limit.

Increase the limit to 64 options.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agoUpdate kernel headers
David Ahern [Sun, 18 Aug 2019 18:48:02 +0000 (11:48 -0700)]
Update kernel headers

Update kernel headers to commit:
    d83d508b74c4 ("Merge branch 'stmmac-next'")

Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agoMerge branch 'master' into next
David Ahern [Sun, 18 Aug 2019 18:40:30 +0000 (11:40 -0700)]
Merge branch 'master' into next

Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agoip nexthop: Add space to display properly when showing a group
Donald Sharp [Sat, 10 Aug 2019 00:18:42 +0000 (20:18 -0400)]
ip nexthop: Add space to display properly when showing a group

When displaying a nexthop group made up of other nexthops, the display
line shows this when you have additional data at the end:

id 42 group 43/44/45/46/47/48/49/50/51/52/53/54/55/56/57/58/59/60/61/62/63/64/65/66/67/68/69/70/71/72/73/74proto zebra

Modify code so that it shows:

id 42 group 43/44/45/46/47/48/49/50/51/52/53/54/55/56/57/58/59/60/61/62/63/64/65/66/67/68/69/70/71/72/73/74 proto zebra

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agolib: fix spelling errors
Stephen Hemminger [Tue, 13 Aug 2019 01:21:10 +0000 (18:21 -0700)]
lib: fix spelling errors

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agotc: fix spelling errors
Stephen Hemminger [Tue, 13 Aug 2019 01:18:51 +0000 (18:18 -0700)]
tc: fix spelling errors

Minor spelling errors found by codespell

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agouapi: update socket.h
Stephen Hemminger [Mon, 12 Aug 2019 17:58:49 +0000 (10:58 -0700)]
uapi: update socket.h

Upstream change to resolve gcc-9 issues.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agotc: Fix block-handle support for filter operations
Ido Schimmel [Mon, 12 Aug 2019 10:17:06 +0000 (13:17 +0300)]
tc: Fix block-handle support for filter operations

The revert of batchsize accidently reverted more than it should
and broke shared block functionality.  Fix this by restoring the
original functionality.

To reproduce:

dst_ip 192.0.2.0/24 action drop
Unknown filter "block", hence option "10" is unparsable

Fixes: e991c04d64c0 ("Revert "tc: Add batchsize feature for filter and actions"")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agoip tunnel: add json output
Andrea Claudi [Fri, 2 Aug 2019 17:38:10 +0000 (19:38 +0200)]
ip tunnel: add json output

Add json support on iptunnel and ip6tunnel.
The plain text output format should remain the same.

Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agordma: Add driver QP type string
Gal Pressman [Sun, 4 Aug 2019 08:07:56 +0000 (11:07 +0300)]
rdma: Add driver QP type string

RDMA resource tracker now tracks driver QPs as well, add driver QP type
string to qp_types_to_str function.

Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agoMerge branch 'master' into next
David Ahern [Wed, 7 Aug 2019 18:59:19 +0000 (11:59 -0700)]
Merge branch 'master' into next

Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agoss: sctp: Formatting tweak in sctp_show_info for locals
Patrick Talbert [Sat, 3 Aug 2019 08:47:08 +0000 (10:47 +0200)]
ss: sctp: Formatting tweak in sctp_show_info for locals

'locals' output does not include a leading space so it runs up against
skmem:() output. Add a leading space to fix it.

Signed-off-by: Patrick Talbert <ptalbert@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agoss: sctp: fix typo for nodelay
Patrick Talbert [Sat, 3 Aug 2019 08:37:41 +0000 (10:37 +0200)]
ss: sctp: fix typo for nodelay

nodealy should be nodelay.

Signed-off-by: Patrick Talbert <ptalbert@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agodevlink: finish queue.h to list.h transition
Jiri Pirko [Mon, 5 Aug 2019 09:56:56 +0000 (11:56 +0200)]
devlink: finish queue.h to list.h transition

Loose the "q" from the names and name the structure fields in the same
way rest of the code does. Also, fix list_add arg order which leads
to segfault.

Fixes: 33267017faf1 ("iproute2: devlink: port from sys/queue.h to list.h")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agotc: fflush after each command in batch mode
Stephen Hemminger [Fri, 2 Aug 2019 16:33:39 +0000 (09:33 -0700)]
tc: fflush after each command in batch mode

Restore behaviour of tc batch mode.
Flush stdout after each command.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agoRevert "tc: Add batchsize feature for filter and actions"
Stephen Hemminger [Thu, 1 Aug 2019 00:27:59 +0000 (17:27 -0700)]
Revert "tc: Add batchsize feature for filter and actions"

This reverts commit 485d0c6001c4aa134b99c86913d6a7089b7b2ab0.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agoRevert "tc: fix batch force option"
Stephen Hemminger [Thu, 1 Aug 2019 00:19:33 +0000 (17:19 -0700)]
Revert "tc: fix batch force option"

This reverts commit b133392468d1f404077a8f3554d1f63d48bb45e8.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agoRevert "tc: flush after each command in batch mode"
Stephen Hemminger [Thu, 1 Aug 2019 00:19:18 +0000 (17:19 -0700)]
Revert "tc: flush after each command in batch mode"

This reverts commit d66fdfda71e4a30c1ca0ddb7b1a048bef30fe79e.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agoRevert "tc: Remove pointless assignments in batch()"
Stephen Hemminger [Thu, 1 Aug 2019 00:16:54 +0000 (17:16 -0700)]
Revert "tc: Remove pointless assignments in batch()"

This reverts commit 6358bbc381c6e38465838370bcbbdeb77ec3565a.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agordma: Document adaptive-moderation
Yamin Friedman [Mon, 29 Jul 2019 07:42:26 +0000 (10:42 +0300)]
rdma: Document adaptive-moderation

Add document of setting the adaptive-moderation for the ib device.

Signed-off-by: Yamin Friedman <yaminf@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years agordma: Control CQ adaptive moderation (DIM)
Yamin Friedman [Mon, 29 Jul 2019 07:42:25 +0000 (10:42 +0300)]
rdma: Control CQ adaptive moderation (DIM)

In order to set adaptive-moderation for an ib device the command is:
rdma dev set [DEV] adaptive-moderation [on|off]

rdma dev show -d
0: mlx5_0: node_type ca fw 16.25.0319 node_guid 248a:0703:00a5:29d0
sys_image_guid 248a:0703:00a5:29d0 adaptive-moderation on
caps: <BAD_PKEY_CNTR, BAD_QKEY_CNTR, AUTO_PATH_MIG, CHANGE_PHY_PORT,
PORT_ACTIVE_EVENT, SYS_IMAGE_GUID, RC_RNR_NAK_GEN, MEM_WINDOW, XRC,
MEM_MGT_EXTENSIONS, BLOCK_MULTICAST_LOOPBACK, MEM_WINDOW_TYPE_2B,
RAW_IP_CSUM, CROSS_CHANNEL, MANAGED_FLOW_STEERING, SIGNATURE_HANDOVER,
ON_DEMAND_PAGING, SG_GAPS_REG, RAW_SCATTER_FCS, PCI_WRITE_END_PADDING>

rdma resource show cq
dev mlx5_0 cqn 0 cqe 1023 users 4 poll-ctx UNBOUND_WORKQUEUE
adaptive-moderation off comm [ib_core]

Signed-off-by: Yamin Friedman <yaminf@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years agojson_print: drop extra semi-colons
Stephen Hemminger [Mon, 29 Jul 2019 15:45:32 +0000 (08:45 -0700)]
json_print: drop extra semi-colons

The _PRINT_FUNC() macro expands to a function call.
Putting a semi-colon is unnecessary and causes warnings with -pedantic

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agoutils: Fix get_s64() function
Kurt Kanzenbach [Thu, 4 Jul 2019 12:24:27 +0000 (14:24 +0200)]
utils: Fix get_s64() function

get_s64() uses internally strtoll() to parse the value out of a given
string. strtoll() returns a long long. However, the intermediate variable is
long only which might be 32 bit on some systems. So, fix it.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agoiplink: document 'change' option to ip link
Stephen Hemminger [Fri, 26 Jul 2019 21:59:59 +0000 (14:59 -0700)]
iplink: document 'change' option to ip link

Add the command alias "change" to man page.
Don't show it on usage, since it is not commonly used.

Reported-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Matteo Croce <mcroce@redhat.com>
4 years agoiplink_can: fix format output of clock with flag -details
Antonio Borneo [Fri, 26 Jul 2019 13:06:09 +0000 (15:06 +0200)]
iplink_can: fix format output of clock with flag -details

The command
ip -details link show can0
prints in the last line the value of the clock frequency attached
to the name of the following value "numtxqueues", e.g.
clock 49500000numtxqueues 1 numrxqueues 1 gso_max_size
 65536 gso_max_segs 65535

Add the missing space after the clock value.

Signed-off-by: Antonio Borneo <borneo.antonio@gmail.com>
4 years agoiproute2: devlink: port from sys/queue.h to list.h
Sergei Trofimovich [Fri, 26 Jul 2019 21:01:05 +0000 (22:01 +0100)]
iproute2: devlink: port from sys/queue.h to list.h

sys/queue.h does not exist on linux-musl targets and fails build as:

    devlink.c:28:10: fatal error: sys/queue.h: No such file or directory
       28 | #include <sys/queue.h>
          |          ^~~~~~~~~~~~~

The change ports to list.h API and drops dependency of 'sys/queue.h'.
The API maps one-to-one.

Build-tested on linux-musl and linux-glibc.

Bug: https://bugs.gentoo.org/690486
CC: Stephen Hemminger <stephen@networkplumber.org>
CC: netdev@vger.kernel.org
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agouapi: update kernel headers from 5.3-rc1
Stephen Hemminger [Mon, 22 Jul 2019 16:45:09 +0000 (09:45 -0700)]
uapi: update kernel headers from 5.3-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agordma: Document counter statistic
Mark Zhang [Wed, 17 Jul 2019 14:31:56 +0000 (17:31 +0300)]
rdma: Document counter statistic

Add document of accessing the QP counter, including bind/unbind a QP
to a counter manually or automatically, and dump counter statistics.

Signed-off-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agordma: Add default counter show support
Mark Zhang [Wed, 17 Jul 2019 14:31:55 +0000 (17:31 +0300)]
rdma: Add default counter show support

Show default counter statistics, which are same through the sysfs
interface: /sys/class/infiniband/<dev>/ports/<port>/hw_counters/

Example:
$ rdma stat show link mlx5_2/1
link mlx5_2/1 rx_write_requests 8 rx_read_requests 4 rx_atomic_requests 0
out_of_buffer 0 out_of_sequence 0 duplicate_request 0 rnr_nak_retry_err 0
packet_seq_err 0 implied_nak_seq_err 0 local_ack_timeout_err 0
resp_local_length_error 0 resp_cqe_error 0 req_cqe_error 0
req_remote_invalid_request 0 req_remote_access_errors 0
resp_remote_access_errors 0 resp_cqe_flush_error 0 req_cqe_flush_error 0
rp_cnp_ignored 0 rp_cnp_handled 0 np_ecn_marked_roce_packets 0
np_cnp_sent 0 rx_icrc_encapsulated 0

Signed-off-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agordma: Add stat manual mode support
Mark Zhang [Wed, 17 Jul 2019 14:31:54 +0000 (17:31 +0300)]
rdma: Add stat manual mode support

In manual mode a QP can be manually bound to a counter. If the counter
id(cntn) is not specified that kernel will allocate one. After a
successful bind, the cntn can be seen through "rdma statistic qp show".
And in unbind if lqpn is not specified then all QPs on this counter will
be unbound.
The manual and auto mode are mutual-exclusive.

Examples:
$ rdma statistic qp bind link mlx5_2/1 lqpn 178
$ rdma statistic qp bind link mlx5_2/1 lqpn 178 cntn 4
$ rdma statistic qp unbind link mlx5_2/1 cntn 4
$ rdma statistic qp unbind link mlx5_2/1 cntn 4 lqpn 178

Signed-off-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agordma: Make get_port_from_argv() returns valid port in strict port mode
Mark Zhang [Wed, 17 Jul 2019 14:31:53 +0000 (17:31 +0300)]
rdma: Make get_port_from_argv() returns valid port in strict port mode

When strict_port is set, make get_port_from_argv() returns failure if
no valid port is specified.

Signed-off-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agordma: Add rdma statistic counter per-port auto mode support
Mark Zhang [Wed, 17 Jul 2019 14:31:52 +0000 (17:31 +0300)]
rdma: Add rdma statistic counter per-port auto mode support

With per-QP statistic counter support, a user is allowed to monitor
specific QPs categories, which are bound to/unbound from counters
dynamically allocated/deallocated.

In per-port "auto" mode, QPs are bound to counters automatically
according to common criteria. For example a per "type"(qp type)
scheme, where in each process all QPs have same qp type are bind
automatically to a single counter.
Currently only "type" (qp type) is supported. Examples:

$ rdma statistic qp set link mlx5_2/1 auto type on
$ rdma statistic qp set link mlx5_2/1 auto off

Signed-off-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agordma: Add get per-port counter mode support
Mark Zhang [Wed, 17 Jul 2019 14:31:51 +0000 (17:31 +0300)]
rdma: Add get per-port counter mode support

Add an interface to show which mode is active. Two modes are supported:
- "auto": In this mode all QPs belong to one category are bind automatically
  to a single counter set. Currently only "qp type" is supported;
- "manual": In this mode QPs are bound to a counter manually.

Examples:
$ rdma statistic qp mode
0/1: mlx5_0/1: qp auto off
1/1: mlx5_1/1: qp auto off
2/1: mlx5_2/1: qp auto type on
3/1: mlx5_3/1: qp auto off

$ rdma statistic qp mode link mlx5_0
0/1: mlx5_0/1: qp auto off

Signed-off-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agordma: Add "stat qp show" support
Mark Zhang [Wed, 17 Jul 2019 14:31:50 +0000 (17:31 +0300)]
rdma: Add "stat qp show" support

This patch presents link, id, task name, lqpn, as well as all sub
counters of a QP counter.
A QP counter is a dynamically allocated statistic counter that is
bound with one or more QPs. It has several sub-counters, each is
used for a different purpose.

Examples:
$ rdma stat qp show
link mlx5_2/1 cntn 5 pid 31609 comm client.1 rx_write_requests 0
rx_read_requests 0 rx_atomic_requests 0 out_of_buffer 0 out_of_sequence 0
duplicate_request 0 rnr_nak_retry_err 0 packet_seq_err 0
implied_nak_seq_err 0 local_ack_timeout_err 0 resp_local_length_error 0
resp_cqe_error 0 req_cqe_error 0 req_remote_invalid_request 0
req_remote_access_errors 0 resp_remote_access_errors 0
resp_cqe_flush_error 0 req_cqe_flush_error 0
    LQPN: <178>
$ rdma stat show link rocep1s0f5/1
link rocep1s0f5/1 rx_write_requests 0 rx_read_requests 0 rx_atomic_requests 0 out_of_buffer 0 duplicate_request 0
rnr_nak_retry_err 0 packet_seq_err 0 implied_nak_seq_err 0 local_ack_timeout_err 0 resp_local_length_error 0 resp_cqe_error 0
req_cqe_error 0 req_remote_invalid_request 0 req_remote_access_errors 0 resp_remote_access_errors 0 resp_cqe_flush_error 0
req_cqe_flush_error 0 rp_cnp_ignored 0 rp_cnp_handled 0 np_ecn_marked_roce_packets 0 np_cnp_sent 0
$ rdma stat show link rocep1s0f5/1 -p
link rocep1s0f5/1
    rx_write_requests 0
    rx_read_requests 0
    rx_atomic_requests 0
    out_of_buffer 0
    duplicate_request 0
    rnr_nak_retry_err 0
    packet_seq_err 0
    implied_nak_seq_err 0
    local_ack_timeout_err 0
    resp_local_length_error 0
    resp_cqe_error 0
    req_cqe_error 0
    req_remote_invalid_request 0
    req_remote_access_errors 0
    resp_remote_access_errors 0
    resp_cqe_flush_error 0
    req_cqe_flush_error 0
    rp_cnp_ignored 0
    rp_cnp_handled 0
    np_ecn_marked_roce_packets 0
    np_cnp_sent 0

Signed-off-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agouapi: fix bpf comment typo
Stephen Hemminger [Fri, 19 Jul 2019 17:49:36 +0000 (10:49 -0700)]
uapi: fix bpf comment typo

From upstream.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agojson: fix backslash escape typo in jsonw_puts
Ivan Delalande [Thu, 18 Jul 2019 01:15:31 +0000 (18:15 -0700)]
json: fix backslash escape typo in jsonw_puts

Fixes: fcc16c22 ("provide common json output formatter")
Signed-off-by: Ivan Delalande <colona@arista.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agotc: taprio: Update documentation
Vedang Patel [Thu, 18 Jul 2019 19:55:43 +0000 (12:55 -0700)]
tc: taprio: Update documentation

Add documentation for the latest options, flags and txtime-delay, to the
taprio manpage.

This also adds an example to run tc in txtime offload mode.

Signed-off-by: Vedang Patel <vedang.patel@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agotc: etf: Add documentation for skip_sock_check.
Vedang Patel [Thu, 18 Jul 2019 19:55:42 +0000 (12:55 -0700)]
tc: etf: Add documentation for skip_sock_check.

Document the newly added option (skip_sock_check) on the etf man-page.

Signed-off-by: Vedang Patel <vedang.patel@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agotaprio: add support for setting txtime_delay.
Vedang Patel [Thu, 18 Jul 2019 19:55:41 +0000 (12:55 -0700)]
taprio: add support for setting txtime_delay.

This adds support for setting the txtime_delay parameter which is useful
for the txtime offload mode of taprio.

Signed-off-by: Vedang Patel <vedang.patel@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agotaprio: Add support for setting flags
Vinicius Costa Gomes [Thu, 18 Jul 2019 19:55:40 +0000 (12:55 -0700)]
taprio: Add support for setting flags

This allows a new parameter, flags, to be passed to taprio. Currently, it
only supports enabling the txtime-assist mode. But, we plan to add
different modes for taprio (e.g. hardware offloading) and this parameter
will be useful in enabling those modes.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Vedang Patel <vedang.patel@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agoetf: Add skip_sock_check
Vedang Patel [Thu, 18 Jul 2019 19:55:39 +0000 (12:55 -0700)]
etf: Add skip_sock_check

ETF Qdisc currently checks for a socket with SO_TXTIME socket option. If
either is not present, the packet is dropped. In the future commits, we
want other Qdiscs to add packet with launchtime to the ETF Qdisc. Also,
there are some packets (e.g. ICMP packets) which may not have a socket
associated with them.  So, add an option to skip this check.

Signed-off-by: Vedang Patel <vedang.patel@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agoMerge branch 'tc-conntrack' into next
David Ahern [Thu, 18 Jul 2019 22:42:13 +0000 (15:42 -0700)]
Merge branch 'tc-conntrack' into next

Paul Blakey  says:

====================

This patch series add connection tracking capabilities in tc.
It does so via a new tc action, called act_ct, and new tc flower classifier matching.
Act ct and relevant flower matches, are still under review in net-next mailing list.

Usage is as follows:
$ tc qdisc add dev ens1f0_0 ingress
$ tc qdisc add dev ens1f0_1 ingress

$ tc filter add dev ens1f0_0 ingress \
  prio 1 chain 0 proto ip \
  flower ip_proto tcp ct_state -trk \
  action ct zone 2 pipe \
  action goto chain 2
$ tc filter add dev ens1f0_0 ingress \
  prio 1 chain 2 proto ip \
  flower ct_state +trk+new \
  action ct zone 2 commit mark 0xbb nat src addr 5.5.5.7 pipe \
  action mirred egress redirect dev ens1f0_1
$ tc filter add dev ens1f0_0 ingress \
  prio 1 chain 2 proto ip \
  flower ct_zone 2 ct_mark 0xbb ct_state +trk+est \
  action ct nat pipe \
  action mirred egress redirect dev ens1f0_1

$ tc filter add dev ens1f0_1 ingress \
  prio 1 chain 0 proto ip \
  flower ip_proto tcp ct_state -trk \
  action ct zone 2 pipe \
  action goto chain 1
$ tc filter add dev ens1f0_1 ingress \
  prio 1 chain 1 proto ip \
  flower ct_zone 2 ct_mark 0xbb ct_state +trk+est \
  action ct nat pipe \
  action mirred egress redirect dev ens1f0_0

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agotc: flower: Add matching on conntrack info
Paul Blakey [Thu, 11 Jul 2019 08:14:27 +0000 (11:14 +0300)]
tc: flower: Add matching on conntrack info

Matches on conntrack state, zone, mark, and label.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agotc: Introduce tc ct action
Paul Blakey [Thu, 11 Jul 2019 08:14:26 +0000 (11:14 +0300)]
tc: Introduce tc ct action

New tc action to send packets to conntrack module, commit
them, and set a zone, labels, mark, and nat on the connection.

It can also clear the packet's conntrack state by using clear.

Usage:
   ct clear
   ct commit [force] [zone] [mark] [label] [nat]
   ct [nat] [zone]

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agoImport tc_act/tc_ct.h uapi file
David Ahern [Thu, 18 Jul 2019 22:40:07 +0000 (15:40 -0700)]
Import tc_act/tc_ct.h uapi file

Import include/uapi/linux/tc_act/tc_ct.h header from commit of last
kernel headers sync.

Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agotc: add NLA_F_NESTED flag to all actions options nested block
Paul Blakey [Thu, 11 Jul 2019 08:14:25 +0000 (11:14 +0300)]
tc: add NLA_F_NESTED flag to all actions options nested block

Strict netlink validation now requires this flag on all nested
attributes, add it for action options.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agotunnel: factorize printout of GRE key and flags
Andrea Claudi [Fri, 12 Jul 2019 17:02:14 +0000 (19:02 +0200)]
tunnel: factorize printout of GRE key and flags

print_tunnel() functions in ip6tunnel.c and iptunnel.c contains
the same code to print out GRE key and flags

This commit factorize the code in a helper function in tunnel.c

Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agoip tunnel: warn when changing IPv6 tunnel without tunnel name
Andrea Claudi [Tue, 9 Jul 2019 13:16:51 +0000 (15:16 +0200)]
ip tunnel: warn when changing IPv6 tunnel without tunnel name

Tunnel change fails if a tunnel name is not specified while using
'ip -6 tunnel change'. However, no warning message is printed and
no error code is returned.

$ ip -6 tunnel add ip6tnl1 mode ip6gre local fd::1 remote fd::2 tos inherit ttl 127 encaplimit none dev dummy0
$ ip -6 tunnel change dev dummy0 local 2001:1234::1 remote 2001:1234::2
$ ip -6 tunnel show ip6tnl1
ip6tnl1: gre/ipv6 remote fd::2 local fd::1 dev dummy0 encaplimit none hoplimit 127 tclass inherit flowlabel 0x00000 (flowinfo 0x00000000)

This commit checks if tunnel interface name is equal to an empty
string: in this case, it prints a warning message to the user.
It intentionally avoids to return an error to not break existing
script setup.

This is the output after this commit:
$ ip -6 tunnel add ip6tnl1 mode ip6gre local fd::1 remote fd::2 tos inherit ttl 127 encaplimit none dev dummy0
$ ip -6 tunnel change dev dummy0 local 2001:1234::1 remote 2001:1234::2
Tunnel interface name not specified
$ ip -6 tunnel show ip6tnl1
ip6tnl1: gre/ipv6 remote fd::2 local fd::1 dev dummy0 encaplimit none hoplimit 127 tclass inherit flowlabel 0x00000 (flowinfo 0x00000000)

Reviewed-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agoRevert "ip6tunnel: fix 'ip -6 {show|change} dev <name>' cmds"
Andrea Claudi [Tue, 9 Jul 2019 13:16:50 +0000 (15:16 +0200)]
Revert "ip6tunnel: fix 'ip -6 {show|change} dev <name>' cmds"

This reverts commit ba126dcad20e6d0e472586541d78bdd1ac4f1123.
It breaks tunnel creation when using 'dev' parameter:

$ ip link add type dummy
$ ip -6 tunnel add ip6tnl1 mode ip6ip6 remote 2001:db8:ffff:100::2 local 2001:db8:ffff:100::1 hoplimit 1 tclass 0x0 dev dummy0
add tunnel "ip6tnl0" failed: File exists

dev parameter must be used to specify the device to which
the tunnel is binded, and not the tunnel itself.

Reported-by: Jianwen Ji <jiji@redhat.com>
Reviewed-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agouapi: rdma netlink.h update
Stephen Hemminger [Tue, 16 Jul 2019 18:58:44 +0000 (11:58 -0700)]
uapi: rdma netlink.h update

From upstream 5.3-rc

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agouapi: update uapi/magic.h
Stephen Hemminger [Tue, 16 Jul 2019 18:56:58 +0000 (11:56 -0700)]
uapi: update uapi/magic.h

From upstream

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agodevlink: Remove enclosing array brackets binary print with json format
Aya Levin [Wed, 10 Jul 2019 11:03:21 +0000 (14:03 +0300)]
devlink: Remove enclosing array brackets binary print with json format

Keep pr_out_binary_value function only for printing. Inner relations
like array grouping should be done outside the function.

Fixes: 844a61764c6f ("devlink: Add helper functions for name and value separately")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agodevlink: Fix binary values print
Aya Levin [Wed, 10 Jul 2019 11:03:20 +0000 (14:03 +0300)]
devlink: Fix binary values print

Fix function pr_out_binary_value() to start printing the binary buffer
from offset 0 instead of offset 1. Remove redundant new line at the
beginning of the output

Example:
With patch:
 mlx5e_txqsq:
   05 00 00 00 05 00 00 00 01 00 00 00 00 00 00 00
   00 00 00 00 00 00 00 00 8e 6e 3a 13 07 00 00 00
   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
   c0
Without patch
  mlx5e_txqsq:

  00 00 00 05 00 00 00 01 00 00 00 00 00 00 00 00
  00 00 00 00 00 00 00 8e 6e 3a 13 07 00 00 00 00
  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0

Fixes: 844a61764c6f ("devlink: Add helper functions for name and value separately")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agodevlink: Change devlink health dump show command to dumpit
Aya Levin [Wed, 10 Jul 2019 11:03:19 +0000 (14:03 +0300)]
devlink: Change devlink health dump show command to dumpit

Although devlink health dump show command is given per reporter, it
returns large amounts of data. Trying to use the doit cb results in
OUT-OF-BUFFER error. This complementary patch raises the DUMP flag in
order to invoke the dumpit cb. We're safe as no existing drivers
implement the dump health reporter option yet.

Fixes: 041e6e651a8e ("devlink: Add devlink health dump show command")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agoutils: don't match empty strings as prefixes
Matteo Croce [Mon, 15 Jul 2019 18:04:30 +0000 (20:04 +0200)]
utils: don't match empty strings as prefixes

iproute has an utility function which checks if a string is a prefix for
another one, to allow use of abbreviated commands, e.g. 'addr' or 'a'
instead of 'address'.

This routine unfortunately considers an empty string as prefix
of any pattern, leading to undefined behaviour when an empty
argument is passed to ip:

    # ip ''
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
        inet6 ::1/128 scope host
           valid_lft forever preferred_lft forever

    # tc ''
    qdisc noqueue 0: dev lo root refcnt 2

    # ip address add 192.0.2.0/24 '' 198.51.100.1 dev dummy0
    # ip addr show dev dummy0
    6: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
        link/ether 02:9d:5e:e9:3f:c0 brd ff:ff:ff:ff:ff:ff
        inet 192.0.2.0/24 brd 198.51.100.1 scope global dummy0
           valid_lft forever preferred_lft forever

Rewrite matches() so it takes care of an empty input, and doesn't
scan the input strings three times: the actual implementation
does 2 strlen and a memcpy to accomplish the same task.

Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agotc: util: constrain percentage in 0-100 interval
Andrea Claudi [Sat, 13 Jul 2019 09:44:07 +0000 (11:44 +0200)]
tc: util: constrain percentage in 0-100 interval

parse_percent() currently allows to specify negative percentages
or value above 100%. However this does not seems to make sense,
as the function is used for probabilities or bandiwidth rates.

Moreover, using negative values leads to erroneous results
(using Bernoulli loss model as example):

$ ip link add test type dummy
$ ip link set test up
$ tc qdisc add dev test root netem loss gemodel -10% limit 10
$ tc qdisc show dev test
qdisc netem 800c: root refcnt 2 limit 10 loss gemodel p 90% r 10% 1-h 100% 1-k 0%

Using values above 100% we have instead:

$ ip link add test type dummy
$ ip link set test up
$ tc qdisc add dev test root netem loss gemodel 140% limit 10
$ tc qdisc show dev test
qdisc netem 800f: root refcnt 2 limit 10 loss gemodel p 40% r 60% 1-h 100% 1-k 0%

This commit changes parse_percent() with a check to ensure
percentage values stay between 1.0 and 0.0.
parse_percent_rate() function, which already employs a similar
check, is adjusted accordingly.

With this check in place, we have:

$ ip link add test type dummy
$ ip link set test up
$ tc qdisc add dev test root netem loss gemodel -10% limit 10
Illegal "loss gemodel p"

Fixes: 927e3cfb52b58 ("tc: B.W limits can now be specified in %.")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agouapi: fix bpf.h link
Stephen Hemminger [Thu, 11 Jul 2019 22:36:29 +0000 (15:36 -0700)]
uapi: fix bpf.h link

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agotc: print all error messages to stderr
Stephen Hemminger [Tue, 9 Jul 2019 21:25:14 +0000 (14:25 -0700)]
tc: print all error messages to stderr

Many tc modules were printing error messages to stdout.
This is problematic if using JSON or other output formats.
Change all these places to use fprintf(stderr, ...) instead.

Also, remove unnecessary initialization and places
where else is used after error return.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agoMerge branch 'master' into next
David Ahern [Wed, 10 Jul 2019 21:41:13 +0000 (14:41 -0700)]
Merge branch 'master' into next

Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agoMerge branch 'tc-mpls-action' into next
David Ahern [Wed, 10 Jul 2019 21:07:42 +0000 (14:07 -0700)]
Merge branch 'tc-mpls-action' into next

John Hurley  says:

====================

Recent kernel additions to TC allows the manipulation of MPLS headers as
filter actions.

The following patchset creates an iproute2 interface to the new actions
and includes documentation on how to use it.

v1->v2:
- change error from print_string() to fprintf(strerr,) (Stephen Hemminger)
- split long line in explain() message (David Ahern)
- use _SL_ instead of /n in print message (David Ahern)

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
4 years agoman: update man pages for TC MPLS actions
John Hurley [Wed, 10 Jul 2019 12:40:40 +0000 (13:40 +0100)]
man: update man pages for TC MPLS actions

Add a man page describing the newly added TC mpls manipulation actions.

Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David Ahern <dsahern@gmail.com>