]> git.proxmox.com Git - mirror_iproute2.git/log
mirror_iproute2.git
3 years agordma: fix spelling error in comment
Stephen Hemminger [Sun, 8 Nov 2020 18:44:19 +0000 (10:44 -0800)]
rdma: fix spelling error in comment

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agoman: fix spelling errors
Stephen Hemminger [Sun, 8 Nov 2020 18:40:30 +0000 (10:40 -0800)]
man: fix spelling errors

Lots of little typo errors on man pages.
Found by running codespell

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agotc/m_gate: fix spelling errors
Stephen Hemminger [Sun, 8 Nov 2020 18:34:23 +0000 (10:34 -0800)]
tc/m_gate: fix spelling errors

Fix spelling errors in error messages.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agouapi: updates from 5.10-rc1
Stephen Hemminger [Tue, 3 Nov 2020 16:29:14 +0000 (08:29 -0800)]
uapi: updates from 5.10-rc1

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agolibnetlink: define __aligned conditionally
Johannes Berg [Mon, 26 Oct 2020 11:32:52 +0000 (12:32 +0100)]
libnetlink: define __aligned conditionally

On some systems (e.g. current Debian/stable) the inclusion
of utils.h pulled in some other things that may end up
defining __aligned, in a possibly different way than what
we had here.

Use our own definition only if there isn't one already.

Fixes: d5acae244f9d ("libnetlink: add nl_print_policy() helper")
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agoMerge branch 'main' into next
David Ahern [Sun, 25 Oct 2020 21:08:01 +0000 (15:08 -0600)]
Merge branch 'main' into next

Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agom_mpls: test the 'mac_push' action after 'modify'
Guillaume Nault [Thu, 22 Oct 2020 09:11:44 +0000 (11:11 +0200)]
m_mpls: test the 'mac_push' action after 'modify'

Commit 02a261b5ba1c ("m_mpls: add mac_push action") added a matches()
test for the "mac_push" string before the test for "modify".
This changes the previous behaviour as 'action m' used to match
"modify" while it now matches "mac_push".

Revert to the original behaviour by moving the "mac_push" test after
"modify".

Fixes: 02a261b5ba1c ("m_mpls: add mac_push action")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agoMerge branch 'tipc-encryption' into next
David Ahern [Tue, 20 Oct 2020 15:05:00 +0000 (09:05 -0600)]
Merge branch 'tipc-encryption' into next

Tuong Lien  says:

====================

This series adds two new options in the 'iproute2/tipc' command, enabling users
to use the new TIPC encryption features, i.e. the master key and rekeying which
have been recently merged in kernel.

The help menu of the "tipc node set key" command is also updated accordingly:

 # tipc node set key --help
Usage: tipc node set key KEY [algname ALGNAME] [PROPERTIES]
       tipc node set key rekeying REKEYING

KEY
  Symmetric KEY & SALT as a composite ASCII or hex string (0x...) in form:
  [KEY: 16, 24 or 32 octets][SALT: 4 octets]

ALGNAME
  Cipher algorithm [default: "gcm(aes)"]

PROPERTIES
  master                - Set KEY as a cluster master key
  <empty>               - Set KEY as a cluster key
  nodeid NODEID         - Set KEY as a per-node key for own or peer

REKEYING
  INTERVAL              - Set rekeying interval (in minutes) [0: disable]
  now                   - Trigger one (first) rekeying immediately

EXAMPLES
  tipc node set key this_is_a_master_key master
  tipc node set key 0x746869735F69735F615F6B657931365F73616C74
  tipc node set key this_is_a_key16_salt algname "gcm(aes)" nodeid 1001002
  tipc node set key rekeying 600

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agotipc: add option to set rekeying for encryption
Tuong Lien [Fri, 16 Oct 2020 16:02:01 +0000 (23:02 +0700)]
tipc: add option to set rekeying for encryption

As supported in kernel, the TIPC encryption rekeying can be tuned using
the netlink attribute - 'TIPC_NLA_NODE_REKEYING'. Now we add the
'rekeying' option correspondingly to the 'tipc node set key' command so
that user will be able to perform that tuning:

tipc node set key rekeying REKEYING

where the 'REKEYING' value can be:

INTERVAL              - Set rekeying interval (in minutes) [0: disable]
now                   - Trigger one (first) rekeying immediately

For example:
$ tipc node set key rekeying 60
$ tipc node set key rekeying now

The command's help menu is also updated with these descriptions for the
new command option.

Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agotipc: add option to set master key for encryption
Tuong Lien [Fri, 16 Oct 2020 16:02:00 +0000 (23:02 +0700)]
tipc: add option to set master key for encryption

In addition to the support of master key in kernel, we add the 'master'
option to the 'tipc node set key' command for user to be able to
specify a key as master key during the key setting. This is carried out
by turning on the new netlink flag - 'TIPC_NLA_NODE_KEY_MASTER'.
For example:

$ tipc node set key "this_is_a_master_key" master

The command's help menu is also updated to give a better description of
all the available options.

Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agoMerge branch 'tc-mpls-l2-vpn' into next
David Ahern [Tue, 20 Oct 2020 14:57:47 +0000 (08:57 -0600)]
Merge branch 'tc-mpls-l2-vpn' into next
Guillaume Nault  says:

====================

This patch series adds the possibility for TC to tunnel Ethernet frames
over MPLS.

Patch 1 allows adding or removing the Ethernet header.
Patch 2 allows pushing an MPLS LSE before the MAC header.

By combining these actions, it becomes possible to encapsulate an
entire Ethernet frame into MPLS, then add an outer Ethernet header
and send the resulting frame to the next hop.

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agom_mpls: add mac_push action
Guillaume Nault [Mon, 19 Oct 2020 15:23:08 +0000 (17:23 +0200)]
m_mpls: add mac_push action

Add support for the new TCA_MPLS_ACT_MAC_PUSH action (kernel commit
a45294af9e96 ("net/sched: act_mpls: Add action to push MPLS LSE before
Ethernet header")). This action let TC push an MPLS header before the
MAC header of a frame.

Example (encapsulate all outgoing frames with label 20, then add an
outer Ethernet header):
 # tc filter add dev ethX matchall \
       action mpls mac_push label 20 ttl 64 \
       action vlan push_eth dst_mac 0a:00:00:00:00:02 \
                            src_mac 0a:00:00:00:00:01

This patch also adds an alias for ETH_P_TEB, since it is useful when
decapsulating MPLS packets that contain an Ethernet frame.

With MAC_PUSH, there's no previous Ethertype to modify. However, the
"protocol" option is still needed, because the kernel uses it to set
skb->protocol. So rename can_modify_ethtype() to can_set_ethtype().

Also add a test suite for m_mpls, which covers the new action and the
pre-existing ones.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agom_vlan: add pop_eth and push_eth actions
Guillaume Nault [Mon, 19 Oct 2020 15:23:01 +0000 (17:23 +0200)]
m_vlan: add pop_eth and push_eth actions

Add support for the new TCA_VLAN_ACT_POP_ETH and TCA_VLAN_ACT_PUSH_ETH
actions (kernel commit 19fbcb36a39e ("net/sched: act_vlan:
Add {POP,PUSH}_ETH actions"). These action let TC remove or add the
Ethernet at the head of a frame.

Drop an Ethernet header:
 # tc filter add dev ethX matchall action vlan pop_eth

Push an Ethernet header (the original frame must have no MAC header):
 # tc filter add dev ethX matchall action vlan \
       push_eth dst_mac 0a:00:00:00:00:02 src_mac 0a:00:00:00:00:01

Also add a test suite for m_vlan, which covers these new actions and
the pre-existing ones.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agodevlink: display elapsed time during flash update
Jacob Keller [Wed, 14 Oct 2020 22:31:04 +0000 (15:31 -0700)]
devlink: display elapsed time during flash update

For some devices, updating the flash can take significant time during
operations where no status can meaningfully be reported. This can be
somewhat confusing to a user who sees devlink appear to hang on the
terminal waiting for the device to update.

Recent changes to the kernel interface allow such long running commands
to provide a timeout value indicating some upper bound on how long the
relevant action could take.

Provide a ticking counter of the time elapsed since the previous status
message in order to make it clear that the program is not simply stuck.

Display this message whenever the status message from the kernel
indicates a timeout value. Additionally also display the message if
we've received no status for more than couple of seconds. If we elapse
more than the timeout provided by the status message, replace the
timeout display with "timeout reached".

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agov5.9.0
Stephen Hemminger [Thu, 15 Oct 2020 22:18:35 +0000 (15:18 -0700)]
v5.9.0

3 years agotc: fq: clarify the length of orphan_mask.
zhangkaiheb@126.com [Tue, 13 Oct 2020 05:26:40 +0000 (05:26 +0000)]
tc: fq: clarify the length of orphan_mask.

Signed-off-by: kai zhang <zhangkaiheb@126.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agoip: add error reporting when RTM_GETNSID failed
Jan Engelhardt [Mon, 12 Oct 2020 13:55:55 +0000 (15:55 +0200)]
ip: add error reporting when RTM_GETNSID failed

`ip addr` when run under qemu-user-riscv64, fails. This likely is due
to qemu-5.1 not doing translation of RTM_GETNSID calls. Aborting ip
completely is not helpful for the user however. This patch reworks
the error handling.

Before:

rtest:/ # ip a
2: host0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
request send failed: Operation not supported
    link/ether 46:3f:2d:88:3d:db brd ff:ff:ff:ff:ff:ffrtest:/ #

Afterwards:

rtest:/ # ip a
2: host0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
rtnl_send(RTM_GETNSID): Operation not supported. Continuing anyway.
    link/ether 46:3f:2d:88:3d:db brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.72.147/28 brd 192.168.72.159 scope global host0
       valid_lft forever preferred_lft forever
    inet6 fe80::443f:2dff:fe88:3ddb/64 scope link
       valid_lft forever preferred_lft forever

Signed-off-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agolib: ignore invalid mounts in cg_init_map
Dmitry Yakunin [Thu, 8 Oct 2020 17:59:27 +0000 (20:59 +0300)]
lib: ignore invalid mounts in cg_init_map

In case of bad entries in /proc/mounts just skip cgroup cache initialization.
Cgroups in output will be shown as "unreachable:cgroup_id".

Fixes: d5e6ee0dac64 ("ss: introduce cgroup2 cache and helper functions")
Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru>
Reported-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agouapi: add new SNMP entry
Stephen Hemminger [Mon, 12 Oct 2020 05:50:22 +0000 (22:50 -0700)]
uapi: add new SNMP entry

Update to snmp.h from 5.9

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agoMerge branch 'main' into next
David Ahern [Mon, 12 Oct 2020 02:11:09 +0000 (20:11 -0600)]
Merge branch 'main' into next
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agogenl: ctrl: print op -> policy idx mapping
Johannes Berg [Sat, 3 Oct 2020 08:45:32 +0000 (10:45 +0200)]
genl: ctrl: print op -> policy idx mapping

Newer kernels can dump per-op policies, so print out the new
mapping attribute to indicate which op has which policy.

v2:
 * print out both do/dump policy idx
v3:
 * fix userspace API which renumbered after patch rebasing

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agoMerge branch 'bridge-igmpv3-mldv2' into next
David Ahern [Mon, 12 Oct 2020 02:09:14 +0000 (20:09 -0600)]
Merge branch 'bridge-igmpv3-mldv2' into next
Nikolay Aleksandrov  says:

====================
This set adds support for IGMPv3/MLDv2 attributes, they're mostly
read-only at the moment. The only new "set" option is the source address
for S,G entries. It is added in patch 01 (see the patch commit message for
an example). Patch 02 shows a missing flag (fast_leave) for
completeness, then patch 03 shows the new IGMPv3/MLDv2 flags:
added_by_star_ex and blocked. Patches 04-06 show the new extra
information about the entry's state when IGMPv3/MLDv2 are enabled. That
includes its filter mode (include/exclude), source list with timers and
origin protocol (currently only static/kernel), in order to show the new
information the user must use "-d"/show_details.
Here's the output of a few IGMPv3 entries:
 dev bridge port ens12 grp 239.0.0.1 src 20.21.22.23 temp filter_mode include proto kernel  blocked    0.00
 dev bridge port ens12 grp 239.0.0.1 src 8.9.10.11 temp filter_mode include proto kernel  blocked    0.00
 dev bridge port ens12 grp 239.0.0.1 src 1.2.3.1 temp filter_mode include proto kernel  blocked    0.00
 dev bridge port ens12 grp 239.0.0.1 temp filter_mode exclude source_list 20.21.22.23/0.00,8.9.10.11/0.00,1.2.3.1/0.00 proto kernel    26.65

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agobridge: mdb: print protocol when available
Nikolay Aleksandrov [Thu, 8 Oct 2020 13:50:24 +0000 (16:50 +0300)]
bridge: mdb: print protocol when available

Print the mdb entry's protocol (i.e. who added it)  when it's available if
the user requested to show details (-d). Currently the only possible
values are RTPROT_STATIC (user-space added) or RTPROT_KERNEL
(automatically added by kernel). The value is kernel controlled.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agobridge: mdb: print source list when available
Nikolay Aleksandrov [Thu, 8 Oct 2020 13:50:23 +0000 (16:50 +0300)]
bridge: mdb: print source list when available

Print the mdb entry's source list when it's available if the user
requested to show details (-d). Each source has an associated timer
which controls if traffic should be forwarded to that S,G entry (if the
timer is non-zero traffic is forwarded, otherwise it's not).
Currently the source list is kernel controlled and can't be changed by
user-space.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agobridge: mdb: print filter mode when available
Nikolay Aleksandrov [Thu, 8 Oct 2020 13:50:22 +0000 (16:50 +0300)]
bridge: mdb: print filter mode when available

Print the mdb entry's filter mode when it's available if the user
requested to show details (-d). It can be either include or exclude.
Currently it's kernel controlled and can't be changed by user-space.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agobridge: mdb: show igmpv3/mldv2 flags
Nikolay Aleksandrov [Thu, 8 Oct 2020 13:50:21 +0000 (16:50 +0300)]
bridge: mdb: show igmpv3/mldv2 flags

With IGMPv3/MLDv2 support we have 2 new flags:
 - added_by_star_ex: set when the S,G entry was automatically created
                     because of a *,G entry in EXCLUDE mode
 - blocked: set when traffic for the S,G entry for that port has to be
            blocked
Both flags are used only on the new S,G entries and are currently kernel
managed, i.e. similar to other flags which can't be set from user-space.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agobridge: mdb: print fast_leave flag
Nikolay Aleksandrov [Thu, 8 Oct 2020 13:50:20 +0000 (16:50 +0300)]
bridge: mdb: print fast_leave flag

We're not showing the fast_leave flag when it's set. Currently that can
be only when an mdb entry is being deleted due to fast leave, so it will
only affect mdb monitor.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agobridge: mdb: add support for source address
Nikolay Aleksandrov [Thu, 8 Oct 2020 13:50:19 +0000 (16:50 +0300)]
bridge: mdb: add support for source address

This patch adds the user-space control and dump of mdb entry source
address. When setting the new MDBA_SET_ENTRY_ATTRS nested attribute is
used and inside is added MDBE_ATTR_SOURCE based on the address family.
When dumping we look for MDBA_MDB_EATTR_SOURCE and if present we add the
"src x.x.x.x" output. The source address will be always shown as it's
needed to match the entry to modify it from user-space.

Example:
 $ bridge mdb add dev bridge port ens13 grp 239.0.0.1 src 1.2.3.4 permanent vid 100
 $ bridge mdb show
 dev bridge port ens13 grp 239.0.0.1 src 1.2.3.4 permanent vid 100

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agoUpdate kernel headers
David Ahern [Mon, 12 Oct 2020 02:04:57 +0000 (20:04 -0600)]
Update kernel headers

Update kernel headers to commit:
    bc081a693a56 ("Merge branch 'Offload-tc-vlan-mangle-to-mscc_ocelot-switch'")

Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agoip xfrm: support setting XFRMA_SET_MARK_MASK attribute in states
Antony Antony [Fri, 2 Oct 2020 13:22:38 +0000 (15:22 +0200)]
ip xfrm: support setting XFRMA_SET_MARK_MASK attribute in states

The XFRMA_SET_MARK_MASK attribute can be set in states (4.19+)
It is optional and the kernel default is 0xffffffff
It is the mask of XFRMA_SET_MARK(a.k.a. XFRMA_OUTPUT_MARK in 4.18)

e.g.
./ip/ip xfrm state add output-mark 0x6 mask 0xab proto esp \
 auth digest_null 0 enc cipher_null ''
ip xfrm state
src 0.0.0.0 dst 0.0.0.0
proto esp spi 0x00000000 reqid 0 mode transport
replay-window 0
output-mark 0x6/0xab
auth-trunc digest_null 0x30 0
enc ecb(cipher_null)
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
sel src 0.0.0.0/0 dst 0.0.0.0/0

Signed-off-by: Antony Antony <antony@phenome.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agodevlink: Add health reporter test command support
Jiri Pirko [Thu, 1 Oct 2020 07:21:13 +0000 (09:21 +0200)]
devlink: Add health reporter test command support

Add health reporter test command and allow user to trigger a test event.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agodevlink: support setting the overwrite mask attribute
Jacob Keller [Wed, 30 Sep 2020 21:05:47 +0000 (14:05 -0700)]
devlink: support setting the overwrite mask attribute

The recently added DEVLINK_ATTR_FLASH_UPDATE_OVERWRITE_MASK allows
userspace to indicate how a device should handle subsections of a flash
component when updating. For example, a flash component might contain
vital data such as PCIe serial number or configuration fields such as
settings that control device bootup.

The overwrite mask allows specifying whether the device should overwrite
these subsections when updating from the provided image. If nothing is
specified, then the update is expected to preserve all vital fields and
configuration.

Add support for specifying the overwrite mask using the new "overwrite"
option to the flash command line.

By specifying "overwrite identifiers", the user request that the flash
update should overwrite any settings in the updated flash component with
settings from the provided flash image

  $devlink dev flash pci/0000:af:00.0 file flash_image.bin overwrite identifiers

By specifying "overwrite settings" the user requests that the flash update
should overwrite any settings in the updated flash component with setting
values from the provided flash image.

  $devlink dev flash pci/0000:af:00.0 file flash_image.bin overwrite settings

These options may be combined, in which case both subsections will be sent
in the overwrite mask, resulting in a request to overwrite all settings and
identifiers stored in the updated flash components.

  $devlink dev flash pci/0000:af:00.0 file flash_image.bin overwrite settings overwrite identifiers

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agoUpdate kernel headers
David Ahern [Wed, 7 Oct 2020 06:01:26 +0000 (00:01 -0600)]
Update kernel headers

Update kernel headers to commit:
    9faebeb2d800 ("Merge branch 'ethtool-allow-dumping-policies-to-user-space'")

Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agoaddr: Fix noprefixroute and autojoin for IPv4
Stephen Hemminger [Tue, 6 Oct 2020 22:15:56 +0000 (15:15 -0700)]
addr: Fix noprefixroute and autojoin for IPv4

These were reported as IPv6-only and ignored:

     # ip address add 192.0.2.2/24 dev dummy5 noprefixroute
     Warning: noprefixroute option can be set only for IPv6 addresses
     # ip address add 224.1.1.10/24 dev dummy5 autojoin
     Warning: autojoin option can be set only for IPv6 addresses

This enables them back for IPv4.

Fixes: 9d59c86e575b5 ("iproute2: ip addr: Organize flag properties structurally")
Signed-off-by: Adel Belhouane <bugs.a.b@free.fr>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agoipntable: add missing ndts_table_fulls ntable stat
Eyal Birger [Fri, 2 Oct 2020 09:34:28 +0000 (12:34 +0300)]
ipntable: add missing ndts_table_fulls ntable stat

Used for tracking neighbour table overflows.

Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agoip: iplink_ipoib.c: Remove extra spaces
Kamal Heib [Sun, 27 Sep 2020 12:06:56 +0000 (15:06 +0300)]
ip: iplink_ipoib.c: Remove extra spaces

Remove the extra space between the reported ipoib attrs - use only one
space instead of two.

Fixes: de0389935f8c ("iplink: Added support for the kernel IPoIB RTNL ops")
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agoss: add support for xdp statistics
Ciara Loftus [Thu, 24 Sep 2020 07:03:27 +0000 (07:03 +0000)]
ss: add support for xdp statistics

The patch exposes statistics for XDP sockets which can be useful for
debugging purposes.

The stats exposed are:
    rx dropped
    rx invalid
    rx queue full
    rx fill ring empty
    tx invalid
    tx ring empty

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agoUpdate kernel headers
David Ahern [Tue, 29 Sep 2020 15:13:21 +0000 (09:13 -0600)]
Update kernel headers

Update kernel headers to commit:
    280095713ce2 ("Merge branch 'ibmvnic-refactor-some-send-handle-functions'")

Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agouapi: update headers from 5.9-rc7
Stephen Hemminger [Mon, 28 Sep 2020 20:50:36 +0000 (13:50 -0700)]
uapi: update headers from 5.9-rc7

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agobuild: avoid make jobserver warnings
Jan Engelhardt [Mon, 28 Sep 2020 19:08:01 +0000 (21:08 +0200)]
build: avoid make jobserver warnings

I observe:

» make -j8 CCOPTS=-ggdb3
lib
make[1]: warning: -j8 forced in submake: resetting jobserver mode.
make[1]: Nothing to be done for 'all'.
ip
make[1]: warning: -j8 forced in submake: resetting jobserver mode.
    CC       ipntable.o

MFLAGS is a historic variable of some kind; removing it fixes the
jobserver issue.

Signed-off-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agoip: promote missed packets to the -s row
Jakub Kicinski [Wed, 16 Sep 2020 19:42:49 +0000 (12:42 -0700)]
ip: promote missed packets to the -s row

missed_packet_errors are much more commonly reported:

linux$ git grep -c '[.>]rx_missed_errors ' -- drivers/ | wc -l
64
linux$ git grep -c '[.>]rx_over_errors ' -- drivers/ | wc -l
37

Plus those drivers are generally more modern than those
using rx_over_errors.

Since recently merged kernel documentation makes this
preference official, let's make ip -s output more informative
and let rx_missed_errors take the place of rx_over_errors.

Before:

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 00:0a:f7:c1:4d:38 brd ff:ff:ff:ff:ff:ff
    RX: bytes  packets  errors  dropped overrun mcast
    6.04T      4.67G    0       0       0       67.7M
    RX errors: length   crc     frame   fifo    missed
               0        0       0       0       7
    TX: bytes  packets  errors  dropped carrier collsns
    3.13T      2.76G    0       0       0       0
    TX errors: aborted  fifo   window heartbeat transns
               0        0       0       0       6

After:

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 00:0a:f7:c1:4d:38 brd ff:ff:ff:ff:ff:ff
    RX: bytes  packets  errors  dropped missed  mcast
    6.04T      4.67G    0       0       7       67.7M
    RX errors: length   crc     frame   fifo    overrun
               0        0       0       0       0
    TX: bytes  packets  errors  dropped carrier collsns
    3.13T      2.76G    0       0       0       0
    TX errors: aborted  fifo   window heartbeat transns
               0        0       0       0       6

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agoMerge branch 'devlink-controller-external-info' into next
David Ahern [Wed, 23 Sep 2020 02:17:48 +0000 (20:17 -0600)]
Merge branch 'devlink-controller-external-info' into next
Parav Pandit  says:

====================

For certain devlink port flavours controller number and optionally external=
 attributes are reported by the kernel.

(a) controller number indicates that a given port belong to which local or =
external controller.
(b) external port attribute indicates that if a given port is for external =
or local controller.

This short series shows this attributes to user.

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agodevlink: Show controller number of a devlink port
Parav Pandit [Fri, 18 Sep 2020 10:16:49 +0000 (13:16 +0300)]
devlink: Show controller number of a devlink port

Show the controller number of the devlink port whenever kernel reports
it.

Example of a PCI VF port for an external controller number 1:

$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev ens2f0c1pf0vf1 flavour pcivf controller 1 pfnum 0 vfnum 1 external true splittable false
  function:
    hw_addr 00:00:00:00:00:00

$ devlink port show pci/0000:06:00.0/2 -jp
{
    "port": {
        "pci/0000:06:00.0/2": {
            "type": "eth",
            "netdev": "ens2f0c1pf0vf1",
            "flavour": "pcivf",
            "controller": 1,
            "pfnum": 0,
            "vfnum": 1,
            "external": true,
            "splittable": false,
            "function": {
                "hw_addr": "00:00:00:00:00:00"
            }
        }
    }
}

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agodevlink: Show external port attribute
Parav Pandit [Fri, 18 Sep 2020 10:16:48 +0000 (13:16 +0300)]
devlink: Show external port attribute

If a port is for an external controller, port's external attribute is
set. Show such external attribute.

An example of an external controller port for PCI VF:

$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev ens2f0c1pf0vf1 flavour pcivf pfnum 0 vfnum 1 external true splittable false
  function:
    hw_addr 00:00:00:00:00:00

$ devlink port show pci/0000:06:00.0/2 -jp
{
    "port": {
        "pci/0000:06:00.0/2": {
            "type": "eth",
            "netdev": "ens2f0c1pf0vf1",
            "flavour": "pcivf",
            "pfnum": 0,
            "vfnum": 1,
            "external": true,
            "splittable": false,
            "function": {
                "hw_addr": "00:00:00:00:00:00"
            }
        }
    }
}

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agoUpdate kernel headers
David Ahern [Wed, 23 Sep 2020 02:10:43 +0000 (20:10 -0600)]
Update kernel headers

Update kernel headers to commit:
    748d1c8a425e ("Merge branch 'devlink-Use-nla_policy-to-validate-range'")

Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agoip: updated ip-link man page
Roman Mashak [Tue, 1 Sep 2020 19:46:12 +0000 (15:46 -0400)]
ip: updated ip-link man page

Added description of link flags allmulticast, promisc and trailers.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agoiproute2: ss: add support to expose various inet sockopts
Wei Wang [Wed, 19 Aug 2020 21:13:54 +0000 (14:13 -0700)]
iproute2: ss: add support to expose various inet sockopts

This commit adds support to expose the following inet socket options:
-- recverr
-- is_icsk
-- freebind
-- hdrincl
-- mc_loop
-- transparent
-- mc_all
-- nodefrag
-- bind_address_no_port
-- recverr_rfc4884
-- defer_connect
with the option --inet-sockopt. The individual option is only shown
when set.

Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agoUpdate kernel headers
David Ahern [Wed, 9 Sep 2020 02:35:28 +0000 (20:35 -0600)]
Update kernel headers

Update kernel headers to commit:
4349abdb409b ("net: dsa: don't print non-fatal MTU error if not supported")

Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agotipc: support 128bit node identity for peer removing
Hoang Le [Thu, 27 Aug 2020 02:30:37 +0000 (09:30 +0700)]
tipc: support 128bit node identity for peer removing

Problem:
In kernel upstream, we add the support to set node identity with
128bit. However, we are still using legacy format in command tipc
peer removing. Then, we got a problem when trying to remove
offline node i.e:

$ tipc node list
Node Identity                    Hash     State
d6babc1c1c6d                     1cbcd7ca down

$ tipc peer remove address d6babc1c1c6d
invalid network address, syntax: Z.C.N
error: No such device or address

Solution:
We add the support to remove a specific node down with 128bit
node identifier, as an alternative to legacy 32-bit node address.

Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Huu Le <hoang.h.le@dektech.com.au>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agoiplink: add support for protodown reason
Roopa Prabhu [Sat, 29 Aug 2020 03:42:56 +0000 (20:42 -0700)]
iplink: add support for protodown reason

This patch adds support for recently
added link IFLA_PROTO_DOWN_REASON attribute.
IFLA_PROTO_DOWN_REASON enumerates reasons
for the already existing IFLA_PROTO_DOWN link
attribute.

$ cat /etc/iproute2/protodown_reasons.d/r.conf
0 mlag
1 evpn
2 vrrp
3 psecurity

$ ip link set dev vx10 protodown on protodown_reason vrrp on
$ip link show dev vx10
14: vx10: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
DEFAULT group default qlen 1000
    link/ether f2:32:28:b8:35:ff brd ff:ff:ff:ff:ff:ff protodown on
protodown_reason <vrrp>
$ip -p -j link show dev vx10
[ {
<snip>
        "proto_down": true,
        "proto_down_reason": [ "vrrp" ]
} ]
$ip link set dev vx10 protodown_reason mlag on
$ip link show dev vx10
14: vx10: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
DEFAULT group default qlen 1000
    link/ether f2:32:28:b8:35:ff brd ff:ff:ff:ff:ff:ff protodown on
protodown_reason <mlag,vrrp>
$ip -p -j link show dev vx10
[ {
<snip>
        "proto_down": true,
        "protodown_reason": [ "mlag","vrrp" ]
} ]

$ip -p -j link show dev vx10
$ip link set dev vx10 protodown off protodown_reason vrrp off
Error: Cannot clear protodown, active reasons.
$ip link set dev vx10 protodown off protodown_reason mlag off
$

Note: for somereason the json and non-json key for protodown
are different (protodown and proto_down). I have kept the
same for protodown reason for consistency (protodown_reason and
proto_down_reason).

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agoip xfrm: support printing XFRMA_SET_MARK_MASK attribute in states
Antony Antony [Fri, 28 Aug 2020 14:59:07 +0000 (16:59 +0200)]
ip xfrm: support printing XFRMA_SET_MARK_MASK attribute in states

The XFRMA_SET_MARK_MASK attribute is set in states (4.19+).
It is the mask of XFRMA_SET_MARK(a.k.a. XFRMA_OUTPUT_MARK in 4.18)

sample output: note the output-mark mask
ip xfrm state
src 192.1.2.23 dst 192.1.3.33
proto esp spi 0xSPISPI reqid REQID mode tunnel
replay-window 32 flag af-unspec
output-mark 0x3/0xffffff
aead rfc4106(gcm(aes)) 0xENCAUTHKEY 128
if_id 0x1

Signed-off-by: Antony Antony <antony@phenome.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agoMerge branch 'main' into next
David Ahern [Wed, 2 Sep 2020 01:46:20 +0000 (19:46 -0600)]
Merge branch 'main' into next

Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agoip link: Fix indenting in help text
Phil Sutter [Sat, 29 Aug 2020 10:18:35 +0000 (12:18 +0200)]
ip link: Fix indenting in help text

Indenting of 'ip link set' options below 'link-netns' was wrong, they
should be on the same level as the above.

While being at it, fix closing brackets in vf-specific options. Also
write node/port_guid parameters in upper-case without curly braces: They
are supposed to be replaced by values, not put literally.

Fixes: 8589eb4efdf2a ("treewide: refactor help messages")
Fixes: 5a3ec4ba64783 ("iplink: Update usage in help message")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agogenl: ctrl: support dumping netlink policy
Johannes Berg [Mon, 24 Aug 2020 17:51:08 +0000 (19:51 +0200)]
genl: ctrl: support dumping netlink policy

Support dumping the netlink policy of a given generic netlink
family, the policy (with any sub-policies if appropriate) is
exported by the kernel in a general fashion.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agolibnetlink: add nl_print_policy() helper
Johannes Berg [Mon, 24 Aug 2020 17:51:07 +0000 (19:51 +0200)]
libnetlink: add nl_print_policy() helper

This prints out the data from the given nested attribute
to the given FILE pointer, interpreting the firmware that
the kernel has for showing netlink policies.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agolibnetlink: add rtattr_for_each_nested() iteration macro
Johannes Berg [Mon, 24 Aug 2020 17:51:06 +0000 (19:51 +0200)]
libnetlink: add rtattr_for_each_nested() iteration macro

This is useful for iterating elements in a nested attribute,
if they're not parsed with a strict length limit or such.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David Ahern <dsahern@gmail.com>
3 years agoip: iplink: prp: update man page for new parameter
Murali Karicheri [Mon, 17 Aug 2020 21:17:37 +0000 (17:17 -0400)]
ip: iplink: prp: update man page for new parameter

PRP support requires a proto parameter which is 0 for hsr and 1 for
prp. Default is hsr and is backward compatible.

Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agoiplink: hsr: add support for creating PRP device similar to HSR
Murali Karicheri [Mon, 17 Aug 2020 21:17:36 +0000 (17:17 -0400)]
iplink: hsr: add support for creating PRP device similar to HSR

This patch enhances the iplink command to add a proto parameters to
create PRP device/interface similar to HSR. Both protocols are
quite similar and requires a pair of Ethernet interfaces. So re-use
the existing HSR iplink command to create PRP device/interface as
well. Use proto parameter to differentiate the two protocols.

Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agodevlink: Add fflush() in cmd_mon_show_cb()
Amit Cohen [Thu, 20 Aug 2020 13:51:13 +0000 (16:51 +0300)]
devlink: Add fflush() in cmd_mon_show_cb()

Similar to other print functions we need to flush buffered data
in order to work with pipes and output redirects.

Without it, stdout output is buffered and not written to the disk.

This is useful when writing scripts that rely on devlink-monitor output.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agoiproute2: ip maddress: Check multiaddr length
Sascha Hauer [Mon, 17 Aug 2020 11:25:19 +0000 (13:25 +0200)]
iproute2: ip maddress: Check multiaddr length

ip maddress add|del takes a MAC address as argument, so insist on
getting a length of ETH_ALEN bytes. This makes sure the passed argument
is actually a MAC address and especially not an IPv4 address which
was previously accepted and silently taken as a MAC address.

While at it, do not print *argv in the error path as this has been
modified by ll_addr_a2n() and doesn't contain the full string anymore,
which can lead to misleading error messages.

Also while at it, replace the hardcoded buffer size with the actual
buffer size using sizeof().

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agouapi: update bpf.h
Stephen Hemminger [Sun, 16 Aug 2020 23:09:52 +0000 (16:09 -0700)]
uapi: update bpf.h

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agordma: Properly print device and link names in CLI output
Leon Romanovsky [Tue, 11 Aug 2020 07:32:01 +0000 (10:32 +0300)]
rdma: Properly print device and link names in CLI output

The citied commit broke the CLI output and printed ifindex/ifname
instead of dev/link.

Before:
[leonro@vm ~]$ rdma res show qp
link mlx5_0/lqpn 1 type GSI state RTS sq-psn 0 comm ib_core
[leonro@vm ~]$ rdma res show cq
ifindex 0 ifname rocep0s9 cqn 0 cqe 1023 users 2 poll-ctx WORKQUEUE adaptive-moderation on comm ib_core

After:
[leonro@vm ~]$ rdma res show qp
link mlx5_0/- lqpn 1 type GSI state RTS sq-psn 0 comm [ib_core]
[leonro@vm ~]$ rdma res show cq
dev rocep0s9 cqn 0 cqe 1023 users 2 poll-ctx WORKQUEUE adaptive-moderation on comm [ib_core]

It was missed because rdmatool mostly used in JSON mode.

Fixes: b0a688a542cd ("rdma: Rewrite custom JSON and prints logic to use common API")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agordma: Fix owner name for the kernel resources
Leon Romanovsky [Tue, 11 Aug 2020 07:32:00 +0000 (10:32 +0300)]
rdma: Fix owner name for the kernel resources

Owner of kernel resources is printed in different format than user
resources to easy with the reader by simply looking on the name.
The kernel owner will have "[ ]" around the name.

Before this change:
[leonro@vm ~]$ rdma res show qp
link rocep0s9/1 lqpn 1 type GSI state RTS sq-psn 58 comm ib_core

After this change:
[leonro@vm ~]$ rdma res show qp
link rocep0s9/1 lqpn 1 type GSI state RTS sq-psn 58 comm [ib_core]

Fixes: b0a688a542cd ("rdma: Rewrite custom JSON and prints logic to use common API")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agouapi: update kernel headers
Stephen Hemminger [Tue, 11 Aug 2020 20:18:41 +0000 (13:18 -0700)]
uapi: update kernel headers

pre-rc1 version of Linux kernel headers.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agordma: Document the new "pid" criteria for auto mode
Mark Zhang [Tue, 4 Aug 2020 08:49:09 +0000 (11:49 +0300)]
rdma: Document the new "pid" criteria for auto mode

Document the new supported criteria of auto mode. Examples:
$ rdma statistic qp set link mlx5_2/1 auto pid on
$ rdma statistic qp set link mlx5_2/1 auto pid,type on

Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Ido Kalir <idok@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
3 years agordma: Add "PID" criteria support for statistic counter auto mode
Mark Zhang [Tue, 4 Aug 2020 08:49:08 +0000 (11:49 +0300)]
rdma: Add "PID" criteria support for statistic counter auto mode

With this new criteria, QPs have different PIDs will be bound to
different counters in auto mode. This can be used in combination with
other criteria like "type". Examples:

$ rdma statistic qp set link mlx5_2/1 auto pid on
$ rdma statistic qp set link mlx5_2/1 auto type,pid on
$ rdma statistic qp set link mlx5_2/1 auto off
$ rdma statistic qp show link mlx5_0 qp-type UD

Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Ido Kalir <idok@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
3 years agordma: update uapi headers
Mark Zhang [Tue, 4 Aug 2020 08:49:07 +0000 (11:49 +0300)]
rdma: update uapi headers

Update rdma_netlink.h file upto kernel commit 76251e15ea73
("RDMA/counter: Add PID category support in auto mode")

Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Ido Kalir <idok@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
3 years agoMerge branch 'main' into next
David Ahern [Thu, 6 Aug 2020 16:21:35 +0000 (16:21 +0000)]
Merge branch 'main' into next

Conflicts:
bridge/fdb.c
man/man8/bridge.8

Signed-off-by: David Ahern <dsahern@kernel.org>
3 years agov5.8.0
Stephen Hemminger [Mon, 3 Aug 2020 17:03:42 +0000 (10:03 -0700)]
v5.8.0

3 years agolnstat: use same version as iproute2
Stephen Hemminger [Mon, 3 Aug 2020 16:27:48 +0000 (09:27 -0700)]
lnstat: use same version as iproute2

Lnstat was trying to be different and have its own version.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agoreplace SNAPSHOT with auto-generated version string
Stephen Hemminger [Sat, 1 Aug 2020 17:26:41 +0000 (10:26 -0700)]
replace SNAPSHOT with auto-generated version string

Replace the iproute2 snapshot with a version string which is
autogenerated as part of the build process using git describe.

This will also allow seeing if the version of the command
is built from the same sources is as upstream.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agodevlink: Add board.serial_number to info subcommand.
Vasundhara Volam [Fri, 31 Jul 2020 10:46:43 +0000 (03:46 -0700)]
devlink: Add board.serial_number to info subcommand.

Add support for reading board serial_number to devlink info
subcommand. Example:

$ devlink dev info pci/0000:af:00.0 -jp
{
    "info": {
        "pci/0000:af:00.0": {
            "driver": "bnxt_en",
            "serial_number": "00-10-18-FF-FE-AD-1A-00",
            "board.serial_number": "433551F+172300000",
            "versions": {
                "fixed": {
                    "board.id": "7339763 Rev 0.",
                    "asic.id": "16D7",
                    "asic.rev": "1"
                },
                "running": {
                    "fw": "216.1.216.0",
                    "fw.psid": "0.0.0",
                    "fw.mgmt": "216.1.192.0",
                    "fw.mgmt.api": "1.10.1",
                    "fw.ncsi": "0.0.0.0",
                    "fw.roce": "216.1.16.0"
                }
            }
        }
    }
}

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
3 years agoip-xfrm: add support for oseq-may-wrap extra flag
Petr Vaněk [Fri, 31 Jul 2020 07:12:59 +0000 (09:12 +0200)]
ip-xfrm: add support for oseq-may-wrap extra flag

This flag allows to create SA where sequence number can cycle in
outbound packets if set.

Signed-off-by: Petr Vaněk <pv@excello.cz>
Signed-off-by: David Ahern <dsahern@kernel.org>
3 years agoUpdate kernel headers
David Ahern [Mon, 3 Aug 2020 14:56:28 +0000 (14:56 +0000)]
Update kernel headers

Update kernel headers to commit:
    bd0b33b24897 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net")

Signed-off-by: David Ahern <dsahern@kernel.org>
3 years agodevlink: Expose port split ability
Danielle Ratson [Thu, 30 Jul 2020 14:33:18 +0000 (17:33 +0300)]
devlink: Expose port split ability

Add a new attribute that indicates the port split ability to devlink port.

Expose the attribute to user space as RO value, for example:

$devlink port show swp1
pci/0000:03:00.0/61: type eth netdev swp1 flavour physical port 1
splittable false lanes 1

Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
3 years agodevlink: Expose number of port lanes
Danielle Ratson [Thu, 30 Jul 2020 14:33:17 +0000 (17:33 +0300)]
devlink: Expose number of port lanes

Add a new attribute that indicates the port's number of lanes to devlink port.

Expose the attribute to user space as RO value, for example:

$devlink port show swp1
pci/0000:03:00.0/61: type eth netdev swp1 flavour physical port 1 lanes 1

Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
3 years agobridge: fdb show: fix fdb entry state output for json context
Julien Fortin [Wed, 29 Jul 2020 13:04:25 +0000 (15:04 +0200)]
bridge: fdb show: fix fdb entry state output for json context

bridge json fdb show is printing an incorrect / non-machine readable
value, when using -j (json output) we are expecting machine readable
data that shouldn't require special handling/parsing.

$ bridge -j fdb show | \
python -c \
'import sys,json;print(json.dumps(json.loads(sys.stdin.read()),indent=4))'
[
    {
"master": "br0",
"mac": "56:23:28:4f:4f:e5",
"flags": [],
"ifname": "vx0",
"state": "state=0x80"  <<<<<<<<< with the patch: "state": "0x80"
    }
]

Fixes: c7c1a1ef51aea7c ("bridge: colorize output and use JSON print library")
Signed-off-by: Julien Fortin <julien@cumulusnetworks.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agotc: Add space after format specifier
Briana Oursler [Tue, 28 Jul 2020 05:20:48 +0000 (22:20 -0700)]
tc: Add space after format specifier

Add space after format specifier in print_string call. Fixes broken
qdisc tests within tdc testing suite. Per suggestion from Petr Machata,
remove a space and change spacing in tc/q_event.c to complete the fix.

Tested fix in tdc using:
./tdc.py -c qdisc

All qdisc RED tests return ok.

Fixes: d0e450438571("tc: q_red: Add support for qevents "mark" and "early_drop")
Signed-off-by: Briana Oursler <briana.oursler@gmail.com>
Tested-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
3 years agobridge: fdb: the 'dynamic' option in the show/get commands
Anton Danilov [Mon, 27 Jul 2020 13:26:07 +0000 (16:26 +0300)]
bridge: fdb: the 'dynamic' option in the show/get commands

In most of cases a user wants to see only the dynamic mac addresses
in the fdb output. But currently the 'fdb show' displays tons of
various self entries, those only waste the output without any useful
goal.

New option 'dynamic' for 'show' and 'get' commands forces display
only relevant records.

Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agomptcp: show all endpoints when no ID is specified
Matthieu Baerts [Fri, 24 Jul 2020 12:17:18 +0000 (14:17 +0200)]
mptcp: show all endpoints when no ID is specified

According to 'ip mptcp help', 'endpoint show' can accept no argument:

  ip mptcp endpoint show [ id ID ]

It makes sense to print all endpoints when no filter is used.

So here if the following command is used, all endpoints are printed:

  ip mptcp endpoint show

Same as:

  ip mptcp endpoint

Fixes: 7e0767cd ("add support for mptcp netlink interface")
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agoMerge branch 'devlink-port-health' into next
David Ahern [Thu, 23 Jul 2020 00:34:07 +0000 (00:34 +0000)]
Merge branch 'devlink-port-health' into next

Moshe Shemesh  says:

====================

Implement commands for interaction with per-port devlink health
reporters. To do this, adapt devlink-health for usage of port handles
with any existing devlink-health subcommands. Add devlink-port health
subcommand as an alias for devlink-health.

====================

Signed-off-by: David Ahern <dsahern@kernel.org>
3 years agodevlink: Update devlink-health and devlink-port manpages
Vladyslav Tarasiuk [Sun, 19 Jul 2020 13:36:03 +0000 (16:36 +0300)]
devlink: Update devlink-health and devlink-port manpages

Describe support for per-port reporters in devlink-health and
devlink-port commands.

Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
3 years agodevlink: Add devlink port health command
Vladyslav Tarasiuk [Sun, 19 Jul 2020 13:36:02 +0000 (16:36 +0300)]
devlink: Add devlink port health command

Add devlink port health show subcommand which displays information about
specified port reporter or all present port reporters as in the example.
Device and port reporters can be distinguished by a handle being used.

Make other devlink-health subcommands be aliased by devlink port health.
Refactor devlink-health commands for usage of port handles in order to
interact with port reporters.

Change devlink health show output to dump information about both device
and port reporters with correct handles.

Example:
$ devlink health show
pci/0000:00:0b.0:
  reporter fw
    state healthy error 0 recover 0 auto_dump true
  reporter fw_fatal
    state healthy error 0 recover 0 grace_period 1200000 auto_recover true auto_dump true
pci/0000:00:0b.0/1:
  reporter tx
    state healthy error 0 recover 0 grace_period 10000 auto_recover true auto_dump true
  reporter rx
    state healthy error 0 recover 0 grace_period 10000 auto_recover true auto_dump true

$ devlink health show pci/0000:00:0b.0/1 reporter rx
Which is equivalent to:
$ devlink port health show pci/0000:00:0b.0/1 reporter rx
pci/0000:00:0b.0/1:
  reporter rx
    state healthy error 0 recover 0 grace_period 10000 auto_recover true auto_dump true

$ devlink port health show pci/0000:00:0b.0/1 reporter rx -j --pretty
{
    "health": {
         "pci/0000:00:0b.0/1": [ {
                 "reporter": "rx",
                 "state": "healthy",
                 "error": 0,
                 "recover": 0,
                 "grace_period": 500,
                 "auto_recover": true,
                 "auto_dump": true
              } ]
    }
}

$ devlink health set pci/0000:00:0b.0/1 reporter rx grace_period 5000
Which is equivalent to:
$ devlink port health set pci/0000:00:0b.0/1 reporter rx grace_period 5000

$ devlink port health show pci/0000:00:0b.0/1 reporter rx
pci/0000:00:0b.0/1:
  reporter rx
    state healthy error 0 recover 0 grace_period 5000 auto_recover true auto_dump true

Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
3 years agodevlink: Add a possibility to print arrays of devlink port handles
Vladyslav Tarasiuk [Sun, 19 Jul 2020 13:36:01 +0000 (16:36 +0300)]
devlink: Add a possibility to print arrays of devlink port handles

Add a capability of printing port handles for arrays in non-JSON format
in devlink-health manner.

Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
3 years agouapi: update bpf.h
Stephen Hemminger [Tue, 21 Jul 2020 16:18:15 +0000 (09:18 -0700)]
uapi: update bpf.h

Upstrean 5.8-rc6 changes.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agotestsuite: Add tests for bareudp tunnels
Guillaume Nault [Fri, 17 Jul 2020 14:39:46 +0000 (16:39 +0200)]
testsuite: Add tests for bareudp tunnels

Test the plain MPLS (unicast and multicast) and IP (v4 and v6) modes.
Also test the multiproto option for MPLS and for IP.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agomisc: make the pattern matching case-insensitive
Anton Danilov [Thu, 9 Jul 2020 15:03:43 +0000 (18:03 +0300)]
misc: make the pattern matching case-insensitive

To improve the usability better use case-insensitive pattern-matching
in ifstat, nstat and ss tools.

Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agotc/m_estimator: Print proper value for estimator interval in raw.
Jamie Gloudon [Fri, 17 Jul 2020 15:05:30 +0000 (11:05 -0400)]
tc/m_estimator: Print proper value for estimator interval in raw.

While looking at the estimator code, I noticed an incorrect interval
number printed in raw for the handles. This patch fixes the formatting.

Before patch:

root@bytecenter.fr:~# tc -r filter add dev eth0 ingress estimator
250ms 999ms matchall action police avrate 12mbit conform-exceed drop
[estimator i=4294967294 e=2]

After patch:

root@bytecenter.fr:~# tc -r filter add dev eth0 ingress estimator
250ms 999ms matchall action police avrate 12mbit conform-exceed drop
[estimator i=-2 e=2]

Signed-off-by: Jamie Gloudon <jamie.gloudon@gmx.fr>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agoMerge branch 'tc-qevent-block' into next
David Ahern [Mon, 20 Jul 2020 16:36:41 +0000 (16:36 +0000)]
Merge branch 'tc-qevent-block' into next

Petr Machata  says:

====================

When a list of filters at a given block is requested, tc first validates
that the block exists before doing the filter query. Currently the
validation routine checks ingress and egress blocks. But now that blocks
can be bound to qevents as well, qevent blocks should be looked for as
well:

    # ip link add up type dummy
    # tc qdisc add dev dummy1 root handle 1: \
         red min 30000 max 60000 avpkt 1000 qevent early_drop block 100
    # tc filter add block 100 pref 1234 handle 102 matchall action drop
    # tc filter show block 100
    Cannot find block "100"

This patchset fixes this issue:

    # tc filter show block 100
    filter protocol all pref 1234 matchall chain 0
    filter protocol all pref 1234 matchall chain 0 handle 0x66
      not_in_hw
            action order 1: gact action drop
             random type none pass val 0
             index 2 ref 1 bind 1

In patch #1, the helpers and necessary infrastructure is introduced,
including a new qdisc_util callback that implements sniffing out bound
blocks in a given qdisc.

In patch #2, RED implements the new callback.

v3:
- Patch #1:
    - Do not pass &ctx->found directly to has_block. Do it through a
      helper variable, so that the callee does not overwrite the result
      already stored in ctx->found.

v2:
- Patch #1:
    - In tc_qdisc_block_exists_cb(), do not initialize 'q'.
    - Propagate upwards errors from q->has_block.

====================

Signed-off-by: David Ahern <dsahern@kernel.org>
3 years agotc: q_red: Implement has_block for RED
Petr Machata [Thu, 16 Jul 2020 16:47:08 +0000 (19:47 +0300)]
tc: q_red: Implement has_block for RED

In order for "tc filter show block X" to find a given block, implement the
has_block callback.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
3 years agotc: Look for blocks in qevents
Petr Machata [Thu, 16 Jul 2020 16:47:07 +0000 (19:47 +0300)]
tc: Look for blocks in qevents

When a list of filters at a given block is requested, tc first validates
that the block exists before doing the filter query. Currently the
validation routine checks ingress and egress blocks. But now that blocks
can be bound to qevents as well, qevent blocks should be looked for as
well.

In order to support that, extend struct qdisc_util with a new callback,
has_block. That should report whether, give the attributes in TCA_OPTIONS,
a blocks with a given number is bound to a qevent. In
tc_qdisc_block_exists_cb(), invoke that callback when set.

Add a helper to the tc_qevent module that walks the list of qevents and
looks for a given block. This is meant to be used by the individual qdiscs.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
3 years agoss: mptcp: add msk diag interface support
Paolo Abeni [Fri, 10 Jul 2020 13:52:35 +0000 (15:52 +0200)]
ss: mptcp: add msk diag interface support

This implement support for MPTCP sockets type, comprising
extended socket info. Note that we need to add an extended
attribute carrying the actual protocol number to the diag
request.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
3 years agoUpdate kernel headers
David Ahern [Tue, 14 Jul 2020 23:56:53 +0000 (23:56 +0000)]
Update kernel headers

Update kernel headers to commit:
    81adcd65b685 ("ksz884x: switch from 'pci_' to 'dma_' API")

Signed-off-by: David Ahern <dsahern@kernel.org>
3 years agoMerge branch 'main' into next
David Ahern [Tue, 14 Jul 2020 23:52:43 +0000 (23:52 +0000)]
Merge branch 'main' into next

Signed-off-by: David Ahern <dsahern@kernel.org>
3 years agoip xfrm: policy: support policies with IF_ID in get/delete/deleteall
Eyal Birger [Thu, 9 Jul 2020 06:29:48 +0000 (09:29 +0300)]
ip xfrm: policy: support policies with IF_ID in get/delete/deleteall

The XFRMA_IF_ID attribute is set in policies for them to be
associated with an XFRM interface (4.19+).

Add support for getting/deleting policies with this attribute.

For supporting 'deleteall' the XFRMA_IF_ID attribute needs to be
explicitly copied.

Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agoip xfrm: update man page on setting/printing XFRMA_IF_ID in states/policies
Eyal Birger [Thu, 9 Jul 2020 06:29:47 +0000 (09:29 +0300)]
ip xfrm: update man page on setting/printing XFRMA_IF_ID in states/policies

In commit aed63ae1acb9 ("ip xfrm: support setting/printing XFRMA_IF_ID attribute in states/policies")
I added the ability to set/print the xfrm interface ID without updating
the man page.

Fixes: aed63ae1acb9 ("ip xfrm: support setting/printing XFRMA_IF_ID attribute in states/policies")
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agotipc: fixed a compile warning in tipc/link.c
Hoang Huu Le [Thu, 9 Jul 2020 04:25:55 +0000 (11:25 +0700)]
tipc: fixed a compile warning in tipc/link.c

Fixes: 5027f233e35b ("tipc: add link broadcast get")
Signed-off-by: Hoang Huu Le <hoang.h.le@dektech.com.au>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agobridge: fdb get: add missing json init (new_json_obj)
Julien Fortin [Fri, 10 Jul 2020 00:53:02 +0000 (02:53 +0200)]
bridge: fdb get: add missing json init (new_json_obj)

'bridge fdb get' has json support but the json object is never initialized

before patch:

$ bridge -j fdb get 56:23:28:4f:4f:e5 dev vx0
56:23:28:4f:4f:e5 dev vx0 master br0 permanent
$

after patch:

$ bridge -j fdb get 56:23:28:4f:4f:e5 dev vx0 | \
python -c \
'import sys,json;print(json.dumps(json.loads(sys.stdin.read()),indent=4))'
[
    {
        "master": "br0",
        "mac": "56:23:28:4f:4f:e5",
        "flags": [],
        "ifname": "vx0",
        "state": "permanent"
    }
]
$

Signed-off-by: Julien Fortin <julien@cumulusnetworks.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agoconfigure: support ipset version 7 with kernel version 5
Tony Ambardar [Tue, 7 Jul 2020 07:58:33 +0000 (00:58 -0700)]
configure: support ipset version 7 with kernel version 5

The configure script checks for ipset v6 availability but doesn't test
for v7, which is backward compatible and used on kernel v5.x systems.
Update the script to test for both ipset versions. Without this change,
the tc ematch function em_ipset will be disabled.

Signed-off-by: Tony Ambardar <Tony.Ambardar@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
3 years agoip address: remove useless include
Andrea Claudi [Tue, 7 Jul 2020 19:49:47 +0000 (21:49 +0200)]
ip address: remove useless include

utils.h is included two times in ipaddress.c, there is no need for that.

Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>