Thomas Haller [Tue, 23 Apr 2019 07:16:14 +0000 (09:16 +0200)]
iprule: refactor print_rule() to use leading space before printing attribute
When printing the actions, we avoid adding the trailing space after the
attribute. Possibly because we expect the action to be the last output
on the line and not end with a space.
But for FR_ACT_TO_TBL nothing is printed. That means, we add double
spaces if a protocol is printed as well:
# ip rule add priority 10 protocol 10 type 1
will be printed as
10: from all lookup 1 proto mrt
The only visible effect of the patch is to avoid the double-space and
avoid a trailing space if the action is FR_ACT_TO_TBL.
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Thomas Haller [Tue, 23 Apr 2019 07:16:13 +0000 (09:16 +0200)]
iprule: avoid trailing space in print_rule() after printing protocol
It seems print_rule() tries to avoid a trailing space at the end
of the line. At least, when printing details about the actions,
they no longer append the space. Probably expecting to be the
last attribute that will be printed.
Don't let the protocol add the trailing space. The space at the end
of the line should be printed consistently (or not).
Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Each of the commits below broke the vlan stats output in a different
way:
- 45fca4ed9412 ("bridge: fix vlan show stats formatting")
Added a second print of an interface name (e.g. eth4eth4)
- c7c1a1ef51ae ("bridge: colorize output and use JSON print library")
Broke normal vlan stats output by not printing a new line after them
Also printed interfaces without any vlans when printing stats
This fix is not pretty but it brings back the previous behaviour.
Before this fix:
$ bridge -s vlan show
port vlan id
br0br0 1 PVID Egress Untagged
RX: 0 bytes 0 packets
TX: 0 bytes 0 packets 4
RX: 0 bytes 0 packets
TX: 0 bytes 0 packetseth4eth4 4
RX: 0 bytes 0 packets
TX: 0 bytes 0 packetsroot@debian:~/
After this fix:
$ bridge -s vlan show
port vlan id
br0 1 PVID Egress Untagged
RX: 0 bytes 0 packets
TX: 0 bytes 0 packets
4
RX: 0 bytes 0 packets
TX: 0 bytes 0 packets
eth4 4
RX: 0 bytes 0 packets
TX: 0 bytes 0 packets
Fixes: 45fca4ed9412 ("bridge: fix vlan show stats formatting") Fixes: c7c1a1ef51ae ("bridge: colorize output and use JSON print library") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Since the commit below mdb's json output has been invalid and also with
changed format. Restore it to a valid json like the previous format.
Also takes care of a double "Deleted" print when monitoring for changes.
Fixes: c7c1a1ef51ae ("bridge: colorize output and use JSON print library") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This adds support for the newly added fwmark option to CAKE, which allows
overriding the tin selection from the per-packet firewall marks. The fwmark
field is a bitmask that is applied to the fwmark to select the tin.
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Fixes: 663c3cb23103 ("iproute: implement JSON and color output") Acked-by: Phil Sutter <phil@nwl.cc> Reviewed-and-tested-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Fixes: 663c3cb23103 ("iproute: implement JSON and color output") Acked-by: Phil Sutter <phil@nwl.cc> Reviewed-and-tested-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
action m_connmark returns error messages identifying itself as the
'simple' action instead of 'connmark' action. e.g.
tc filter add dev eth0 protocol all u32 match u32 0 0 flowid 1:1 \
action connmark index wrong
simple: Illegal "index"
bad action parsing
parse_action: bad value (3:connmark)!
Illegal "action"
In what is most likely a copy/paste error from the simple action example
code, fix connmark error messages to identify themselves as coming from
connmark.
tc filter add dev eth0 protocol all u32 match u32 0 0 flowid 1:1 \
action connmark index wrong
connmark: Illegal "index"
bad action parsing
parse_action: bad value (3:connmark)!
Illegal "action"
While we're here also fixup the 'Illegal "Zone"' error code to say
'Illegal "zone"' instead of 'Illegal "index"'
Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
David Ahern [Fri, 15 Mar 2019 21:01:35 +0000 (14:01 -0700)]
Merge branch 'bond-bridge-xstats-json' into next
Nikolay Aleksandrov says:
====================
This set adds json output support to the xstats API (patch 01) and then
adds json support to the bridge xstats output (patch 02) and adds xstats
output support (both plain text and json) for the bonding (patch 03).
It doesn't change the bridge's plain text output, but it fixes an
inconsistency that could happen if new bridge xstats attributes were
added (print the interface name once for each group of xstats attrs).
Add json support for bridge's xstats output.
The plain text output format should remain the same.
Note that this patch pulls the interface out of the attribute
loop, this was an oversight when the set was upstreamed. This does not
change the output format, but fixes it when new xstats attributes show
up.
Roopa Prabhu [Mon, 4 Mar 2019 05:26:32 +0000 (21:26 -0800)]
bridge: fdb: add support for src_vni option
We already print src_vni for a fdb entry when present.
This patch adds the ability to set src_vni on a fdb
entry. When not specified, kernel will use vni specified
on the vxlan device. This can be used on a vxlan fdb entry
when the vxlan device is in external or collect metadata
mode.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Dmytro Linkin [Wed, 27 Feb 2019 12:10:17 +0000 (12:10 +0000)]
tc/pedit: Fix wrong pedit ipv6 structure id
Tc pedit action with more than two ip6 munge in a row cause infinite
loop.
Example:
$ tc filter add dev eth0 protocol ipv6 parent ffff: \
flower ip_proto sctp \
action pedit ex \
munge ip6 hoplimit set 0x1 \
munge ip6 src set 2001:0db8:0:f101::1 \
munge that cause infinite loop
The example command never returns, instead of failing with parse error
as expected. Pedit ipv6 structure has wrong id, which leads to the
creation linked list with one node in tc/m_pedit.c:get_pedit_kind(),
referring to itself. This node is created if command have two ip6 munge
in a row, and any third ip6 munge will cause infinite loop.
Changing this id from "ipv6" to "ip6" solves the problem.
Fixes: f3e1b2448a95 ("pedit: Introduce ipv6 support") Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
David Ahern [Thu, 28 Feb 2019 16:00:19 +0000 (08:00 -0800)]
Merge branch 'devlink-health' into next
Aya Levin says:
====================
This series adds support for devlink health commands:
devlink health show [ DEV reporter REPORTER_NAME ]
devlink health recover DEV reporter REPORTER_NAME
devlink health diagnose DEV reporter REPORTER_NAME
devlink health dump show DEV reporter REPORTER_NAME
devlink health dump clear DEV reporter REPORTER_NAME
devlink health set DEV reporter REPORTER_NAME { grace_period | auto_recover } { msec | boolean }
The first patch refactors the validation of input parameters, which
grow way too long. Second and third patches fix bugs that were
discovered during the devlink health development. The forth patch adds
helper functions which enable output of value and labels separately.
Patches 5-10 add the devlink health functionality by command, the last
is the man page.
Aya Levin [Thu, 28 Feb 2019 12:13:03 +0000 (14:13 +0200)]
devlink: Add devlink health set command
Add devlink set command which enables the user to configure parameters
related to the devlink health mechanism per reporter.
1) grace_period [msec] time interval between auto recoveries.
2) auto_recover [true/false] whether the devlink should execute automatic
recover on error.
Add a helper function to retrieve a boolean value as an input parameter.
Example:
$ devlink health set pci/0000:00:09.0 reporter tx grace_period 3500
$ devlink health set pci/0000:00:09.0 reporter tx auto_recover false
Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Aya Levin [Thu, 28 Feb 2019 12:13:02 +0000 (14:13 +0200)]
devlink: Add devlink health dump clear command
Add devlink dump clear command which deletes the last saved dump file.
Clearing the last saved dump enables a new dump file to be saved.
Example:
$ devlink health dump clear pci/0000:00:09.0 reporter tx
Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Aya Levin [Thu, 28 Feb 2019 12:13:01 +0000 (14:13 +0200)]
devlink: Add devlink health dump show command
Add devlink dump show command which displays the last saved dump.
Devlink health saves a single dump. If a dump is not already stored
by the devlink for this reporter, devlink generates a new dump. The dump
can be generated automatically when a reporter reports on an
error or manually by user's request.
The dump's output is defined by the reporter. The command uses the
infra structure for flexible format output introduced in previous patch.
Example:
$ devlink health dump show pci/0000:00:09.0 reporter tx
Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Aya Levin [Thu, 28 Feb 2019 12:13:00 +0000 (14:13 +0200)]
devlink: Add devlink health diagnose command
Add devlink health diagnose command: enabling retrieval of diagnostics data
by the user on a reporter on a device. The command's output is a
free text defined by the reporter.
This patch also introduces an infra structure for flexible format
output. This allow the command to display different data fields
according to the reporter.
Example:
$ devlink health diagnose pci/0000:00:0a.0 reporter tx
SQs:
sqn: 4403 HW state: 1 stopped: false
sqn: 4408 HW state: 1 stopped: false
sqn: 4413 HW state: 1 stopped: false
sqn: 4418 HW state: 1 stopped: false
sqn: 4423 HW state: 1 stopped: false
Aya Levin [Thu, 28 Feb 2019 12:12:59 +0000 (14:12 +0200)]
devlink: Add devlink health recover command
Add devlink health recover command which enables the user to initiate a
recovery on a reporter (if a recovery cb was supplied by the reporter).
This operation will increment the recoveries counter displayed in the
show command.
Example:
$ devlink health recover pci/0000:00:09.0 reporter tx
Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Aya Levin [Thu, 28 Feb 2019 12:12:58 +0000 (14:12 +0200)]
devlink: Add devlink health show command
Add devlink health show command which displays status and configuration
info on a specific reporter on a device or dump the info on all
reporters on all devices. Add helper functions to display status and
dump's time stamp.
Example:
$ devlink health show pci/0000:00:09.0 reporter tx
pci/0000:00:09.0:
name tx
state healthy error 0 recover 1 last_dump_date 2019-02-14 last_dump_time 10:10:10 grace_period 600 auto_recover true
$ devlink health show pci/0000:00:09.0 reporter tx -jp
{
"health":{
"pci/0000:00:0a.0":[
{
"name":"tx",
"state":"healthy",
"error":0,
"recover":1,
"last_dump_date":"2019-Feb-14",
"last_dump_time":"10:10:10",
"grace_period":600,
"auto_recover":true
}
]
}
Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Aya Levin [Thu, 28 Feb 2019 12:12:57 +0000 (14:12 +0200)]
devlink: Add helper functions for name and value separately
Add a new helper functions which outputs only values (without name
label) for different types: boolean, uint, uint64, string and binary.
In addition add a helper function which prints only the name label.
Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Aya Levin [Thu, 28 Feb 2019 12:12:54 +0000 (14:12 +0200)]
devlink: Refactor validation of finding required arguments
Introducing argument's metadata structure matching a bitmap flag per
required argument and an error message if missing. Using this static
array to refactor validation of finding required arguments in devlink
command line and to ease further maintenance.
Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Leon Romanovsky [Wed, 27 Feb 2019 06:41:51 +0000 (08:41 +0200)]
rdma: Add the prefix for driver attributes
There is a need to distinguish between driver vs. general exposed
attributes. The most common use case is to expose some internal
garbage under extremely common and sexy name, e.g. pi, ci e.t.c
In order to achieve that, we will add "drv_" prefix to all strings
which were received through RDMA_NLDEV_ATTR_DRIVER_* attributes.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>a Tested-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: David Ahern <dsahern@gmail.com>
David Ahern [Sun, 24 Feb 2019 15:13:21 +0000 (07:13 -0800)]
Merge branch 'rdma-object-ids' into next
Leon Romanovsky says:
====================
This series adds ability to present and query all known to rdmatool
object by their respective, unique IDs (e.g. pdn. mrn, cqn e.t.c).
All objects which have "parent" object has this information too.
Leon Romanovsky [Sat, 23 Feb 2019 09:15:28 +0000 (11:15 +0200)]
rdma: Provide and reuse filter functions
Globally replace all filter function in safer variants of those
is_filtered functions, which take into account the availability/lack
of netlink attributes.
Such conversion allowed to fix a number of places in the code, where
the previous implementation didn't honor filter requests if netlink
attribute wasn't present.
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Leon Romanovsky [Sat, 23 Feb 2019 09:15:27 +0000 (11:15 +0200)]
rdma: Perform single .doit call to query specific objects
If user provides specific index, we can speedup query
by using .doit callback and save full dump and filtering
after that.
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Leon Romanovsky [Sat, 23 Feb 2019 09:15:26 +0000 (11:15 +0200)]
rdma: Unify netlink attribute checks prior to prints
Place check if netlink attribute available in general place,
instead of doing the same check in many paces.
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Leon Romanovsky [Sat, 23 Feb 2019 09:15:25 +0000 (11:15 +0200)]
rdma: Move QP code to separate function
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Leon Romanovsky [Sat, 23 Feb 2019 09:15:24 +0000 (11:15 +0200)]
rdma: Place PD parsing print routine into separate function
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Leon Romanovsky [Sat, 23 Feb 2019 09:15:23 +0000 (11:15 +0200)]
rdma: Move MR code to be suitable for per-line parsing
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Leon Romanovsky [Sat, 23 Feb 2019 09:15:22 +0000 (11:15 +0200)]
rdma: Refactor CQ prints
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Leon Romanovsky [Sat, 23 Feb 2019 09:15:21 +0000 (11:15 +0200)]
rdma: Simplify CM_ID print code
Refactor our the CM_ID print code.
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Leon Romanovsky [Sat, 23 Feb 2019 09:15:20 +0000 (11:15 +0200)]
rdma: Simplify code to reuse existing functions
Remove duplicated functions in favour general res_print_uint() call.
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Leon Romanovsky [Sat, 23 Feb 2019 09:15:19 +0000 (11:15 +0200)]
rdma: Properly mark RDMAtool license
RDMA subsystem is dual-licensed with "GPL-2.0 OR Linux-OpenIB" proper
license and Mellanox submission are supposed to have this type of license.
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Leon Romanovsky [Sat, 23 Feb 2019 09:15:18 +0000 (11:15 +0200)]
rdma: Move resource QP logic to separate file
Logically separate resource QP logic to separate file,
in order to make PD specific logic self-contained.
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Leon Romanovsky [Sat, 23 Feb 2019 09:15:17 +0000 (11:15 +0200)]
rdma: Move out resource CM-ID logic to separate file
Logically separate resource CM-ID logic to separate file,
in order to make CM-ID specific logic self-contained.
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Leon Romanovsky [Sat, 23 Feb 2019 09:15:16 +0000 (11:15 +0200)]
rdma: Move out resource CQ logic to separate file
Logically separate resource CQ logic to separate file,
in order to make CQ specific logic self-contained.
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Leon Romanovsky [Sat, 23 Feb 2019 09:15:15 +0000 (11:15 +0200)]
rdma: Refactor out resource MR logic to separate file
Logically separate resource MR logic to separate file,
in order to make MR specific logic self-contained.
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Leon Romanovsky [Sat, 23 Feb 2019 09:15:14 +0000 (11:15 +0200)]
rdma: Move resource PD logic to separate file
Logically separate resource PD logic to separate file,
in order to make PD specific logic self-contained.
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Leon Romanovsky [Sat, 23 Feb 2019 09:15:13 +0000 (11:15 +0200)]
rdma: Provide parent context index for all objects except CM_ID
Allow users to correlate allocated object with relevant parent
[leonro@server ~]$ rdma res show pd
dev mlx5_0 users 5 pid 0 comm [ib_core] pdn 1
dev mlx5_0 users 7 pid 0 comm [ib_ipoib] pdn 2
dev mlx5_0 users 0 pid 0 comm [mlx5_ib] pdn 3
dev mlx5_0 users 2 pid 548 comm ibv_rc_pingpong ctxn 0 pdn 4
[leonro@server ~]$ rdma res show cq cqn 0-100
dev mlx5_0 cqe 2047 users 6 poll-ctx UNBOUND_WORKQUEUE pid 0 comm [ib_core] cqn 2
dev mlx5_0 cqe 255 users 2 poll-ctx SOFTIRQ pid 0 comm [mlx5_ib] cqn 3
dev mlx5_0 cqe 511 users 1 poll-ctx DIRECT pid 0 comm [ib_ipoib] cqn 4
dev mlx5_0 cqe 255 users 1 poll-ctx DIRECT pid 0 comm [ib_ipoib] cqn 5
dev mlx5_0 cqe 255 users 0 poll-ctx SOFTIRQ pid 0 comm [mlx5_ib] cqn 6
dev mlx5_0 cqe 511 users 2 pid 548 comm ibv_rc_pingpong cqn 7 ctxn 0
[leonro@server ~]$ rdma res show mr
dev mlx5_0 mrlen 4096 pid 548 comm ibv_rc_pingpong mrn 4 pdn 0
[leonro@nps-server-14-015 ~]$ /images/leonro/src/iproute2/rdma/rdma res show qp
link mlx5_0/1 lqpn 0 type SMI state RTS sq-psn 0 pid 0 comm [ib_core]
link mlx5_0/1 lqpn 1 type GSI state RTS sq-psn 0 pid 0 comm [ib_core]
link mlx5_0/1 lqpn 7 type UD state RTS sq-psn 0 pid 0 comm [ib_core]
link mlx5_0/1 lqpn 8 type UD state RTS sq-psn 0 pid 0 comm [ib_ipoib]
link mlx5_0/1 lqpn 9 pdn 4 rqpn 0 type RC state INIT rq-psn 0 sq-psn 0 path-mig-state MIGRATED pid 548 comm ibv_rc_pingpong
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Leon Romanovsky [Sat, 23 Feb 2019 09:15:12 +0000 (11:15 +0200)]
rdma: Provide unique indexes for all visible objects
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Leon Romanovsky [Sat, 23 Feb 2019 09:15:11 +0000 (11:15 +0200)]
rdma: Remove duplicated print code
There is no need to keep same print functions for
uint32_t and uint64_t, unify them into one function.
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Leon Romanovsky [Sat, 23 Feb 2019 09:15:10 +0000 (11:15 +0200)]
rdma: update uapi headers
Update rdma_netlink.h file upto kernel commit f2a0e45f36b0 RDMA/nldev: Don't expose number of not-visible entries
Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
David Ahern [Wed, 13 Feb 2019 23:56:30 +0000 (15:56 -0800)]
Improve batch and dump times by caching link lookups
ip route uses ll_name_to_index and ll_index_to_name to convert between
device names and indices. At the moment both use for the ioctl based glibc
functions if_nametoindex and if_indextoname and does not cache the result.
When using a batch file or dumping large number of routes this means the
same device lookups can be done repeatedly adding unnecessary overhead
(socket + ioctl + close for each device lookup).
Add a new function, ll_link_get, to send a netlink based RTM_GETLINK. If
successful, cache the result in idx_head and name_head so future lookups
can re-use the entry. Update ll_name_to_index and ll_index_to_name to use
ll_link_get and only fallback to the glibc functions if it fails.
With this change the time to install 720,022 routes with 2 ecmp nexthops
where the nexthop device is given is reduced from 31.4 seconds to 19.2
seconds. A dump of those routes drops from 13.3 to 2.8 seconds.
Stefano Brivio [Thu, 14 Feb 2019 00:58:32 +0000 (01:58 +0100)]
ss: Render buffer to output every time a number of chunks are allocated
Eric reported that, with 10 million sockets, ss -emoi (about 1000 bytes
output per socket) can easily lead to OOM (buffer would grow to 10GB of
memory).
Limit the maximum size of the buffer to five chunks, 1M each. Render and
flush buffers whenever we reach that.
This might make the resulting blocks slightly unaligned between them, with
occasional loss of readability on lines occurring every 5k to 50k sockets
approximately. Something like (from ss -tu):
However, I don't actually expect any human user to scroll through that
amount of sockets, so readability should be preserved when it matters.
The bulk of the diffstat comes from moving field_next() around, as we now
call render() from it. Functionally, this is implemented by six lines of
code, most of them in field_next().
Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Fixes: 691bd854bf4a ("ss: Buffer raw fields first, then render them as a table") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Commit c759116a0b2b6da8df9687b0a40ac69050132c77 introduced support for
AF_VSOCK. This define is only provided since glibc version 2.18, so
compilation fails when using older toolchains.
Provide the necessary definitions if needed.
Signed-off-by: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Vivien Didelot [Wed, 20 Feb 2019 16:33:57 +0000 (11:33 -0500)]
bridge: make mcast_flood description consistent
This patch simply changes the description of the mcast_flood flag
with "flood" instead of "be flooded with" to avoid confusion, and be
consistent with the description of the flooding flag, which "Controls
whether a given port will *flood* unicast traffic for which there is
no FDB entry."
At the same time, fix the documentation for the "flood" flag which
is incorrectly described as "flooding on" or "flooding off".
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Jiri Pirko [Thu, 21 Feb 2019 10:55:56 +0000 (11:55 +0100)]
devlink: relax dpipe table show dependency on resources
Dpipe table show command has a depencency on getting resources.
If resource get command is not supported by the driver, dpipe table
show fails. However, resource is only additional information
in dpipe table show output. So relax the dependency and let
the dpipe tables be shown even if resources get command fails.
Fixes: ead180274caf ("devlink: Add support for resource/dpipe relation") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Matteo Croce [Wed, 13 Feb 2019 14:40:30 +0000 (15:40 +0100)]
iplink: document XDP subcommand to force the XDP mode.
When attaching an eBPF program to a device, ip link can force the XDP mode
by using the xdp{generic,drv,offload} keyword instead of just 'xdp'.
Document this behaviour also in the help output.
ss: add option --tos for requesting ipv4 tos and ipv6 tclass
Also show socket class_id/priority used by classful qdisc.
Kernel report this together with tclass since commit
("inet_diag: fix reporting cgroup classid and fallback to priority")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Eric Dumazet [Wed, 13 Feb 2019 01:58:41 +0000 (17:58 -0800)]
lib/libnetlink: ensure a minimum of 32KB for the buffer used in rtnl_recvmsg()
In the past, we tried to increase the buffer size up to 32 KB in order
to reduce number of syscalls per dump.
Commit 2d34851cd341 ("lib/libnetlink: re malloc buff if size is not enough")
brought the size back to 4KB because the kernel can not know the application
is ready to receive bigger requests.
See kernel commits 9063e21fb026 ("netlink: autosize skb lengthes") and d35c99ff77ec ("netlink: do not enter direct reclaim from netlink_dump()")
for more details.
Fixes: 2d34851cd341 ("lib/libnetlink: re malloc buff if size is not enough") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hangbin Liu <liuhangbin@gmail.com> Cc: Phil Sutter <phil@nwl.cc> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
As /sys/class/net/<iface>/speed indicates a value in Mbits/sec, the
conversion is necessary to create the correct limits.
This guarantees the same result for the following commands in an
1000Mbit/sec device:
tc class add ... htb rate 500Mbit
tc class add ... htb rate 50%
Fixes: 927e3cfb52b5 ("tc: B.W limits can now be specified in %.") Signed-off-by: Marcos Antonio Moraes <marcos.antonio@digirati.com.br> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
v5:
- remove spurious new line.
v4:
- more commit message improvements.
v3:
- show up-to-date output in the commit message.
v2 (Jiri):
- remove filtering;
- add example in the commit message.
RFCv2:
- make info subcommand of dev.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Yonghong Song [Fri, 25 Jan 2019 00:41:07 +0000 (16:41 -0800)]
bpf: add btf func and func_proto kind support
The issue is discovered for bpf selftest test_skb_cgroup.sh.
Currently we have,
$ ./test_skb_cgroup_id.sh
Wait for testing link-local IP to become available ... OK
Object has unknown BTF type: 13!
[PASS]
In the above the BTF type 13 refers to BTF kind
BTF_KIND_FUNC_PROTO.
This patch added support of BTF_KIND_FUNC_PROTO and
BTF_KIND_FUNC during type parsing.
With this patch, I got
$ ./test_skb_cgroup_id.sh
Wait for testing link-local IP to become available ... OK
[PASS]
Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Ido Schimmel [Fri, 25 Jan 2019 17:09:17 +0000 (17:09 +0000)]
bridge: fdb: Fix FDB dump with strict checking disabled
While iproute2 correctly uses ifinfomsg struct as the ancillary header
when requesting an FDB dump on old kernels, it sets the message type to
RTM_GETLINK. This results in wrong reply being returned.
Chris Mi [Fri, 25 Jan 2019 10:37:07 +0000 (10:37 +0000)]
libnetlink: linkdump_req: AF_PACKET family also expects ext_filter_mask
Without this fix, the VF info can't be showed using command
"ip link".
146: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 24:8a:07:ad:78:52 brd ff:ff:ff:ff:ff:ff
vf 0 MAC 02:25:d0:12:01:01, spoof checking off, link-state auto, trust off, query_rss off
vf 1 MAC 02:25:d0:12:01:02, spoof checking off, link-state auto, trust off, query_rss off
Fixes: d97b16b2c906 ("libnetlink: linkdump_req: Only AF_UNSPEC family expects an ext_filter_mask") Signed-off-by: Chris Mi <chrism@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Davide Caratti [Thu, 31 Jan 2019 17:58:41 +0000 (18:58 +0100)]
tc: add 'kind' property to 'csum' action
unlike other TC actions already supporting JSON printout, 'csum' does not
print the value of TCA_KIND in the 'kind' property: remove 'csum' word
from 'csum' property, and add a separate 'kind' property containing the
action name. The human-readable printout is preserved.
Tested with:
# ./tdc.py -c csum
Cc: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Matteo Croce [Tue, 29 Jan 2019 15:01:15 +0000 (16:01 +0100)]
netns: add subcommand to attach an existing network namespace
ip tracks namespaces with dummy files in /var/run/netns/, but can't see
namespaces created with other tools.
Creating the dummy file and bind mounting the correct procfs entry will
make ip aware of that namespace.
Add an ip netns subcommand to automate this task.
Signed-off-by: Matteo Croce <mcroce@redhat.com> Reviewed-by: Andrea Claudi <aclaudi@redhat.com> Tested-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: David Ahern <dsahern@gmail.com>