Phil Sutter [Tue, 14 Aug 2018 12:18:07 +0000 (14:18 +0200)]
testsuite: Prepare for ss tests
This merges the shared bits from ts_tc() and ts_ip() into a common
function for being wrapped by the first ones and adds a third ts_ss()
for testing ss commands.
Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Phil Sutter [Tue, 14 Aug 2018 12:18:06 +0000 (14:18 +0200)]
ss: Review ssfilter
The original problem was ssfilter rejecting single expressions if
enclosed in braces, such as:
| sport = 22 or ( dport = 22 )
This is fixed by allowing 'expr' to be an 'exprlist' enclosed in braces.
The no longer required recursion in 'exprlist' being an 'exprlist'
enclosed in braces is dropped.
In addition to that, a few other things are changed:
* Remove pointless 'null' prefix in 'appled' before 'exprlist'.
* For simple equals matches, '=' operator was required for ports but not
allowed for hosts. Make this consistent by making '=' operator
optional in both cases.
Reported-by: Samuel Mannehed <samuel@cendio.se> Fixes: b2038cc0b2403 ("ssfilter: Eliminate shift/reduce conflicts") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Phil Sutter [Wed, 15 Aug 2018 09:18:26 +0000 (11:18 +0200)]
man: ip-route: Clarify referenced versions are Linux ones
Versioning scheme of Linux and iproute2 is similar, therefore the
referenced kernel versions are likely to confuse readers. Clarify this
by prefixing each kernel version by 'Linux' prefix.
Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Phil Sutter [Wed, 15 Aug 2018 16:21:25 +0000 (18:21 +0200)]
bridge: Fix check for colored output
There is no point in calling enable_color() conditionally if it was
already called for each time '-color' flag was parsed. Align the
algorithm with that in ip and tc by actually making use of 'color'
variable.
Fixes: e9625d6aead11 ("Merge branch 'iproute2-master' into iproute2-next") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David Ahern <dsahern@gmail.com>
sch_skbprio is a qdisc that prioritizes packets according to their skb->priority
field. Under congestion, it drops already-enqueued lower priority packets to
make space available for higher priority packets. Skbprio was conceived as a
solution for denial-of-service defenses that need to route packets with
different priorities as a means to overcome DoS attacks.
Signed-off-by: Nishanth Devarajan <ndev2021@gmail.com> Reviewed-by: Michel Machado <michel@digirati.com.br> Signed-off-by: David Ahern <dsahern@gmail.com>
Tobias Klauser [Wed, 8 Aug 2018 12:33:40 +0000 (14:33 +0200)]
tc: bpf: update list of archs with eBPF support in manpage
Update the list of architectures supporting eBPF JIT as of Linux 4.18.
Also mention the Linux version where support for a particular
architecture was introduced. Finally, reformat the list of architectures
as a bullet list in order to make it more readable.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Fix hex output for both the ip_attr and tcp_flags print functions.
Sample usage:
$ $TC qdisc add dev lo ingress
$ $TC filter add dev lo parent ffff: prio 3 proto ip flower ip_tos 0x8/32
$ $TC fitler add dev lo parent ffff: prio 5 proto ip flower ip_proto tcp \
tcp_flags 0x909/f00
Matteo Croce [Fri, 3 Aug 2018 17:49:33 +0000 (19:49 +0200)]
ip link: don't stop batch processing
When 'ip link show dev DEVICE' is processed in a batch mode, ip exits
and stop processing further commands.
This because ipaddr_list_flush_or_save() calls exit() to avoid printing
the link information twice.
Replace the exit with a classic goto out instruction.
Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
In a LXC container we're unable to umount the sysfs instance, nor mount
a read-write one. We still are able to create a new read-only instance.
Nevertheless, it still makes sense to attempt the umount() even though
the sysfs is mounted read-only. Otherwise we may end up attempting to
mount a sysfs with the same flags as is already mounted, resulting in
an EBUSY error (meaning "Already mounted").
Perhaps this is not a very likely scenario in real world, but we hit
it in NetworkManager test suite and makes netns_switch() somewhat more
robust. It also fixes the case, when /sys wasn't mounted at all.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
tc: Do not use addattr_nest_compat on mqprio and netem
Here we are partially reverting commit c14f9d92eee107
"treewide: Use addattr_nest()/addattr_nest_end() to handle nested
attributes" .
As discussed in [1], changing from the 'manually' coded version that
used addattr_l() to addattr_nest_compat() wasn't functionally
equivalent, because now the messages have extra fields appended to it.
This introduced a regression since the implementation of parse_attr()
from both mqprio and netem can't handle this new message format.
Without this fix, mqprio returns an error. netem won't return an error
but its internal configuration ends up wrong.
As an example, this can be reproduced by the following commands when
this patch is not applied:
Fixes: c14f9d92eee107 ("treewide: Use addattr_nest()/addattr_nest_end() to handle nested attributes") Reported-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
sch_cake is intended to squeeze the most bandwidth and latency out of even
the slowest ISP links and routers, while presenting an API simple enough
that even an ISP can configure it.
* A hybrid Codel/Blue AQM algorithm, "Cobalt", tied to an FQ_Codel
derived Flow Queuing system, which autoconfigures based on the bandwidth.
* A novel "triple-isolate" mode (the default) which balances per-host
and per-flow FQ even through NAT.
* An deficit based shaper, that can also be used in an unlimited mode.
* 8 way set associative hashing to reduce flow collisions to a minimum.
* A reasonable interpretation of various diffserv latency/loss tradeoffs.
* Support for zeroing diffserv markings for entering and exiting traffic.
* Support for interacting well with Docsis 3.0 shaper framing.
* Support for DSL framing types and shapers.
* Support for ack filtering.
* Extensive statistics for measuring, loss, ecn markings, latency variation.
Various versions baking have been available as an out of tree build for
kernel versions going back to 3.10, as the embedded router world has been
running a few years behind mainline Linux. A stable version has been
generally available on lede-17.01 and later.
sch_cake replaces a combination of iptables, tc filter, htb and fq_codel
in the sqm-scripts, with sane defaults and vastly simpler configuration.
Cake's principal author is Jonathan Morton, with contributions from
Kevin Darbyshire-Bryant, Toke Høiland-Jørgensen, Sebastian Moeller,
Ryan Mounce, Tony Ambardar, Dean Scarff, Nils Andreas Svee, Dave Täht,
and Loganaden Velvindron.
Testing from Pete Heist, Georgios Amanakis, and the many other members of
the cake@lists.bufferbloat.net mailing list.
Signed-off-by: Dave Taht <dave.taht@gmail.com> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: David Ahern <dsahern@gmail.com>
Alex Vesker [Tue, 17 Jul 2018 08:34:27 +0000 (11:34 +0300)]
devlink: Add support for devlink-region access
Devlink region allows access to driver defined address regions.
Each device can create its supported address regions and register
them. A device which exposes a region will allow access to it
using devlink.
This support allows reading and dumping regions snapshots as well
as presenting information such as region size and current available
snapshots.
A snapshot represents a memory image of a region taken by the driver.
If a device collects a snapshot of an address region it can be later
exposed using devlink region read or dump commands.
This functionality allows for future analyses on the snapshots.
The dump command is designed to read the full address space of a
region or of a snapshot unlike the read command which allows
reading only a specific section in a region/snapshot indicated by
an address and a length, current support is for reading and dumping
for a previously taken snapshot ID.
New commands added:
devlink region show [ DEV/REGION ]
devlink region delete DEV/REGION snapshot SNAPSHOT_ID
devlink region dump DEV/REGION [ snapshot SNAPSHOT_ID ]
devlink region read DEV/REGION [ snapshot SNAPSHOT_ID ]
address ADDRESS length length
Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Qiaobin Fu [Thu, 19 Jul 2018 16:07:18 +0000 (12:07 -0400)]
net:sched: add action inheritdsfield to skbedit
The new action inheritdsfield copies the field DS of
IPv4 and IPv6 packets into skb->priority. This enables
later classification of packets based on the DS field.
v4:
* Make tc use netlink helper functions
v3:
* Make flag represented in JSON output as a null value
v2:
* Align the output syntax with the input syntax
* Fix the style issues
Original idea by Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Qiaobin Fu <qiaobinf@bu.edu> Reviewed-by: Michel Machado <michel@digirati.com.br> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com>
This patch adds support for the End.BPF action of the seg6local
lightweight tunnel. Functions from the BPF lightweight tunnel are
re-used in this patch. Example:
$ ip -6 route add fc00::18 encap seg6local action End.BPF endpoint
obj my_bpf.o sec my_func dev eth0
$ ip -6 route show fc00::18
fc00::18 encap seg6local action End.BPF endpoint my_bpf.o:[my_func]
dev eth0 metric 1024 pref medium
v2: - re-use of print_encap_bpf_prog instead of fprintf
- introduction of "endpoint" keyword for more consistency with
others parameters
Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
ipaddress: Fix and make consistent label match handling
Since commit 9516823051ce ("ipaddress: Improve print_linkinfo()") we
return -1 instead of 0 when ip-address(8) label does not match network
device name as we did before change. This causes regression when trying
to output ip address matching label:
# ip addr add 192.168.192.1/24 dev lo label lo:1
# ip addr show label lo:1
<no output>
This is special case and return 0 from print_linkinfo() earlier to match
only filter.ifindex and filter.up if given, but not rest fields in
@filter. Then call print_selected_addrinfo() without calling
print_link_stats() in ipaddr_list_flush_or_save().
Later print_selected_addrinfo() calls print_addrinfo() that finally
matches IFA_LABEL attribute in netlink buffer with filter.label using
ifa_label_match_rta().
On the other hand there is three conditions checked in print_linkinfo()
to determine label special case:
With 1) it is ok to check if filtering by label is on by given pattern
in @filter.label.
Since label is IPv4 specific and AF_PACKET is for printing ip-link(8)
information (see ipaddr_link_list()::ipaddress.c as example) checking
for AF_PACKET in 2) doesn't take much sense: better to defer these
checks to print_addrinfo() determine valid combinations before calling
ifa_label_match_rta() to finally match IFA_LABEL to pattern in
filter.label.
For 3) we have following call for test case:
fnmatch(pattern, string, flags) ->
fnmatch(filter.label, name, 0) ->
fnmatch("lo:1", "lo", 0) == FNM_NOMATCH (1) or non-zero on error
To support special case in print_linkinfo() for filtering by label we
only need to check if label pattern is given in filter.label and return
0 to skip print_link_stats() in ipaddr_list_flush_or_save(): actual
filtering will be done in print_addrinfo().
Before commit 9516823051ce ("ipaddress: Improve print_linkinfo()"):
-------------------------------------------------------------------
$ ip addr sh label lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN \
group default qlen 1000
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
fnmatch("lo", "lo", 0) == 0
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
$ ip addr show label 'lo:*'
inet 192.168.192.1/24 scope global lo:1
valid_lft forever preferred_lft forever
$ ip addr sh label lo:1
inet 192.168.192.1/24 scope global lo:1
valid_lft forever preferred_lft forever
$ ip -4 addr sh label lo:1
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN \
group default qlen 1000
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
filter.family == AF_INET
inet 192.168.192.1/24 scope global lo:1
valid_lft forever preferred_lft forever
After this change applied:
--------------------------
$ ip/ip addr show label lo
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
$ ip/ip addr show label 'lo:*'
inet 192.168.192.1/24 scope global lo:1
valid_lft forever preferred_lft forever
$ ip/ip addr show label lo:1
inet 192.168.192.1/24 scope global lo:1
valid_lft forever preferred_lft forever
$ ip/ip -4 addr show label lo:1
inet 192.168.192.1/24 scope global lo:1
valid_lft forever preferred_lft forever
Note that we no longer show link information as we did previously:
we are filtering by "label" pattern, not showing by "dev".
Fixes: commit 9516823051ce ("ipaddress: Improve print_linkinfo()") Reported-by: Vincent Bernat <vincent@bernat.im> Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
David Ahern [Wed, 18 Jul 2018 02:39:06 +0000 (19:39 -0700)]
Merge branch 'bpf-btf' into iproute2-next
Daniel Borkmann says:
====================
Main part of this set is to: i) avoid strict af_alg kernel dependency,
ii) add loader support for bpf to bpf calls and iii) add btf loader
support with an option to annotate maps. For details please see the
individual patches. Thanks!
Daniel Borkmann [Tue, 17 Jul 2018 23:31:22 +0000 (01:31 +0200)]
bpf: implement btf handling and map annotation
Implement loading of .BTF section from object file and build up
internal table for retrieving key/value id related to maps in
the BPF program. Latter is done by setting up struct btf_type
table.
One of the issues is that there's a disconnect between the data
types used in the map and struct bpf_elf_map, meaning the underlying
types are unknown from the map description. One way to overcome
this is to add a annotation such that the loader will recognize
the relation to both. BPF_ANNOTATE_KV_PAIR(map_foo, struct key,
struct val); has been added to the API that programs can use.
The loader will then pick the corresponding key/value type ids and
attach it to the maps for creation. This can later on be dumped via
bpftool for introspection.
Example with test_xdp_noinline.o from kernel selftests:
Daniel Borkmann [Tue, 17 Jul 2018 23:31:21 +0000 (01:31 +0200)]
bpf: implement bpf to bpf calls support
Implement missing bpf to bpf calls support. The loader will
recognize .text section and handle relocation entries that
are emitted by LLVM.
First step is processing of map related relocation entries
for .text section, and in a second step loader will copy .text
section into program section and adjust call instruction
offset accordingly.
Example with test_xdp_noinline.o from kernel selftests:
1) Every function as __attribute__ ((always_inline)), rest
left unchanged:
# ip -force link set dev lo xdp obj test_xdp_noinline.o sec xdp-test
# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 xdpgeneric/id:233 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
[...]
# bpftool prog dump xlated id 233
[...]
1669: (2d) if r3 > r2 goto pc+4
1670: (79) r2 = *(u64 *)(r10 -136)
1671: (61) r2 = *(u32 *)(r2 +0)
1672: (63) *(u32 *)(r1 +0) = r2
1673: (b7) r0 = 1
1674: (95) exit <-- 1674 insns total
2) Every function as __attribute__ ((noinline)), rest
left unchanged:
Daniel Borkmann [Tue, 17 Jul 2018 23:31:20 +0000 (01:31 +0200)]
bpf: remove strict dependency on af_alg
Do not bail out when AF_ALG is not supported by the kernel and
only do so when a map is requested in object ns where we're
calculating the hash. Otherwise, the loader can operate just
fine, therefore lets not fail early when it's not needed.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David Ahern <dsahern@gmail.com>
David Ahern [Wed, 18 Jul 2018 02:37:50 +0000 (19:37 -0700)]
Import btf.h from kernel headers
Import btf.h from kernel headers at commit 2aa4a3378ad0 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next")
which is the last sync point.
ipneigh: exclude NTF_EXT_LEARNED from default filter
NUD_NOARP entries are filtered out by default by iproute2.
We dont want NUD_NOARP with NTF_EXT_LEARNED flag filtered out.
This patch extends the default filter check for ip neigh show
to include the NTF_EXT_LEARNED flag.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Jakub Kicinski [Fri, 13 Jul 2018 22:54:51 +0000 (15:54 -0700)]
iplink: add support for reporting multiple XDP programs
Kernel now supports attaching XDP programs in the driver
and hardware at the same time. Print that information
correctly.
In case there are multiple programs attached kernel will
not provide IFLA_XDP_PROG_ID, so don't expect it to be
there (this also improves the printing for very old kernels
slightly, as it avoids unnecessary "prog/xdp" line).
In short mode preserve the current outputs but don't print
IDs if there are multiple.
6: netdevsim0: <BROADCAST,NOARP> mtu 1500 xdpoffload/id:11 qdisc [...]
and:
6: netdevsim0: <BROADCAST,NOARP> mtu 1500 xdpmulti qdisc [...]
ip link output will keep using prog/xdp prefix if only one program
is attached, but can also print multiple program lines:
David Ahern [Thu, 12 Jul 2018 00:51:46 +0000 (17:51 -0700)]
Merge branch 'tc-etf' into iproute2-next
Jesus Sanchez-Palencia says:
====================
fixes since v3:
- Add support for clock names with the "CLOCK_" prefix;
- Print clock name on print_opt();
- Use strcasecmp() instead of strncasecmp().
The ETF (earliest txtime first) qdisc was recently merged into net-next
[1], so this patchset adds support for it through the tc command line
tool.
An initial man page is also provided.
The first commit in this series is adding an updated version of
include/uapi/linux/pkt_sched.h and is not meant to be merged. It's
provided here just as a convenience for those who want to easily build
this patchset.
The "Earliest TxTime First" (ETF) queueing discipline allows precise
control of the transmission time of packets by providing a sorted
time-based scheduling of packets.
The syntax is:
tc qdisc add dev DEV parent NODE etf delta <DELTA>
clockid <CLOCKID> [offload] [deadline_mode]
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Roi Dayan [Tue, 3 Jul 2018 12:54:32 +0000 (15:54 +0300)]
tc: Fix output of ip attributes
Example output is of tos and ttl.
Befoe this fix the format used %x caused output of the pointer
instead of the intended string created in the out variable.
Fixes: e28b88a464c4 ("tc: jsonify flower filter") Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Simon Horman [Fri, 6 Jul 2018 00:12:00 +0000 (17:12 -0700)]
tc: m_tunnel_key: Add tunnel option support to act_tunnel_key
Allow setting tunnel options using the act_tunnel_key action.
Options are expressed as class:type:data and multiple options
may be listed using a comma delimiter.
# ip link add name geneve0 type geneve dstport 0 external
# tc qdisc add dev eth0 ingress
# tc filter add dev eth0 protocol ip parent ffff: \
flower indev eth0 \
ip_proto udp \
action tunnel_key \
set src_ip 10.0.99.192 \
dst_ip 10.0.99.193 \
dst_port 6081 \
id 11 \
geneve_opts 0102:80:00800022,0102:80:00800022 \
action mirred egress redirect dev geneve0
Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Add support for configuration parameters set and show.
Each parameter can be either generic or driver-specific.
The user can retrieve data on these configuration parameters by devlink
param show command and can set new value to a configuration parameter
by devlink param set command.
The configuration parameters can be set in different configuration
modes:
runtime - set while driver is running, no reset required.
driverinit - applied while driver initializes, requires restart
driver by devlink reload command.
permanent - written to device's non-volatile memory, hard reset
required to apply.
New commands added:
devlink dev param show [DEV name PARAMETER]
devlink dev param set DEV name PARAMETER value VALUE
cmode { permanent | driverinit | runtime }
This patch adds support for the new isolated port option which, if set,
would allow the isolated ports to communicate only with non-isolated
ports and the bridge device. The option can be set via the bridge or ip
link type bridge_slave commands, e.g.:
$ ip link set dev eth0 type bridge_slave isolated on
$ bridge link set dev eth0 isolated on
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David Ahern <dsahern@gmail.com>
Vlad Buslov [Wed, 20 Jun 2018 07:24:21 +0000 (10:24 +0300)]
tc: fix batch force option
When sending accumulated compound command results an error, check 'force'
option before exiting. Move return code check after putting batch bufs and
freeing iovs to prevent memory leak. Break from loop, instead of returning
error code to allow cleanup at the end of batch function. Don't reset ret
code on each iteration.
Fixes: 485d0c6001c4 ("tc: Add batchsize feature for filter and actions") Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Chris Mi <chrism@mellanox.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch adds support for OUTPUT_MARK in xfrm state to exercise the
functionality added by kernel commit 077fbac405bf
("net: xfrm: support setting an output mark.").
Sample output-
(with mark and output-mark)
src 192.168.1.1 dst 192.168.1.2
proto esp spi 0x00004321 reqid 0 mode tunnel
replay-window 0 flag af-unspec
mark 0x10000/0x3ffff output-mark 0x20000
auth-trunc xcbc(aes) 0x3ed0af408cf5dcbf5d5d9a5fa806b211 96
enc cbc(aes) 0x3ed0af408cf5dcbf5d5d9a5fa806b233
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
(with mark only)
src 192.168.1.1 dst 192.168.1.2
proto esp spi 0x00004321 reqid 0 mode tunnel
replay-window 0 flag af-unspec
mark 0x10000/0x3ffff
auth-trunc xcbc(aes) 0x3ed0af408cf5dcbf5d5d9a5fa806b211 96
enc cbc(aes) 0x3ed0af408cf5dcbf5d5d9a5fa806b233
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
Patrick Talbert [Thu, 14 Jun 2018 13:46:57 +0000 (15:46 +0200)]
ipaddress: strengthen check on 'label' input
As mentioned in the ip-address man page, an address label must
be equal to the device name or prefixed by the device name
followed by a colon. Currently the only check on this input is
to see if the device name appears at the beginning of the label
string.
This commit adds an additional check to ensure label == dev or
continues with a colon.
Signed-off-by: Patrick Talbert <ptalbert@redhat.com> Suggested-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Hoang Le [Wed, 13 Jun 2018 04:09:56 +0000 (11:09 +0700)]
rdma: sync some IP headers with glibc
In the commit 9a362cc71a45, new userspace header:
(i.e rdma/rdma_user_cm.h -> linux/in6.h)
is included before the kernel space header:
(i.e utils.h -> resolv.h -> netinet/in.h).
This leads to unsynchronous some IP headers and compiler got failure
with error: redefinition of some structs IP.
In this commit, just reorder this including to make them in-sync.
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Hoang Le [Fri, 8 Jun 2018 02:19:28 +0000 (09:19 +0700)]
tipc: TIPC_NLA_LINK_NAME value pass on nesting entry TIPC_NLA_LINK
In the commit 94f6a80 on next-net, TIPC_NLA_LINK_NAME attribute should be
retrieved and validated via TIPC_NLA_LINK nesting entry in
tipc_nl_node_get_link().
According to that commit, TIPC_NLA_LINK_NAME value passing via
tipc link get command must follow above hierachy.
Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Since commit 049c58539f5d ("devlink: mnlg: Add support for extended ack")
devlink requires NETLINK_{CAP,EXT}_ACK. This prevents devlink from
working with older kernels that don't support these features.
host # ./devlink/devlink
Failed to connect to devlink Netlink
Fixes: 049c58539f5d ("devlink: mnlg: Add support for extended ack") Cc: Arkadi Sharshevsky <arkadis@mellanox.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Acked-by: Jiri Pirko <jiri@mellanox.com>
Nicolas Dichtel [Thu, 31 May 2018 14:28:48 +0000 (16:28 +0200)]
ip: IFLA_NEW_NETNSID/IFLA_NEW_IFINDEX support
Parse and display those attributes.
Example:
ip l a type dummy
ip netns add foo
ip monitor link&
ip l s dummy1 netns foo
Deleted 6: dummy1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether 66:af:3a:3f:a0:89 brd ff:ff:ff:ff:ff:ff new-nsid 0 new-ifindex 6
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Nathan Harold [Wed, 30 May 2018 19:11:32 +0000 (12:11 -0700)]
iproute2: fix 'ip xfrm monitor all' command
Currently, calling 'ip xfrm monitor all' will
actually invoke the 'all-nsid' command because the
soft-match for 'all-nsid' occurs before the precise
match for 'all'. This patch rearranges the checks
so that the 'all' command, itself an alias for
invoking 'ip xfrm monitor' with no argument, can
be called consistent with the syntax for other ip
commands that accept an 'all'.
Signed-off-by: Nathan Harold <nharold@google.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
David Ahern [Fri, 1 Jun 2018 15:50:16 +0000 (08:50 -0700)]
iplink_vrf: Save device index from response for return code
A recent commit changed rtnl_talk_* to return the response message in
allocated memory so callers need to free it. The change to name_is_vrf
did not save the device index which is pointing to a struct inside the
now allocated and freed memory resulting in garbage getting returned
in some cases.
Fix by using a stack variable to save the return value and only set
it to ifi->ifi_index after all checks are done and before the answer
buffer is freed.
Fixes: 86bf43c7c2fdc ("lib/libnetlink: update rtnl_talk to support malloc buff at run time") Cc: Hangbin Liu <liuhangbin@gmail.com> Cc: Phil Sutter <phil@nwl.cc> Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
David Ahern [Wed, 30 May 2018 15:30:09 +0000 (08:30 -0700)]
ip route: print RTA_CACHEINFO if it exists
RTA_CACHEINFO can be sent for non-cloned routes. If the attribute is
present print it. Allows route dumps to print expires times for example
which can exist on FIB entries.
The ip command would always lookup the network device index
even when not necessary. This slows down operations like creating
lots of VLAN's.
David reported the original issue, this is an alternative patch
that solves it in a slightly more general method.
Using iproute2 to create a bridge and add 4094 vlans to it can take from
2 to 3 *minutes*. The reason is the extraneous call to ll_name_to_index.
ll_name_to_index results in an ioctl(SIOCGIFINDEX) call which in turn
invokes dev_load. If the index does not exist, which it won't when
creating a new link, dev_load calls modprobe twice -- once for
netdev-NAME and again for NAME. This is unnecessary overhead for each
link create.
When ip link is invoked for a new device, there is no reason to
call ll_name_to_index for the new device. With this patch, creating
a bridge and adding 4094 vlans takes less than 3 *seconds*.
old:
# time ip -batch ip-vlan.batch
real 3m13.727s
user 0m0.076s
sys 0m1.959s
new:
# time ip -batch ip-vlan.batch
real 0m3.222s
user 0m0.044s
sys 0m1.777s
Reported-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>