Phil Sutter [Wed, 2 Mar 2016 18:19:49 +0000 (19:19 +0100)]
ip: align help text with manpage
Although the ip command accepts both "neighbor" and "neighbour" as
subcommand, I assume it's sufficient to list it in help text as just
"neigh" like ip.8 does.
Phil Sutter [Thu, 25 Feb 2016 12:07:35 +0000 (13:07 +0100)]
iprule: Align help text with man page synopsis
The help text was misleading: One could think it is possible to list
rules by selector, which would be nice but isn't. This change also
clarifies that 'ip rule' defaults to 'list' if no further arguments are
given.
Add IFLA_VF_TRUST message to trust the VF.
PF can accept some privileged operation from the trusted VF.
For example, ixgbe PF doesn't allow to enable VF promiscuous mode until
the VF is trusted because it may hurt performance.
To trust VF.
# ip link set dev eth0 vf 1 trust on
To untrust VF.
# ip link set dev eth0 vf 1 trust off
Roopa Prabhu [Sat, 20 Feb 2016 05:34:52 +0000 (21:34 -0800)]
bridge: add support for dynamic fdb entries
This patch is a follow up to the recently added
'static' fdb option.
It introduces a new option 'dynamic' which adds
dynamic fdb entries with NUD_REACHABLE.
$bridge fdb add 00:01:02:03:04:06 dev eth0 master dynamic
$bridge fdb show
00:01:02:03:04:06 dev eth0
This patch also documents all fdb types. Removes 'temp'
from usage message since it is now replaced by 'static'.
'temp' still works and is synonymous with static.
Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
htb: remove printing of a deprecated overhead value
Remove printing according to the previously used encoding of mpu and
overhead values within the tc_ratespec's mpu field. This encoding is
no longer being used as a separate 'overhead' field in the ratespec
structure has been introduced.
Export all the read-only values that get returned about a bridge port
such as the timers, the ids, designated_port and cost,
topology_change_ack and config_pending. For the bridge ids the
br_dump_bridge_id function is exported from iplink_bridge.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
netns: Fix an off-by-one strcpy() in netns_map_add().
netns_map_add() does a malloc of (sizeof (struct nsid_cache) +
strlen(name)) and then proceed with strcpy() of name into the
zero-length member at the end of the nsid_cache structure. The
nul-terminator is written outside of the allocated memory and may
overwrite the allocator's internal structure.
This can trigger a segmentation fault on i386 uclibc with names of size 8:
after the corruption occurs, the call to closedir() on netns_map_init()
crashes while freeing the DIR structure.
Here is the relevant valgrind output:
==1251== Memcheck, a memory error detector
==1251== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==1251== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright
info
==1251== Command: ./ip netns
==1251==
==1251== Invalid write of size 1
==1251== at 0x4011975: strcpy (in
/usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==1251== by 0x8058B00: netns_map_add (ipnetns.c:181)
==1251== by 0x8058E2A: netns_map_init (ipnetns.c:226)
==1251== by 0x8058E79: do_netns (ipnetns.c:776)
==1251== by 0x804D9FF: do_cmd (ip.c:110)
==1251== by 0x804D814: main (ip.c:300)
It seems that I've made a mistake when I exported these, instead of a
space in the end I've put a newline character which is wrong and breaks
the single line output.
Fixes: 7d6bc3b87abad ("bonding: export 3ad actor and partner port state") Reported-by: Sam Tannous <stannous@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Roopa Prabhu [Wed, 27 Jan 2016 17:09:37 +0000 (09:09 -0800)]
bridge: support for static fdb entries
There is no intuitive option to add static fdb entries today.
'temp' seems to have a side effect of adding
'static' fdb entries. But the name and intent
of 'temp' does not say anything about it being static.
example:
bridge fdb add operates as follows:
$bridge fdb add 00:01:02:03:04:05 dev eth0 master
$bridge fdb add 00:01:02:03:04:06 dev eth0 master temp
$bridge fdb add 00:01:02:03:04:07 dev eth0 master local
$bridge fdb show
00:01:02:03:04:05 dev eth0 permanent
00:01:02:03:04:06 dev eth0 static
00:01:02:03:04:07 dev eth0 permanent
00:01:02:03:04:08 dev eth0 <<== dynamic, ageable learned mac
This patch adds a new bridge fdb type 'static' which
makes sure NUD_NOARP and NUD_REACHABLE is set for static
entries. This effectively is nothing but what 'temp'
does today. But the name 'temp' is misleading.
After the patch:
$bridge fdb add 00:01:02:03:04:06 dev eth0 master static
$bridge fdb show
00:01:02:03:04:06 dev eth0 static
'temp' could ideally be a dynamic mac that can age (ie just
NUD_REACHABLE). But, 'temp' sets 'NUD_NOARP' and 'NUD_REACHABLE'.
Too late to change 'temp' now. But, we are thinking of introduing a
'dynamic' keyword after this patch that only sets NUD_REACHABLE.
Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Daniel Borkmann [Sun, 7 Feb 2016 01:11:52 +0000 (02:11 +0100)]
tc, bpf: give some more hints wrt false relos
Provide some more hints to the user/developer when relos have been found
that don't point to ld64 imm instruction. Ran couple of times into relos
generated by clang [1], where the compiler tried to uninline inlined
functions with eBPF and emitted BPF_JMP | BPF_CALL opcodes. If this seems
the case, give a hint that the user should do a work-around to use
always_inline annotation.
Daniel Borkmann [Sun, 7 Feb 2016 01:11:51 +0000 (02:11 +0100)]
tc, bpf: improve verifier logging
With a bit larger, branchy eBPF programs f.e. already ~BPF_MAXINSNS/7 in
size, it happens rather quickly that bpf(2) rejects also valid programs
when only the verifier log buffer size we have in tc is too small.
Change that, so by default we don't do any logging, and only in error
case we retry with logging enabled. If we should fail providing a
reasonable dump of the verifier analysis, retry few times with a larger
log buffer so that we can at least give the user a chance to debug the
program.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.r.fastabend@intel.com>
Paolo Abeni [Thu, 28 Jan 2016 13:48:55 +0000 (14:48 +0100)]
geneve: add support for lwt tunnel creation and dst port selection
This change add the ability to create lwt/flow based/externally
controlled geneve device and to select the udp destination port used
by a full geneve tunnel.
Roopa Prabhu [Wed, 3 Feb 2016 00:53:40 +0000 (16:53 -0800)]
ipmonitor: match user option 'all' before 'all-nsid'
'ip monitor all' is broken on older kernels.
This patch fixes 'ip monitor all' to match
'all' and not 'all-nsid'.
It moves parsing arg 'all-nsid' to after parsing
'all'.
Before:
$ip monitor all
NETLINK_LISTEN_ALL_NSID: Protocol not available
After:
$ip monitor all
[NEIGH]Deleted 10.0.0.1 dev eth1 lladdr c4:54:44:4f:b2:dd STALE
Fixes: 449b824ad196 ("ipmonitor: allows to monitor in several netns") Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Daniel Borkmann [Thu, 21 Jan 2016 23:46:28 +0000 (00:46 +0100)]
tc, bpf: make sure relo is in relation with map section
Add a test that symbol from relocation entry is actually related
to map section and bail out with an error message if it's not the
case; in relation to [1].
[1] https://llvm.org/bugs/show_bug.cgi?id=26243
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org>
Zhang Shengju [Thu, 21 Jan 2016 02:23:49 +0000 (02:23 +0000)]
ip-link: remove warning message
the warning was:
iproute.c:301:12: warning: 'val' may be used uninitialized in this
function [-Wmaybe-uninitialized]
features &= ~RTAX_FEATURE_ECN;
^
iproute.c:575:10: note: 'val' was declared here
__u32 val;
^
Lorenzo Colitti [Fri, 8 Jan 2016 08:32:37 +0000 (17:32 +0900)]
ss: support closing inet sockets via SOCK_DESTROY.
This patch adds a -K / --kill option to ss that attempts to
forcibly close matching sockets using SOCK_DESTROY.
Because ss typically prints sockets instead of acting on them,
and because the kernel only supports forcibly closing some types
of sockets, the output of -K is as follows:
- If closing the socket succeeds, the socket is printed.
- If the kernel does not support forcibly closing this type of
socket (e.g., if it's a UDP socket, or a TIME_WAIT socket),
the socket is silently skipped.
- If an error occurs (e.g., permission denied), the error is
reported and ss exits.
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Lorenzo Colitti [Fri, 8 Jan 2016 08:32:36 +0000 (17:32 +0900)]
libnetlink: don't print NETLINK_SOCK_DIAG errors in rtnl_talk
This change is a no-op, as currently no code uses rtnl_talk on
NETLINK_SOCK_DIAG_BY_FAMILY sockets. It is needed to suppress
spurious errors when using SOCK_DESTROY via rtnl_talk.
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Daniel Borkmann [Tue, 12 Jan 2016 01:03:08 +0000 (02:03 +0100)]
tc, bpf: more header checks on loading elf
eBPF llvm backend can support different BPF formats, make sure the object
we're trying to load matches with regards to endiannes and while at it, also
check for other attributes related to BPF ELFs.
# llc --version
LLVM (http://llvm.org/):
LLVM version 3.8.0svn
Optimized build.
Built Jan 9 2016 (02:08:10).
Default target: x86_64-unknown-linux-gnu
Host CPU: ivybridge
Daniel Borkmann [Tue, 12 Jan 2016 01:03:07 +0000 (02:03 +0100)]
tc, bpf: check section names and type everywhere
When extracting sections, we better check for name and type. Noticed
that some llvm versions emit .strtab and .shstrtab (e.g. saw it on pre
3.7), while more recent ones only seem to emit .strtab. Thus, make sure
we get the right sections.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org>
Richard Alpe [Tue, 5 Jan 2016 09:57:40 +0000 (10:57 +0100)]
tipc: add peer remove functionality
This enables a user to remove an offline peer from the kernel data
structures. This could for example be useful when deliberately scaling
in peer nodes in a cloud environment.
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com>
Jamal Hadi Salim [Sun, 10 Jan 2016 19:56:31 +0000 (14:56 -0500)]
tc: flower no need to specify the ethertype
since all tc classifiers are required to specify ethertype as part of grammar
By not allowing eth_type to be specified we remove contradiction for
example when a user specifies:
tc filter add ... priority xxx protocol ip flower eth_type ipv6
This patch removes that contradiction
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Richard Alpe [Tue, 5 Jan 2016 09:57:40 +0000 (10:57 +0100)]
tipc: add peer remove functionality
This enables a user to remove an offline peer from the kernel data
structures. This could for example be useful when deliberately scaling
in peer nodes in a cloud environment.
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com>
Bjørn Mork [Mon, 4 Jan 2016 09:58:05 +0000 (10:58 +0100)]
iplink: support show and set of "addrgenmode random"
"random" is a new IPv6 addrgenmode, enabling "stable_secret" type
addresses with an auto-generated secret.
$ ip link set eth0 addrgenmode random
$ ip -d link show dev eth0
2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000
link/ether 00:21:86:a3:25:7d brd ff:ff:ff:ff:ff:ff promiscuity 0 addrgenmode random
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Bjørn Mork <bjorn@mork.no>
Paolo Abeni [Fri, 18 Dec 2015 09:50:38 +0000 (10:50 +0100)]
lwtunnel: implement support for ip6 encap
Currently ip6 encap support for lwtunnel is missing.
This patch implement it, mostly duplicating the ipv4 parts.
Also be sure to insert a space after the encap type, when
showing lwtunnel, to avoid the tunnel type and the following
argument being merged into a single word.
Paolo Abeni [Tue, 15 Dec 2015 11:18:04 +0000 (12:18 +0100)]
lwtunnel: fix argument parsing
Currently parse_encap_ip() does not update correctly argv/argc;
if multiple lwtunnel arguments are provided, the parsing fails after
the first one, i.e.
ip route add 172.16.101.0/24 dev vxlan1 encap ip id 42 dst 192.168.255.1
fails with:
Error: either "to" is duplicate, or "dst" is a garbage.
This commit addresses the issue, stepping to next argument at each iteration
of the parsing loop.
Fixes: 1e5293056a02 ("lwtunnel: Add encapsulation support to ip route") Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tom Herbert [Mon, 30 Nov 2015 22:57:28 +0000 (14:57 -0800)]
ila: Add support for ILA lwtunnels
This patch:
- Adds a utility function for parsing a 64 bit address
- Adds a utility function for converting a 64 bit address to ASCII
- Adds and ILA encap type in lwt tunnels