Phil Sutter [Thu, 24 Aug 2017 09:51:50 +0000 (11:51 +0200)]
lib/ll_map: Choose size of new cache items at run-time
Instead of having a fixed buffer of 16 bytes for the interface name,
tailor size of new ll_cache entry using the interface name's actual
length. This also makes sure the following call to strcpy() is safe.
Phil Sutter [Thu, 24 Aug 2017 09:51:49 +0000 (11:51 +0200)]
tc/m_xt: Fix for potential string buffer overflows
- Use strncpy() when writing to target->t->u.user.name and make sure the
final byte remains untouched (xtables_calloc() set it to zero).
- 'tname' length sanitization was completely wrong: If it's length
exceeded the 16 bytes available in 'k', passing a length value of 16
to strncpy() would overwrite the previously NULL'ed 'k[15]'. Also, the
sanitization has to happen if 'tname' is exactly 16 bytes long as
well.
Phil Sutter [Thu, 24 Aug 2017 09:51:48 +0000 (11:51 +0200)]
lnstat_util: Simplify alloc_and_open() a bit
Relying upon callers and using unsafe strcpy() is probably not the best
idea. Aside from that, using snprintf() allows to format the string for
lf->path in one go.
Phil Sutter [Thu, 24 Aug 2017 09:51:47 +0000 (11:51 +0200)]
lib/inet_proto: Review inet_proto_{a2n,n2a}()
The original intent was to make sure strings written by those functions
are NUL-terminated at all times, though it was suggested to get rid of
the 15 char protocol name limit as well which this patch accomplishes.
In addition to that, simplify inet_proto_a2n() a bit: Use the error
checking in get_u8() to find out whether passed 'buf' contains a valid
decimal number instead of checking the first character's value manually.
Phil Sutter [Thu, 24 Aug 2017 09:51:46 +0000 (11:51 +0200)]
lib/fs: Fix format string in find_fs_mount()
A field width of 4096 allows fscanf() to store that amount of characters
into the given buffer, though that doesn't include the terminating NULL
byte. Decrease the value by one to leave space for it.
Phil Sutter [Thu, 24 Aug 2017 09:51:45 +0000 (11:51 +0200)]
ipntable: Avoid memory allocation for filter.name
The original issue was that filter.name might end up unterminated if
user provided string was too long. But in fact it is not necessary to
copy the commandline parameter at all: just make filter.name point to it
instead.
ss: fix help/man TCP-STATE description for listening
There's some misleading information in --help and ss(8) manpage about
TCP-STATE named 'listen'.
ss doesn't know such a state, but it knows 'listening' state.
$ ss -tua state listen
ss: wrong state name: listen
$ ss -tua state listening
[...]
Addresses: https://bugs.debian.org/872990 Reported-by: Pavel Lyulchenko <p.lyulchenko@gmail.com> Signed-off-by: Andreas Henriksson <andreas@fatal.se>
Phil Sutter [Mon, 21 Aug 2017 16:36:52 +0000 (18:36 +0200)]
devlink: Check return code of strslashrsplit()
This function shouldn't fail because all callers of
__dl_argv_handle_port() make sure the passed string contains enough
slashes already, but better make sure if this changes in future the
function won't access uninitialized data.
Phil Sutter [Mon, 21 Aug 2017 09:27:00 +0000 (11:27 +0200)]
iplink_can: Prevent overstepping array bounds
can_state_names array contains at most CAN_STATE_MAX fields, so allowing
an index to it to be equal to that number is wrong. While here, also
make sure the array is indeed that big so nothing bad happens if
CAN_STATE_MAX ever increases.
Phil Sutter [Thu, 17 Aug 2017 17:09:31 +0000 (19:09 +0200)]
tc/m_gact: Drop dead code
The use of 'ok' variable in parse_gact() is ineffective: The second
conditional increments it either if *argv is 'gact' or if
parse_action_control() doesn't fail (in which case exit() is called).
So this is effectively an unconditional increment and since no decrement
happens anywhere, all remaining checks for 'ok != 0' can be dropped.
Phil Sutter [Thu, 17 Aug 2017 17:09:27 +0000 (19:09 +0200)]
iproute: Fix for missing 'Oifs:' display
Covscan complained about dead code but after reading it, I assume the
author's intention was to prefix the interface list with 'Oifs: '.
Initializing first to 1 and setting it to 0 after above prefix was
printed should fix it.
Leon Romanovsky [Sun, 20 Aug 2017 09:58:22 +0000 (12:58 +0300)]
rdma: Add basic infrastructure for RDMA tool
RDMA devices are cross-functional devices from one side,
but very tailored for the specific markets from another.
Such diversity caused to spread of RDMA related configuration
across various tools, e.g. devlink, ip, ethtool, ib specific and
vendor specific solutions.
This patch adds ability to fill device and port information
by reading RDMA netlink.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Phil Sutter [Thu, 17 Aug 2017 17:09:30 +0000 (19:09 +0200)]
iproute_lwtunnel: csum_mode value checking was ineffective
ila_csum_name2mode() returning -1 on error but being declared as
returning __u8 doesn't make much sense. Change the code to correctly
detect this issue. Checking for __u8 overruns shouldn't be necessary
though since ila_csum_name2mode() return values are well-defined.
Phil Sutter [Thu, 17 Aug 2017 17:09:31 +0000 (19:09 +0200)]
examples: Some shell fixes to cbq.init
This addresses the following issues:
- $@ is an array, so don't use it in quoted strings - use $* instead.
- Add missing quotes to components of [ ] expressions. These are not
strictly necessary since the output of 'wc -l' should be a single word
only, but in case of errors, bash prints "integer expression expected"
instead of "too many arguments".
- Use -print0/-0 when piping from find to xargs to allow for filenames
which contain whitespace.
- Quote arguments to 'eval' to prevent word-splitting.
Daniel Borkmann [Wed, 9 Aug 2017 22:15:41 +0000 (00:15 +0200)]
bpf: unbreak libelf linkage for bpf obj loader
Commit 69fed534a533 ("change how Config is used in Makefile's") moved
HAVE_MNL specific CFLAGS/LDLIBS for building with libmnl out of the
top level Makefile into sub-Makefiles. However, it also removed the
HAVE_ELF specific CFLAGS/LDLIBS entirely, which breaks the BPF object
loader for tc and ip with "No ELF library support compiled in." despite
having libelf detected in configure script. Fix it similarly as in 69fed534a533 for HAVE_ELF.
Fixes: 69fed534a533 ("change how Config is used in Makefile's") Reported-by: Jeffrey Panneman <jeffrey.panneman@tno.nl> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
David Ahern [Wed, 9 Aug 2017 15:43:27 +0000 (08:43 -0700)]
lib: Dump ext-ack string by default
In time, errfn can be implemented for link, route, etc commands to
give a much more detailed response (e.g., point to the attribute
that failed). Doing so is much more complicated to process the
message and convert attribute ids to names.
In any case the error string returned by the kernel should be dumped
to the user, so make that happen now.
The ikey and okey value are normal u32 values. The input accepts
them in dotted, hex or decimal form. For output, hex seems like
the best form since they are not really addresses.
Suggested-by: Christian Langrock <christian.langrock@secunet.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The recent LIBMNL changes was made more difficult to debug because
of how Config is handle in clean make. The Config file is generated
by top level make, but since it is not recursive, the values generated
would not be visible on a clean make.
The change is to not include Config in top level make, and move
all the conditionals down into sub makefiles. Not ideal, but beter
than going full autoconf route. Or forcing separate configure
step.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
According to the IPv4 behavior of 'ip' it should be possible
to omit the arguments for local and remote address.
Without this patch omitting these parameters would lead to
uninitialized memory being interpreted as IPv6 addresses.
Reported-by: Christian Langrock <christian.langrock@secunet.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
When ip netns {add|delete} is first run, it bind-mounts /var/run/netns
on top of itself, then marks it as shared. However, if there are already
bind-mounts in the directory from other tools, these would not be
propagated. Fix this by recursively bind-mounting.
Signed-off-by: Casey Callendrello <casey.callendrello@coreos.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
iproute: Add support for extended ack to rtnl_talk
Add support for extended ack error reporting via libmnl.
Add a new function rtnl_talk_extack that takes a callback as an input
arg. If a netlink response contains extack attributes, the callback is
is invoked with the the err string, offset in the message and a pointer
to the message returned by the kernel.
If iproute2 is built without libmnl, it will still work but
extended error reports from kernel will not be available.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Phil Sutter [Tue, 1 Aug 2017 16:36:11 +0000 (18:36 +0200)]
Really fix get_addr() and get_prefix() error messages
Both functions take the desired address family as a parameter. So using
that to notify the user what address family was expected is correct,
unlike using dst->family which will tell the user only what address
family was specified.
The situation which commit 334af76143368 tried to fix was when 'ip'
would accept addresses from multiple families. In that case, the family
parameter is set to AF_UNSPEC so that get_addr_1() may accept any valid
address.
This patch introduces a wrapper around family_name() which returns the
string "any valid" for AF_UNSPEC instead of the three question marks
unsuitable for use in error messages.
Tests for AF_UNSPEC:
| # ip a a 256.10.166.1/24 dev d0
| Error: any valid prefix is expected rather than "256.10.166.1/24".
| # ip neighbor add proxy 2001:db8::g dev d0
| Error: any valid address is expected rather than "2001:db8::g".
Tests for explicit address family:
| # ip -6 addrlabel add prefix 1.1.1.1/24 label 123
| Error: inet6 prefix is expected rather than "1.1.1.1/24".
| # ip -4 addrlabel add prefix dead:beef::1/24 label 123
| Error: inet prefix is expected rather than "dead:beef::1/24".
Reported-by: Jaroslav Aster <jaster@redhat.com> Fixes: 334af76143368 ("fix get_addr() and get_prefix() error messages") Signed-off-by: Phil Sutter <phil@nwl.cc>
Phil Sutter [Wed, 2 Aug 2017 12:57:56 +0000 (14:57 +0200)]
bpf: Make bytecode-file reading a little more robust
bpf_parse_string() will now correctly handle:
- Extraneous whitespace,
- OPs on multiple lines and
- overlong file names.
The added feature of allowing to have OPs on multiple lines (like e.g.
tcpdump prints them) is rather a side effect of fixing detection of
malformed bytecode files having random content on a second line, like
e.g.:
Hangbin Liu [Thu, 27 Jul 2017 09:44:15 +0000 (17:44 +0800)]
utils: return default family when rtm_family is not RTNL_FAMILY_IPMR/IP6MR
When we get a multicast route, the rtm_type is RTN_MULTICAST, but the
rtm_family may be AF_INET. If we only check the type with RTNL_FAMILY_IPMR,
we will get malformed address. e.g.
+ ip -4 route add multicast 172.111.1.1 dev em1 table main
Before fix:
+ ip route list type multicast table main
multicast ac6f:101:800:400:400:0:3c00:0 dev em1 scope link
After fix:
+ ip route list type multicast table main
multicast 172.111.1.1 dev em1 scope link
Fixes: 56e3eb4c3400 ("ip: route: fix multicast route dumps") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Phil Sutter <phil@nwl.cc>
ip netns accepts invalid input as namespace name like an empty string or a
string longer than the maximum file name length.
Check that the netns name is not empty and less than or equal to NAME_MAX.
Ability to change geneve device attributes was added to kernel through
commit 5b861f6baa3a ("geneve: add rtnl changelink support"), however one
cannot do the same through ip-link(8) command. Changing the allowed
geneve device attributes using 'ip link set <geneve_name> type geneve id
<geneve_id> <allowed_attributes>' currently fails with 'operation not
supported' error. This patch adds support for it.
Daniel Borkmann [Sat, 22 Jul 2017 23:22:18 +0000 (01:22 +0200)]
bpf: improve error reporting around tail calls
Currently, it's still quite hard to figure out if a prog passed the
verifier, but later gets rejected due to different tail call ownership.
Figure out whether that is the case and provide appropriate error
messages to the user.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
In the presence of firewalls which improperly block ICMP Unreachable
(including Fragmentation Required) messages, Path MTU Discovery is
prevented from working.
The workaround is to handle IPv4 payloads opaquely, ignoring the DF
bit.
Kernel commit 22a59be8b7693eb2d0897a9638f5991f2f8e4ddd ("net: ipv4:
Add ability to have GRE ignore DF bit in IPv4 payloads") is
complemented by this user-space changeset which exposes control of
this setting.
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Philip Prindeville <philipp@redfish-solutions.com>
ip netns keeps track of created namespaces with bind mounts named
/var/run/netns/<namespace>. No input sanitization is done, allowing creation and
deletion of files relatives to /var/run/netns or, if the path is non existent or
invalid, allows to create "untracked" namespaces (invisible to the tool).
This commit denies creation or deletion of namespaces with names contaning
"/" or matching exactly "." or "..".
Daniel Borkmann [Mon, 17 Jul 2017 15:18:52 +0000 (17:18 +0200)]
bpf: dump id/jited info for cls/act programs
Make use of TCA_BPF_ID/TCA_ACT_BPF_ID that we exposed and print the ID
of the programs loaded and use the new BPF_OBJ_GET_INFO_BY_FD command
for dumping further information about the program, currently whether
the attached program is jited.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Daniel Borkmann [Mon, 17 Jul 2017 15:18:51 +0000 (17:18 +0200)]
bpf: support loading map in map from obj
Add support for map in map in the loader and add a small example program.
The outer map uses inner_id to reference a bpf_elf_map with a given ID
as the inner type. Loading maps is done in three passes, i) all non-map
in map maps are loaded, ii) all map in map maps are loaded based on the
inner_id map spec of a non-map in map with corresponding id, and iii)
related inner maps are attached to the map in map with given inner_idx
key. Pinned objetcs are assumed to be managed externally, so they are
only retrieved from BPF fs.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Daniel Borkmann [Mon, 17 Jul 2017 15:18:50 +0000 (17:18 +0200)]
bpf: remove obsolete samples
Remove old samples that have been added in pre BPF fs days which were
using file descriptor passing. It's long obsolete and not encouraged
to use this method given BPF fs is the default way like in the other
samples.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This patch extends route get to support mpls specific
route attributes like RTA_NEWDST.
Input:
RTA_DST - input label
RTA_NEWDST - labels in packet for multipath selection
By default the getroute handler returns matched
nexthop label, via and oif
With fibmatch keyword (RTM_F_FIB_MATCH flag), full matched
route is returned.
example:
$ip -f mpls route show
101
nexthop as to 102/103 via inet 172.16.2.2 dev virt1-2
nexthop as to 302/303 via inet 172.16.12.2 dev virt1-12
201
nexthop as to 202/203 via inet6 2001:db8:2::2 dev virt1-2
nexthop as to 402/403 via inet6 2001:db8:12::2 dev virt1-12
$ip -f mpls route get 103
RTNETLINK answers: Network is unreachable
$ip -f mpls route get 101
101 as to 102/103 via inet 172.16.2.2 dev virt1-2
$ip -f mpls route get as to 302/303 101
101 as to 302/303 via inet 172.16.12.2 dev virt1-12
$ip -f mpls route get fibmatch 103
RTNETLINK answers: Network is unreachable
$ip -f mpls route get fibmatch 101
101
nexthop as to 102/103 via inet 172.16.2.2 dev virt1-2
nexthop as to 302/303 via inet 172.16.12.2 dev virt1-12
Daniel Borkmann [Tue, 27 Jun 2017 00:48:36 +0000 (02:48 +0200)]
bpf: indicate lderr when bpf_apply_relo_data fails
When LLVM wrongly generates a rodata relo entry (llvm BZ #33599),
then just bail out instead of probing for prog w/o reloc, which
will fail in this case anyway.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>