Julien Fortin [Thu, 17 Aug 2017 17:35:52 +0000 (10:35 -0700)]
ip: ipaddress.c: add support for json output
This patch converts all output (mostly fprintfs) to the new ip_print api
which handle both regular and json output.
Initialize a json_writer and open an array object if -json was specified.
Note that the JSON attribute naming follows the NETLINK_ATTRIBUTE naming.
In many places throughout the code, IP, matches integer values with
hardcoded strings tables, such as link mode, link operstate or link
family.
In JSON context, this will result in a named string field. In the
very unlikely event that the requested index is out of bound, IP
displays the raw integer value. For JSON context this result in
having a different integer field example bellow:
The "_index" suffix is open to discussion and it is something that I came
up with. The bottom line is that you can't have a string field that may
become an int field in specific cases. Programs written in strongly type
languages (like C) might break if they are expecting a string value and
got an integer instead. We don't want to confuse anybody or make the code
even more complicated handling these specifics cases.
Hence the extra "_index" field that is easy to check for and deal with.
JSON schema, followed by live example:
Live config used:
$ ip link add dev vxlan42 type vxlan id 42
$ ip link add dev bond0 type bond
$ ip link add name swp1.50 link swp1 type vlan id 50
$ ip link add dev br0 type bridge
$ ip link set dev vxlan42 master br0
$ ip link set dev bond0 master br0
$ ip link set dev swp1.50 master br0
$ ip link set dev br0 up
$ ip -d link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode
DEFAULT group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 promiscuity 0
addrgenmode eui64
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UP mode DEFAULT group default qlen 1000
link/ether 08:00:27:db:31:88 brd ff:ff:ff:ff:ff:ff promiscuity 0
addrgenmode eui64
3: swp1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
DEFAULT group default qlen 1000
link/ether 08:00:27:5b:b1:75 brd ff:ff:ff:ff:ff:ff promiscuity 0
addrgenmode eui64
10: vxlan42: <BROADCAST,MULTICAST> mtu 1500 qdisc noop master br0 state
DOWN mode DEFAULT group default
link/ether 4a:d9:91:42:a2:d2 brd ff:ff:ff:ff:ff:ff promiscuity 1
vxlan id 42 srcport 0 0 dstport 8472 ageing 300
bridge_slave state disabled priority 8 cost 100 hairpin off guard off
root_block off fastleave off learning on flood on port_id 0x8001 port_no
0x1 designated_port 32769 designated_cost 0 designated_bridge
8000.8:0:27:5b:b1:75 designated_root 8000.8:0:27:5b:b1:75 hold_timer
0.00 message_age_timer 0.00 forward_delay_timer 0.00
topology_change_ack 0 config_pending 0 proxy_arp off proxy_arp_wifi off
mcast_router 1 mcast_fast_leave off mcast_flood on neigh_suppress off
addrgenmode eui64
11: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop master br0
state DOWN mode DEFAULT group default
link/ether e2:aa:7b:17:c5:14 brd ff:ff:ff:ff:ff:ff promiscuity 1
bond mode 802.3ad miimon 100 updelay 0 downdelay 0 use_carrier 1
arp_interval 0 arp_validate none arp_all_targets any primary_reselect
always fail_over_mac none xmit_hash_policy layer3+4 resend_igmp 1
num_grat_arp 1 all_slaves_active 0 min_links 1 lp_interval 1
packets_per_slave 1 lacp_rate fast ad_select stable ad_actor_sys_prio
65535 ad_user_port_key 0 ad_actor_system 00:00:00:00:00:00
bridge_slave state disabled priority 8 cost 100 hairpin off guard off
root_block off fastleave off learning on flood on port_id 0x8002 port_no
0x2 designated_port 32770 designated_cost 0 designated_bridge
8000.8:0:27:5b:b1:75 designated_root 8000.8:0:27:5b:b1:75 hold_timer
0.00 message_age_timer 0.00 forward_delay_timer 0.00
topology_change_ack 0 config_pending 0 proxy_arp off proxy_arp_wifi off
mcast_router 1 mcast_fast_leave off mcast_flood on neigh_suppress off
addrgenmode eui64
12: swp1.50@swp1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop master
br0 state DOWN mode DEFAULT group default
link/ether 08:00:27:5b:b1:75 brd ff:ff:ff:ff:ff:ff promiscuity 1
vlan protocol 802.1Q id 50 <REORDER_HDR>
bridge_slave state disabled priority 8 cost 100 hairpin off guard off
root_block off fastleave off learning on flood on port_id 0x8003 port_no
0x3 designated_port 32771 designated_cost 0 designated_bridge
8000.8:0:27:5b:b1:75 designated_root 8000.8:0:27:5b:b1:75 hold_timer
0.00 message_age_timer 0.00 forward_delay_timer 0.00
topology_change_ack 0 config_pending 0 proxy_arp off proxy_arp_wifi off
mcast_router 1 mcast_fast_leave off mcast_flood on neigh_suppress off
addrgenmode eui64
13: br0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state
DOWN mode DEFAULT group default
link/ether 08:00:27:5b:b1:75 brd ff:ff:ff:ff:ff:ff promiscuity 0
bridge forward_delay 1500 hello_time 200 max_age 2000 ageing_time
30000 stp_state 0 priority 32768 vlan_filtering 0 vlan_protocol 802.1Q
bridge_id 8000.8:0:27:5b:b1:75 designated_root 8000.8:0:27:5b:b1:75
root_port 0 root_path_cost 0 topology_change 0 topology_change_detected 0
hello_timer 0.00 tcn_timer 0.00 topology_change_timer 0.00
gc_timer 244.44 vlan_default_pvid 1 vlan_stats_enabled 0 group_fwd_mask 0
group_address 01:80:c2:00:00:00 mcast_snooping 1 mcast_router 1
mcast_query_use_ifaddr 0 mcast_querier 0 mcast_hash_elasticity 4096
mcast_hash_max 4096 mcast_last_member_count 2 mcast_startup_query_count 2
mcast_last_member_interval 100 mcast_membership_interval 26000
mcast_querier_interval 25500 mcast_query_interval 12500
mcast_query_response_interval 1000 mcast_startup_query_interval 3125
mcast_stats_enabled 0 mcast_igmp_version 2 mcast_mld_version 1
nf_call_iptables 0 nf_call_ip6tables 0 nf_call_arptables 0 addrgenmode
eui64
Julien Fortin [Thu, 17 Aug 2017 17:35:51 +0000 (10:35 -0700)]
ip: ip_print: add new API to print JSON or regular format output
To avoid code duplication and have a ligther impact on most of the files,
these functions were made to handle both stdout (FP context) or JSON
output. Using this api, the changes are easier to read and the code
stays as compact as possible.
includes json_writer.h in ip_common.h to make the lib/json_writer.c
functions available to the new "ip_print" api.
Daniel Borkmann [Wed, 9 Aug 2017 22:15:41 +0000 (00:15 +0200)]
bpf: unbreak libelf linkage for bpf obj loader
Commit 69fed534a533 ("change how Config is used in Makefile's") moved
HAVE_MNL specific CFLAGS/LDLIBS for building with libmnl out of the
top level Makefile into sub-Makefiles. However, it also removed the
HAVE_ELF specific CFLAGS/LDLIBS entirely, which breaks the BPF object
loader for tc and ip with "No ELF library support compiled in." despite
having libelf detected in configure script. Fix it similarly as in 69fed534a533 for HAVE_ELF.
Fixes: 69fed534a533 ("change how Config is used in Makefile's") Reported-by: Jeffrey Panneman <jeffrey.panneman@tno.nl> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
David Ahern [Wed, 9 Aug 2017 15:43:27 +0000 (08:43 -0700)]
lib: Dump ext-ack string by default
In time, errfn can be implemented for link, route, etc commands to
give a much more detailed response (e.g., point to the attribute
that failed). Doing so is much more complicated to process the
message and convert attribute ids to names.
In any case the error string returned by the kernel should be dumped
to the user, so make that happen now.
The ikey and okey value are normal u32 values. The input accepts
them in dotted, hex or decimal form. For output, hex seems like
the best form since they are not really addresses.
Suggested-by: Christian Langrock <christian.langrock@secunet.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The recent LIBMNL changes was made more difficult to debug because
of how Config is handle in clean make. The Config file is generated
by top level make, but since it is not recursive, the values generated
would not be visible on a clean make.
The change is to not include Config in top level make, and move
all the conditionals down into sub makefiles. Not ideal, but beter
than going full autoconf route. Or forcing separate configure
step.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
According to the IPv4 behavior of 'ip' it should be possible
to omit the arguments for local and remote address.
Without this patch omitting these parameters would lead to
uninitialized memory being interpreted as IPv6 addresses.
Reported-by: Christian Langrock <christian.langrock@secunet.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
tc actions: Improved batching and time filtered dumping
dump more than TCA_ACT_MAX_PRIO actions per batch when the kernel
supports it.
Introduced keyword "since" for time based filtering of actions.
Some example (we have 400 actions bound to 400 filters); at
installation time. Using updated when tc setting the time of
interest to 120 seconds earlier (we see 400 actions):
prompt$ hackedtc actions ls action gact since 120000| grep index | wc -l
400
go get some coffee and wait for > 120 seconds and try again:
prompt$ hackedtc actions ls action gact since 120000 | grep index | wc -l
0
Lets see a filter bound to one of these actions:
....
filter pref 10 u32
filter pref 10 u32 fh 800: ht divisor 1
filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10 (rule hit 2 success 1)
match 7f000002/ffffffff at 12 (success 1 )
action order 1: gact action pass
random type none pass val 0
index 23 ref 2 bind 1 installed 1145 sec used 802 sec
Action statistics:
Sent 84 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
...
that coffee took long, no? It was good.
Now lets ping -c 1 127.0.0.2, then run the actions again:
prompt$ hackedtc actions ls action gact since 120 | grep index | wc -l
1
More details please:
prompt$ hackedtc -s actions ls action gact since 120000
action order 0: gact action pass
random type none pass val 0
index 23 ref 2 bind 1 installed 1270 sec used 30 sec
Action statistics:
Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
And the filter?
filter pref 10 u32
filter pref 10 u32 fh 800: ht divisor 1
filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10 (rule hit 4 success 2)
match 7f000002/ffffffff at 12 (success 2 )
action order 1: gact action pass
random type none pass val 0
index 23 ref 2 bind 1 installed 1324 sec used 84 sec
Action statistics:
Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
When ip netns {add|delete} is first run, it bind-mounts /var/run/netns
on top of itself, then marks it as shared. However, if there are already
bind-mounts in the directory from other tools, these would not be
propagated. Fix this by recursively bind-mounting.
Signed-off-by: Casey Callendrello <casey.callendrello@coreos.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
iproute: Add support for extended ack to rtnl_talk
Add support for extended ack error reporting via libmnl.
Add a new function rtnl_talk_extack that takes a callback as an input
arg. If a netlink response contains extack attributes, the callback is
is invoked with the the err string, offset in the message and a pointer
to the message returned by the kernel.
If iproute2 is built without libmnl, it will still work but
extended error reports from kernel will not be available.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Ido Schimmel [Thu, 3 Aug 2017 06:13:55 +0000 (09:13 +0300)]
iproute: Display offload indication per-nexthop
Since kernel commit 475abbf1ef67 ("ipv4: fib: Set offload indication
according to nexthop flags") offload indication is reported on a
per-nexthop basis.
Phil Sutter [Tue, 1 Aug 2017 16:36:11 +0000 (18:36 +0200)]
Really fix get_addr() and get_prefix() error messages
Both functions take the desired address family as a parameter. So using
that to notify the user what address family was expected is correct,
unlike using dst->family which will tell the user only what address
family was specified.
The situation which commit 334af76143368 tried to fix was when 'ip'
would accept addresses from multiple families. In that case, the family
parameter is set to AF_UNSPEC so that get_addr_1() may accept any valid
address.
This patch introduces a wrapper around family_name() which returns the
string "any valid" for AF_UNSPEC instead of the three question marks
unsuitable for use in error messages.
Tests for AF_UNSPEC:
| # ip a a 256.10.166.1/24 dev d0
| Error: any valid prefix is expected rather than "256.10.166.1/24".
| # ip neighbor add proxy 2001:db8::g dev d0
| Error: any valid address is expected rather than "2001:db8::g".
Tests for explicit address family:
| # ip -6 addrlabel add prefix 1.1.1.1/24 label 123
| Error: inet6 prefix is expected rather than "1.1.1.1/24".
| # ip -4 addrlabel add prefix dead:beef::1/24 label 123
| Error: inet prefix is expected rather than "dead:beef::1/24".
Reported-by: Jaroslav Aster <jaster@redhat.com> Fixes: 334af76143368 ("fix get_addr() and get_prefix() error messages") Signed-off-by: Phil Sutter <phil@nwl.cc>
Phil Sutter [Wed, 2 Aug 2017 12:57:56 +0000 (14:57 +0200)]
bpf: Make bytecode-file reading a little more robust
bpf_parse_string() will now correctly handle:
- Extraneous whitespace,
- OPs on multiple lines and
- overlong file names.
The added feature of allowing to have OPs on multiple lines (like e.g.
tcpdump prints them) is rather a side effect of fixing detection of
malformed bytecode files having random content on a second line, like
e.g.:
Hangbin Liu [Thu, 27 Jul 2017 09:44:15 +0000 (17:44 +0800)]
utils: return default family when rtm_family is not RTNL_FAMILY_IPMR/IP6MR
When we get a multicast route, the rtm_type is RTN_MULTICAST, but the
rtm_family may be AF_INET. If we only check the type with RTNL_FAMILY_IPMR,
we will get malformed address. e.g.
+ ip -4 route add multicast 172.111.1.1 dev em1 table main
Before fix:
+ ip route list type multicast table main
multicast ac6f:101:800:400:400:0:3c00:0 dev em1 scope link
After fix:
+ ip route list type multicast table main
multicast 172.111.1.1 dev em1 scope link
Fixes: 56e3eb4c3400 ("ip: route: fix multicast route dumps") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Phil Sutter <phil@nwl.cc>
ip netns accepts invalid input as namespace name like an empty string or a
string longer than the maximum file name length.
Check that the netns name is not empty and less than or equal to NAME_MAX.
Ability to change geneve device attributes was added to kernel through
commit 5b861f6baa3a ("geneve: add rtnl changelink support"), however one
cannot do the same through ip-link(8) command. Changing the allowed
geneve device attributes using 'ip link set <geneve_name> type geneve id
<geneve_id> <allowed_attributes>' currently fails with 'operation not
supported' error. This patch adds support for it.
Daniel Borkmann [Sat, 22 Jul 2017 23:22:18 +0000 (01:22 +0200)]
bpf: improve error reporting around tail calls
Currently, it's still quite hard to figure out if a prog passed the
verifier, but later gets rejected due to different tail call ownership.
Figure out whether that is the case and provide appropriate error
messages to the user.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
In the presence of firewalls which improperly block ICMP Unreachable
(including Fragmentation Required) messages, Path MTU Discovery is
prevented from working.
The workaround is to handle IPv4 payloads opaquely, ignoring the DF
bit.
Kernel commit 22a59be8b7693eb2d0897a9638f5991f2f8e4ddd ("net: ipv4:
Add ability to have GRE ignore DF bit in IPv4 payloads") is
complemented by this user-space changeset which exposes control of
this setting.
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Philip Prindeville <philipp@redfish-solutions.com>
ip netns keeps track of created namespaces with bind mounts named
/var/run/netns/<namespace>. No input sanitization is done, allowing creation and
deletion of files relatives to /var/run/netns or, if the path is non existent or
invalid, allows to create "untracked" namespaces (invisible to the tool).
This commit denies creation or deletion of namespaces with names contaning
"/" or matching exactly "." or "..".
bridge: this patch adds json support for bridge mdb show
This patch adds json output to bridge mdb show
Normal Output:
$ bridge -d -s mdb show
dev br0 port swp3 grp 239.0.0.1 temp vid 128 172.26
dev br0 port swp3 grp 239.0.0.1 temp vid 64 172.26
dev br0 port swp2 grp 239.0.0.2 temp vid 1024 172.26
dev br0 port swp2 grp 239.0.0.2 temp vid 256 172.26
dev br0 port swp2 grp 239.0.0.2 temp vid 1 172.26
dev br0 port swp3 grp 239.0.0.1 temp vid 1 172.26
router ports on br0: swp4 0.00 permanent
router ports on br0: swp5 0.00 permanent
Daniel Borkmann [Mon, 17 Jul 2017 15:18:52 +0000 (17:18 +0200)]
bpf: dump id/jited info for cls/act programs
Make use of TCA_BPF_ID/TCA_ACT_BPF_ID that we exposed and print the ID
of the programs loaded and use the new BPF_OBJ_GET_INFO_BY_FD command
for dumping further information about the program, currently whether
the attached program is jited.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Daniel Borkmann [Mon, 17 Jul 2017 15:18:51 +0000 (17:18 +0200)]
bpf: support loading map in map from obj
Add support for map in map in the loader and add a small example program.
The outer map uses inner_id to reference a bpf_elf_map with a given ID
as the inner type. Loading maps is done in three passes, i) all non-map
in map maps are loaded, ii) all map in map maps are loaded based on the
inner_id map spec of a non-map in map with corresponding id, and iii)
related inner maps are attached to the map in map with given inner_idx
key. Pinned objetcs are assumed to be managed externally, so they are
only retrieved from BPF fs.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Daniel Borkmann [Mon, 17 Jul 2017 15:18:50 +0000 (17:18 +0200)]
bpf: remove obsolete samples
Remove old samples that have been added in pre BPF fs days which were
using file descriptor passing. It's long obsolete and not encouraged
to use this method given BPF fs is the default way like in the other
samples.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This patch extends route get to support mpls specific
route attributes like RTA_NEWDST.
Input:
RTA_DST - input label
RTA_NEWDST - labels in packet for multipath selection
By default the getroute handler returns matched
nexthop label, via and oif
With fibmatch keyword (RTM_F_FIB_MATCH flag), full matched
route is returned.
example:
$ip -f mpls route show
101
nexthop as to 102/103 via inet 172.16.2.2 dev virt1-2
nexthop as to 302/303 via inet 172.16.12.2 dev virt1-12
201
nexthop as to 202/203 via inet6 2001:db8:2::2 dev virt1-2
nexthop as to 402/403 via inet6 2001:db8:12::2 dev virt1-12
$ip -f mpls route get 103
RTNETLINK answers: Network is unreachable
$ip -f mpls route get 101
101 as to 102/103 via inet 172.16.2.2 dev virt1-2
$ip -f mpls route get as to 302/303 101
101 as to 302/303 via inet 172.16.12.2 dev virt1-12
$ip -f mpls route get fibmatch 103
RTNETLINK answers: Network is unreachable
$ip -f mpls route get fibmatch 101
101
nexthop as to 102/103 via inet 172.16.2.2 dev virt1-2
nexthop as to 302/303 via inet 172.16.12.2 dev virt1-12
Daniel Borkmann [Tue, 27 Jun 2017 00:48:36 +0000 (02:48 +0200)]
bpf: indicate lderr when bpf_apply_relo_data fails
When LLVM wrongly generates a rodata relo entry (llvm BZ #33599),
then just bail out instead of probing for prog w/o reloc, which
will fail in this case anyway.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Phil Sutter [Tue, 27 Jun 2017 19:00:25 +0000 (21:00 +0200)]
man: Collect names of man pages automatically
As it turned out, forgetting to add a man page to the respective
Makefile when introducing it is a common mistake. Overcome this once and
for all by using $(wildcard) function in Makefiles.
Fixes: 7124942942e53 ("genl: add manpage") Fixes: 958cd210942c8 ("ifcfg: add manpage") Fixes: e1b7f883e50de ("man: add documentation for IPv6 SR commands") Fixes: 1949f82cdf62c ("Introduce ip vrf command") Fixes: 535194a172d23 ("tipc: add peer remove functionality") Signed-off-by: Phil Sutter <phil@nwl.cc>
David Lebrun [Fri, 16 Jun 2017 13:54:28 +0000 (15:54 +0200)]
iproute: fix compilation issue with older glibc
If a header that includes linux/in6.h is included before
iproute's utils.h, then iproute2 fails to compile on older
glibc versions.
Fixes: e8493916a8ede9970732e33ea52d30b83071f401 ("iproute: add support for SR-IPv6 lwtunnel encapsulation") Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Hangbin Liu [Fri, 16 Jun 2017 03:31:52 +0000 (11:31 +0800)]
ip neigh: allow flush FAILED neighbour entry
After upstream commit 5071034e4af7 ('neigh: Really delete an arp/neigh entry
on "ip neigh delete" or "arp -d"'), we could delete a single FAILED neighbour
entry now. But `ip neigh flush` still skip the FAILED entry.
Move the filter after first round flush so we can flush FAILED entry on fixed
kernel and also do not keep retrying on old kernel.
Donald Sharp [Wed, 14 Jun 2017 12:08:12 +0000 (08:08 -0400)]
ip: mroute: Add table output to show command
When the user specifies `table all` or `table 0` to
the `ip mroute show` command we dump the entirety of
the known mroute tables. Without some sort of
divisor to tell us what table we are looking at
the command is useless.
Add `Table: <vrf name>` to the output of 'ip mroute show table 0'
Follow the convention established by 'ip route show table 0'
for when to display
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Jakub Sitnicki [Wed, 7 Jun 2017 13:23:13 +0000 (15:23 +0200)]
iproute: Remove useless check for nexthop keyword when setting RTA_OIF
When modifying a route we set the RTA_OIF attribute only if a device was
specified with "dev" or "oif" keyword. But for some unknown reason we
earlier alternatively check also for the presence of "nexthop" keyword,
even though it has no effect. So remove the pointless check.