iptuntap: allow creation of multi-queue tun/tap device
This patch adds multi_queue option to ip tuntap.
This allows IFF_MULTI_QUEUE flag to be specified during
tun/tap device creation enabling multi-queue support in tun/tap
device.
Example: ip tuntap add dev tap0 mode tap multi_queue
Nicolas Dichtel [Fri, 17 May 2013 15:42:34 +0000 (08:42 -0700)]
ss: allow to retrieve AF_PACKET info via netlink
This patch add support of netlink messages for AF_PACKET and thus it allows
to get filter information of this kind of sockets.
To dump these filters info the option --bfp must be specified and the user
must have admin rights.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Nicolas Dichtel [Fri, 17 May 2013 08:47:33 +0000 (01:47 -0700)]
ipnetconf: by default dump all entries
This is now possible, because the dump function has been added in kernel.
Note that IPv4 and IPv6 entries are displayed.
Before this patch, only all entries were displayed.
Example:
$ ip netconf
ipv4 dev lo forwarding on rp_filter off mc_forwarding 0
ipv4 dev eth0 forwarding on rp_filter off mc_forwarding 1
ipv4 all forwarding on rp_filter off mc_forwarding 1
ipv4 default forwarding on rp_filter off mc_forwarding 0
ipv6 dev lo forwarding on mc_forwarding 0
ipv6 dev eth0 forwarding on mc_forwarding 0
ipv6 all forwarding on mc_forwarding 0
ipv6 default forwarding on mc_forwarding 0
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This change shifts burden onto the users to choose the UDP port value.
Kernel default value is incorrect UDP port 5287 but now there is
an official assigned port for VXLAN.
The kernel can't change because of legacy compatibility
but new deployments should not use the legacy port value.
David L Stevens [Mon, 6 May 2013 12:11:46 +0000 (05:11 -0700)]
iproute2: support NTF_ROUTER flag in VXLAN fdb entries
This patch allows setting the "NTF_ROUTER" flag in VXLAN forwarding table
entries to enable L3 switching for router destinations while still allowing
L2 redirection appliances for non-router MAC destinations.
Signed-Off-By: David L Stevens <dlstevens@us.ibm.com>
Add ability to set UDP destination port on a per device basis.
If no port is assigned, the default IANA assigned port will be used.
If you want the kernel default value, then use port 0.
Source port range option is now called 'srcport', to avoid
confusion. The old option syntax is accepted for compatiablity.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Daniel Borkmann [Tue, 30 Apr 2013 06:22:50 +0000 (06:22 +0000)]
ip: ipv6: add tokenized interface identifier support
This patch adds support for tokenized IIDs, that enable
administrators to assign well-known host-part addresses
to nodes whilst still obtaining global network prefix
from Router Advertisements. This is the iproute2 part for
the kernel patch f53adae4eae5 (``net: ipv6: add tokenized
interface identifier support'').
Example commands with iproute2:
Setting a device token:
# ip token set ::1a:2b:3c:4d/64 dev eth1
Getting a device token:
# ip token get dev eth1
token ::1a:2b:3c:4d dev eth1
Listing all tokens:
# ip token list (or: ip token)
token :: dev eth0
token ::1a:2b:3c:4d dev eth1
Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Alexander Duyck [Mon, 29 Apr 2013 05:50:13 +0000 (05:50 +0000)]
iproute2: act_ipt fix xtables breakage on older versions.
In trying to build on a RHEL6.3 I ran into several build issues that are
addressed in this patch.
The first is that xtables_merge_options only has 3 parameters. It appears
this is how this code was originally. As such for the case where the version
is less than 6 I am assuming it would be correct to maintain the original
setup that only had 3 parameters being passed instead of 4.
I also ran into an issue with the define for __ALIGN_KERNEL not being present.
I believe this may be due to the fact that __ALIGN_KERNEL was moved into a
separate header from ALIGN after the UAPI changes. In order to just cover all
of the bases I have moved the main definition for the macros into
__ALIGN_KERNEL_MASK and __ALIGN_KERNEL and if ALIGN is also needed then it is
just a direct redefine to __ALIGN_KERNEL.
Cc: Hasan Chowdhury <shemonc@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Alexander Duyck [Thu, 25 Apr 2013 12:07:10 +0000 (12:07 +0000)]
libnetlink: Use ifinfomsg instead of rtgenmsg in rtnl_wilddump_req_filter
This change corrects a kernel incompatibility that was resulting in the
ext_filter_mask not being correctly discovered by the kernel as it is buried
somewhere in the ifinfomsg.
Reported-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: David S. Miller <davem@davemloft.net>
add short description of batch mode in tc man page
The tc command is missing documentation of -batch and -force switches
that are returned by "tc -help".
Add short description on their syntax and usage.
David Ward [Mon, 25 Mar 2013 04:23:18 +0000 (04:23 +0000)]
ip/xfrm: Improve usage text and documentation
Change ALGO-KEY to ALGO-KEYMAT to make it more obvious that the
keying material might need to contain more than just the key (such
as a salt or nonce value).
List the algorithm names that currently exist in the kernel.
Indicate that for IPComp, the Compression Parameter Index (CPI) is
used as the SPI.
Group the list of mode values by transform protocol.
David Ward [Mon, 25 Mar 2013 04:23:13 +0000 (04:23 +0000)]
ip/xfrm: Extend SPI validity checking
A Security Policy Index (SPI) is not used with Mobile IPv6. IPComp
uses a smaller 16-bit Compression Parameter Index (CPI) which is
passed as the SPI value. Perform checks whenever specifying an ID.
James Chapman [Tue, 26 Mar 2013 06:49:22 +0000 (06:49 +0000)]
iproute2: add l2spec_type param to l2tp add session
When unmanaged L2TP sessions are created using "ip l2tp add session",
there is no option to allow the session's Layer2SpecificHeader type to
be selected - the kernel's default setting is always used. For
interopability with some vendor equipment, it might be necessary to
use a different setting. So add a new l2spec_type parameter to the "ip
l2tp add session" parameter list, allowing operators to set a specific
Layer2SpecificHeader type. The kernel already exposes the setting as a
netlink attribute so it is straightforward to add support for it in
iproute2.
This change allows unmanaged L2TP sessions to be configured between
Linux and some Cisco equipment by specifying "l2spec_type none" in "ip
l2tp add session" command parameters.
Signed-off-by: James Chapman <jchapman@katalix.com>
In older versions of traffic shaping the Alpha kernel was special
and had higher HZ. This no longer matters, TC is based on high
resoulution timers in kernel.
Thomas Egerer [Wed, 20 Mar 2013 09:18:43 +0000 (02:18 -0700)]
ip xfrm state: Allow different selector family
My previous commit introduced a patch to allow for states with different
ip address families for selector and id. The must have somehow been a
mixup of the patch I tested and the one I send, so the patch sent breaks
the iproute2 build. This patch fixes this. My apologies.
Petr Šabata [Thu, 14 Mar 2013 14:10:44 +0000 (15:10 +0100)]
iproute2: Mention the 'up' argument in documentation
Both ip-link and ip-address support the 'up' argument, however this
isn't documented in neither their help outputs or ip-address' manpage.
This patch fixes that.
Signed-off-by: Petr Šabata <contyk@redhat.com> Reported-by: Jiří Popelka <jpopelka@redhat.com>
Vlad Yasevich [Thu, 28 Feb 2013 10:04:05 +0000 (10:04 +0000)]
bridge: Add vlan configuration support
Recent kernel patches added support for VLAN filtering on the bridge.
This functionality allows one to turn a basic bridge into a VLAN bridge,
where VLANs dicatate packet forwarding and header transformation.
To configure the VLANs on the bridge and its ports a new command is
added to the 'bridge' utility.
# bridge vlan add dev eth0 vid 10 pvid untagged brdev
# bridge vlan add
# bridge vlan delete dev eth0 vid 10
# bridge vlan show
This command supports the following flags:
master - peform the operation on the software bridge device. This is
the default behavior.
self - perform the operation on the hardware associated with the port.
This flag is required when the device is the bridge device and
the configuration is desired on the bridge device itself (not
one of the ports).
pvid - Set the PVID (port vlan id) for a given port. Any untagged
frames arriving on the port will be assigned to this vlan.
untagged - Sets the egress policy of for a given vlan. Default port
egress policy is tagged. Set this flag if you wish traffic
associated with this VLAN to exit the port untagged.
fix dependency on sizeof(__u64) == sizeof(unsigned long long)
Some platforms like ppc64 have unsigned long long as 128 bits, and
the printf format string would cause errors. Resolve this by using
unsigned long long where necessary (or unsigned long).
iproute2: clearer error messages for fifo and tbf qdiscs
Clearer error messages for fifo and tbf qdiscs:
- Say who is complaining
- Don't just say a parameter is bad, show the offending parameter
- Be clearer about duplicate parameters vs illegal pairs of parameters
- Try to give multiple error messages rather than let the user discover the errors one by one
- When there are parameter aliases, try to use the variant that was used, or at least mention them all
Note that in the old version an empty parameter list to tbf would just cause an explain() message
without a specific error message. By simply removing the relevant error check, the code now
handles this error more gracefully by printing an error message for all mandatory parameters.
It still prints the explain() message.
Signed-off-by: Kees van Reeuwijk <reeuwijk@few.vu.nl>
Lutz Jaenicke [Thu, 30 Aug 2012 05:01:34 +0000 (05:01 +0000)]
rtnl_wilddump_request: fix alignment issue for embedded platforms
Platforms have different alignment requirements which need to be
fulfilled by the compiler. If the structure elements are already
4 byte (NLMGS_ALIGNTO) aligned by the compiler adding an explicit
padding element (align_rta) is not allowed.
Use __attribute__ ((aligned (NLMSG_ALIGNTO))) in order to achieve
the required alignment.
Experienced on ARM (xscale) with symptom
netlink: 12 bytes leftover after parsing attributes
Tested on:
ARM (32bit Big Endian)
PowerPC (32bit Big Endian)
x86_64 (64bit Little Endian)
Each with different aligment requirments.
Fixes Debian bug #700434
Need to table id in filter to be unsigned to avoid conversion to -1
The documentation for "ip" suggests that, when using multiple routing tables, the table ID can be an arbitrary 32 bit number. I've been writing a script that calculates a table Id based on an IP addresses and sets up tables accordingly based on it. This seems to work for everything I've tried except "ip route flush". If you specify a table to flush with an ID over 2^31, it flushes all IPv4 routing tables. For example:
Will delete all routing tables, including the default one. Needless to say, this is quite annoying. I think this is an upstream bug, but your opinions will be greatly appreciated.
This patch improves many error messages as follows:
- For incorrect parameters, show the value of the offending parameter, rather than just say that it is incorrect
- Rephrased messages for clarity
- Rephrased to more `mainstream' english
Signed-off-by: Kees van Reeuwijk <reeuwijk@few.vu.nl>
Note that in ip-rule.8 I rephrased a sentence to "The RPDB is scanned
in order of decreasing priority." The original version talked about
*in*creasing priority, but from the context that didn't make sense.
Signed-off-by: Kees van Reeuwijk <reeuwijk@few.vu.nl>
On openSUSE 12.2 (at least) xtables.h is not installed in the system-wide
include dir but in /usr/include/iptables-1.4.16.3/. This results in the
following build failure:
em_ipset.c:26:21: fatal error: xtables.h: No such file or directory
Other includers of xtables.h already call out to pkg-config
Nicolas Dichtel [Tue, 5 Feb 2013 08:38:34 +0000 (00:38 -0800)]
iplink: display the value of IFLA_PROMISCUITY
This is useful to know the 'real' status of an interface (the flag IFF_PROMISC
is exported by the kernel only when the user set it explicitly, for example it
will not be exported when a tcpdump is running).
This information will be displayed when '-details' is provided by the user.
Example:
$ ip -d l l tun10
6: tun10: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN mode DEFAULT
link/sit 10.16.0.249 peer 10.16.0.121
sit remote 10.16.0.121 local 10.16.0.249 ttl inherit pmtudisc 6rd-prefix 2002::/16
promiscuity 2
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
David Ward [Sun, 27 Jan 2013 13:04:59 +0000 (13:04 +0000)]
ip/iptunnel: Extend TOS syntax
The 'inherit/STRING' or 'inherit/00..ff' syntax indicates that the
TOS field of tunneled packets should be copied from the original IP
header, but for non-IP packets the value STRING or 00..ff should be
used instead. (This syntax is already used by 'ip tunnel show'.)
Also clarify the man page and the command usage text (particularly
that the TOS is not specified as a decimal number).
iproute2: Add "ip netns pids" and "ip netns identify"
Add command that go between network namespace names and process
identifiers. The code builds and runs agains older kernels but
only works on Linux 3.8+ kernels where I have fixed stat to work
properly.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
iproute2: Make "ip netns delete" more likely to succeed
Sometimes "ip netns delete" fails because it can not delete the file a
network namespace was mounted on. If this only happened when a
network namespace was really in use this would be fine, but today it
is possible to pin all network namespaces by simply having a long
running process started with "ip netns exec".
Every mount is copied when a network namespace is created so it is
impossible to prevent the mounts from getting into other mount
namespaces. Modify all mounts in the files and subdirectories of
/var/run/netns to be shared mount points so that unmount events can
propogate, making it unlikely that "ip netns delete" will fail because
a directory is mounted in another mount namespace.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Ben Hutchings pointed out that the return value of do_netns is passed
to exit and the current convention of returning -1 for failure is
inconsitent with that reality.
Return EXIT_FAILURE instead of -1 and EXIT_SUCCESS instead of 0. To make
it clear that the return codes are expected to be passed to exit.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Some systems are now following the advice in
linux/Documentation/sharedsubtrees.txt and running with all mount
points shared between all mount namespaces by default.
After creating the mount namespace call mount on / with
MS_SLAVE|MS_REC to modify all mounts in the new mount namespace to
slave mounts if they are shared or private mounts otherwise.
Guarnateeing that changes to the mount namespace created with
"ip netns exec" don't propgate to other namespaces.
Reported-by: Petr Šabata <contyk@redhat.com> Tested-by: Petr Šabata <contyk@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>