example usage:
tc filter add dev $dev parent $id: basic match not ipset'(foobar src)' ..
also updates iproute2/ematch_map, else tc complains:
Error: Unable to find ematch "ipset" in /etc/iproute2/ematch_map
Please assign a unique ID to the ematch kind the suggested entry is:
8 ipset
when trying to use this ematch.
(text ematch (5) only exists in kernel, a vlan ematch (6) exists neither in
kernel nor userspace, but kernel headers define TCF_EM_VLAN == 6).
Mike Frysinger [Mon, 13 Aug 2012 15:09:52 +0000 (08:09 -0700)]
Fix regression with 'ip address show'
`ip a s` no longer shows addresses since 3.4.0 works, but 3.5.0,
the simple test case:
make clean && make -j -s && ./ip/ip a s lo
before that change, i would get:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
but after, i now get:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
seems like the bug was introduced in the middle of that patch:
- if (filter.family != AF_PACKET) {
+ if (filter.family && filter.family != AF_PACKET) {
+ if (filter.oneline)
+ no_link = 1;
+
if (rtnl_wilddump_request(&rth, filter.family, RTM_GETADDR) < 0) {
perror("Cannot send dump request");
exit(1);
if i revert the change to the if statement there, `ip a s` works for me again.
Alternative solution to problem reported by Pravin B Shelar <pshelar@nicira.com>
Split large function ipaddr_list_or_flush into components.
Fix memory leak of address and link nlmsg info.
Avoid fetching address info if only flushing.
Li Wei [Tue, 10 Jul 2012 08:45:28 +0000 (16:45 +0800)]
tc: filter: validate filter priority in userspace.
Because we use the high 16 bits of tcm_info to pass prio value to
kernel, thus it's range would be [0, 0xffff], without validation
in tc when user pass a lager(>65535) priority, the actual priority
set in kernel would confuse the user.
This makes 2 changes:
1: Add fq_codel to SEE ALSO section in tc manpage.
2: Reorder the SEE ALSO section to make the order alphabetical
(suggested by Jan Ceuleers ).
tc(8): Negative indent and missing "-" after an escape
<groff: tc.8>:51: warning: total indent cannot be negative
<groff: tc.8>:57: warning: escape character ignored before `i'
*********************
Space at end of line removed
General considerations
a) Manuals should usually only be left justified. Use ".ad l"
as the first regular command.
b) Each sentence should begin on a new line. The conventions
about the amount of space between sentences are different. This
also makes a check on the number of space characters between
words easier.
c) Separate numbers from units with a (no-break) space. A
no-break space can be code 0xA0, "\ " (\<space>), or "\~"
(groff).
d) Use macros "TS/TE" for tables with more than two columns.
Then use
'\" t
as the first line in the source to tell "man" to use the "tbl"
preprocessor.
e) Protect last period (full stop) in abbreviations with "\&",
if it is or might be (through new formatting of source) at the
end of line, if it is also not an end of sentence.
*********************
Originally filed at: http://bugs.debian.org/674704
Signed-off-by: Andreas Henriksson <andreas@fatal.se>
Chris Elston [Tue, 1 May 2012 04:25:22 +0000 (04:25 +0000)]
iproute2: allow IPv6 addresses for l2tp local and remote parameters
Adds support for parsing IPv6 addresses to the parameters local and
remote in the l2tp commands. Requires netlink attributes L2TP_ATTR_IP6_SADDR
and L2TP_ATTR_IP6_DADDR, added in a required kernel patch already submitted
to netdev.
Also enables printing of IPv6 addresses returned by the L2TP_CMD_TUNNEL_GET
request.
Signed-off-by: Chris Elston <celston@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com>
Eric Dumazet [Fri, 11 May 2012 09:49:50 +0000 (09:49 +0000)]
fq_codel: Fair Queue Codel AQM
Fair Queue Codel packet scheduler
Principles :
- Packets are classified (internal classifier or external) on flows.
- This is a Stochastic model (as we use a hash, several flows might
be hashed on same slot)
- Each flow has a CoDel managed queue.
- Flows are linked onto two (Round Robin) lists,
so that new flows have priority on old ones.
- For a given flow, packets are not reordered (CoDel uses a FIFO)
- head drops only.
- ECN capability is on by default.
- Very low memory footprint (64 bytes per flow)
tc qdisc ... fq_codel [ limit PACKETS ] [ flows number ]
[ target TIME ] [ interval TIME ] [ noecn ]
[ quantum BYTES ]
Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Dave Taht <dave.taht@bufferbloat.net> Cc: Kathleen Nichols <nichols@pollere.com> Cc: Van Jacobson <van@pollere.net> Cc: Tom Herbert <therbert@google.com> Cc: Matt Mathis <mattmathis@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Changli Gao <xiaosuo@gmail.com>
This AQM main input is no longer queue size in bytes or packets, but the
delay packets stay in (FIFO) queue.
As we don't have infinite memory, we still can drop packets in enqueue()
in case of massive load, but mean of CoDel is to drop packets in
dequeue(), using a control law based on two simple parameters :
target : target sojourn time (default 5ms)
interval : width of moving time window (default 100ms)
Selected packets are dropped, unless ECN is enabled and packets can get
ECN mark instead.
Usage: tc qdisc ... codel [ limit PACKETS ] [ target TIME ]
[ interval TIME ] [ ecn ]
CoDel must be seen as a base module, and should be used keeping in mind
there is still a FIFO queue. So a typical setup will probably need a
hierarchy of several qdiscs and packet classifiers to be able to meet
whatever constraints a user might have.
One possible example would be to use fq_codel, which combines Fair
Queueing and CoDel, in replacement of sfq / sfq_red.
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Dave Taht <dave.taht@bufferbloat.net>
This patch provides support for marking packets with ECN instead of
dropping them with netem. This makes it possible to make use of the
netem ECN marking feature that was added recently to the kernel.
iproute2: man page and /bin/ip disagree on del vs delete
Reported by Robert Henney:
> the 'ip' man page does not mention the command "del" at all but does
> claim, "As a rule, it is possible to add, delete and show (or list ) objects".
> however, 'ip' does not always recognize "delete" as a commend.
>
> robh@debian:~$ ip tunnel delete
> Command "delete" is unknown, try "ip tunnel help".
Lets use "delete" in all calls to matches() for consistency. This will
make both "del" and "delete" work everywhere.
Signed-off-by: Andreas Henriksson <andreas@fatal.se>
iproute2: trivial fix of ip link syntax in manpage
Reported by Ivan Vilata i Balaguer <ivan@selidor.net>
found that the description of the `ip link add` command in the manpage
is outdated regarding the compulsory `link DEVICE` option.
For instance, `ip link help` says:
Usage: ip link add [link DEV] [ name ] NAME
...
But the manpage still says:
ip link add link DEVICE [ name ] NAME
(Trying to provide a `link` option e.g. under an LXC container can frustrate
the creation of dummy devices which don't need an actual device.)
The syntax of the "ip link help" output was fixed in commit
"iproute2: Fix usage and man page for 'ip link'" (a22e92951d).
This updates the manpage to mark "link DEVICE" as an optional
argument there as well.
http://bugs.debian.org/673171
Signed-off-by: Andreas Henriksson <andreas@fatal.se>
Commit (761a1e60 iproute2 - Split up manual page installation )
introduced man/man8/Makefile but did not add all the man pages.
This patch adds the missing man pages for installation.
Chris Elston [Fri, 20 Apr 2012 01:29:42 +0000 (01:29 +0000)]
iproute2: allow IPv6 addresses for l2tp local and remote parameters
Adds support for parsing IPv6 addresses to the parameters local and
remote in the l2tp commands. Requires netlink attributes L2TP_ATTR_IP6_SADDR
and L2TP_ATTR_IP6_DADDR, added in a required kernel patch already submitted
to netdev.
Also enables printing of IPv6 addresses returned by the L2TP_CMD_TUNNEL_GET
request.
Signed-off-by: Chris Elston <celston@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com>
Generate manual pages based on where the config files are installed.
Add missing manual pages for utilities which are links to other binaries.
Make tc-pfifo.8 a real file that points to tc-bfifo.8 instead of symlink
which causes problems with compressing manual pages.
Signed-off-by: Christoph J. Thompson <cjsthompson@gmail.com>
Rose, Gregory V [Tue, 21 Feb 2012 10:43:09 +0000 (10:43 +0000)]
iproute2: Add netlink attribute to filter dump requests
Add a new netlink attribute type to the dump request to allow
filtering of the information returned for the respective matching
interfaces. At this time the only filter defined is to request
virtual function (VF) device info for interfaces that attached VFs.
It will also be possible to extend the request with other yet to be
defined netlink attributes in the future.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
The kernel supports a link mode attribute (which can be dormant or default).
This attribute is used to control how the link watch engine
handles operstate transistion.
This adds a new parameter to ip link command to allow setting and
displaying the value.
---
There is nothing in the standard that says 0 can't be used as a key.
It makes sense to allow it. Also fix typo where ikey was printed for
when printing okey.
Kenyon Ralph [Thu, 15 Mar 2012 21:39:12 +0000 (14:39 -0700)]
Update ip address manual page
* update synopsis to match "ip address help" output
* specify IPv4, since "IP" is ambiguous
* remove deprecated site scope
* document lifetimes, home, and nodad
* update wording to make sense since page was split from the ip(8) page
* git rid of extra spaces
As reported by Thomas Mühlgrabner <muehltom@cable.vol.at>
in http://bugs.debian.org/662979 :
When showing htb class configuration with "tc -iec class show",
the output for Mibit is actually the value for bit.
Example: configure a class with a ceil of 1000Mibit.
Output states 1048576000 Mibit.
The cause is missing parenteses in the display code of tc....
(Please also note that a lower value of 100Mibit will be displayed
as 102400 Kibit, which I think is kind of ugly.)
Reported-by: Thomas Mühlgrabner <muehltom@cable.vol.at> Signed-off-by: Andreas Henriksson <andreas@fatal.se>
Yegor Yefremov [Mon, 27 Feb 2012 14:21:15 +0000 (15:21 +0100)]
iproute2: cleanup dependencies
LIBNETLINK will be defined in the main Makefile, so
both ../lib/libnetlink.a ../lib/libutil.a will be
automatically appended during linking. Otherwise
../lib/libnetlink.a ../lib/libutil.a will appear
twice during linking.
A new option -p is added to the arpd command that accepts
a time indicating the number of seconds
to wait between kernel arp table polling attempts.
The minimum value is .1 (100ms).
If not specified, polling defaults to 30 seconds.
Patch by Erik Hugne <erik.hugne@ericsson.com> with
modifications
Based on patch by Vasiliy Kulikov <segoon@openwall.com>
Don't use /tmp since it is dangerous, instead put temporary files
from configure script in build directory. This is what autoconf
generated configure does.
Tony Zelenoff [Thu, 26 Jan 2012 04:50:04 +0000 (04:50 +0000)]
Modify neighbour proxy show
New "ip neigh show proxy" command now can show proxies which
were added with "ip neigh add proxy" command. Kernel code to
support this feature sent a bit earlier to netdev.
Signed-off-by: Tony Zelenoff <antonz@parallels.com>
The syntax used in the example on reordering in the manpage is inconsistent with
the usage syntax. Moreover, the text does not describe the reordering process
in the kernel correctly. This patch fixes these two issues.
(Resending patch since it looks like my earlier mail did not make it to
netdev).
netem reordering requires that the delay parameter be given. Currently, if no
delay is given, tc prints the error message but still installs the qdisc. Fix
this by printing the usage and failing cleanly.
Eric Dumazet [Fri, 20 Jan 2012 11:17:43 +0000 (12:17 +0100)]
sfq: add optional RED on top of SFQ
Adds an optional Random Early Detection on each SFQ flow queue.
Traditional SFQ limits count of packets, while RED permits to also
control number of bytes per flow, and adds ECN capability as well.
1) We dont handle the idle time management in this RED implementation,
since each 'new flow' begins with a null qavg. We really want to address
backlogged flows.
2) if headdrop is selected, we try to ecn mark first packet instead of
currently enqueued packet. This gives faster feedback for tcp flows
compared to traditional RED [ marking the last packet in queue ]
Example of use :
tc qdisc add dev $DEV parent 1:1 handle 10: est 1sec 4sec sfq \
limit 3000 headdrop flows 512 divisor 16384 \
redflowlimit 100000 min 8000 max 60000 probability 0.20 ecn
In this test, with 64 netperf TCP_STREAM sessions, 50% using ECN enabled
flows, we can see number of packets CE marked is smaller than number of
drops (for non ECN flows)
If same test is run, without RED, we can check backlog is much bigger.