]> git.proxmox.com Git - mirror_iproute2.git/blame - man/man8/tc-cake.8
bridge: mdb: add support for source address
[mirror_iproute2.git] / man / man8 / tc-cake.8
CommitLineData
714444c0
THJ
1.TH CAKE 8 "19 July 2018" "iproute2" "Linux"
2.SH NAME
3CAKE \- Common Applications Kept Enhanced (CAKE)
4.SH SYNOPSIS
5.B tc qdisc ... cake
6.br
7[
8.BR bandwidth
9RATE |
10.BR unlimited*
11|
77c9fbd0 12.BR autorate-ingress
714444c0
THJ
13]
14.br
15[
16.BR rtt
17TIME |
18.BR datacentre
19|
20.BR lan
21|
22.BR metro
23|
24.BR regional
25|
26.BR internet*
27|
28.BR oceanic
29|
30.BR satellite
31|
32.BR interplanetary
33]
34.br
35[
36.BR besteffort
37|
38.BR diffserv8
39|
40.BR diffserv4
41|
42.BR diffserv3*
43]
44.br
45[
46.BR flowblind
47|
48.BR srchost
49|
50.BR dsthost
51|
52.BR hosts
53|
54.BR flows
55|
56.BR dual-srchost
57|
58.BR dual-dsthost
59|
60.BR triple-isolate*
61]
62.br
63[
64.BR nat
65|
66.BR nonat*
67]
68.br
69[
70.BR wash
71|
72.BR nowash*
73]
74.br
75[
23a67b00
THJ
76.BR split-gso*
77|
78.BR no-split-gso
79]
80.br
81[
714444c0
THJ
82.BR ack-filter
83|
84.BR ack-filter-aggressive
85|
86.BR no-ack-filter*
87]
88.br
89[
90.BR memlimit
91LIMIT ]
92.br
93[
d5d27f27
THJ
94.BR fwmark
95MASK ]
96.br
97[
714444c0
THJ
98.BR ptm
99|
100.BR atm
101|
102.BR noatm*
103]
104.br
105[
106.BR overhead
107N |
108.BR conservative
109|
110.BR raw*
111]
112.br
113[
114.BR mpu
115N ]
116.br
117[
118.BR ingress
119|
120.BR egress*
121]
122.br
123(* marks defaults)
124
125
126.SH DESCRIPTION
127CAKE (Common Applications Kept Enhanced) is a shaping-capable queue discipline
128which uses both AQM and FQ. It combines COBALT, which is an AQM algorithm
129combining Codel and BLUE, a shaper which operates in deficit mode, and a variant
130of DRR++ for flow isolation. 8-way set-associative hashing is used to virtually
131eliminate hash collisions. Priority queuing is available through a simplified
132diffserv implementation. Overhead compensation for various encapsulation
133schemes is tightly integrated.
134
135All settings are optional; the default settings are chosen to be sensible in
136most common deployments. Most people will only need to set the
137.B bandwidth
138parameter to get useful results, but reading the
139.B Overhead Compensation
140and
141.B Round Trip Time
142sections is strongly encouraged.
143
144.SH SHAPER PARAMETERS
145CAKE uses a deficit-mode shaper, which does not exhibit the initial burst
146typical of token-bucket shapers. It will automatically burst precisely as much
147as required to maintain the configured throughput. As such, it is very
148straightforward to configure.
149.PP
150.B unlimited
151(default)
152.br
153 No limit on the bandwidth.
154.PP
155.B bandwidth
156RATE
157.br
158 Set the shaper bandwidth. See
159.BR tc(8)
160or examples below for details of the RATE value.
161.PP
77c9fbd0 162.B autorate-ingress
714444c0
THJ
163.br
164 Automatic capacity estimation based on traffic arriving at this qdisc.
165This is most likely to be useful with cellular links, which tend to change
166quality randomly. A
167.B bandwidth
168parameter can be used in conjunction to specify an initial estimate. The shaper
169will periodically be set to a bandwidth slightly below the estimated rate. This
170estimator cannot estimate the bandwidth of links downstream of itself.
171
172.SH OVERHEAD COMPENSATION PARAMETERS
173The size of each packet on the wire may differ from that seen by Linux. The
174following parameters allow CAKE to compensate for this difference by internally
175considering each packet to be bigger than Linux informs it. To assist users who
176are not expert network engineers, keywords have been provided to represent a
177number of common link technologies.
178
179.SS Manual Overhead Specification
180.B overhead
181BYTES
182.br
183 Adds BYTES to the size of each packet. BYTES may be negative; values
184between -64 and 256 (inclusive) are accepted.
185.PP
186.B mpu
187BYTES
188.br
189 Rounds each packet (including overhead) up to a minimum length
190BYTES. BYTES may not be negative; values between 0 and 256 (inclusive)
191are accepted.
192.PP
193.B atm
194.br
195 Compensates for ATM cell framing, which is normally found on ADSL links.
196This is performed after the
197.B overhead
198parameter above. ATM uses fixed 53-byte cells, each of which can carry 48 bytes
199payload.
200.PP
201.B ptm
202.br
203 Compensates for PTM encoding, which is normally found on VDSL2 links and
204uses a 64b/65b encoding scheme. It is even more efficient to simply
205derate the specified shaper bandwidth by a factor of 64/65 or 0.984. See
206ITU G.992.3 Annex N and IEEE 802.3 Section 61.3 for details.
207.PP
208.B noatm
209.br
210 Disables ATM and PTM compensation.
211
212.SS Failsafe Overhead Keywords
213These two keywords are provided for quick-and-dirty setup. Use them if you
214can't be bothered to read the rest of this section.
215.PP
216.B raw
217(default)
218.br
219 Turns off all overhead compensation in CAKE. The packet size reported
220by Linux will be used directly.
221.PP
222 Other overhead keywords may be added after "raw". The effect of this is
223to make the overhead compensation operate relative to the reported packet size,
224not the underlying IP packet size.
225.PP
226.B conservative
227.br
228 Compensates for more overhead than is likely to occur on any
229widely-deployed link technology.
230.br
231 Equivalent to
232.B overhead 48 atm.
233
234.SS ADSL Overhead Keywords
235Most ADSL modems have a way to check which framing scheme is in use. Often this
236is also specified in the settings document provided by the ISP. The keywords in
237this section are intended to correspond with these sources of information. All
238of them implicitly set the
239.B atm
240flag.
241.PP
242.B pppoa-vcmux
243.br
244 Equivalent to
245.B overhead 10 atm
246.PP
247.B pppoa-llc
248.br
249 Equivalent to
250.B overhead 14 atm
251.PP
252.B pppoe-vcmux
253.br
254 Equivalent to
255.B overhead 32 atm
256.PP
257.B pppoe-llcsnap
258.br
259 Equivalent to
260.B overhead 40 atm
261.PP
262.B bridged-vcmux
263.br
264 Equivalent to
265.B overhead 24 atm
266.PP
267.B bridged-llcsnap
268.br
269 Equivalent to
270.B overhead 32 atm
271.PP
272.B ipoa-vcmux
273.br
274 Equivalent to
275.B overhead 8 atm
276.PP
277.B ipoa-llcsnap
278.br
279 Equivalent to
280.B overhead 16 atm
281.PP
282See also the Ethernet Correction Factors section below.
283
284.SS VDSL2 Overhead Keywords
285ATM was dropped from VDSL2 in favour of PTM, which is a much more
286straightforward framing scheme. Some ISPs retained PPPoE for compatibility with
287their existing back-end systems.
288.PP
289.B pppoe-ptm
290.br
291 Equivalent to
292.B overhead 30 ptm
293
294.br
295 PPPoE: 2B PPP + 6B PPPoE +
296.br
297 ETHERNET: 6B dest MAC + 6B src MAC + 2B ethertype + 4B Frame Check Sequence +
298.br
299 PTM: 1B Start of Frame (S) + 1B End of Frame (Ck) + 2B TC-CRC (PTM-FCS)
300.br
301.PP
302.B bridged-ptm
303.br
304 Equivalent to
305.B overhead 22 ptm
306.br
307 ETHERNET: 6B dest MAC + 6B src MAC + 2B ethertype + 4B Frame Check Sequence +
308.br
309 PTM: 1B Start of Frame (S) + 1B End of Frame (Ck) + 2B TC-CRC (PTM-FCS)
310.br
311.PP
312See also the Ethernet Correction Factors section below.
313
314.SS DOCSIS Cable Overhead Keyword
315DOCSIS is the universal standard for providing Internet service over cable-TV
316infrastructure.
317
318In this case, the actual on-wire overhead is less important than the packet size
319the head-end equipment uses for shaping and metering. This is specified to be
320an Ethernet frame including the CRC (aka FCS).
321.PP
322.B docsis
323.br
324 Equivalent to
325.B overhead 18 mpu 64 noatm
326
327.SS Ethernet Overhead Keywords
328.PP
329.B ethernet
330.br
331 Accounts for Ethernet's preamble, inter-frame gap, and Frame Check
332Sequence. Use this keyword when the bottleneck being shaped for is an
333actual Ethernet cable.
334.br
335 Equivalent to
336.B overhead 38 mpu 84 noatm
337.PP
338.B ether-vlan
339.br
340 Adds 4 bytes to the overhead compensation, accounting for an IEEE 802.1Q
341VLAN header appended to the Ethernet frame header. NB: Some ISPs use one or
342even two of these within PPPoE; this keyword may be repeated as necessary to
343express this.
344
345.SH ROUND TRIP TIME PARAMETERS
346Active Queue Management (AQM) consists of embedding congestion signals in the
347packet flow, which receivers use to instruct senders to slow down when the queue
348is persistently occupied. CAKE uses ECN signalling when available, and packet
349drops otherwise, according to a combination of the Codel and BLUE AQM algorithms
350called COBALT.
351
352Very short latencies require a very rapid AQM response to adequately control
353latency. However, such a rapid response tends to impair throughput when the
354actual RTT is relatively long. CAKE allows specifying the RTT it assumes for
355tuning various parameters. Actual RTTs within an order of magnitude of this
356will generally work well for both throughput and latency management.
357
358At the 'lan' setting and below, the time constants are similar in magnitude to
359the jitter in the Linux kernel itself, so congestion might be signalled
360prematurely. The flows will then become sparse and total throughput reduced,
361leaving little or no back-pressure for the fairness logic to work against. Use
362the "metro" setting for local lans unless you have a custom kernel.
363.PP
364.B rtt
365TIME
366.br
367 Manually specify an RTT.
368.PP
369.B datacentre
370.br
371 For extremely high-performance 10GigE+ networks only. Equivalent to
372.B rtt 100us.
373.PP
374.B lan
375.br
376 For pure Ethernet (not Wi-Fi) networks, at home or in the office. Don't
377use this when shaping for an Internet access link. Equivalent to
378.B rtt 1ms.
379.PP
380.B metro
381.br
382 For traffic mostly within a single city. Equivalent to
383.B rtt 10ms.
384.PP
385.B regional
386.br
387 For traffic mostly within a European-sized country. Equivalent to
388.B rtt 30ms.
389.PP
390.B internet
391(default)
392.br
393 This is suitable for most Internet traffic. Equivalent to
394.B rtt 100ms.
395.PP
396.B oceanic
397.br
398 For Internet traffic with generally above-average latency, such as that
399suffered by Australasian residents. Equivalent to
400.B rtt 300ms.
401.PP
402.B satellite
403.br
404 For traffic via geostationary satellites. Equivalent to
405.B rtt 1000ms.
406.PP
407.B interplanetary
408.br
409 So named because Jupiter is about 1 light-hour from Earth. Use this to
410(almost) completely disable AQM actions. Equivalent to
411.B rtt 3600s.
412
413.SH FLOW ISOLATION PARAMETERS
414With flow isolation enabled, CAKE places packets from different flows into
415different queues, each of which carries its own AQM state. Packets from each
416queue are then delivered fairly, according to a DRR++ algorithm which minimises
417latency for "sparse" flows. CAKE uses a set-associative hashing algorithm to
418minimise flow collisions.
419
420These keywords specify whether fairness based on source address, destination
421address, individual flows, or any combination of those is desired.
422.PP
423.B flowblind
424.br
425 Disables flow isolation; all traffic passes through a single queue for
426each tin.
427.PP
428.B srchost
429.br
430 Flows are defined only by source address. Could be useful on the egress
431path of an ISP backhaul.
432.PP
433.B dsthost
434.br
435 Flows are defined only by destination address. Could be useful on the
436ingress path of an ISP backhaul.
437.PP
438.B hosts
439.br
440 Flows are defined by source-destination host pairs. This is host
441isolation, rather than flow isolation.
442.PP
443.B flows
444.br
445 Flows are defined by the entire 5-tuple of source address, destination
446address, transport protocol, source port and destination port. This is the type
447of flow isolation performed by SFQ and fq_codel.
448.PP
449.B dual-srchost
450.br
451 Flows are defined by the 5-tuple, and fairness is applied first over
452source addresses, then over individual flows. Good for use on egress traffic
453from a LAN to the internet, where it'll prevent any one LAN host from
454monopolising the uplink, regardless of the number of flows they use.
455.PP
456.B dual-dsthost
457.br
458 Flows are defined by the 5-tuple, and fairness is applied first over
459destination addresses, then over individual flows. Good for use on ingress
460traffic to a LAN from the internet, where it'll prevent any one LAN host from
461monopolising the downlink, regardless of the number of flows they use.
462.PP
463.B triple-isolate
464(default)
465.br
466 Flows are defined by the 5-tuple, and fairness is applied over source
467*and* destination addresses intelligently (ie. not merely by host-pairs), and
468also over individual flows. Use this if you're not certain whether to use
469dual-srchost or dual-dsthost; it'll do both jobs at once, preventing any one
470host on *either* side of the link from monopolising it with a large number of
471flows.
472.PP
473.B nat
474.br
475 Instructs Cake to perform a NAT lookup before applying flow-isolation
476rules, to determine the true addresses and port numbers of the packet, to
477improve fairness between hosts "inside" the NAT. This has no practical effect
478in "flowblind" or "flows" modes, or if NAT is performed on a different host.
479.PP
480.B nonat
481(default)
482.br
483 Cake will not perform a NAT lookup. Flow isolation will be performed
484using the addresses and port numbers directly visible to the interface Cake is
485attached to.
486
487.SH PRIORITY QUEUE PARAMETERS
488CAKE can divide traffic into "tins" based on the Diffserv field. Each tin has
489its own independent set of flow-isolation queues, and is serviced based on a WRR
490algorithm. To avoid perverse Diffserv marking incentives, tin weights have a
491"priority sharing" value when bandwidth used by that tin is below a threshold,
492and a lower "bandwidth sharing" value when above. Bandwidth is compared against
493the threshold using the same algorithm as the deficit-mode shaper.
494
495Detailed customisation of tin parameters is not provided. The following presets
496perform all necessary tuning, relative to the current shaper bandwidth and RTT
497settings.
498.PP
499.B besteffort
500.br
501 Disables priority queuing by placing all traffic in one tin.
502.PP
503.B precedence
504.br
505 Enables legacy interpretation of TOS "Precedence" field. Use of this
506preset on the modern Internet is firmly discouraged.
507.PP
508.B diffserv4
509.br
510 Provides a general-purpose Diffserv implementation with four tins:
511.br
512 Bulk (CS1), 6.25% threshold, generally low priority.
513.br
514 Best Effort (general), 100% threshold.
515.br
516 Video (AF4x, AF3x, CS3, AF2x, CS2, TOS4, TOS1), 50% threshold.
517.br
518 Voice (CS7, CS6, EF, VA, CS5, CS4), 25% threshold.
519.PP
520.B diffserv3
521(default)
522.br
523 Provides a simple, general-purpose Diffserv implementation with three tins:
524.br
525 Bulk (CS1), 6.25% threshold, generally low priority.
526.br
527 Best Effort (general), 100% threshold.
528.br
529 Voice (CS7, CS6, EF, VA, TOS4), 25% threshold, reduced Codel interval.
530
d5d27f27
THJ
531.PP
532.B fwmark
533MASK
534.br
535 This options turns on fwmark-based overriding of CAKE's tin selection.
536If set, the option specifies a bitmask that will be applied to the fwmark
537associated with each packet. If the result of this masking is non-zero, the
538result will be right-shifted by the number of least-significant unset bits in
539the mask value, and the result will be used as a the tin number for that packet.
540This can be used to set policies in a firewall script that will override CAKE's
541built-in tin selection.
542
714444c0
THJ
543.SH OTHER PARAMETERS
544.B memlimit
545LIMIT
546.br
547 Limit the memory consumed by Cake to LIMIT bytes. Note that this does
548not translate directly to queue size (so do not size this based on bandwidth
549delay product considerations, but rather on worst case acceptable memory
550consumption), as there is some overhead in the data structures containing the
551packets, especially for small packets.
552
553 By default, the limit is calculated based on the bandwidth and RTT
554settings.
555
556.PP
557.B wash
558
559.br
560 Traffic entering your diffserv domain is frequently mis-marked in
561transit from the perspective of your network, and traffic exiting yours may be
562mis-marked from the perspective of the transiting provider.
563
564Apply the wash option to clear all extra diffserv (but not ECN bits), after
565priority queuing has taken place.
566
567If you are shaping inbound, and cannot trust the diffserv markings (as is the
568case for Comcast Cable, among others), it is best to use a single queue
569"besteffort" mode with wash.
570
23a67b00
THJ
571.PP
572.B split-gso
573
574.br
575 This option controls whether CAKE will split General Segmentation
576Offload (GSO) super-packets into their on-the-wire components and
577dequeue them individually.
578
579.br
580Super-packets are created by the networking stack to improve efficiency.
581However, because they are larger they take longer to dequeue, which
582translates to higher latency for competing flows, especially at lower
583bandwidths. CAKE defaults to splitting GSO packets to achieve the lowest
584possible latency. At link speeds higher than 10 Gbps, setting the
585no-split-gso parameter can increase the maximum achievable throughput by
586retaining the full GSO packets.
587
6526e604
THJ
588.SH OVERRIDING CLASSIFICATION WITH TC FILTERS
589
590CAKE supports overriding of its internal classification of packets through the
591tc filter mechanism. Packets can be assigned to different priority tins by
592setting the
593.B priority
594field on the skb, and the flow hashing can be overridden by setting the
595.B classid
596parameter.
597
598.PP
599.B Tin override
600
601.br
602 To assign a priority tin, the major number of the priority field needs
603to match the qdisc handle of the cake instance; if it does, the minor number
604will be interpreted as the tin index. For example, to classify all ICMP packets
605as 'bulk', the following filter can be used:
606
607.br
608 # tc qdisc replace dev eth0 handle 1: root cake diffserv3
609 # tc filter add dev eth0 parent 1: protocol ip prio 1 \\
610 u32 match icmp type 0 0 action skbedit priority 1:1
611
612.PP
613.B Flow hash override
614
615.br
616 To override flow hashing, the classid can be set. CAKE will interpret
617the major number of the classid as the host hash used in host isolation mode,
618and the minor number as the flow hash used for flow-based queueing. One or both
619of those can be set, and will be used if the relevant flow isolation parameter
620is set (i.e., the major number will be ignored if CAKE is not configured in
621hosts mode, and the minor number will be ignored if CAKE is not configured in
622flows mode).
623
624.br
625This example will assign all ICMP packets to the first queue:
626
627.br
628 # tc qdisc replace dev eth0 handle 1: root cake
629 # tc filter add dev eth0 parent 1: protocol ip prio 1 \\
630 u32 match icmp type 0 0 classid 0:1
631
632.br
633If only one of the host and flow overrides is set, CAKE will compute the other
634hash from the packet as normal. Note, however, that the host isolation mode
635works by assigning a host ID to the flow queue; so if overriding both host and
636flow, the same flow cannot have more than one host assigned. In addition, it is
637not possible to assign different source and destination host IDs through the
638override mechanism; if a host ID is assigned, it will be used as both source and
639destination host.
640
641
642
714444c0
THJ
643.SH EXAMPLES
644# tc qdisc delete root dev eth0
645.br
646# tc qdisc add root dev eth0 cake bandwidth 100Mbit ethernet
647.br
648# tc -s qdisc show dev eth0
649.br
650qdisc cake 1: root refcnt 2 bandwidth 100Mbit diffserv3 triple-isolate rtt 100.0ms noatm overhead 38 mpu 84
651 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
652 backlog 0b 0p requeues 0
653 memory used: 0b of 5000000b
654 capacity estimate: 100Mbit
655 min/max network layer size: 65535 / 0
656 min/max overhead-adjusted size: 65535 / 0
657 average network hdr offset: 0
658
659 Bulk Best Effort Voice
660 thresh 6250Kbit 100Mbit 25Mbit
661 target 5.0ms 5.0ms 5.0ms
662 interval 100.0ms 100.0ms 100.0ms
663 pk_delay 0us 0us 0us
664 av_delay 0us 0us 0us
665 sp_delay 0us 0us 0us
666 pkts 0 0 0
667 bytes 0 0 0
668 way_inds 0 0 0
669 way_miss 0 0 0
670 way_cols 0 0 0
671 drops 0 0 0
672 marks 0 0 0
673 ack_drop 0 0 0
674 sp_flows 0 0 0
675 bk_flows 0 0 0
676 un_flows 0 0 0
677 max_len 0 0 0
678 quantum 300 1514 762
679
680After some use:
681.br
682# tc -s qdisc show dev eth0
683
684qdisc cake 1: root refcnt 2 bandwidth 100Mbit diffserv3 triple-isolate rtt 100.0ms noatm overhead 38 mpu 84
685 Sent 44709231 bytes 31931 pkt (dropped 45, overlimits 93782 requeues 0)
686 backlog 33308b 22p requeues 0
687 memory used: 292352b of 5000000b
688 capacity estimate: 100Mbit
689 min/max network layer size: 28 / 1500
690 min/max overhead-adjusted size: 84 / 1538
691 average network hdr offset: 14
692
693 Bulk Best Effort Voice
694 thresh 6250Kbit 100Mbit 25Mbit
695 target 5.0ms 5.0ms 5.0ms
696 interval 100.0ms 100.0ms 100.0ms
697 pk_delay 8.7ms 6.9ms 5.0ms
698 av_delay 4.9ms 5.3ms 3.8ms
699 sp_delay 727us 1.4ms 511us
700 pkts 2590 21271 8137
701 bytes 3081804 30302659 11426206
702 way_inds 0 46 0
703 way_miss 3 17 4
704 way_cols 0 0 0
705 drops 20 15 10
706 marks 0 0 0
707 ack_drop 0 0 0
708 sp_flows 2 4 1
709 bk_flows 1 2 1
710 un_flows 0 0 0
711 max_len 1514 1514 1514
712 quantum 300 1514 762
713
714.SH SEE ALSO
715.BR tc (8),
716.BR tc-codel (8),
717.BR tc-fq_codel (8),
718.BR tc-htb (8)
719
720.SH AUTHORS
721Cake's principal author is Jonathan Morton, with contributions from
722Tony Ambardar, Kevin Darbyshire-Bryant, Toke Høiland-Jørgensen,
723Sebastian Moeller, Ryan Mounce, Dean Scarff, Nils Andreas Svee, and Dave Täht.
724
725This manual page was written by Loganaden Velvindron. Please report corrections
726to the Linux Networking mailing list <netdev@vger.kernel.org>.