]> git.proxmox.com Git - mirror_frr.git/blob - doc/user/flowspec.rst
Merge pull request #12646 from pguibert6WIND/mpls_alloc_per_nh
[mirror_frr.git] / doc / user / flowspec.rst
1 .. _flowspec:
2
3 Flowspec
4 ========
5
6 .. _features-of-the-current-implementation-flowspec:
7
8 Overview
9 ---------
10
11 Flowspec introduces a new :abbr:`NLRI (Network Layer Reachability Information)`
12 encoding format that is used to distribute traffic rule flow specifications.
13 Basically, instead of simply relying on destination IP address for IP prefixes,
14 the IP prefix is replaced by a n-tuple consisting of a rule. That rule can be a
15 more or less complex combination of the following:
16
17
18 - Network source/destination (can be one or the other, or both).
19 - Layer 4 information for UDP/TCP: source port, destination port, or any port.
20 - Layer 4 information for ICMP type and ICMP code.
21 - Layer 4 information for TCP Flags.
22 - Layer 3 information: DSCP value, Protocol type, packet length, fragmentation.
23 - Misc layer 4 TCP flags.
24
25 Note that if originally Flowspec defined IPv4 rules, this is also possible to use
26 IPv6 address-family. The same set of combinations as defined for IPv4 can be used.
27
28 A combination of the above rules is applied for traffic filtering. This is
29 encoded as part of specific BGP extended communities and the action can range
30 from the obvious rerouting (to nexthop or to separate VRF) to shaping, or
31 discard.
32
33 The following IETF drafts and RFCs have been used to implement FRR Flowspec:
34
35 - :rfc:`5575`
36 - [Draft-IETF-IDR-Flowspec-redirect-IP]_
37 - [Draft-IETF-IDR-Flow-Spec-V6]_
38
39 .. _design-principles-flowspec:
40
41 Design Principles
42 -----------------
43
44 FRR implements the Flowspec client side, that is to say that BGP is able to
45 receive Flowspec entries, but is not able to act as manager and send Flowspec
46 entries.
47
48 Linux provides the following mechanisms to implement policy based routing:
49
50 - Filtering the traffic with ``Netfilter``.
51 ``Netfilter`` provides a set of tools like ``ipset`` and ``iptables`` that are
52 powerful enough to be able to filter such Flowspec filter rule.
53
54 - using non standard routing tables via ``iproute2`` (via the ``ip rule``
55 command provided by ``iproute2``).
56 ``iproute2`` is already used by FRR's :ref:`pbr` daemon which provides basic
57 policy based routing based on IP source and destination criterion.
58
59 Below example is an illustration of what Flowspec will inject in the underlying
60 system:
61
62 .. code-block:: shell
63
64 # linux shell
65 ipset create match0x102 hash:net,net counters
66 ipset add match0x102 32.0.0.0/16,40.0.0.0/16
67 iptables -N match0x102 -t mangle
68 iptables -A match0x102 -t mangle -j MARK --set-mark 102
69 iptables -A match0x102 -t mangle -j ACCEPT
70 iptables -i ntfp3 -t mangle -I PREROUTING -m set --match-set match0x102
71 src,dst -g match0x102
72 ip rule add fwmark 102 lookup 102
73 ip route add 40.0.0.0/16 via 44.0.0.2 table 102
74
75 For handling an incoming Flowspec entry, the following workflow is applied:
76
77 - Incoming Flowspec entries are handled by *bgpd*, stored in the BGP RIB.
78 - Flowspec entry is installed according to its complexity.
79
80 It will be installed if one of the following filtering action is seen on the
81 BGP extended community: either redirect IP, or redirect VRF, in conjunction
82 with rate option, for redirecting traffic. Or rate option set to 0, for
83 discarding traffic.
84
85 According to the degree of complexity of the Flowspec entry, it will be
86 installed in *zebra* RIB. For more information about what is supported in the
87 FRR implementation as rule, see :ref:`flowspec-known-issues` chapter. Flowspec
88 entry is split in several parts before being sent to *zebra*.
89
90 - *zebra* daemon receives the policy routing configuration
91
92 Policy Based Routing entities necessary to policy route the traffic in the
93 underlying system, are received by *zebra*. Two filtering contexts will be
94 created or appended in ``Netfilter``: ``ipset`` and ``iptable`` context. The
95 former is used to define an IP filter based on multiple criterium. For
96 instance, an ipset ``net:net`` is based on two ip addresses, while
97 ``net,port,net`` is based on two ip addresses and one port (for ICMP, UDP, or
98 TCP). The way the filtering is used (for example, is src port or dst port
99 used?) is defined by the latter filtering context. ``iptable`` command will
100 reference the ``ipset`` context and will tell how to filter and what to do. In
101 our case, a marker will be set to indicate ``iproute2`` where to forward the
102 traffic to. Sometimes, for dropping action, there is no need to add a marker;
103 the ``iptable`` will tell to drop all packets matching the ``ipset`` entry.
104
105 Configuration Guide
106 -------------------
107
108 In order to configure an IPv4 Flowspec engine, use the following configuration.
109 As of today, it is only possible to configure Flowspec on the default VRF.
110
111 .. code-block:: frr
112
113 router bgp <AS>
114 neighbor <A.B.C.D> remote-as <remoteAS>
115 neighbor <A:B::C:D> remote-as <remoteAS2>
116 address-family ipv4 flowspec
117 neighbor <A.B.C.D> activate
118 exit
119 address-family ipv6 flowspec
120 neighbor <A:B::C:D> activate
121 exit
122 exit
123
124 You can see Flowspec entries, by using one of the following show commands:
125
126 .. clicmd:: show bgp ipv4 flowspec [detail | A.B.C.D]
127
128 .. clicmd:: show bgp ipv6 flowspec [detail | A:B::C:D]
129
130 Per-interface configuration
131 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
132
133 One nice feature to use is the ability to apply Flowspec to a specific
134 interface, instead of applying it to the whole machine. Despite the following
135 IETF draft [Draft-IETF-IDR-Flowspec-Interface-Set]_ is not implemented, it is
136 possible to manually limit Flowspec application to some incoming interfaces.
137 Actually, not using it can result to some unexpected behaviour like accounting
138 twice the traffic, or slow down the traffic (filtering costs). To limit
139 Flowspec to one specific interface, use the following command, under
140 `flowspec address-family` node.
141
142 .. clicmd:: local-install <IFNAME | any>
143
144 By default, Flowspec is activated on all interfaces. Installing it to a named
145 interface will result in allowing only this interface. Conversely, enabling any
146 interface will flush all previously configured interfaces.
147
148 VRF redirection
149 ^^^^^^^^^^^^^^^
150
151 Another nice feature to configure is the ability to redirect traffic to a
152 separate VRF. This feature does not go against the ability to configure
153 Flowspec only on default VRF. Actually, when you receive incoming BGP flowspec
154 entries on that default VRF, you can redirect traffic to an other VRF.
155
156 As a reminder, BGP flowspec entries have a BGP extended community that contains
157 a Route Target. Finding out a local VRF based on Route Target consists in the
158 following:
159
160 - A configuration of each VRF must be done, with its Route Target set
161 Each VRF is being configured within a BGP VRF instance with its own Route
162 Target list. Route Target accepted format matches the following:
163 ``A.B.C.D:U16``, or ``U16:U32``, ``U32:U16``.
164
165 - The first VRF with the matching Route Target will be selected to route traffic
166 to. Use the following command under ipv4 unicast address-family node
167
168 .. clicmd:: rt redirect import RTLIST...
169
170 In order to illustrate, if the Route Target configured in the Flowspec entry is
171 ``E.F.G.H:II``, then a BGP VRF instance with the same Route Target will be set
172 set. That VRF will then be selected. The below full configuration example
173 depicts how Route Targets are configured and how VRFs and cross VRF
174 configuration is done. Note that the VRF are mapped on Linux Network
175 Namespaces. For data traffic to cross VRF boundaries, virtual ethernet
176 interfaces are created with private IP addressing scheme.
177
178 .. code-block:: frr
179
180 router bgp <ASx>
181 neighbor <A.B.C.D> remote-as <ASz>
182 address-family ipv4 flowspec
183 neighbor A.B.C.D activate
184 exit
185 exit
186 router bgp <ASy> vrf vrf2
187 address-family ipv4 unicast
188 rt redirect import <E.F.G.H:II>
189 exit
190 exit
191
192 Similarly, it is possible to do the same for IPv6 flowspec rules, by using
193 an IPv6 extended community. The format is defined on :rfc:`5701`, and that
194 community contains an IPv6 address encoded in the attribute, and matches the
195 locally configured imported route target IPv6 defined under the appropriate
196 BGP VRF instance. Below example defines an IPv6 extended community containing
197 `E:F::G:H` address followed by 2 bytes chosen by admin ( here `JJ`).
198
199 .. code-block:: frr
200
201 router bgp <ASx>
202 neighbor <A:B::C:D> remote-as <ASz>
203 address-family ipv6 flowspec
204 neighbor A:B::C:D activate
205 exit
206 exit
207 router bgp <ASy> vrf vrf2
208 address-family ipv6 unicast
209 rt6 redirect import <E:F::G:H:JJ>
210 exit
211 exit
212
213
214 Flowspec monitoring & troubleshooting
215 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
216
217 You can monitor policy-routing objects by using one of the following commands.
218 Those command rely on the filtering contexts configured from BGP, and get the
219 statistics information retrieved from the underlying system. In other words,
220 those statistics are retrieved from ``Netfilter``.
221
222 .. clicmd:: show pbr ipset IPSETNAME | iptable
223
224 ``IPSETNAME`` is the policy routing object name created by ``ipset``. About
225 rule contexts, it is possible to know which rule has been configured to
226 policy-route some specific traffic. The :clicmd:`show pbr iptable` command
227 displays for forwarded traffic, which table is used. Then it is easy to use
228 that table identifier to dump the routing table that the forwarded traffic will
229 match.
230
231 .. code-block:: frr
232
233 .. clicmd:: show ip route table TABLEID
234
235 ``TABLEID`` is the table number identifier referencing the non standard
236 routing table used in this example.
237
238 .. clicmd:: debug bgp flowspec
239
240 You can troubleshoot Flowspec, or BGP policy based routing. For instance, if
241 you encounter some issues when decoding a Flowspec entry, you should enable
242 :clicmd:`debug bgp flowspec`.
243
244 .. clicmd:: debug bgp pbr [error]
245
246 If you fail to apply the flowspec entry into *zebra*, there should be some
247 relationship with policy routing mechanism. Here,
248 :clicmd:`debug bgp pbr error` could help.
249
250 To get information about policy routing contexts created/removed, only use
251 :clicmd:`debug bgp pbr` command.
252
253 Ensuring that a Flowspec entry has been correctly installed and that incoming
254 traffic is policy-routed correctly can be checked as demonstrated below. First
255 of all, you must check whether the Flowspec entry has been installed or not.
256
257 .. code-block:: frr
258
259 CLI# show bgp ipv4 flowspec 5.5.5.2/32
260 BGP flowspec entry: (flags 0x418)
261 Destination Address 5.5.5.2/32
262 IP Protocol = 17
263 Destination Port >= 50 , <= 90
264 FS:redirect VRF RT:255.255.255.255:255
265 received for 18:41:37
266 installed in PBR (match0x271ce00)
267
268 This means that the Flowspec entry has been installed in an ``iptable`` named
269 ``match0x271ce00``. Once you have confirmation it is installed, you can check
270 whether you find the associate entry by executing following command. You can
271 also check whether incoming traffic has been matched by looking at counter
272 line.
273
274 .. code-block:: frr
275
276 CLI# show pbr ipset match0x271ce00
277 IPset match0x271ce00 type net,port
278 to 5.5.5.0/24:proto 6:80-120 (8)
279 pkts 1000, bytes 1000000
280 to 5.5.5.2:proto 17:50-90 (5)
281 pkts 1692918, bytes 157441374
282
283 As you can see, the entry is present. note that an ``iptable`` entry can be
284 used to host several Flowspec entries. In order to know where the matching
285 traffic is redirected to, you have to look at the policy routing rules. The
286 policy-routing is done by forwarding traffic to a routing table number. That
287 routing table number is reached by using a ``iptable``. The relationship
288 between the routing table number and the incoming traffic is a ``MARKER`` that
289 is set by the IPtable referencing the IPSet. In Flowspec case, ``iptable``
290 referencing the ``ipset`` context have the same name. So it is easy to know
291 which routing table is used by issuing following command:
292
293 .. code-block:: frr
294
295 CLI# show pbr iptable
296 IPtable match0x271ce00 action redirect (5)
297 pkts 1700000, bytes 158000000
298 table 257, fwmark 257
299 ...
300
301 As you can see, by using following Linux commands, the MARKER ``0x101`` is
302 present in both ``iptable`` and ``ip rule`` contexts.
303
304 .. code-block:: shell
305
306 # iptables -t mangle --list match0x271ce00 -v
307 Chain match0x271ce00 (1 references)
308 pkts bytes target prot opt in out source destination
309 1700K 158M MARK all -- any any anywhere anywhere
310 MARK set 0x101
311 1700K 158M ACCEPT all -- any any anywhere anywhere
312
313 # ip rule list
314 0:from all lookup local
315 0:from all fwmark 0x101 lookup 257
316 32766:from all lookup main
317 32767:from all lookup default
318
319 This allows us to see where the traffic is forwarded to.
320
321 .. _flowspec-known-issues:
322
323 Limitations / Known Issues
324 --------------------------
325
326 As you can see, Flowspec is rich and can be very complex. As of today, not all
327 Flowspec rules will be able to be converted into Policy Based Routing actions.
328
329 - The ``Netfilter`` driver is not integrated into FRR yet. Not having this
330 piece of code prevents from injecting flowspec entries into the underlying
331 system.
332
333 - There are some limitations around filtering contexts
334
335 If I take example of UDP ports, or TCP ports in Flowspec, the information
336 can be a range of ports, or a unique value. This case is handled.
337 However, complexity can be increased, if the flow is a combination of a list
338 of range of ports and an enumerate of unique values. Here this case is not
339 handled. Similarly, it is not possible to create a filter for both src port
340 and dst port. For instance, filter on src port from [1-1000] and dst port =
341 80. The same kind of complexity is not possible for packet length, ICMP type,
342 ICMP code.
343
344 There are some other known issues:
345
346 - The validation procedure depicted in :rfc:`5575` is not available.
347
348 This validation procedure has not been implemented, as this feature was not
349 used in the existing setups you shared with us.
350
351 - The filtering action shaper value, if positive, is not used to apply shaping.
352
353 If value is positive, the traffic is redirected to the wished destination,
354 without any other action configured by Flowspec.
355 It is recommended to configure Quality of Service if needed, more globally on
356 a per interface basis.
357
358 - Upon an unexpected crash or other event, *zebra* may not have time to flush
359 PBR contexts.
360
361 That is to say ``ipset``, ``iptable`` and ``ip rule`` contexts. This is also a
362 consequence due to the fact that ip rule / ipset / iptables are not discovered
363 at startup (not able to read appropriate contexts coming from Flowspec).
364
365 Appendix
366 --------
367
368 More information with a public presentation that explains the design of Flowspec
369 inside FRRouting.
370
371 [Presentation]_
372
373 .. [Draft-IETF-IDR-Flowspec-redirect-IP] <https://tools.ietf.org/id/draft-ietf-idr-flowspec-redirect-ip-02.txt>
374 .. [Draft-IETF-IDR-Flowspec-Interface-Set] <https://tools.ietf.org/id/draft-ietf-idr-flowspec-interfaceset-03.txt>
375 .. [Draft-IETF-IDR-Flow-Spec-V6] <https://tools.ietf.org/id/draft-ietf-idr-flow-spec-v6-10.txt>
376 .. [Presentation] <https://docs.google.com/presentation/d/1ekQygUAG5yvQ3wWUyrw4Wcag0LgmbW1kV02IWcU4iUg/edit#slide=id.g378f0e1b5e_1_44>