]>
Commit | Line | Data |
---|---|---|
00458d01 PG |
1 | .. _flowspec: |
2 | ||
3 | Flowspec | |
4 | ======== | |
5 | ||
6 | .. _features-of-the-current-implementation-flowspec: | |
7 | ||
8 | Overview | |
9 | --------- | |
10 | ||
11 | Flowspec introduces a new :abbr:`NLRI (Network Layer Reachability Information)` | |
12 | encoding format that is used to distribute traffic rule flow specifications. | |
13 | Basically, instead of simply relying on destination IP address for IP prefixes, | |
14 | the IP prefix is replaced by a n-tuple consisting of a rule. That rule can be a | |
15 | more or less complex combination of the following: | |
16 | ||
17 | ||
18 | - Network source/destination (can be one or the other, or both). | |
9c8726a3 | 19 | - Layer 4 information for UDP/TCP: source port, destination port, or any port. |
00458d01 PG |
20 | - Layer 4 information for ICMP type and ICMP code. |
21 | - Layer 4 information for TCP Flags. | |
9c8726a3 | 22 | - Layer 3 information: DSCP value, Protocol type, packet length, fragmentation. |
00458d01 PG |
23 | - Misc layer 4 TCP flags. |
24 | ||
25 | A combination of the above rules is applied for traffic filtering. This is | |
26 | encoded as part of specific BGP extended communities and the action can range | |
27 | from the obvious rerouting (to nexthop or to separate VRF) to shaping, or | |
28 | discard. | |
29 | ||
30 | The following IETF drafts and RFCs have been used to implement FRR Flowspec: | |
31 | ||
32 | - :rfc:`5575` | |
9c8726a3 | 33 | - [Draft-IETF-IDR-Flowspec-redirect-IP]_ |
00458d01 PG |
34 | |
35 | .. _design-principles-flowspec: | |
36 | ||
37 | Design Principles | |
38 | ----------------- | |
39 | ||
40 | FRR implements the Flowspec client side, that is to say that BGP is able to | |
41 | receive Flowspec entries, but is not able to act as manager and send Flowspec | |
42 | entries. | |
43 | ||
44 | Linux provides the following mechanisms to implement policy based routing: | |
45 | ||
46 | - Filtering the traffic with ``Netfilter``. | |
47 | ``Netfilter`` provides a set of tools like ``ipset`` and ``iptables`` that are | |
48 | powerful enough to be able to filter such Flowspec filter rule. | |
49 | ||
50 | - using non standard routing tables via ``iproute2`` (via the ``ip rule`` | |
51 | command provided by ``iproute2``). | |
52 | ``iproute2`` is already used by FRR's :ref:`pbr` daemon which provides basic | |
53 | policy based routing based on IP source and destination criterion. | |
54 | ||
55 | Below example is an illustration of what Flowspec will inject in the underlying | |
56 | system: | |
57 | ||
58 | .. code-block:: shell | |
59 | ||
60 | # linux shell | |
61 | ipset create match0x102 hash:net,net counters | |
62 | ipset add match0x102 32.0.0.0/16,40.0.0.0/16 | |
63 | iptables -N match0x102 -t mangle | |
64 | iptables -A match0x102 -t mangle -j MARK --set-mark 102 | |
65 | iptables -A match0x102 -t mangle -j ACCEPT | |
66 | iptables -i ntfp3 -t mangle -I PREROUTING -m set --match-set match0x102 | |
67 | src,dst -g match0x102 | |
68 | ip rule add fwmark 102 lookup 102 | |
69 | ip route add 40.0.0.0/16 via 44.0.0.2 table 102 | |
70 | ||
71 | For handling an incoming Flowspec entry, the following workflow is applied: | |
72 | ||
9c8726a3 | 73 | - Incoming Flowspec entries are handled by *bgpd*, stored in the BGP RIB. |
00458d01 PG |
74 | - Flowspec entry is installed according to its complexity. |
75 | ||
9c8726a3 QY |
76 | It will be installed if one of the following filtering action is seen on the |
77 | BGP extended community: either redirect IP, or redirect VRF, in conjunction | |
78 | with rate option, for redirecting traffic. Or rate option set to 0, for | |
79 | discarding traffic. | |
00458d01 PG |
80 | |
81 | According to the degree of complexity of the Flowspec entry, it will be | |
82 | installed in *zebra* RIB. For more information about what is supported in the | |
83 | FRR implementation as rule, see :ref:`flowspec-known-issues` chapter. Flowspec | |
84 | entry is split in several parts before being sent to *zebra*. | |
85 | ||
86 | - *zebra* daemon receives the policy routing configuration | |
87 | ||
88 | Policy Based Routing entities necessary to policy route the traffic in the | |
89 | underlying system, are received by *zebra*. Two filtering contexts will be | |
90 | created or appended in ``Netfilter``: ``ipset`` and ``iptable`` context. The | |
9c8726a3 QY |
91 | former is used to define an IP filter based on multiple criterium. For |
92 | instance, an ipset ``net:net`` is based on two ip addresses, while | |
93 | ``net,port,net`` is based on two ip addresses and one port (for ICMP, UDP, or | |
94 | TCP). The way the filtering is used (for example, is src port or dst port | |
95 | used?) is defined by the latter filtering context. ``iptable`` command will | |
96 | reference the ``ipset`` context and will tell how to filter and what to do. In | |
97 | our case, a marker will be set to indicate ``iproute2`` where to forward the | |
98 | traffic to. Sometimes, for dropping action, there is no need to add a marker; | |
99 | the ``iptable`` will tell to drop all packets matching the ``ipset`` entry. | |
100 | ||
101 | Configuration Guide | |
00458d01 PG |
102 | ------------------- |
103 | ||
104 | In order to configure an IPv4 Flowspec engine, use the following configuration. | |
105 | As of today, it is only possible to configure Flowspec on the default VRF. | |
106 | ||
107 | .. code-block:: frr | |
108 | ||
109 | router bgp <AS> | |
110 | neighbor <A.B.C.D> remote-as <remoteAS> | |
111 | address-family ipv4 flowspec | |
112 | neighbor <A.B.C.D> activate | |
113 | exit | |
114 | exit | |
115 | ||
116 | You can see Flowspec entries, by using one of the following show commands: | |
117 | ||
118 | .. index:: show bgp ipv4 flowspec [detail | A.B.C.D] | |
119 | .. clicmd:: show bgp ipv4 flowspec [detail | A.B.C.D] | |
120 | ||
121 | ||
9c8726a3 | 122 | Per-interface configuration |
00458d01 PG |
123 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
124 | ||
125 | One nice feature to use is the ability to apply Flowspec to a specific | |
126 | interface, instead of applying it to the whole machine. Despite the following | |
9c8726a3 | 127 | IETF draft [Draft-IETF-IDR-Flowspec-Interface-Set]_ is not implemented, it is |
00458d01 PG |
128 | possible to manually limit Flowspec application to some incoming interfaces. |
129 | Actually, not using it can result to some unexpected behaviour like accounting | |
9c8726a3 QY |
130 | twice the traffic, or slow down the traffic (filtering costs). To limit |
131 | Flowspec to one specific interface, use the following command, under | |
00458d01 PG |
132 | `flowspec address-family` node. |
133 | ||
134 | .. index:: [no] local-install <IFNAME | any> | |
135 | .. clicmd:: [no] local-install <IFNAME | any> | |
136 | ||
137 | By default, Flowspec is activated on all interfaces. Installing it to a named | |
138 | interface will result in allowing only this interface. Conversely, enabling any | |
139 | interface will flush all previously configured interfaces. | |
140 | ||
141 | VRF redirection | |
142 | ^^^^^^^^^^^^^^^ | |
143 | ||
144 | Another nice feature to configure is the ability to redirect traffic to a | |
9c8726a3 QY |
145 | separate VRF. This feature does not go against the ability to configure |
146 | Flowspec only on default VRF. Actually, when you receive incoming BGP flowspec | |
147 | entries on that default VRF, you can redirect traffic to an other VRF. | |
00458d01 PG |
148 | |
149 | As a reminder, BGP flowspec entries have a BGP extended community that contains | |
150 | a Route Target. Finding out a local VRF based on Route Target consists in the | |
151 | following: | |
152 | ||
153 | - A configuration of each VRF must be done, with its Route Target set | |
154 | Each VRF is being configured within a BGP VRF instance with its own Route | |
155 | Target list. Route Target accepted format matches the following: | |
156 | ``A.B.C.D:U16``, or ``U16:U32``, ``U32:U16``. | |
157 | ||
158 | - The first VRF with the matching Route Target will be selected to route traffic | |
159 | to. Use the following command under ipv4 unicast address-family node | |
160 | ||
161 | .. index:: [no] rt redirect import RTLIST... | |
162 | .. clicmd:: [no] rt redirect import RTLIST... | |
163 | ||
164 | In order to illustrate, if the Route Target configured in the Flowspec entry is | |
9c8726a3 QY |
165 | ``E.F.G.H:II``, then a BGP VRF instance with the same Route Target will be set |
166 | set. That VRF will then be selected. The below full configuration example | |
167 | depicts how Route Targets are configured and how VRFs and cross VRF | |
168 | configuration is done. Note that the VRF are mapped on Linux Network | |
169 | Namespaces. For data traffic to cross VRF boundaries, virtual ethernet | |
56f0bea7 | 170 | interfaces are created with private IP addressing scheme. |
00458d01 PG |
171 | |
172 | .. code-block:: frr | |
173 | ||
174 | router bgp <ASx> | |
175 | neighbor <A.B.C.D> remote-as <ASz> | |
176 | address-family ipv4 flowspec | |
177 | neighbor A.B.C.D activate | |
178 | exit | |
179 | exit | |
180 | router bgp <ASy> vrf vrf2 | |
181 | address-family ipv4 unicast | |
182 | rt redirect import <E.F.G.H:II> | |
183 | exit | |
184 | exit | |
185 | ||
9c8726a3 QY |
186 | Flowspec monitoring & troubleshooting |
187 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
00458d01 PG |
188 | |
189 | You can monitor policy-routing objects by using one of the following commands. | |
190 | Those command rely on the filtering contexts configured from BGP, and get the | |
191 | statistics information retrieved from the underlying system. In other words, | |
192 | those statistics are retrieved from ``Netfilter``. | |
193 | ||
194 | .. index:: show pbr ipset IPSETNAME | iptable | |
195 | .. clicmd:: show pbr ipset IPSETNAME | iptable | |
196 | ||
9c8726a3 QY |
197 | ``IPSETNAME`` is the policy routing object name created by ``ipset``. About |
198 | rule contexts, it is possible to know which rule has been configured to | |
00458d01 | 199 | policy-route some specific traffic. The :clicmd:`show pbr iptable` command |
9c8726a3 QY |
200 | displays for forwarded traffic, which table is used. Then it is easy to use |
201 | that table identifier to dump the routing table that the forwarded traffic will | |
00458d01 PG |
202 | match. |
203 | ||
204 | .. code-block:: frr | |
205 | ||
9c8726a3 QY |
206 | .. index:: show ip route table TABLEID |
207 | .. clicmd:: show ip route table TABLEID | |
00458d01 | 208 | |
9c8726a3 QY |
209 | ``TABLEID`` is the table number identifier referencing the non standard |
210 | routing table used in this example. | |
00458d01 PG |
211 | |
212 | .. index:: [no] debug bgp flowspec | |
213 | .. clicmd:: [no] debug bgp flowspec | |
214 | ||
9c8726a3 QY |
215 | You can troubleshoot Flowspec, or BGP policy based routing. For instance, if |
216 | you encounter some issues when decoding a Flowspec entry, you should enable | |
217 | :clicmd:`debug bgp flowspec`. | |
00458d01 PG |
218 | |
219 | .. index:: [no] debug bgp pbr [error] | |
220 | .. clicmd:: [no] debug bgp pbr [error] | |
221 | ||
9c8726a3 QY |
222 | If you fail to apply the flowspec entry into *zebra*, there should be some |
223 | relationship with policy routing mechanism. Here, | |
224 | :clicmd:`debug bgp pbr error` could help. | |
225 | ||
226 | To get information about policy routing contexts created/removed, only use | |
227 | :clicmd:`debug bgp pbr` command. | |
00458d01 PG |
228 | |
229 | Ensuring that a Flowspec entry has been correctly installed and that incoming | |
9c8726a3 | 230 | traffic is policy-routed correctly can be checked as demonstrated below. First |
00458d01 PG |
231 | of all, you must check whether the Flowspec entry has been installed or not. |
232 | ||
233 | .. code-block:: frr | |
234 | ||
235 | CLI# show bgp ipv4 flowspec 5.5.5.2/32 | |
236 | BGP flowspec entry: (flags 0x418) | |
237 | Destination Address 5.5.5.2/32 | |
238 | IP Protocol = 17 | |
239 | Destination Port >= 50 , <= 90 | |
240 | FS:redirect VRF RT:255.255.255.255:255 | |
241 | received for 18:41:37 | |
242 | installed in PBR (match0x271ce00) | |
243 | ||
9c8726a3 QY |
244 | This means that the Flowspec entry has been installed in an ``iptable`` named |
245 | ``match0x271ce00``. Once you have confirmation it is installed, you can check | |
246 | whether you find the associate entry by executing following command. You can | |
247 | also check whether incoming traffic has been matched by looking at counter | |
00458d01 PG |
248 | line. |
249 | ||
250 | .. code-block:: frr | |
251 | ||
252 | CLI# show pbr ipset match0x271ce00 | |
253 | IPset match0x271ce00 type net,port | |
254 | to 5.5.5.0/24:proto 6:80-120 (8) | |
255 | pkts 1000, bytes 1000000 | |
256 | to 5.5.5.2:proto 17:50-90 (5) | |
257 | pkts 1692918, bytes 157441374 | |
258 | ||
9c8726a3 QY |
259 | As you can see, the entry is present. note that an ``iptable`` entry can be |
260 | used to host several Flowspec entries. In order to know where the matching | |
261 | traffic is redirected to, you have to look at the policy routing rules. The | |
262 | policy-routing is done by forwarding traffic to a routing table number. That | |
263 | routing table number is reached by using a ``iptable``. The relationship | |
264 | between the routing table number and the incoming traffic is a ``MARKER`` that | |
265 | is set by the IPtable referencing the IPSet. In Flowspec case, ``iptable`` | |
266 | referencing the ``ipset`` context have the same name. So it is easy to know | |
267 | which routing table is used by issuing following command: | |
00458d01 PG |
268 | |
269 | .. code-block:: frr | |
270 | ||
271 | CLI# show pbr iptable | |
272 | IPtable match0x271ce00 action redirect (5) | |
273 | pkts 1700000, bytes 158000000 | |
274 | table 257, fwmark 257 | |
275 | ... | |
276 | ||
9c8726a3 QY |
277 | As you can see, by using following Linux commands, the MARKER ``0x101`` is |
278 | present in both ``iptable`` and ``ip rule`` contexts. | |
00458d01 PG |
279 | |
280 | .. code-block:: shell | |
281 | ||
282 | # iptables -t mangle --list match0x271ce00 -v | |
283 | Chain match0x271ce00 (1 references) | |
284 | pkts bytes target prot opt in out source destination | |
285 | 1700K 158M MARK all -- any any anywhere anywhere | |
286 | MARK set 0x101 | |
287 | 1700K 158M ACCEPT all -- any any anywhere anywhere | |
288 | ||
289 | # ip rule list | |
290 | 0:from all lookup local | |
291 | 0:from all fwmark 0x101 lookup 257 | |
292 | 32766:from all lookup main | |
293 | 32767:from all lookup default | |
294 | ||
295 | This allows us to see where the traffic is forwarded to. | |
296 | ||
297 | .. _flowspec-known-issues: | |
298 | ||
9c8726a3 | 299 | Limitations / Known Issues |
00458d01 PG |
300 | -------------------------- |
301 | ||
9c8726a3 QY |
302 | As you can see, Flowspec is rich and can be very complex. As of today, not all |
303 | Flowspec rules will be able to be converted into Policy Based Routing actions. | |
00458d01 | 304 | |
9c8726a3 QY |
305 | - The ``Netfilter`` driver is not integrated into FRR yet. Not having this |
306 | piece of code prevents from injecting flowspec entries into the underlying | |
307 | system. | |
00458d01 PG |
308 | |
309 | - There are some limitations around filtering contexts | |
310 | ||
311 | If I take example of UDP ports, or TCP ports in Flowspec, the information | |
312 | can be a range of ports, or a unique value. This case is handled. | |
313 | However, complexity can be increased, if the flow is a combination of a list | |
314 | of range of ports and an enumerate of unique values. Here this case is not | |
315 | handled. Similarly, it is not possible to create a filter for both src port | |
316 | and dst port. For instance, filter on src port from [1-1000] and dst port = | |
317 | 80. The same kind of complexity is not possible for packet length, ICMP type, | |
318 | ICMP code. | |
319 | ||
320 | There are some other known issues: | |
321 | ||
322 | - The validation procedure depicted in :rfc:`5575` is not available. | |
323 | ||
324 | This validation procedure has not been implemented, as this feature was not | |
56f0bea7 | 325 | used in the existing setups you shared with us. |
00458d01 PG |
326 | |
327 | - The filtering action shaper value, if positive, is not used to apply shaping. | |
328 | ||
329 | If value is positive, the traffic is redirected to the wished destination, | |
330 | without any other action configured by Flowspec. | |
331 | It is recommended to configure Quality of Service if needed, more globally on | |
332 | a per interface basis. | |
333 | ||
9c8726a3 QY |
334 | - Upon an unexpected crash or other event, *zebra* may not have time to flush |
335 | PBR contexts. | |
00458d01 PG |
336 | |
337 | That is to say ``ipset``, ``iptable`` and ``ip rule`` contexts. This is also a | |
338 | consequence due to the fact that ip rule / ipset / iptables are not discovered | |
339 | at startup (not able to read appropriate contexts coming from Flowspec). | |
340 | ||
341 | Appendix | |
342 | -------- | |
343 | ||
344 | More information with a public presentation that explains the design of Flowspec | |
345 | inside FRRouting. | |
346 | ||
347 | [Presentation]_ | |
348 | ||
9c8726a3 QY |
349 | .. [Draft-IETF-IDR-Flowspec-redirect-IP] <https://tools.ietf.org/id/draft-ietf-idr-flowspec-redirect-ip-02.txt> |
350 | .. [Draft-IETF-IDR-Flowspec-Interface-Set] <https://tools.ietf.org/id/draft-ietf-idr-flowspec-interfaceset-03.txt> | |
00458d01 | 351 | .. [Presentation] <https://docs.google.com/presentation/d/1ekQygUAG5yvQ3wWUyrw4Wcag0LgmbW1kV02IWcU4iUg/edit#slide=id.g378f0e1b5e_1_44> |