]>
Commit | Line | Data |
---|---|---|
cf9b9f77 OD |
1 | OSPF Segment Routing |
2 | ==================== | |
3 | ||
6f751f14 | 4 | This is an EXPERIMENTAL support of `RFC 8665`. |
cf9b9f77 OD |
5 | DON'T use it for production network. |
6 | ||
e3e3afff QY |
7 | Supported Features |
8 | ------------------ | |
9 | ||
10 | * Automatic computation of Primary and Backup Adjacency SID with | |
11 | Cisco experimental remote IP address | |
6f751f14 | 12 | * SRGB & SRLB configuration |
e3e3afff QY |
13 | * Prefix configuration for Node SID with optional NO-PHP flag (Linux |
14 | kernel support both mode) | |
15 | * Node MSD configuration (with Linux Kernel >= 4.10 a maximum of 32 labels | |
16 | could be stack) | |
17 | * Automatic provisioning of MPLS table | |
6f751f14 | 18 | * Equal Cost Multi-Path (ECMP) |
e3e3afff | 19 | * Static route configuration with label stack up to 32 labels |
3e94c9a4 | 20 | * TI-LFA (for P2P interfaces only) |
e3e3afff QY |
21 | |
22 | Interoperability | |
23 | ---------------- | |
24 | ||
25 | * Tested on various topology including point-to-point and LAN interfaces | |
8678d638 | 26 | in a mix of FRRouting instance and Cisco IOS-XR 6.0.x |
e3e3afff QY |
27 | * Check OSPF LSA conformity with latest wireshark release 2.5.0-rc |
28 | ||
cf9b9f77 OD |
29 | Implementation details |
30 | ---------------------- | |
7743f2f8 OD |
31 | |
32 | Concepts | |
75ca3b11 | 33 | ^^^^^^^^ |
cf9b9f77 | 34 | |
56f0bea7 | 35 | Segment Routing used 3 different OPAQUE LSA in OSPF to carry the various |
fd3b19f2 OD |
36 | information: |
37 | ||
7743f2f8 OD |
38 | * **Router Information:** flood the Segment Routing capabilities of the node. |
39 | This include the supported algorithms, the Segment Routing Global Block | |
40 | (SRGB) and the Maximum Stack Depth (MSD). | |
41 | * **Extended Link:** flood the Adjaceny and Lan Adjacency Segment Identifier | |
42 | * **Extended Prefix:** flood the Prefix Segment Identifier | |
cf9b9f77 | 43 | |
56f0bea7 | 44 | The implementation follows previous TE and Router Information codes. It used the |
7743f2f8 | 45 | OPAQUE LSA functions defined in ospf_opaque.[c,h] as well as the OSPF API. This |
fd3b19f2 OD |
46 | latter is mandatory for the implementation as it provides the Callback to |
47 | Segment Routing functions (see below) when an Extended Link / Prefix or Router | |
7743f2f8 | 48 | Information LSA s are received. |
cf9b9f77 | 49 | |
7743f2f8 | 50 | Overview |
75ca3b11 | 51 | ^^^^^^^^ |
7726c479 | 52 | |
cf9b9f77 | 53 | Following files where modified or added: |
7743f2f8 OD |
54 | |
55 | * ospd_ri.[c,h] have been modified to add the new TLVs for Segment Routing. | |
56 | * ospf_ext.[c,h] implement RFC7684 as base support of Extended Link and Prefix | |
57 | Opaque LSA. | |
58 | * ospf_sr.[c,h] implement the earth of Segment Routing. It adds a new Segment | |
59 | Routing database to manage Segment Identifiers per Link and Prefix and | |
60 | Segment Routing enable node, Callback functions to process incoming LSA and | |
61 | install MPLS FIB entry through Zebra. | |
fd3b19f2 OD |
62 | |
63 | The figure below shows the relation between the various files: | |
64 | ||
7743f2f8 OD |
65 | * ospf_sr.c centralized all the Segment Routing processing. It receives Opaque |
66 | LSA Router Information (4.0.0.0) from ospf_ri.c and Extended Prefix | |
67 | (7.0.0.X) Link (8.0.0.X) from ospf_ext.c. Once received, it parse TLVs and | |
68 | SubTLVs and store information in SRDB (which is defined in ospf_sr.h). For | |
69 | each received LSA, NHLFE is computed and send to Zebra to add/remove new | |
70 | MPLS labels entries and FEC. New CLI configurations are also centralized in | |
71 | ospf_sr.c. This CLI will trigger the flooding of new LSA Router Information | |
72 | (4.0.0.0), Extended Prefix (7.0.0.X) and Link (8.0.0.X) by ospf_ri.c, | |
73 | respectively ospf_ext.c. | |
74 | * ospf_ri.c send back to ospf_sr.c received Router Information LSA and update | |
56f0bea7 | 75 | Self Router Information LSA with parameters provided by ospf_sr.c i.e. SRGB |
7743f2f8 OD |
76 | and MSD. It use ospf_opaque.c functions to send/received these Opaque LSAs. |
77 | * ospf_ext.c send back to ospf_sr.c received Extended Prefix and Link Opaque | |
78 | LSA and send self Extended Prefix and Link Opaque LSA through ospf_opaque.c | |
79 | functions. | |
fd3b19f2 OD |
80 | |
81 | :: | |
cf9b9f77 OD |
82 | |
83 | +-----------+ +-------+ | |
84 | | | | | | |
85 | | ospf_sr.c +-----+ SRDB | | |
86 | +-----------+ +--+ | | | |
87 | | +-^-------^-+ | +-------+ | |
88 | | | | | | | |
89 | | | | | | | |
90 | | | | | +--------+ | |
91 | | | | | | | |
92 | +---v----------+ | | | +-----v-------+ | |
93 | | | | | | | | | |
94 | | ospf_ri.c +--+ | +-------+ ospf_ext.c | | |
95 | | LSA 4.0.0.0 | | | LSA 7.0.0.X | | |
96 | | | | | LSA 8.0.0.X | | |
97 | +---^----------+ | | | | |
98 | | | +-----^-------+ | |
99 | | | | | |
100 | | | | | |
101 | | +--------v------------+ | | |
102 | | | | | | |
103 | | | ZEBRA: Labels + FEC | | | |
104 | | | | | | |
105 | | +---------------------+ | | |
106 | | | | |
107 | | | | |
108 | | +---------------+ | | |
109 | | | | | | |
110 | +---------> ospf_opaque.c <---------+ | |
111 | | | | |
112 | +---------------+ | |
113 | ||
7743f2f8 OD |
114 | Figure 1: Overview of Segment Routing interaction |
115 | ||
116 | Module interactions | |
75ca3b11 | 117 | ^^^^^^^^^^^^^^^^^^^ |
7743f2f8 OD |
118 | |
119 | To process incoming LSA, the code is based on the capability to call `hook()` | |
120 | functions when LSA are inserted or delete to / from the LSDB and the | |
121 | possibility to register particular treatment for Opaque LSA. The first point | |
122 | is provided by the OSPF API feature and the second by the Opaque implementation | |
123 | itself. Indeed, it is possible to register callback function for a given Opaque | |
124 | LSA ID (see `ospf_register_opaque_functab()` function defined in | |
125 | `ospf_opaque.c`). Each time a new LSA is added to the LSDB, the | |
126 | `new_lsa_hook()` function previously register for this LSA type is called. For | |
127 | Opaque LSA it is the `ospf_opaque_lsa_install_hook()`. For deletion, it is | |
128 | `ospf_opaque_lsa_delete_hook()`. | |
129 | ||
130 | Note that incoming LSA which is already present in the LSDB will be inserted | |
131 | after the old instance of this LSA remove from the LSDB. Thus, after the first | |
132 | time, each incoming LSA will trigger a `delete` following by an `install`. This | |
56f0bea7 RK |
133 | is not very helpful to handle real LSA deletion. In fact, LSA deletion is done |
134 | by Flushing LSA i.e. flood LSA after setting its age to MAX_AGE. Then, a garbage | |
7743f2f8 OD |
135 | function has the role to remove all LSA with `age == MAX_AGE` in the LSDB. So, |
136 | to handle LSA Flush, the best is to look to the LSA age to determine if it is | |
137 | an installation or a future deletion i.e. the flushed LSA is first store in the | |
138 | LSDB with MAX_AGE waiting for the garbage collector function. | |
139 | ||
140 | Router Information LSAs | |
141 | ^^^^^^^^^^^^^^^^^^^^^^^ | |
142 | ||
143 | To activate Segment Routing, new CLI command `segment-routing on` has been | |
144 | introduced. When this command is activated, function | |
145 | `ospf_router_info_update_sr()` is called to indicate to Router Information | |
146 | process that Segment Routing TLVs must be flood. Same function is called to | |
147 | modify the Segment Routing Global Block (SRGB) and Maximum Stack Depth (MSD) | |
56f0bea7 | 148 | TLV. Only Shortest Path First (SPF) Algorithm is supported, so no possibility |
7743f2f8 OD |
149 | to modify this TLV is offer by the code. |
150 | ||
39e97e87 | 151 | When Opaque LSA Type 4 i.e. Router Information are stored in LSDB, function |
7743f2f8 OD |
152 | `ospf_opaque_lsa_install_hook()` will call the previously registered function |
153 | `ospf_router_info_lsa_update()`. In turn, the function will simply trigger | |
154 | `ospf_sr_ri_lsa_update()` or `ospf_sr_ri_lsa_delete` in function of the LSA | |
155 | age. Before, it verifies that the LSA Opaque Type is 4 (Router Information). | |
156 | Self Opaque LSA are not send back to the Segment Routing functions as | |
157 | information are already stored. | |
158 | ||
159 | Extended Link Prefix LSAs | |
160 | ^^^^^^^^^^^^^^^^^^^^^^^^^ | |
161 | ||
162 | Like for Router Information, Segment Routing is activate at the Extended | |
56f0bea7 RK |
163 | Link/Prefix level with new `segment-routing on` command. This triggers |
164 | automatically the flooding of Extended Link LSA for all ospf interfaces where | |
7743f2f8 OD |
165 | adjacency is full. For Extended Prefix LSA, the new CLI command |
166 | `segment-routing prefix ...` will trigger the flooding of Prefix SID | |
167 | TLV/SubTLVs. | |
168 | ||
169 | When Opaque LSA Type 7 i.e. Extended Prefix and Type 8 i.e. Extended Link are | |
170 | store in the LSDB, `ospf_ext_pref_update_lsa()` respectively | |
171 | `ospf_ext_link_update_lsa()` are called like for Router Information LSA. In | |
172 | turn, they respectively trigger `ospf_sr_ext_prefix_lsa_update()` / | |
173 | `ospf_sr_ext_link_lsa_update()` or `ospf_sr_ext_prefix_lsa_delete()` / | |
174 | `ospf_sr_ext_link_lsa_delete()` if the LSA age is equal to MAX_AGE. | |
175 | ||
176 | Zebra | |
177 | ^^^^^ | |
178 | ||
179 | When a new MPLS entry or new Forwarding Equivalent Class (FEC) must be added or | |
180 | deleted in the data plane, `add_sid_nhlfe()` respectively `del_sid_nhlfe()` are | |
181 | called. Once check the validity of labels, they are send to ZEBRA layer through | |
182 | `ZEBRA_MPLS_LABELS_ADD` command, respectively `ZEBRA_MPLS_LABELS_DELETE` | |
183 | command for deletion. This is completed by a new labelled route through | |
184 | `ZEBRA_ROUTE_ADD` command, respectively `ZEBRA_ROUTE_DELETE` command. | |
fd3b19f2 | 185 | |
3e94c9a4 G |
186 | TI-LFA |
187 | ^^^^^^ | |
188 | ||
189 | Experimental support for Topology Independent LFA (Loop-Free Alternate), see | |
190 | for example 'draft-bashandy-rtgwg-segment-routing-ti-lfa-05'. The related | |
191 | files are `ospf_ti_lfa.c/h`. | |
192 | ||
193 | The current implementation is rather naive and does not support the advanced | |
194 | optimizations suggested in e.g. RFC7490 or RFC8102. It focuses on providing | |
195 | the essential infrastructure which can also later be used to enhance the | |
196 | algorithmic aspects. | |
197 | ||
198 | Supported features: | |
199 | ||
200 | * Link and node protection | |
201 | * Intra-area support | |
202 | * Proper use of Prefix- and Adjacency-SIDs in label stacks | |
203 | * Asymmetric weights (using reverse SPF) | |
204 | * Non-adjacent P/Q spaces | |
205 | * Protection of Prefix-SIDs | |
206 | ||
207 | If configured for every SPF run the routing table is enriched with additional | |
208 | backup paths for every prefix. The corresponding Prefix-SIDs are updated with | |
209 | backup paths too within the OSPF SR update task. | |
210 | ||
211 | Informal High-Level Algorithm Description: | |
212 | ||
213 | :: | |
214 | ||
215 | p_spaces = empty_list() | |
216 | ||
217 | for every protected_resource (link or node): | |
218 | p_space = generate_p_space(protected_resource) | |
219 | p_space.q_spaces = empty_list() | |
220 | ||
221 | for every destination that is affected by the protected_resource: | |
222 | q_space = generate_q_space(destination) | |
223 | ||
224 | # The label stack is stored in q_space | |
225 | generate_label_stack(p_space, q_space) | |
226 | ||
227 | # The p_space collects all its q_spaces | |
228 | p_spaces.q_spaces.add(q_space) | |
229 | ||
230 | p_spaces.add(p_space) | |
231 | ||
232 | adjust_routing_table(p_spaces) | |
233 | ||
234 | Possible Performance Improvements: | |
235 | ||
236 | * Improve overall datastructures, get away from linked lists for vertices | |
237 | * Don't calculate a Q space for every destination, but for a minimum set of | |
238 | backup paths that cover all destinations in the post-convergence SPF. The | |
239 | thinking here is that once a backup path is known that it is also a backup | |
240 | path for all nodes on the path themselves. This can be done by using the | |
241 | leafs of a trimmed minimum spanning tree generated out of the post- | |
242 | convergence SPF tree for that particular P space. | |
243 | * For an alternative (maybe better) optimization look at | |
244 | https://tools.ietf.org/html/rfc7490#section-5.2.1.3 which describes using | |
245 | the Q space of the node which is affected by e.g. a link failure. Note that | |
246 | this optimization is topology dependent. | |
247 | ||
248 | It is highly recommended to read e.g. `Segment Routing I/II` by Filsfils to | |
249 | understand the basics of Ti-LFA. | |
250 | ||
7743f2f8 OD |
251 | Configuration |
252 | ------------- | |
7726c479 | 253 | |
7743f2f8 | 254 | Linux Kernel |
75ca3b11 | 255 | ^^^^^^^^^^^^ |
7726c479 | 256 | |
7743f2f8 OD |
257 | In order to use OSPF Segment Routing, you must setup MPLS data plane. Up to |
258 | know, only Linux Kernel version >= 4.5 is supported. | |
7726c479 | 259 | |
7743f2f8 OD |
260 | First, the MPLS modules aren't loaded by default, so you'll need to load them |
261 | yourself: | |
7726c479 | 262 | |
7743f2f8 | 263 | :: |
7726c479 | 264 | |
e3e3afff QY |
265 | modprobe mpls_router |
266 | modprobe mpls_gso | |
267 | modprobe mpls_iptunnel | |
7726c479 | 268 | |
7743f2f8 | 269 | Then, you must activate MPLS on the interface you would used: |
7726c479 | 270 | |
7743f2f8 | 271 | :: |
7726c479 | 272 | |
e3e3afff QY |
273 | sysctl -w net.mpls.conf.enp0s9.input=1 |
274 | sysctl -w net.mpls.conf.lo.input=1 | |
275 | sysctl -w net.mpls.platform_labels=1048575 | |
7726c479 | 276 | |
7743f2f8 | 277 | The last line fix the maximum MPLS label value. |
7726c479 | 278 | |
7743f2f8 OD |
279 | Once OSPFd start with Segment Routing, you could check that MPLS routes are |
280 | enable with: | |
fd3b19f2 | 281 | |
7743f2f8 OD |
282 | :: |
283 | ||
e3e3afff QY |
284 | ip -M route |
285 | ip route | |
7743f2f8 OD |
286 | |
287 | The first command show the MPLS LFIB table while the second show the FIB | |
288 | table which contains route with MPLS label encapsulation. | |
289 | ||
290 | If you disable Penultimate Hop Popping with the `no-php-flag` (see below), you | |
291 | MUST check that RP filter is not enable for the interface you intend to use, | |
292 | especially the `lo` one. For that purpose, disable RP filtering with: | |
293 | ||
294 | :: | |
295 | ||
e3e3afff QY |
296 | systcl -w net.ipv4.conf.all.rp_filter=0 |
297 | sysctl -w net.ipv4.conf.lo.rp_filter=0 | |
7743f2f8 OD |
298 | |
299 | OSPFd | |
75ca3b11 | 300 | ^^^^^ |
fd3b19f2 OD |
301 | |
302 | Here it is a simple example of configuration to enable Segment Routing. Note | |
7743f2f8 OD |
303 | that `opaque capability` and `router information` must be set to activate |
304 | Opaque LSA prior to Segment | |
fd3b19f2 OD |
305 | Routing. |
306 | ||
307 | :: | |
308 | ||
e3e3afff QY |
309 | router ospf |
310 | ospf router-id 192.168.1.11 | |
311 | capability opaque | |
e3e3afff | 312 | segment-routing on |
d2e02cbf | 313 | segment-routing global-block 10000 19999 local-block 5000 5999 |
e3e3afff QY |
314 | segment-routing node-msd 8 |
315 | segment-routing prefix 192.168.1.11/32 index 1100 | |
fd3b19f2 | 316 | |
6f751f14 OD |
317 | The first segment-routing statement enables it. The second and third one set |
318 | the SRGB and SRLB respectively, fourth line the MSD and finally, set the | |
319 | Prefix SID index for a given prefix. | |
320 | ||
fd3b19f2 | 321 | Note that only prefix of Loopback interface could be configured with a Prefix |
7743f2f8 | 322 | SID. It is possible to add `no-php-flag` at the end of the prefix command to |
56f0bea7 | 323 | disable Penultimate Hop Popping. This advertises to peers that they MUST NOT pop |
7743f2f8 | 324 | the MPLS label prior to sending the packet. |
cf9b9f77 OD |
325 | |
326 | Known limitations | |
327 | ----------------- | |
328 | ||
7743f2f8 OD |
329 | * Runs only within default VRF |
330 | * Only single Area is supported. ABR is not yet supported | |
331 | * Only SPF algorithm is supported | |
332 | * Extended Prefix Range is not supported | |
7743f2f8 OD |
333 | * With NO Penultimate Hop Popping, it is not possible to express a Segment |
334 | Path with an Adjacency SID due to the impossibility for the Linux Kernel to | |
335 | perform double POP instruction. | |
cf9b9f77 | 336 | |
fd3b19f2 OD |
337 | Credits |
338 | ------- | |
7743f2f8 OD |
339 | |
340 | * Author: Anselme Sawadogo <anselmesawadogo@gmail.com> | |
341 | * Author: Olivier Dugeon <olivier.dugeon@orange.com> | |
342 | * Copyright (C) 2016 - 2018 Orange Labs http://www.orange.com | |
fd3b19f2 OD |
343 | |
344 | This work has been performed in the framework of the H2020-ICT-2014 | |
345 | project 5GEx (Grant Agreement no. 671636), which is partially funded | |
346 | by the European Commission. | |
347 |