]>
Commit | Line | Data |
---|---|---|
9f95a23c TL |
1 | .. SPDX-License-Identifier: BSD-3-Clause |
2 | Copyright(c) 2010-2014 Intel Corporation. | |
7c673cae FG |
3 | |
4 | .. _LPM_Library: | |
5 | ||
6 | LPM Library | |
7 | =========== | |
8 | ||
9 | The DPDK LPM library component implements the Longest Prefix Match (LPM) table search method for 32-bit keys | |
10 | that is typically used to find the best route match in IP forwarding applications. | |
11 | ||
12 | LPM API Overview | |
13 | ---------------- | |
14 | ||
15 | The main configuration parameter for LPM component instances is the maximum number of rules to support. | |
16 | An LPM prefix is represented by a pair of parameters (32- bit key, depth), with depth in the range of 1 to 32. | |
17 | An LPM rule is represented by an LPM prefix and some user data associated with the prefix. | |
18 | The prefix serves as the unique identifier of the LPM rule. | |
19 | In this implementation, the user data is 1-byte long and is called next hop, | |
20 | in correlation with its main use of storing the ID of the next hop in a routing table entry. | |
21 | ||
22 | The main methods exported by the LPM component are: | |
23 | ||
24 | * Add LPM rule: The LPM rule is provided as input. | |
25 | If there is no rule with the same prefix present in the table, then the new rule is added to the LPM table. | |
26 | If a rule with the same prefix is already present in the table, the next hop of the rule is updated. | |
27 | An error is returned when there is no available rule space left. | |
28 | ||
29 | * Delete LPM rule: The prefix of the LPM rule is provided as input. | |
30 | If a rule with the specified prefix is present in the LPM table, then it is removed. | |
31 | ||
32 | * Lookup LPM key: The 32-bit key is provided as input. | |
33 | The algorithm selects the rule that represents the best match for the given key and returns the next hop of that rule. | |
34 | In the case that there are multiple rules present in the LPM table that have the same 32-bit key, | |
35 | the algorithm picks the rule with the highest depth as the best match rule, | |
36 | which means that the rule has the highest number of most significant bits matching between the input key and the rule key. | |
37 | ||
38 | .. _lpm4_details: | |
39 | ||
40 | Implementation Details | |
41 | ---------------------- | |
42 | ||
43 | The current implementation uses a variation of the DIR-24-8 algorithm that trades memory usage for improved LPM lookup speed. | |
44 | The algorithm allows the lookup operation to be performed with typically a single memory read access. | |
45 | In the statistically rare case when the best match rule is having a depth bigger than 24, | |
46 | the lookup operation requires two memory read accesses. | |
47 | Therefore, the performance of the LPM lookup operation is greatly influenced by | |
48 | whether the specific memory location is present in the processor cache or not. | |
49 | ||
50 | The main data structure is built using the following elements: | |
51 | ||
52 | * A table with 2^24 entries. | |
53 | ||
54 | * A number of tables (RTE_LPM_TBL8_NUM_GROUPS) with 2^8 entries. | |
55 | ||
56 | The first table, called tbl24, is indexed using the first 24 bits of the IP address to be looked up, | |
57 | while the second table(s), called tbl8, is indexed using the last 8 bits of the IP address. | |
58 | This means that depending on the outcome of trying to match the IP address of an incoming packet to the rule stored in the tbl24 | |
59 | we might need to continue the lookup process in the second level. | |
60 | ||
61 | Since every entry of the tbl24 can potentially point to a tbl8, ideally, we would have 2^24 tbl8s, | |
62 | which would be the same as having a single table with 2^32 entries. | |
63 | This is not feasible due to resource restrictions. | |
64 | Instead, this approach takes advantage of the fact that rules longer than 24 bits are very rare. | |
65 | By splitting the process in two different tables/levels and limiting the number of tbl8s, | |
66 | we can greatly reduce memory consumption while maintaining a very good lookup speed (one memory access, most of the times). | |
67 | ||
68 | ||
69 | .. figure:: img/tbl24_tbl8.* | |
70 | ||
71 | Table split into different levels | |
72 | ||
73 | ||
74 | An entry in tbl24 contains the following fields: | |
75 | ||
76 | * next hop / index to the tbl8 | |
77 | ||
78 | * valid flag | |
79 | ||
80 | * external entry flag | |
81 | ||
82 | * depth of the rule (length) | |
83 | ||
84 | The first field can either contain a number indicating the tbl8 in which the lookup process should continue | |
85 | or the next hop itself if the longest prefix match has already been found. | |
86 | The two flags are used to determine whether the entry is valid or not and | |
87 | whether the search process have finished or not respectively. | |
88 | The depth or length of the rule is the number of bits of the rule that is stored in a specific entry. | |
89 | ||
90 | An entry in a tbl8 contains the following fields: | |
91 | ||
92 | * next hop | |
93 | ||
94 | * valid | |
95 | ||
96 | * valid group | |
97 | ||
98 | * depth | |
99 | ||
100 | Next hop and depth contain the same information as in the tbl24. | |
101 | The two flags show whether the entry and the table are valid respectively. | |
102 | ||
103 | The other main data structure is a table containing the main information about the rules (IP and next hop). | |
104 | This is a higher level table, used for different things: | |
105 | ||
106 | * Check whether a rule already exists or not, prior to addition or deletion, | |
107 | without having to actually perform a lookup. | |
108 | ||
109 | * When deleting, to check whether there is a rule containing the one that is to be deleted. | |
110 | This is important, since the main data structure will have to be updated accordingly. | |
111 | ||
112 | Addition | |
113 | ~~~~~~~~ | |
114 | ||
115 | When adding a rule, there are different possibilities. | |
116 | If the rule's depth is exactly 24 bits, then: | |
117 | ||
118 | * Use the rule (IP address) as an index to the tbl24. | |
119 | ||
120 | * If the entry is invalid (i.e. it doesn't already contain a rule) then set its next hop to its value, | |
121 | the valid flag to 1 (meaning this entry is in use), | |
122 | and the external entry flag to 0 | |
123 | (meaning the lookup process ends at this point, since this is the longest prefix that matches). | |
124 | ||
125 | If the rule's depth is exactly 32 bits, then: | |
126 | ||
127 | * Use the first 24 bits of the rule as an index to the tbl24. | |
128 | ||
129 | * If the entry is invalid (i.e. it doesn't already contain a rule) then look for a free tbl8, | |
130 | set the index to the tbl8 to this value, | |
131 | the valid flag to 1 (meaning this entry is in use), and the external entry flag to 1 | |
132 | (meaning the lookup process must continue since the rule hasn't been explored completely). | |
133 | ||
134 | If the rule's depth is any other value, prefix expansion must be performed. | |
135 | This means the rule is copied to all the entries (as long as they are not in use) which would also cause a match. | |
136 | ||
137 | As a simple example, let's assume the depth is 20 bits. | |
138 | This means that there are 2^(24 - 20) = 16 different combinations of the first 24 bits of an IP address that | |
139 | would cause a match. | |
140 | Hence, in this case, we copy the exact same entry to every position indexed by one of these combinations. | |
141 | ||
142 | By doing this we ensure that during the lookup process, if a rule matching the IP address exists, | |
143 | it is found in either one or two memory accesses, | |
144 | depending on whether we need to move to the next table or not. | |
145 | Prefix expansion is one of the keys of this algorithm, | |
146 | since it improves the speed dramatically by adding redundancy. | |
147 | ||
148 | Lookup | |
149 | ~~~~~~ | |
150 | ||
151 | The lookup process is much simpler and quicker. In this case: | |
152 | ||
153 | * Use the first 24 bits of the IP address as an index to the tbl24. | |
154 | If the entry is not in use, then it means we don't have a rule matching this IP. | |
155 | If it is valid and the external entry flag is set to 0, then the next hop is returned. | |
156 | ||
157 | * If it is valid and the external entry flag is set to 1, | |
158 | then we use the tbl8 index to find out the tbl8 to be checked, | |
159 | and the last 8 bits of the IP address as an index to this table. | |
160 | Similarly, if the entry is not in use, then we don't have a rule matching this IP address. | |
161 | If it is valid then the next hop is returned. | |
162 | ||
163 | Limitations in the Number of Rules | |
164 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
165 | ||
166 | There are different things that limit the number of rules that can be added. | |
167 | The first one is the maximum number of rules, which is a parameter passed through the API. | |
168 | Once this number is reached, | |
169 | it is not possible to add any more rules to the routing table unless one or more are removed. | |
170 | ||
171 | The second reason is an intrinsic limitation of the algorithm. | |
172 | As explained before, to avoid high memory consumption, the number of tbl8s is limited in compilation time | |
173 | (this value is by default 256). | |
174 | If we exhaust tbl8s, we won't be able to add any more rules. | |
175 | How many of them are necessary for a specific routing table is hard to determine in advance. | |
176 | ||
177 | A tbl8 is consumed whenever we have a new rule with depth bigger than 24, | |
178 | and the first 24 bits of this rule are not the same as the first 24 bits of a rule previously added. | |
179 | If they are, then the new rule will share the same tbl8 than the previous one, | |
180 | since the only difference between the two rules is within the last byte. | |
181 | ||
182 | With the default value of 256, we can have up to 256 rules longer than 24 bits that differ on their first three bytes. | |
183 | Since routes longer than 24 bits are unlikely, this shouldn't be a problem in most setups. | |
184 | Even if it is, however, the number of tbl8s can be modified. | |
185 | ||
186 | Use Case: IPv4 Forwarding | |
187 | ~~~~~~~~~~~~~~~~~~~~~~~~~ | |
188 | ||
189 | The LPM algorithm is used to implement Classless Inter-Domain Routing (CIDR) strategy used by routers implementing IPv4 forwarding. | |
190 | ||
191 | References | |
192 | ~~~~~~~~~~ | |
193 | ||
194 | * RFC1519 Classless Inter-Domain Routing (CIDR): an Address Assignment and Aggregation Strategy, | |
195 | `http://www.ietf.org/rfc/rfc1519 <http://www.ietf.org/rfc/rfc1519>`_ | |
196 | ||
197 | * Pankaj Gupta, Algorithms for Routing Lookups and Packet Classification, PhD Thesis, Stanford University, | |
9f95a23c | 198 | 2000 (`http://klamath.stanford.edu/~pankaj/thesis/thesis_1sided.pdf <http://klamath.stanford.edu/~pankaj/thesis/thesis_1sided.pdf>`_ ) |