]>
Commit | Line | Data |
---|---|---|
e5e68c89 SF |
1 | .. |
2 | Licensed under the Apache License, Version 2.0 (the "License"); you may | |
3 | not use this file except in compliance with the License. You may obtain | |
4 | a copy of the License at | |
5 | ||
6 | http://www.apache.org/licenses/LICENSE-2.0 | |
7 | ||
8 | Unless required by applicable law or agreed to in writing, software | |
9 | distributed under the License is distributed on an "AS IS" BASIS, WITHOUT | |
10 | WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the | |
11 | License for the specific language governing permissions and limitations | |
12 | under the License. | |
13 | ||
14 | Convention for heading levels in Open vSwitch documentation: | |
15 | ||
16 | ======= Heading 0 (reserved for the title in a document) | |
17 | ------- Heading 1 | |
18 | ~~~~~~~ Heading 2 | |
19 | +++++++ Heading 3 | |
20 | ''''''' Heading 4 | |
21 | ||
22 | Avoid deeper levels because they do not render well. | |
23 | ||
24 | ============== | |
25 | OVN To-do List | |
26 | ============== | |
27 | ||
28 | * Work out database for clustering or HA properly. | |
29 | ||
30 | * Compromised chassis mitigation. | |
31 | ||
32 | Possibly depends on database solution. | |
33 | ||
34 | Latest discussion: | |
35 | ||
36 | http://openvswitch.org/pipermail/dev/2016-August/078106.html | |
37 | ||
38 | * Get incremental updates in ovn-controller and ovn-northd in some | |
39 | sensible way. | |
40 | ||
41 | * Testing improvements, possibly heavily based on ovn-trace. | |
42 | ||
43 | Justin Pettit: "I'm planning to write some ovn-trace tests for IPv6. | |
44 | Hopefully we can get those into 2.6." | |
45 | ||
46 | * Self-managing HA for ovn-northd (avoiding the need to set up | |
47 | independent tooling for fail-over). | |
48 | ||
49 | Russell Bryant: "For bonus points, increasing N would scale out ovn-northd if | |
50 | it was under too much load, but that's a secondary concern." | |
51 | ||
52 | * Live migration. | |
53 | ||
54 | Russell Bryant: "When you're ready to have the destination take over, you | |
55 | have to remove the iface-id from the source and add it at the destination and | |
56 | I think it'd typically be configured on both ends, since it's a clone of the | |
57 | source VM (and it's config)." | |
58 | ||
59 | * VLAN trunk ports. | |
60 | ||
61 | Russell Bryant: "Today that would require creating 4096 ports for the VM and | |
62 | attach to 4096 OVN networks, so doable, but not quite ideal." | |
63 | ||
64 | * Native DNS support | |
65 | ||
66 | Russell Bryant: "This is an OpenStack requirement to fully eliminate the DHCP | |
67 | agent." | |
68 | ||
69 | * Service function chaining. | |
70 | ||
71 | * MAC learning. | |
72 | ||
73 | Han Zhou: "To support VMs that hosts workloads with their own macs, e.g. | |
74 | containers, if not using OVN native container support." | |
75 | ||
76 | * Finish up ARP/ND support: re-checking bindings, expiring bindings. | |
77 | ||
78 | * Hitless upgrade, especially for data plane. | |
79 | ||
80 | * Use OpenFlow "bundles" for transactional data plane updates. | |
81 | ||
82 | * L3 support | |
83 | ||
84 | * Logical routers should send RST replies to TCP packets. | |
85 | ||
86 | * IPv6 router ports should periodically send ND Router Advertisements. | |
87 | ||
88 | * Dynamic IP to MAC binding enhancements. | |
89 | ||
90 | OVN has basic support for establishing IP to MAC bindings dynamically, using | |
91 | ARP. | |
92 | ||
93 | * Ratelimiting. | |
94 | ||
95 | From casual observation, Linux appears to generate at most one ARP per | |
96 | second per destination. | |
97 | ||
98 | This might be supported by adding a new OVN logical action for | |
99 | rate-limiting. | |
100 | ||
101 | * Tracking queries | |
102 | ||
103 | It's probably best to only record in the database responses to queries | |
104 | actually issued by an L3 logical router, so somehow they have to be | |
105 | tracked, probably by putting a tentative binding without a MAC address | |
106 | into the database. | |
107 | ||
108 | * Renewal and expiration. | |
109 | ||
110 | Something needs to make sure that bindings remain valid and expire those | |
111 | that become stale. | |
112 | ||
113 | One way to do this might be to add some support for time to the database | |
114 | server itself. | |
115 | ||
116 | * Table size limiting. | |
117 | ||
118 | The table of MAC bindings must not be allowed to grow unreasonably large. | |
119 | ||
120 | * MTU handling (fragmentation on output) | |
121 | ||
122 | * Security | |
123 | ||
124 | * Limiting the impact of a compromised chassis. | |
125 | ||
126 | Every instance of ovn-controller has the same full access to the central | |
127 | OVN_Southbound database. This means that a compromised chassis can | |
128 | interfere with the normal operation of the rest of the deployment. Some | |
129 | specific examples include writing to the logical flow table to alter | |
130 | traffic handling or updating the port binding table to claim ports that are | |
131 | actually present on a different chassis. In practice, the compromised host | |
132 | would be fighting against ovn-northd and other instances of ovn-controller | |
133 | that would be trying to restore the correct state. The impact could | |
134 | include at least temporarily redirecting traffic (so the compromised host | |
135 | could receive traffic that it shouldn't) and potentially a more general | |
136 | denial of service. | |
137 | ||
138 | There are different potential improvements to this area. The first would | |
139 | be to add some sort of ACL scheme to ovsdb-server. A proposal for this | |
140 | should first include an ACL scheme for ovn-controller. An example policy | |
141 | would be to make Logical_Flow read-only. Table-level control is needed, | |
142 | but is not enough. For example, ovn-controller must be able to update the | |
143 | Chassis and Encap tables, but should only be able to modify the rows | |
144 | associated with that chassis and no others. | |
145 | ||
146 | A more complex example is the Port_Binding table. Currently, | |
147 | ovn-controller is the source of truth of where a port is located. There | |
148 | seems to be no policy that can prevent malicious behavior of a compromised | |
149 | host with this table. | |
150 | ||
151 | An alternative scheme for port bindings would be to provide an optional | |
152 | mode where an external entity controls port bindings and make them | |
153 | read-only to ovn-controller. This is actually how OpenStack works today, | |
154 | for example. The part of OpenStack that manages VMs (Nova) tells the | |
155 | networking component (Neutron) where a port will be located, as opposed to | |
156 | the networking component discovering it. | |
157 | ||
158 | * ovsdb-server | |
159 | ||
160 | ovsdb-server should have adequate features for OVN but it probably needs work | |
161 | for scale and possibly for availability as deployments grow. Here are some | |
162 | thoughts. | |
163 | ||
164 | * Multithreading. | |
165 | ||
166 | If it turns out that other changes don't let ovsdb-server scale | |
167 | adequately, we can multithread ovsdb-server. Initially one might | |
168 | only break protocol handling into separate threads, leaving the | |
169 | actual database work serialized through a lock. | |
170 | ||
171 | * Increasing availability. | |
172 | ||
173 | Database availability might become an issue. The OVN system shouldn't | |
174 | grind to a halt if the database becomes unavailable, but it would become | |
175 | impossible to bring VIFs up or down, etc. | |
176 | ||
177 | My current thought on how to increase availability is to add clustering to | |
178 | ovsdb-server, probably via the Raft consensus algorithm. As an experiment, | |
179 | I wrote an implementation of Raft for Open vSwitch that you can clone from: | |
180 | ||
181 | https://github.com/blp/ovs-reviews.git raft | |
182 | ||
183 | * Reducing startup time. | |
184 | ||
185 | As-is, if ovsdb-server restarts, every client will fetch a fresh copy of | |
186 | the part of the database that it cares about. With hundreds of clients, | |
187 | this could cause heavy CPU load on ovsdb-server and use excessive network | |
188 | bandwidth. It would be better to allow incremental updates even across | |
189 | connection loss. One way might be to use "Difference Digests" as described | |
190 | in Epstein et al., "What's the Difference? Efficient Set Reconciliation | |
191 | Without Prior Context". (I'm not yet aware of previous non-academic use of | |
192 | this technique.) | |
193 | ||
194 | * Support multiple tunnel encapsulations in Chassis. | |
195 | ||
196 | So far, both ovn-controller and ovn-controller-vtep only allow chassis to | |
197 | have one tunnel encapsulation entry. We should extend the implementation | |
198 | to support multiple tunnel encapsulations. | |
199 | ||
200 | * Update learned MAC addresses from VTEP to OVN | |
201 | ||
202 | The VTEP gateway stores all MAC addresses learned from its physical | |
203 | interfaces in the 'Ucast_Macs_Local' and the 'Mcast_Macs_Local' tables. | |
204 | ovn-controller-vtep should be able to update that information back to | |
205 | ovn-sb database, so that other chassis know where to send packets destined | |
206 | to the extended external network instead of broadcasting. | |
207 | ||
208 | * Translate ovn-sb Multicast_Group table into VTEP config | |
209 | ||
210 | The ovn-controller-vtep daemon should be able to translate the | |
211 | Multicast_Group table entry in ovn-sb database into Mcast_Macs_Remote table | |
212 | configuration in VTEP database. | |
213 | ||
214 | * Consider the use of BFD as tunnel monitor. | |
215 | ||
216 | The use of BFD for hypervisor-to-hypervisor tunnels is probably not worth it, | |
217 | since there's no alternative to switch to if a tunnel goes down. It could | |
218 | make sense at a slow rate if someone does OVN monitoring system integration, | |
219 | but not otherwise. | |
220 | ||
221 | When OVN gets to supporting HA for gateways (see ovn/OVN-GW-HA.rst), BFD is | |
222 | likely needed as a part of that solution. | |
223 | ||
224 | There's more commentary in this ML post: | |
225 | http://openvswitch.org/pipermail/dev/2015-November/062385.html | |
226 | ||
227 | * ACL | |
228 | ||
229 | * Support FTP ALGs. | |
230 | ||
231 | * Support reject action. | |
232 | ||
233 | * Support log option. |