]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | ================================= |
2 | Network Configuration Reference | |
3 | ================================= | |
4 | ||
5 | Network configuration is critical for building a high performance :term:`Ceph | |
6 | Storage Cluster`. The Ceph Storage Cluster does not perform request routing or | |
7 | dispatching on behalf of the :term:`Ceph Client`. Instead, Ceph Clients make | |
8 | requests directly to Ceph OSD Daemons. Ceph OSD Daemons perform data replication | |
9 | on behalf of Ceph Clients, which means replication and other factors impose | |
10 | additional loads on Ceph Storage Cluster networks. | |
11 | ||
f67539c2 | 12 | Our Quick Start configurations provide a trivial Ceph configuration file that |
7c673cae FG |
13 | sets monitor IP addresses and daemon host names only. Unless you specify a |
14 | cluster network, Ceph assumes a single "public" network. Ceph functions just | |
15 | fine with a public network only, but you may see significant performance | |
16 | improvement with a second "cluster" network in a large cluster. | |
17 | ||
11fdf7f2 | 18 | It is possible to run a Ceph Storage Cluster with two networks: a public |
f67539c2 TL |
19 | (client, front-side) network and a cluster (private, replication, back-side) |
20 | network. However, this approach | |
21 | complicates network configuration (both hardware and software) and does not usually | |
22 | have a significant impact on overall performance. For this reason, we recommend | |
23 | that for resilience and capacity dual-NIC systems either active/active bond | |
20effc67 | 24 | these interfaces or implement a layer 3 multipath strategy with eg. FRR. |
11fdf7f2 TL |
25 | |
26 | If, despite the complexity, one still wishes to use two networks, each | |
f67539c2 | 27 | :term:`Ceph Node` will need to have more than one network interface or VLAN. See `Hardware |
11fdf7f2 | 28 | Recommendations - Networks`_ for additional details. |
7c673cae FG |
29 | |
30 | .. ditaa:: | |
31 | +-------------+ | |
32 | | Ceph Client | | |
33 | +----*--*-----+ | |
34 | | ^ | |
35 | Request | : Response | |
36 | v | | |
37 | /----------------------------------*--*-------------------------------------\ | |
38 | | Public Network | | |
39 | \---*--*------------*--*-------------*--*------------*--*------------*--*---/ | |
40 | ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ | |
41 | | | | | | | | | | | | |
42 | | : | : | : | : | : | |
43 | v v v v v v v v v v | |
44 | +---*--*---+ +---*--*---+ +---*--*---+ +---*--*---+ +---*--*---+ | |
45 | | Ceph MON | | Ceph MDS | | Ceph OSD | | Ceph OSD | | Ceph OSD | | |
46 | +----------+ +----------+ +---*--*---+ +---*--*---+ +---*--*---+ | |
47 | ^ ^ ^ ^ ^ ^ | |
48 | The cluster network relieves | | | | | | | |
49 | OSD replication and heartbeat | : | : | : | |
50 | traffic from the public network. v v v v v v | |
51 | /------------------------------------*--*------------*--*------------*--*---\ | |
52 | | cCCC Cluster Network | | |
53 | \---------------------------------------------------------------------------/ | |
54 | ||
55 | ||
7c673cae FG |
56 | IP Tables |
57 | ========= | |
58 | ||
59 | By default, daemons `bind`_ to ports within the ``6800:7300`` range. You may | |
60 | configure this range at your discretion. Before configuring your IP tables, | |
61 | check the default ``iptables`` configuration. | |
62 | ||
39ae355f TL |
63 | .. prompt:: bash $ |
64 | ||
65 | sudo iptables -L | |
7c673cae FG |
66 | |
67 | Some Linux distributions include rules that reject all inbound requests | |
68 | except SSH from all network interfaces. For example:: | |
69 | ||
70 | REJECT all -- anywhere anywhere reject-with icmp-host-prohibited | |
71 | ||
72 | You will need to delete these rules on both your public and cluster networks | |
73 | initially, and replace them with appropriate rules when you are ready to | |
74 | harden the ports on your Ceph Nodes. | |
75 | ||
76 | ||
77 | Monitor IP Tables | |
78 | ----------------- | |
79 | ||
11fdf7f2 TL |
80 | Ceph Monitors listen on ports ``3300`` and ``6789`` by |
81 | default. Additionally, Ceph Monitors always operate on the public | |
82 | network. When you add the rule using the example below, make sure you | |
83 | replace ``{iface}`` with the public network interface (e.g., ``eth0``, | |
84 | ``eth1``, etc.), ``{ip-address}`` with the IP address of the public | |
39ae355f TL |
85 | network and ``{netmask}`` with the netmask for the public network. : |
86 | ||
87 | .. prompt:: bash $ | |
7c673cae FG |
88 | |
89 | sudo iptables -A INPUT -i {iface} -p tcp -s {ip-address}/{netmask} --dport 6789 -j ACCEPT | |
90 | ||
91 | ||
11fdf7f2 TL |
92 | MDS and Manager IP Tables |
93 | ------------------------- | |
7c673cae | 94 | |
11fdf7f2 TL |
95 | A :term:`Ceph Metadata Server` or :term:`Ceph Manager` listens on the first |
96 | available port on the public network beginning at port 6800. Note that this | |
97 | behavior is not deterministic, so if you are running more than one OSD or MDS | |
98 | on the same host, or if you restart the daemons within a short window of time, | |
99 | the daemons will bind to higher ports. You should open the entire 6800-7300 | |
100 | range by default. When you add the rule using the example below, make sure | |
101 | you replace ``{iface}`` with the public network interface (e.g., ``eth0``, | |
102 | ``eth1``, etc.), ``{ip-address}`` with the IP address of the public network | |
103 | and ``{netmask}`` with the netmask of the public network. | |
7c673cae | 104 | |
39ae355f | 105 | For example: |
7c673cae | 106 | |
39ae355f TL |
107 | .. prompt:: bash $ |
108 | ||
109 | sudo iptables -A INPUT -i {iface} -m multiport -p tcp -s {ip-address}/{netmask} --dports 6800:7300 -j ACCEPT | |
7c673cae FG |
110 | |
111 | ||
112 | OSD IP Tables | |
113 | ------------- | |
114 | ||
115 | By default, Ceph OSD Daemons `bind`_ to the first available ports on a Ceph Node | |
116 | beginning at port 6800. Note that this behavior is not deterministic, so if you | |
117 | are running more than one OSD or MDS on the same host, or if you restart the | |
118 | daemons within a short window of time, the daemons will bind to higher ports. | |
119 | Each Ceph OSD Daemon on a Ceph Node may use up to four ports: | |
120 | ||
121 | #. One for talking to clients and monitors. | |
122 | #. One for sending data to other OSDs. | |
123 | #. Two for heartbeating on each interface. | |
124 | ||
f91f0fd5 | 125 | .. ditaa:: |
7c673cae FG |
126 | /---------------\ |
127 | | OSD | | |
128 | | +---+----------------+-----------+ | |
129 | | | Clients & Monitors | Heartbeat | | |
130 | | +---+----------------+-----------+ | |
131 | | | | |
132 | | +---+----------------+-----------+ | |
133 | | | Data Replication | Heartbeat | | |
134 | | +---+----------------+-----------+ | |
135 | | cCCC | | |
136 | \---------------/ | |
137 | ||
138 | When a daemon fails and restarts without letting go of the port, the restarted | |
139 | daemon will bind to a new port. You should open the entire 6800-7300 port range | |
140 | to handle this possibility. | |
141 | ||
142 | If you set up separate public and cluster networks, you must add rules for both | |
143 | the public network and the cluster network, because clients will connect using | |
144 | the public network and other Ceph OSD Daemons will connect using the cluster | |
145 | network. When you add the rule using the example below, make sure you replace | |
146 | ``{iface}`` with the network interface (e.g., ``eth0``, ``eth1``, etc.), | |
147 | ``{ip-address}`` with the IP address and ``{netmask}`` with the netmask of the | |
39ae355f TL |
148 | public or cluster network. For example: |
149 | ||
150 | .. prompt:: bash $ | |
7c673cae | 151 | |
39ae355f | 152 | sudo iptables -A INPUT -i {iface} -m multiport -p tcp -s {ip-address}/{netmask} --dports 6800:7300 -j ACCEPT |
7c673cae FG |
153 | |
154 | .. tip:: If you run Ceph Metadata Servers on the same Ceph Node as the | |
155 | Ceph OSD Daemons, you can consolidate the public network configuration step. | |
156 | ||
157 | ||
158 | Ceph Networks | |
159 | ============= | |
160 | ||
161 | To configure Ceph networks, you must add a network configuration to the | |
162 | ``[global]`` section of the configuration file. Our 5-minute Quick Start | |
f67539c2 | 163 | provides a trivial Ceph configuration file that assumes one public network |
7c673cae FG |
164 | with client and server on the same network and subnet. Ceph functions just fine |
165 | with a public network only. However, Ceph allows you to establish much more | |
166 | specific criteria, including multiple IP network and subnet masks for your | |
167 | public network. You can also establish a separate cluster network to handle OSD | |
168 | heartbeat, object replication and recovery traffic. Don't confuse the IP | |
169 | addresses you set in your configuration with the public-facing IP addresses | |
170 | network clients may use to access your service. Typical internal IP networks are | |
171 | often ``192.168.0.0`` or ``10.0.0.0``. | |
172 | ||
173 | .. tip:: If you specify more than one IP address and subnet mask for | |
174 | either the public or the cluster network, the subnets within the network | |
175 | must be capable of routing to each other. Additionally, make sure you | |
176 | include each IP address/subnet in your IP tables and open ports for them | |
177 | as necessary. | |
178 | ||
179 | .. note:: Ceph uses `CIDR`_ notation for subnets (e.g., ``10.0.0.0/24``). | |
180 | ||
c07f9fc5 | 181 | When you have configured your networks, you may restart your cluster or restart |
7c673cae FG |
182 | each daemon. Ceph daemons bind dynamically, so you do not have to restart the |
183 | entire cluster at once if you change your network configuration. | |
184 | ||
185 | ||
186 | Public Network | |
187 | -------------- | |
188 | ||
189 | To configure a public network, add the following option to the ``[global]`` | |
190 | section of your Ceph configuration file. | |
191 | ||
192 | .. code-block:: ini | |
193 | ||
194 | [global] | |
11fdf7f2 | 195 | # ... elided configuration |
f67539c2 | 196 | public_network = {public-network/netmask} |
7c673cae | 197 | |
f67539c2 | 198 | .. _cluster-network: |
7c673cae FG |
199 | |
200 | Cluster Network | |
201 | --------------- | |
202 | ||
203 | If you declare a cluster network, OSDs will route heartbeat, object replication | |
204 | and recovery traffic over the cluster network. This may improve performance | |
205 | compared to using a single network. To configure a cluster network, add the | |
206 | following option to the ``[global]`` section of your Ceph configuration file. | |
207 | ||
208 | .. code-block:: ini | |
209 | ||
210 | [global] | |
11fdf7f2 | 211 | # ... elided configuration |
f67539c2 | 212 | cluster_network = {cluster-network/netmask} |
7c673cae FG |
213 | |
214 | We prefer that the cluster network is **NOT** reachable from the public network | |
215 | or the Internet for added security. | |
216 | ||
f67539c2 TL |
217 | IPv4/IPv6 Dual Stack Mode |
218 | ------------------------- | |
219 | ||
220 | If you want to run in an IPv4/IPv6 dual stack mode and want to define your public and/or | |
221 | cluster networks, then you need to specify both your IPv4 and IPv6 networks for each: | |
222 | ||
223 | .. code-block:: ini | |
224 | ||
225 | [global] | |
226 | # ... elided configuration | |
227 | public_network = {IPv4 public-network/netmask}, {IPv6 public-network/netmask} | |
228 | ||
229 | This is so that Ceph can find a valid IP address for both address families. | |
230 | ||
231 | If you want just an IPv4 or an IPv6 stack environment, then make sure you set the `ms bind` | |
232 | options correctly. | |
233 | ||
234 | .. note:: | |
235 | Binding to IPv4 is enabled by default, so if you just add the option to bind to IPv6 | |
236 | you'll actually put yourself into dual stack mode. If you want just IPv6, then disable IPv4 and | |
237 | enable IPv6. See `Bind`_ below. | |
7c673cae FG |
238 | |
239 | Ceph Daemons | |
240 | ============ | |
241 | ||
f67539c2 TL |
242 | Monitor daemons are each configured to bind to a specific IP address. These |
243 | addresses are normally configured by your deployment tool. Other components | |
244 | in the Ceph cluster discover the monitors via the ``mon host`` configuration | |
245 | option, normally specified in the ``[global]`` section of the ``ceph.conf`` file. | |
7c673cae FG |
246 | |
247 | .. code-block:: ini | |
248 | ||
11fdf7f2 | 249 | [global] |
f67539c2 | 250 | mon_host = 10.0.0.2, 10.0.0.3, 10.0.0.4 |
7c673cae | 251 | |
f67539c2 | 252 | The ``mon_host`` value can be a list of IP addresses or a name that is |
11fdf7f2 TL |
253 | looked up via DNS. In the case of a DNS name with multiple A or AAAA |
254 | records, all records are probed in order to discover a monitor. Once | |
255 | one monitor is reached, all other current monitors are discovered, so | |
256 | the ``mon host`` configuration option only needs to be sufficiently up | |
257 | to date such that a client can reach one monitor that is currently online. | |
7c673cae | 258 | |
11fdf7f2 TL |
259 | The MGR, OSD, and MDS daemons will bind to any available address and |
260 | do not require any special configuration. However, it is possible to | |
261 | specify a specific IP address for them to bind to with the ``public | |
262 | addr`` (and/or, in the case of OSD daemons, the ``cluster addr``) | |
263 | configuration option. For example, | |
7c673cae FG |
264 | |
265 | .. code-block:: ini | |
266 | ||
267 | [osd.0] | |
1e59de90 TL |
268 | public_addr = {host-public-ip-address} |
269 | cluster_addr = {host-cluster-ip-address} | |
7c673cae | 270 | |
7c673cae FG |
271 | .. topic:: One NIC OSD in a Two Network Cluster |
272 | ||
f67539c2 | 273 | Generally, we do not recommend deploying an OSD host with a single network interface in a |
7c673cae | 274 | cluster with two networks. However, you may accomplish this by forcing the |
f67539c2 | 275 | OSD host to operate on the public network by adding a ``public_addr`` entry |
7c673cae | 276 | to the ``[osd.n]`` section of the Ceph configuration file, where ``n`` |
f67539c2 | 277 | refers to the ID of the OSD with one network interface. Additionally, the public |
7c673cae FG |
278 | network and cluster network must be able to route traffic to each other, |
279 | which we don't recommend for security reasons. | |
280 | ||
281 | ||
282 | Network Config Settings | |
283 | ======================= | |
284 | ||
285 | Network configuration settings are not required. Ceph assumes a public network | |
286 | with all hosts operating on it unless you specifically configure a cluster | |
287 | network. | |
288 | ||
289 | ||
290 | Public Network | |
291 | -------------- | |
292 | ||
293 | The public network configuration allows you specifically define IP addresses | |
294 | and subnets for the public network. You may specifically assign static IP | |
f67539c2 | 295 | addresses or override ``public_network`` settings using the ``public_addr`` |
7c673cae FG |
296 | setting for a specific daemon. |
297 | ||
20effc67 TL |
298 | .. confval:: public_network |
299 | .. confval:: public_addr | |
7c673cae FG |
300 | |
301 | Cluster Network | |
302 | --------------- | |
303 | ||
304 | The cluster network configuration allows you to declare a cluster network, and | |
305 | specifically define IP addresses and subnets for the cluster network. You may | |
f67539c2 TL |
306 | specifically assign static IP addresses or override ``cluster_network`` |
307 | settings using the ``cluster_addr`` setting for specific OSD daemons. | |
7c673cae FG |
308 | |
309 | ||
20effc67 TL |
310 | .. confval:: cluster_network |
311 | .. confval:: cluster_addr | |
7c673cae FG |
312 | |
313 | Bind | |
314 | ---- | |
315 | ||
316 | Bind settings set the default port ranges Ceph OSD and MDS daemons use. The | |
317 | default range is ``6800:7300``. Ensure that your `IP Tables`_ configuration | |
318 | allows you to use the configured port range. | |
319 | ||
320 | You may also enable Ceph daemons to bind to IPv6 addresses instead of IPv4 | |
321 | addresses. | |
322 | ||
20effc67 TL |
323 | .. confval:: ms_bind_port_min |
324 | .. confval:: ms_bind_port_max | |
325 | .. confval:: ms_bind_ipv4 | |
326 | .. confval:: ms_bind_ipv6 | |
327 | .. confval:: public_bind_addr | |
7c673cae | 328 | |
7c673cae FG |
329 | TCP |
330 | --- | |
331 | ||
332 | Ceph disables TCP buffering by default. | |
333 | ||
20effc67 TL |
334 | .. confval:: ms_tcp_nodelay |
335 | .. confval:: ms_tcp_rcvbuf | |
7c673cae | 336 | |
20effc67 TL |
337 | General Settings |
338 | ---------------- | |
7c673cae | 339 | |
20effc67 TL |
340 | .. confval:: ms_type |
341 | .. confval:: ms_async_op_threads | |
342 | .. confval:: ms_initial_backoff | |
343 | .. confval:: ms_max_backoff | |
344 | .. confval:: ms_die_on_bad_msg | |
345 | .. confval:: ms_dispatch_throttle_bytes | |
346 | .. confval:: ms_inject_socket_failures | |
7c673cae FG |
347 | |
348 | ||
349 | .. _Scalability and High Availability: ../../../architecture#scalability-and-high-availability | |
350 | .. _Hardware Recommendations - Networks: ../../../start/hardware-recommendations#networks | |
7c673cae FG |
351 | .. _hardware recommendations: ../../../start/hardware-recommendations |
352 | .. _Monitor / OSD Interaction: ../mon-osd-interaction | |
353 | .. _Message Signatures: ../auth-config-ref#signatures | |
11fdf7f2 TL |
354 | .. _CIDR: https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing |
355 | .. _Nagle's Algorithm: https://en.wikipedia.org/wiki/Nagle's_algorithm |