]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | ================================= |
2 | Network Configuration Reference | |
3 | ================================= | |
4 | ||
5 | Network configuration is critical for building a high performance :term:`Ceph | |
6 | Storage Cluster`. The Ceph Storage Cluster does not perform request routing or | |
7 | dispatching on behalf of the :term:`Ceph Client`. Instead, Ceph Clients make | |
8 | requests directly to Ceph OSD Daemons. Ceph OSD Daemons perform data replication | |
9 | on behalf of Ceph Clients, which means replication and other factors impose | |
10 | additional loads on Ceph Storage Cluster networks. | |
11 | ||
f67539c2 | 12 | Our Quick Start configurations provide a trivial Ceph configuration file that |
7c673cae FG |
13 | sets monitor IP addresses and daemon host names only. Unless you specify a |
14 | cluster network, Ceph assumes a single "public" network. Ceph functions just | |
15 | fine with a public network only, but you may see significant performance | |
16 | improvement with a second "cluster" network in a large cluster. | |
17 | ||
11fdf7f2 | 18 | It is possible to run a Ceph Storage Cluster with two networks: a public |
f67539c2 TL |
19 | (client, front-side) network and a cluster (private, replication, back-side) |
20 | network. However, this approach | |
21 | complicates network configuration (both hardware and software) and does not usually | |
22 | have a significant impact on overall performance. For this reason, we recommend | |
23 | that for resilience and capacity dual-NIC systems either active/active bond | |
24 | these interfaces or implemebnt a layer 3 multipath strategy with eg. FRR. | |
11fdf7f2 TL |
25 | |
26 | If, despite the complexity, one still wishes to use two networks, each | |
f67539c2 | 27 | :term:`Ceph Node` will need to have more than one network interface or VLAN. See `Hardware |
11fdf7f2 | 28 | Recommendations - Networks`_ for additional details. |
7c673cae FG |
29 | |
30 | .. ditaa:: | |
31 | +-------------+ | |
32 | | Ceph Client | | |
33 | +----*--*-----+ | |
34 | | ^ | |
35 | Request | : Response | |
36 | v | | |
37 | /----------------------------------*--*-------------------------------------\ | |
38 | | Public Network | | |
39 | \---*--*------------*--*-------------*--*------------*--*------------*--*---/ | |
40 | ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ | |
41 | | | | | | | | | | | | |
42 | | : | : | : | : | : | |
43 | v v v v v v v v v v | |
44 | +---*--*---+ +---*--*---+ +---*--*---+ +---*--*---+ +---*--*---+ | |
45 | | Ceph MON | | Ceph MDS | | Ceph OSD | | Ceph OSD | | Ceph OSD | | |
46 | +----------+ +----------+ +---*--*---+ +---*--*---+ +---*--*---+ | |
47 | ^ ^ ^ ^ ^ ^ | |
48 | The cluster network relieves | | | | | | | |
49 | OSD replication and heartbeat | : | : | : | |
50 | traffic from the public network. v v v v v v | |
51 | /------------------------------------*--*------------*--*------------*--*---\ | |
52 | | cCCC Cluster Network | | |
53 | \---------------------------------------------------------------------------/ | |
54 | ||
55 | ||
7c673cae FG |
56 | IP Tables |
57 | ========= | |
58 | ||
59 | By default, daemons `bind`_ to ports within the ``6800:7300`` range. You may | |
60 | configure this range at your discretion. Before configuring your IP tables, | |
61 | check the default ``iptables`` configuration. | |
62 | ||
63 | sudo iptables -L | |
64 | ||
65 | Some Linux distributions include rules that reject all inbound requests | |
66 | except SSH from all network interfaces. For example:: | |
67 | ||
68 | REJECT all -- anywhere anywhere reject-with icmp-host-prohibited | |
69 | ||
70 | You will need to delete these rules on both your public and cluster networks | |
71 | initially, and replace them with appropriate rules when you are ready to | |
72 | harden the ports on your Ceph Nodes. | |
73 | ||
74 | ||
75 | Monitor IP Tables | |
76 | ----------------- | |
77 | ||
11fdf7f2 TL |
78 | Ceph Monitors listen on ports ``3300`` and ``6789`` by |
79 | default. Additionally, Ceph Monitors always operate on the public | |
80 | network. When you add the rule using the example below, make sure you | |
81 | replace ``{iface}`` with the public network interface (e.g., ``eth0``, | |
82 | ``eth1``, etc.), ``{ip-address}`` with the IP address of the public | |
83 | network and ``{netmask}`` with the netmask for the public network. :: | |
7c673cae FG |
84 | |
85 | sudo iptables -A INPUT -i {iface} -p tcp -s {ip-address}/{netmask} --dport 6789 -j ACCEPT | |
86 | ||
87 | ||
11fdf7f2 TL |
88 | MDS and Manager IP Tables |
89 | ------------------------- | |
7c673cae | 90 | |
11fdf7f2 TL |
91 | A :term:`Ceph Metadata Server` or :term:`Ceph Manager` listens on the first |
92 | available port on the public network beginning at port 6800. Note that this | |
93 | behavior is not deterministic, so if you are running more than one OSD or MDS | |
94 | on the same host, or if you restart the daemons within a short window of time, | |
95 | the daemons will bind to higher ports. You should open the entire 6800-7300 | |
96 | range by default. When you add the rule using the example below, make sure | |
97 | you replace ``{iface}`` with the public network interface (e.g., ``eth0``, | |
98 | ``eth1``, etc.), ``{ip-address}`` with the IP address of the public network | |
99 | and ``{netmask}`` with the netmask of the public network. | |
7c673cae FG |
100 | |
101 | For example:: | |
102 | ||
103 | sudo iptables -A INPUT -i {iface} -m multiport -p tcp -s {ip-address}/{netmask} --dports 6800:7300 -j ACCEPT | |
104 | ||
105 | ||
106 | OSD IP Tables | |
107 | ------------- | |
108 | ||
109 | By default, Ceph OSD Daemons `bind`_ to the first available ports on a Ceph Node | |
110 | beginning at port 6800. Note that this behavior is not deterministic, so if you | |
111 | are running more than one OSD or MDS on the same host, or if you restart the | |
112 | daemons within a short window of time, the daemons will bind to higher ports. | |
113 | Each Ceph OSD Daemon on a Ceph Node may use up to four ports: | |
114 | ||
115 | #. One for talking to clients and monitors. | |
116 | #. One for sending data to other OSDs. | |
117 | #. Two for heartbeating on each interface. | |
118 | ||
f91f0fd5 | 119 | .. ditaa:: |
7c673cae FG |
120 | /---------------\ |
121 | | OSD | | |
122 | | +---+----------------+-----------+ | |
123 | | | Clients & Monitors | Heartbeat | | |
124 | | +---+----------------+-----------+ | |
125 | | | | |
126 | | +---+----------------+-----------+ | |
127 | | | Data Replication | Heartbeat | | |
128 | | +---+----------------+-----------+ | |
129 | | cCCC | | |
130 | \---------------/ | |
131 | ||
132 | When a daemon fails and restarts without letting go of the port, the restarted | |
133 | daemon will bind to a new port. You should open the entire 6800-7300 port range | |
134 | to handle this possibility. | |
135 | ||
136 | If you set up separate public and cluster networks, you must add rules for both | |
137 | the public network and the cluster network, because clients will connect using | |
138 | the public network and other Ceph OSD Daemons will connect using the cluster | |
139 | network. When you add the rule using the example below, make sure you replace | |
140 | ``{iface}`` with the network interface (e.g., ``eth0``, ``eth1``, etc.), | |
141 | ``{ip-address}`` with the IP address and ``{netmask}`` with the netmask of the | |
142 | public or cluster network. For example:: | |
143 | ||
144 | sudo iptables -A INPUT -i {iface} -m multiport -p tcp -s {ip-address}/{netmask} --dports 6800:7300 -j ACCEPT | |
145 | ||
146 | .. tip:: If you run Ceph Metadata Servers on the same Ceph Node as the | |
147 | Ceph OSD Daemons, you can consolidate the public network configuration step. | |
148 | ||
149 | ||
150 | Ceph Networks | |
151 | ============= | |
152 | ||
153 | To configure Ceph networks, you must add a network configuration to the | |
154 | ``[global]`` section of the configuration file. Our 5-minute Quick Start | |
f67539c2 | 155 | provides a trivial Ceph configuration file that assumes one public network |
7c673cae FG |
156 | with client and server on the same network and subnet. Ceph functions just fine |
157 | with a public network only. However, Ceph allows you to establish much more | |
158 | specific criteria, including multiple IP network and subnet masks for your | |
159 | public network. You can also establish a separate cluster network to handle OSD | |
160 | heartbeat, object replication and recovery traffic. Don't confuse the IP | |
161 | addresses you set in your configuration with the public-facing IP addresses | |
162 | network clients may use to access your service. Typical internal IP networks are | |
163 | often ``192.168.0.0`` or ``10.0.0.0``. | |
164 | ||
165 | .. tip:: If you specify more than one IP address and subnet mask for | |
166 | either the public or the cluster network, the subnets within the network | |
167 | must be capable of routing to each other. Additionally, make sure you | |
168 | include each IP address/subnet in your IP tables and open ports for them | |
169 | as necessary. | |
170 | ||
171 | .. note:: Ceph uses `CIDR`_ notation for subnets (e.g., ``10.0.0.0/24``). | |
172 | ||
c07f9fc5 | 173 | When you have configured your networks, you may restart your cluster or restart |
7c673cae FG |
174 | each daemon. Ceph daemons bind dynamically, so you do not have to restart the |
175 | entire cluster at once if you change your network configuration. | |
176 | ||
177 | ||
178 | Public Network | |
179 | -------------- | |
180 | ||
181 | To configure a public network, add the following option to the ``[global]`` | |
182 | section of your Ceph configuration file. | |
183 | ||
184 | .. code-block:: ini | |
185 | ||
186 | [global] | |
11fdf7f2 | 187 | # ... elided configuration |
f67539c2 | 188 | public_network = {public-network/netmask} |
7c673cae | 189 | |
f67539c2 | 190 | .. _cluster-network: |
7c673cae FG |
191 | |
192 | Cluster Network | |
193 | --------------- | |
194 | ||
195 | If you declare a cluster network, OSDs will route heartbeat, object replication | |
196 | and recovery traffic over the cluster network. This may improve performance | |
197 | compared to using a single network. To configure a cluster network, add the | |
198 | following option to the ``[global]`` section of your Ceph configuration file. | |
199 | ||
200 | .. code-block:: ini | |
201 | ||
202 | [global] | |
11fdf7f2 | 203 | # ... elided configuration |
f67539c2 | 204 | cluster_network = {cluster-network/netmask} |
7c673cae FG |
205 | |
206 | We prefer that the cluster network is **NOT** reachable from the public network | |
207 | or the Internet for added security. | |
208 | ||
f67539c2 TL |
209 | IPv4/IPv6 Dual Stack Mode |
210 | ------------------------- | |
211 | ||
212 | If you want to run in an IPv4/IPv6 dual stack mode and want to define your public and/or | |
213 | cluster networks, then you need to specify both your IPv4 and IPv6 networks for each: | |
214 | ||
215 | .. code-block:: ini | |
216 | ||
217 | [global] | |
218 | # ... elided configuration | |
219 | public_network = {IPv4 public-network/netmask}, {IPv6 public-network/netmask} | |
220 | ||
221 | This is so that Ceph can find a valid IP address for both address families. | |
222 | ||
223 | If you want just an IPv4 or an IPv6 stack environment, then make sure you set the `ms bind` | |
224 | options correctly. | |
225 | ||
226 | .. note:: | |
227 | Binding to IPv4 is enabled by default, so if you just add the option to bind to IPv6 | |
228 | you'll actually put yourself into dual stack mode. If you want just IPv6, then disable IPv4 and | |
229 | enable IPv6. See `Bind`_ below. | |
7c673cae FG |
230 | |
231 | Ceph Daemons | |
232 | ============ | |
233 | ||
f67539c2 TL |
234 | Monitor daemons are each configured to bind to a specific IP address. These |
235 | addresses are normally configured by your deployment tool. Other components | |
236 | in the Ceph cluster discover the monitors via the ``mon host`` configuration | |
237 | option, normally specified in the ``[global]`` section of the ``ceph.conf`` file. | |
7c673cae FG |
238 | |
239 | .. code-block:: ini | |
240 | ||
11fdf7f2 | 241 | [global] |
f67539c2 | 242 | mon_host = 10.0.0.2, 10.0.0.3, 10.0.0.4 |
7c673cae | 243 | |
f67539c2 | 244 | The ``mon_host`` value can be a list of IP addresses or a name that is |
11fdf7f2 TL |
245 | looked up via DNS. In the case of a DNS name with multiple A or AAAA |
246 | records, all records are probed in order to discover a monitor. Once | |
247 | one monitor is reached, all other current monitors are discovered, so | |
248 | the ``mon host`` configuration option only needs to be sufficiently up | |
249 | to date such that a client can reach one monitor that is currently online. | |
7c673cae | 250 | |
11fdf7f2 TL |
251 | The MGR, OSD, and MDS daemons will bind to any available address and |
252 | do not require any special configuration. However, it is possible to | |
253 | specify a specific IP address for them to bind to with the ``public | |
254 | addr`` (and/or, in the case of OSD daemons, the ``cluster addr``) | |
255 | configuration option. For example, | |
7c673cae FG |
256 | |
257 | .. code-block:: ini | |
258 | ||
259 | [osd.0] | |
260 | public addr = {host-public-ip-address} | |
261 | cluster addr = {host-cluster-ip-address} | |
262 | ||
7c673cae FG |
263 | .. topic:: One NIC OSD in a Two Network Cluster |
264 | ||
f67539c2 | 265 | Generally, we do not recommend deploying an OSD host with a single network interface in a |
7c673cae | 266 | cluster with two networks. However, you may accomplish this by forcing the |
f67539c2 | 267 | OSD host to operate on the public network by adding a ``public_addr`` entry |
7c673cae | 268 | to the ``[osd.n]`` section of the Ceph configuration file, where ``n`` |
f67539c2 | 269 | refers to the ID of the OSD with one network interface. Additionally, the public |
7c673cae FG |
270 | network and cluster network must be able to route traffic to each other, |
271 | which we don't recommend for security reasons. | |
272 | ||
273 | ||
274 | Network Config Settings | |
275 | ======================= | |
276 | ||
277 | Network configuration settings are not required. Ceph assumes a public network | |
278 | with all hosts operating on it unless you specifically configure a cluster | |
279 | network. | |
280 | ||
281 | ||
282 | Public Network | |
283 | -------------- | |
284 | ||
285 | The public network configuration allows you specifically define IP addresses | |
286 | and subnets for the public network. You may specifically assign static IP | |
f67539c2 | 287 | addresses or override ``public_network`` settings using the ``public_addr`` |
7c673cae FG |
288 | setting for a specific daemon. |
289 | ||
f67539c2 | 290 | ``public_network`` |
7c673cae FG |
291 | |
292 | :Description: The IP address and netmask of the public (front-side) network | |
293 | (e.g., ``192.168.0.0/24``). Set in ``[global]``. You may specify | |
9f95a23c | 294 | comma-separated subnets. |
7c673cae FG |
295 | |
296 | :Type: ``{ip-address}/{netmask} [, {ip-address}/{netmask}]`` | |
297 | :Required: No | |
298 | :Default: N/A | |
299 | ||
300 | ||
f67539c2 | 301 | ``public_addr`` |
7c673cae FG |
302 | |
303 | :Description: The IP address for the public (front-side) network. | |
304 | Set for each daemon. | |
305 | ||
306 | :Type: IP Address | |
307 | :Required: No | |
308 | :Default: N/A | |
309 | ||
310 | ||
311 | ||
312 | Cluster Network | |
313 | --------------- | |
314 | ||
315 | The cluster network configuration allows you to declare a cluster network, and | |
316 | specifically define IP addresses and subnets for the cluster network. You may | |
f67539c2 TL |
317 | specifically assign static IP addresses or override ``cluster_network`` |
318 | settings using the ``cluster_addr`` setting for specific OSD daemons. | |
7c673cae FG |
319 | |
320 | ||
f67539c2 | 321 | ``cluster_network`` |
7c673cae FG |
322 | |
323 | :Description: The IP address and netmask of the cluster (back-side) network | |
324 | (e.g., ``10.0.0.0/24``). Set in ``[global]``. You may specify | |
9f95a23c | 325 | comma-separated subnets. |
7c673cae FG |
326 | |
327 | :Type: ``{ip-address}/{netmask} [, {ip-address}/{netmask}]`` | |
328 | :Required: No | |
329 | :Default: N/A | |
330 | ||
331 | ||
f67539c2 | 332 | ``cluster_addr`` |
7c673cae FG |
333 | |
334 | :Description: The IP address for the cluster (back-side) network. | |
335 | Set for each daemon. | |
336 | ||
337 | :Type: Address | |
338 | :Required: No | |
339 | :Default: N/A | |
340 | ||
341 | ||
342 | Bind | |
343 | ---- | |
344 | ||
345 | Bind settings set the default port ranges Ceph OSD and MDS daemons use. The | |
346 | default range is ``6800:7300``. Ensure that your `IP Tables`_ configuration | |
347 | allows you to use the configured port range. | |
348 | ||
349 | You may also enable Ceph daemons to bind to IPv6 addresses instead of IPv4 | |
350 | addresses. | |
351 | ||
352 | ||
f67539c2 | 353 | ``ms_bind_port_min`` |
7c673cae FG |
354 | |
355 | :Description: The minimum port number to which an OSD or MDS daemon will bind. | |
356 | :Type: 32-bit Integer | |
357 | :Default: ``6800`` | |
358 | :Required: No | |
359 | ||
360 | ||
f67539c2 | 361 | ``ms_bind_port_max`` |
7c673cae FG |
362 | |
363 | :Description: The maximum port number to which an OSD or MDS daemon will bind. | |
364 | :Type: 32-bit Integer | |
365 | :Default: ``7300`` | |
366 | :Required: No. | |
367 | ||
f67539c2 TL |
368 | ``ms_bind_ipv4`` |
369 | ||
370 | :Description: Enables Ceph daemons to bind to IPv4 addresses. | |
371 | :Type: Boolean | |
372 | :Default: ``true`` | |
373 | :Required: No | |
7c673cae | 374 | |
f67539c2 | 375 | ``ms_bind_ipv6`` |
7c673cae | 376 | |
f67539c2 | 377 | :Description: Enables Ceph daemons to bind to IPv6 addresses. |
7c673cae FG |
378 | :Type: Boolean |
379 | :Default: ``false`` | |
380 | :Required: No | |
381 | ||
f67539c2 | 382 | ``public_bind_addr`` |
224ce89b WB |
383 | |
384 | :Description: In some dynamic deployments the Ceph MON daemon might bind | |
f67539c2 | 385 | to an IP address locally that is different from the ``public_addr`` |
224ce89b | 386 | advertised to other peers in the network. The environment must ensure |
f67539c2 TL |
387 | that routing rules are set correctly. If ``public_bind_addr`` is set |
388 | the Ceph Monitor daemon will bind to it locally and use ``public_addr`` | |
224ce89b | 389 | in the monmaps to advertise its address to peers. This behavior is limited |
f67539c2 | 390 | to the Monitor daemon. |
224ce89b WB |
391 | |
392 | :Type: IP Address | |
393 | :Required: No | |
394 | :Default: N/A | |
395 | ||
7c673cae FG |
396 | |
397 | ||
7c673cae FG |
398 | TCP |
399 | --- | |
400 | ||
401 | Ceph disables TCP buffering by default. | |
402 | ||
403 | ||
f67539c2 | 404 | ``ms_tcp_nodelay`` |
7c673cae | 405 | |
f67539c2 | 406 | :Description: Ceph enables ``ms_tcp_nodelay`` so that each request is sent |
7c673cae FG |
407 | immediately (no buffering). Disabling `Nagle's algorithm`_ |
408 | increases network traffic, which can introduce latency. If you | |
409 | experience large numbers of small packets, you may try | |
f67539c2 | 410 | disabling ``ms_tcp_nodelay``. |
7c673cae FG |
411 | |
412 | :Type: Boolean | |
413 | :Required: No | |
414 | :Default: ``true`` | |
415 | ||
416 | ||
f67539c2 | 417 | ``ms_tcp_rcvbuf`` |
7c673cae FG |
418 | |
419 | :Description: The size of the socket buffer on the receiving end of a network | |
420 | connection. Disable by default. | |
421 | ||
422 | :Type: 32-bit Integer | |
423 | :Required: No | |
424 | :Default: ``0`` | |
425 | ||
426 | ||
f67539c2 | 427 | ``ms_tcp_read_timeout`` |
7c673cae FG |
428 | |
429 | :Description: If a client or daemon makes a request to another Ceph daemon and | |
430 | does not drop an unused connection, the ``ms tcp read timeout`` | |
431 | defines the connection as idle after the specified number | |
432 | of seconds. | |
433 | ||
434 | :Type: Unsigned 64-bit Integer | |
435 | :Required: No | |
436 | :Default: ``900`` 15 minutes. | |
437 | ||
438 | ||
439 | ||
440 | .. _Scalability and High Availability: ../../../architecture#scalability-and-high-availability | |
441 | .. _Hardware Recommendations - Networks: ../../../start/hardware-recommendations#networks | |
7c673cae FG |
442 | .. _hardware recommendations: ../../../start/hardware-recommendations |
443 | .. _Monitor / OSD Interaction: ../mon-osd-interaction | |
444 | .. _Message Signatures: ../auth-config-ref#signatures | |
11fdf7f2 TL |
445 | .. _CIDR: https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing |
446 | .. _Nagle's Algorithm: https://en.wikipedia.org/wiki/Nagle's_algorithm |