]> git.proxmox.com Git - ceph.git/blob - ceph/doc/rados/configuration/network-config-ref.rst
16f3192c7bd65dd7e8ff4f975aa32f81f3390bfe
[ceph.git] / ceph / doc / rados / configuration / network-config-ref.rst
1 =================================
2 Network Configuration Reference
3 =================================
4
5 Network configuration is critical for building a high performance :term:`Ceph
6 Storage Cluster`. The Ceph Storage Cluster does not perform request routing or
7 dispatching on behalf of the :term:`Ceph Client`. Instead, Ceph Clients make
8 requests directly to Ceph OSD Daemons. Ceph OSD Daemons perform data replication
9 on behalf of Ceph Clients, which means replication and other factors impose
10 additional loads on Ceph Storage Cluster networks.
11
12 Our Quick Start configurations provide a trivial Ceph configuration file that
13 sets monitor IP addresses and daemon host names only. Unless you specify a
14 cluster network, Ceph assumes a single "public" network. Ceph functions just
15 fine with a public network only, but you may see significant performance
16 improvement with a second "cluster" network in a large cluster.
17
18 It is possible to run a Ceph Storage Cluster with two networks: a public
19 (client, front-side) network and a cluster (private, replication, back-side)
20 network. However, this approach
21 complicates network configuration (both hardware and software) and does not usually
22 have a significant impact on overall performance. For this reason, we recommend
23 that for resilience and capacity dual-NIC systems either active/active bond
24 these interfaces or implement a layer 3 multipath strategy with eg. FRR.
25
26 If, despite the complexity, one still wishes to use two networks, each
27 :term:`Ceph Node` will need to have more than one network interface or VLAN. See `Hardware
28 Recommendations - Networks`_ for additional details.
29
30 .. ditaa::
31 +-------------+
32 | Ceph Client |
33 +----*--*-----+
34 | ^
35 Request | : Response
36 v |
37 /----------------------------------*--*-------------------------------------\
38 | Public Network |
39 \---*--*------------*--*-------------*--*------------*--*------------*--*---/
40 ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
41 | | | | | | | | | |
42 | : | : | : | : | :
43 v v v v v v v v v v
44 +---*--*---+ +---*--*---+ +---*--*---+ +---*--*---+ +---*--*---+
45 | Ceph MON | | Ceph MDS | | Ceph OSD | | Ceph OSD | | Ceph OSD |
46 +----------+ +----------+ +---*--*---+ +---*--*---+ +---*--*---+
47 ^ ^ ^ ^ ^ ^
48 The cluster network relieves | | | | | |
49 OSD replication and heartbeat | : | : | :
50 traffic from the public network. v v v v v v
51 /------------------------------------*--*------------*--*------------*--*---\
52 | cCCC Cluster Network |
53 \---------------------------------------------------------------------------/
54
55
56 IP Tables
57 =========
58
59 By default, daemons `bind`_ to ports within the ``6800:7300`` range. You may
60 configure this range at your discretion. Before configuring your IP tables,
61 check the default ``iptables`` configuration.
62
63 sudo iptables -L
64
65 Some Linux distributions include rules that reject all inbound requests
66 except SSH from all network interfaces. For example::
67
68 REJECT all -- anywhere anywhere reject-with icmp-host-prohibited
69
70 You will need to delete these rules on both your public and cluster networks
71 initially, and replace them with appropriate rules when you are ready to
72 harden the ports on your Ceph Nodes.
73
74
75 Monitor IP Tables
76 -----------------
77
78 Ceph Monitors listen on ports ``3300`` and ``6789`` by
79 default. Additionally, Ceph Monitors always operate on the public
80 network. When you add the rule using the example below, make sure you
81 replace ``{iface}`` with the public network interface (e.g., ``eth0``,
82 ``eth1``, etc.), ``{ip-address}`` with the IP address of the public
83 network and ``{netmask}`` with the netmask for the public network. ::
84
85 sudo iptables -A INPUT -i {iface} -p tcp -s {ip-address}/{netmask} --dport 6789 -j ACCEPT
86
87
88 MDS and Manager IP Tables
89 -------------------------
90
91 A :term:`Ceph Metadata Server` or :term:`Ceph Manager` listens on the first
92 available port on the public network beginning at port 6800. Note that this
93 behavior is not deterministic, so if you are running more than one OSD or MDS
94 on the same host, or if you restart the daemons within a short window of time,
95 the daemons will bind to higher ports. You should open the entire 6800-7300
96 range by default. When you add the rule using the example below, make sure
97 you replace ``{iface}`` with the public network interface (e.g., ``eth0``,
98 ``eth1``, etc.), ``{ip-address}`` with the IP address of the public network
99 and ``{netmask}`` with the netmask of the public network.
100
101 For example::
102
103 sudo iptables -A INPUT -i {iface} -m multiport -p tcp -s {ip-address}/{netmask} --dports 6800:7300 -j ACCEPT
104
105
106 OSD IP Tables
107 -------------
108
109 By default, Ceph OSD Daemons `bind`_ to the first available ports on a Ceph Node
110 beginning at port 6800. Note that this behavior is not deterministic, so if you
111 are running more than one OSD or MDS on the same host, or if you restart the
112 daemons within a short window of time, the daemons will bind to higher ports.
113 Each Ceph OSD Daemon on a Ceph Node may use up to four ports:
114
115 #. One for talking to clients and monitors.
116 #. One for sending data to other OSDs.
117 #. Two for heartbeating on each interface.
118
119 .. ditaa::
120 /---------------\
121 | OSD |
122 | +---+----------------+-----------+
123 | | Clients & Monitors | Heartbeat |
124 | +---+----------------+-----------+
125 | |
126 | +---+----------------+-----------+
127 | | Data Replication | Heartbeat |
128 | +---+----------------+-----------+
129 | cCCC |
130 \---------------/
131
132 When a daemon fails and restarts without letting go of the port, the restarted
133 daemon will bind to a new port. You should open the entire 6800-7300 port range
134 to handle this possibility.
135
136 If you set up separate public and cluster networks, you must add rules for both
137 the public network and the cluster network, because clients will connect using
138 the public network and other Ceph OSD Daemons will connect using the cluster
139 network. When you add the rule using the example below, make sure you replace
140 ``{iface}`` with the network interface (e.g., ``eth0``, ``eth1``, etc.),
141 ``{ip-address}`` with the IP address and ``{netmask}`` with the netmask of the
142 public or cluster network. For example::
143
144 sudo iptables -A INPUT -i {iface} -m multiport -p tcp -s {ip-address}/{netmask} --dports 6800:7300 -j ACCEPT
145
146 .. tip:: If you run Ceph Metadata Servers on the same Ceph Node as the
147 Ceph OSD Daemons, you can consolidate the public network configuration step.
148
149
150 Ceph Networks
151 =============
152
153 To configure Ceph networks, you must add a network configuration to the
154 ``[global]`` section of the configuration file. Our 5-minute Quick Start
155 provides a trivial Ceph configuration file that assumes one public network
156 with client and server on the same network and subnet. Ceph functions just fine
157 with a public network only. However, Ceph allows you to establish much more
158 specific criteria, including multiple IP network and subnet masks for your
159 public network. You can also establish a separate cluster network to handle OSD
160 heartbeat, object replication and recovery traffic. Don't confuse the IP
161 addresses you set in your configuration with the public-facing IP addresses
162 network clients may use to access your service. Typical internal IP networks are
163 often ``192.168.0.0`` or ``10.0.0.0``.
164
165 .. tip:: If you specify more than one IP address and subnet mask for
166 either the public or the cluster network, the subnets within the network
167 must be capable of routing to each other. Additionally, make sure you
168 include each IP address/subnet in your IP tables and open ports for them
169 as necessary.
170
171 .. note:: Ceph uses `CIDR`_ notation for subnets (e.g., ``10.0.0.0/24``).
172
173 When you have configured your networks, you may restart your cluster or restart
174 each daemon. Ceph daemons bind dynamically, so you do not have to restart the
175 entire cluster at once if you change your network configuration.
176
177
178 Public Network
179 --------------
180
181 To configure a public network, add the following option to the ``[global]``
182 section of your Ceph configuration file.
183
184 .. code-block:: ini
185
186 [global]
187 # ... elided configuration
188 public_network = {public-network/netmask}
189
190 .. _cluster-network:
191
192 Cluster Network
193 ---------------
194
195 If you declare a cluster network, OSDs will route heartbeat, object replication
196 and recovery traffic over the cluster network. This may improve performance
197 compared to using a single network. To configure a cluster network, add the
198 following option to the ``[global]`` section of your Ceph configuration file.
199
200 .. code-block:: ini
201
202 [global]
203 # ... elided configuration
204 cluster_network = {cluster-network/netmask}
205
206 We prefer that the cluster network is **NOT** reachable from the public network
207 or the Internet for added security.
208
209 IPv4/IPv6 Dual Stack Mode
210 -------------------------
211
212 If you want to run in an IPv4/IPv6 dual stack mode and want to define your public and/or
213 cluster networks, then you need to specify both your IPv4 and IPv6 networks for each:
214
215 .. code-block:: ini
216
217 [global]
218 # ... elided configuration
219 public_network = {IPv4 public-network/netmask}, {IPv6 public-network/netmask}
220
221 This is so that Ceph can find a valid IP address for both address families.
222
223 If you want just an IPv4 or an IPv6 stack environment, then make sure you set the `ms bind`
224 options correctly.
225
226 .. note::
227 Binding to IPv4 is enabled by default, so if you just add the option to bind to IPv6
228 you'll actually put yourself into dual stack mode. If you want just IPv6, then disable IPv4 and
229 enable IPv6. See `Bind`_ below.
230
231 Ceph Daemons
232 ============
233
234 Monitor daemons are each configured to bind to a specific IP address. These
235 addresses are normally configured by your deployment tool. Other components
236 in the Ceph cluster discover the monitors via the ``mon host`` configuration
237 option, normally specified in the ``[global]`` section of the ``ceph.conf`` file.
238
239 .. code-block:: ini
240
241 [global]
242 mon_host = 10.0.0.2, 10.0.0.3, 10.0.0.4
243
244 The ``mon_host`` value can be a list of IP addresses or a name that is
245 looked up via DNS. In the case of a DNS name with multiple A or AAAA
246 records, all records are probed in order to discover a monitor. Once
247 one monitor is reached, all other current monitors are discovered, so
248 the ``mon host`` configuration option only needs to be sufficiently up
249 to date such that a client can reach one monitor that is currently online.
250
251 The MGR, OSD, and MDS daemons will bind to any available address and
252 do not require any special configuration. However, it is possible to
253 specify a specific IP address for them to bind to with the ``public
254 addr`` (and/or, in the case of OSD daemons, the ``cluster addr``)
255 configuration option. For example,
256
257 .. code-block:: ini
258
259 [osd.0]
260 public addr = {host-public-ip-address}
261 cluster addr = {host-cluster-ip-address}
262
263 .. topic:: One NIC OSD in a Two Network Cluster
264
265 Generally, we do not recommend deploying an OSD host with a single network interface in a
266 cluster with two networks. However, you may accomplish this by forcing the
267 OSD host to operate on the public network by adding a ``public_addr`` entry
268 to the ``[osd.n]`` section of the Ceph configuration file, where ``n``
269 refers to the ID of the OSD with one network interface. Additionally, the public
270 network and cluster network must be able to route traffic to each other,
271 which we don't recommend for security reasons.
272
273
274 Network Config Settings
275 =======================
276
277 Network configuration settings are not required. Ceph assumes a public network
278 with all hosts operating on it unless you specifically configure a cluster
279 network.
280
281
282 Public Network
283 --------------
284
285 The public network configuration allows you specifically define IP addresses
286 and subnets for the public network. You may specifically assign static IP
287 addresses or override ``public_network`` settings using the ``public_addr``
288 setting for a specific daemon.
289
290 .. confval:: public_network
291 .. confval:: public_addr
292
293 Cluster Network
294 ---------------
295
296 The cluster network configuration allows you to declare a cluster network, and
297 specifically define IP addresses and subnets for the cluster network. You may
298 specifically assign static IP addresses or override ``cluster_network``
299 settings using the ``cluster_addr`` setting for specific OSD daemons.
300
301
302 .. confval:: cluster_network
303 .. confval:: cluster_addr
304
305 Bind
306 ----
307
308 Bind settings set the default port ranges Ceph OSD and MDS daemons use. The
309 default range is ``6800:7300``. Ensure that your `IP Tables`_ configuration
310 allows you to use the configured port range.
311
312 You may also enable Ceph daemons to bind to IPv6 addresses instead of IPv4
313 addresses.
314
315 .. confval:: ms_bind_port_min
316 .. confval:: ms_bind_port_max
317 .. confval:: ms_bind_ipv4
318 .. confval:: ms_bind_ipv6
319 .. confval:: public_bind_addr
320
321 TCP
322 ---
323
324 Ceph disables TCP buffering by default.
325
326 .. confval:: ms_tcp_nodelay
327 .. confval:: ms_tcp_rcvbuf
328
329 General Settings
330 ----------------
331
332 .. confval:: ms_type
333 .. confval:: ms_async_op_threads
334 .. confval:: ms_initial_backoff
335 .. confval:: ms_max_backoff
336 .. confval:: ms_die_on_bad_msg
337 .. confval:: ms_dispatch_throttle_bytes
338 .. confval:: ms_inject_socket_failures
339
340
341 .. _Scalability and High Availability: ../../../architecture#scalability-and-high-availability
342 .. _Hardware Recommendations - Networks: ../../../start/hardware-recommendations#networks
343 .. _hardware recommendations: ../../../start/hardware-recommendations
344 .. _Monitor / OSD Interaction: ../mon-osd-interaction
345 .. _Message Signatures: ../auth-config-ref#signatures
346 .. _CIDR: https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing
347 .. _Nagle's Algorithm: https://en.wikipedia.org/wiki/Nagle's_algorithm