]> git.proxmox.com Git - ceph.git/blame - ceph/doc/rados/configuration/network-config-ref.rst
bump version to 18.2.2-pve1
[ceph.git] / ceph / doc / rados / configuration / network-config-ref.rst
CommitLineData
7c673cae
FG
1=================================
2 Network Configuration Reference
3=================================
4
5Network configuration is critical for building a high performance :term:`Ceph
6Storage Cluster`. The Ceph Storage Cluster does not perform request routing or
7dispatching on behalf of the :term:`Ceph Client`. Instead, Ceph Clients make
8requests directly to Ceph OSD Daemons. Ceph OSD Daemons perform data replication
9on behalf of Ceph Clients, which means replication and other factors impose
10additional loads on Ceph Storage Cluster networks.
11
f67539c2 12Our Quick Start configurations provide a trivial Ceph configuration file that
7c673cae
FG
13sets monitor IP addresses and daemon host names only. Unless you specify a
14cluster network, Ceph assumes a single "public" network. Ceph functions just
15fine with a public network only, but you may see significant performance
16improvement with a second "cluster" network in a large cluster.
17
11fdf7f2 18It is possible to run a Ceph Storage Cluster with two networks: a public
f67539c2
TL
19(client, front-side) network and a cluster (private, replication, back-side)
20network. However, this approach
21complicates network configuration (both hardware and software) and does not usually
22have a significant impact on overall performance. For this reason, we recommend
23that for resilience and capacity dual-NIC systems either active/active bond
20effc67 24these interfaces or implement a layer 3 multipath strategy with eg. FRR.
11fdf7f2
TL
25
26If, despite the complexity, one still wishes to use two networks, each
f67539c2 27:term:`Ceph Node` will need to have more than one network interface or VLAN. See `Hardware
11fdf7f2 28Recommendations - Networks`_ for additional details.
7c673cae
FG
29
30.. ditaa::
31 +-------------+
32 | Ceph Client |
33 +----*--*-----+
34 | ^
35 Request | : Response
36 v |
37 /----------------------------------*--*-------------------------------------\
38 | Public Network |
39 \---*--*------------*--*-------------*--*------------*--*------------*--*---/
40 ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
41 | | | | | | | | | |
42 | : | : | : | : | :
43 v v v v v v v v v v
44 +---*--*---+ +---*--*---+ +---*--*---+ +---*--*---+ +---*--*---+
45 | Ceph MON | | Ceph MDS | | Ceph OSD | | Ceph OSD | | Ceph OSD |
46 +----------+ +----------+ +---*--*---+ +---*--*---+ +---*--*---+
47 ^ ^ ^ ^ ^ ^
48 The cluster network relieves | | | | | |
49 OSD replication and heartbeat | : | : | :
50 traffic from the public network. v v v v v v
51 /------------------------------------*--*------------*--*------------*--*---\
52 | cCCC Cluster Network |
53 \---------------------------------------------------------------------------/
54
55
7c673cae
FG
56IP Tables
57=========
58
59By default, daemons `bind`_ to ports within the ``6800:7300`` range. You may
60configure this range at your discretion. Before configuring your IP tables,
61check the default ``iptables`` configuration.
62
39ae355f
TL
63.. prompt:: bash $
64
65 sudo iptables -L
7c673cae
FG
66
67Some Linux distributions include rules that reject all inbound requests
68except SSH from all network interfaces. For example::
69
70 REJECT all -- anywhere anywhere reject-with icmp-host-prohibited
71
72You will need to delete these rules on both your public and cluster networks
73initially, and replace them with appropriate rules when you are ready to
74harden the ports on your Ceph Nodes.
75
76
77Monitor IP Tables
78-----------------
79
11fdf7f2
TL
80Ceph Monitors listen on ports ``3300`` and ``6789`` by
81default. Additionally, Ceph Monitors always operate on the public
82network. When you add the rule using the example below, make sure you
83replace ``{iface}`` with the public network interface (e.g., ``eth0``,
84``eth1``, etc.), ``{ip-address}`` with the IP address of the public
39ae355f
TL
85network and ``{netmask}`` with the netmask for the public network. :
86
87.. prompt:: bash $
7c673cae
FG
88
89 sudo iptables -A INPUT -i {iface} -p tcp -s {ip-address}/{netmask} --dport 6789 -j ACCEPT
90
91
11fdf7f2
TL
92MDS and Manager IP Tables
93-------------------------
7c673cae 94
11fdf7f2
TL
95A :term:`Ceph Metadata Server` or :term:`Ceph Manager` listens on the first
96available port on the public network beginning at port 6800. Note that this
97behavior is not deterministic, so if you are running more than one OSD or MDS
98on the same host, or if you restart the daemons within a short window of time,
99the daemons will bind to higher ports. You should open the entire 6800-7300
100range by default. When you add the rule using the example below, make sure
101you replace ``{iface}`` with the public network interface (e.g., ``eth0``,
102``eth1``, etc.), ``{ip-address}`` with the IP address of the public network
103and ``{netmask}`` with the netmask of the public network.
7c673cae 104
39ae355f 105For example:
7c673cae 106
39ae355f
TL
107.. prompt:: bash $
108
109 sudo iptables -A INPUT -i {iface} -m multiport -p tcp -s {ip-address}/{netmask} --dports 6800:7300 -j ACCEPT
7c673cae
FG
110
111
112OSD IP Tables
113-------------
114
115By default, Ceph OSD Daemons `bind`_ to the first available ports on a Ceph Node
116beginning at port 6800. Note that this behavior is not deterministic, so if you
117are running more than one OSD or MDS on the same host, or if you restart the
118daemons within a short window of time, the daemons will bind to higher ports.
119Each Ceph OSD Daemon on a Ceph Node may use up to four ports:
120
121#. One for talking to clients and monitors.
122#. One for sending data to other OSDs.
123#. Two for heartbeating on each interface.
124
f91f0fd5 125.. ditaa::
7c673cae
FG
126 /---------------\
127 | OSD |
128 | +---+----------------+-----------+
129 | | Clients & Monitors | Heartbeat |
130 | +---+----------------+-----------+
131 | |
132 | +---+----------------+-----------+
133 | | Data Replication | Heartbeat |
134 | +---+----------------+-----------+
135 | cCCC |
136 \---------------/
137
138When a daemon fails and restarts without letting go of the port, the restarted
139daemon will bind to a new port. You should open the entire 6800-7300 port range
140to handle this possibility.
141
142If you set up separate public and cluster networks, you must add rules for both
143the public network and the cluster network, because clients will connect using
144the public network and other Ceph OSD Daemons will connect using the cluster
145network. When you add the rule using the example below, make sure you replace
146``{iface}`` with the network interface (e.g., ``eth0``, ``eth1``, etc.),
147``{ip-address}`` with the IP address and ``{netmask}`` with the netmask of the
39ae355f
TL
148public or cluster network. For example:
149
150.. prompt:: bash $
7c673cae 151
39ae355f 152 sudo iptables -A INPUT -i {iface} -m multiport -p tcp -s {ip-address}/{netmask} --dports 6800:7300 -j ACCEPT
7c673cae
FG
153
154.. tip:: If you run Ceph Metadata Servers on the same Ceph Node as the
155 Ceph OSD Daemons, you can consolidate the public network configuration step.
156
157
158Ceph Networks
159=============
160
161To configure Ceph networks, you must add a network configuration to the
162``[global]`` section of the configuration file. Our 5-minute Quick Start
f67539c2 163provides a trivial Ceph configuration file that assumes one public network
7c673cae
FG
164with client and server on the same network and subnet. Ceph functions just fine
165with a public network only. However, Ceph allows you to establish much more
166specific criteria, including multiple IP network and subnet masks for your
167public network. You can also establish a separate cluster network to handle OSD
168heartbeat, object replication and recovery traffic. Don't confuse the IP
169addresses you set in your configuration with the public-facing IP addresses
170network clients may use to access your service. Typical internal IP networks are
171often ``192.168.0.0`` or ``10.0.0.0``.
172
173.. tip:: If you specify more than one IP address and subnet mask for
174 either the public or the cluster network, the subnets within the network
175 must be capable of routing to each other. Additionally, make sure you
176 include each IP address/subnet in your IP tables and open ports for them
177 as necessary.
178
179.. note:: Ceph uses `CIDR`_ notation for subnets (e.g., ``10.0.0.0/24``).
180
c07f9fc5 181When you have configured your networks, you may restart your cluster or restart
7c673cae
FG
182each daemon. Ceph daemons bind dynamically, so you do not have to restart the
183entire cluster at once if you change your network configuration.
184
185
186Public Network
187--------------
188
189To configure a public network, add the following option to the ``[global]``
190section of your Ceph configuration file.
191
192.. code-block:: ini
193
194 [global]
11fdf7f2 195 # ... elided configuration
f67539c2 196 public_network = {public-network/netmask}
7c673cae 197
f67539c2 198.. _cluster-network:
7c673cae
FG
199
200Cluster Network
201---------------
202
203If you declare a cluster network, OSDs will route heartbeat, object replication
204and recovery traffic over the cluster network. This may improve performance
205compared to using a single network. To configure a cluster network, add the
206following option to the ``[global]`` section of your Ceph configuration file.
207
208.. code-block:: ini
209
210 [global]
11fdf7f2 211 # ... elided configuration
f67539c2 212 cluster_network = {cluster-network/netmask}
7c673cae
FG
213
214We prefer that the cluster network is **NOT** reachable from the public network
215or the Internet for added security.
216
f67539c2
TL
217IPv4/IPv6 Dual Stack Mode
218-------------------------
219
220If you want to run in an IPv4/IPv6 dual stack mode and want to define your public and/or
221cluster networks, then you need to specify both your IPv4 and IPv6 networks for each:
222
223.. code-block:: ini
224
225 [global]
226 # ... elided configuration
227 public_network = {IPv4 public-network/netmask}, {IPv6 public-network/netmask}
228
229This is so that Ceph can find a valid IP address for both address families.
230
231If you want just an IPv4 or an IPv6 stack environment, then make sure you set the `ms bind`
232options correctly.
233
234.. note::
235 Binding to IPv4 is enabled by default, so if you just add the option to bind to IPv6
236 you'll actually put yourself into dual stack mode. If you want just IPv6, then disable IPv4 and
237 enable IPv6. See `Bind`_ below.
7c673cae
FG
238
239Ceph Daemons
240============
241
f67539c2
TL
242Monitor daemons are each configured to bind to a specific IP address. These
243addresses are normally configured by your deployment tool. Other components
244in the Ceph cluster discover the monitors via the ``mon host`` configuration
245option, normally specified in the ``[global]`` section of the ``ceph.conf`` file.
7c673cae
FG
246
247.. code-block:: ini
248
11fdf7f2 249 [global]
f67539c2 250 mon_host = 10.0.0.2, 10.0.0.3, 10.0.0.4
7c673cae 251
f67539c2 252The ``mon_host`` value can be a list of IP addresses or a name that is
11fdf7f2
TL
253looked up via DNS. In the case of a DNS name with multiple A or AAAA
254records, all records are probed in order to discover a monitor. Once
255one monitor is reached, all other current monitors are discovered, so
256the ``mon host`` configuration option only needs to be sufficiently up
257to date such that a client can reach one monitor that is currently online.
7c673cae 258
11fdf7f2
TL
259The MGR, OSD, and MDS daemons will bind to any available address and
260do not require any special configuration. However, it is possible to
261specify a specific IP address for them to bind to with the ``public
262addr`` (and/or, in the case of OSD daemons, the ``cluster addr``)
263configuration option. For example,
7c673cae
FG
264
265.. code-block:: ini
266
267 [osd.0]
1e59de90
TL
268 public_addr = {host-public-ip-address}
269 cluster_addr = {host-cluster-ip-address}
7c673cae 270
7c673cae
FG
271.. topic:: One NIC OSD in a Two Network Cluster
272
f67539c2 273 Generally, we do not recommend deploying an OSD host with a single network interface in a
7c673cae 274 cluster with two networks. However, you may accomplish this by forcing the
f67539c2 275 OSD host to operate on the public network by adding a ``public_addr`` entry
7c673cae 276 to the ``[osd.n]`` section of the Ceph configuration file, where ``n``
f67539c2 277 refers to the ID of the OSD with one network interface. Additionally, the public
7c673cae
FG
278 network and cluster network must be able to route traffic to each other,
279 which we don't recommend for security reasons.
280
281
282Network Config Settings
283=======================
284
285Network configuration settings are not required. Ceph assumes a public network
286with all hosts operating on it unless you specifically configure a cluster
287network.
288
289
290Public Network
291--------------
292
293The public network configuration allows you specifically define IP addresses
294and subnets for the public network. You may specifically assign static IP
f67539c2 295addresses or override ``public_network`` settings using the ``public_addr``
7c673cae
FG
296setting for a specific daemon.
297
20effc67
TL
298.. confval:: public_network
299.. confval:: public_addr
7c673cae
FG
300
301Cluster Network
302---------------
303
304The cluster network configuration allows you to declare a cluster network, and
305specifically define IP addresses and subnets for the cluster network. You may
f67539c2
TL
306specifically assign static IP addresses or override ``cluster_network``
307settings using the ``cluster_addr`` setting for specific OSD daemons.
7c673cae
FG
308
309
20effc67
TL
310.. confval:: cluster_network
311.. confval:: cluster_addr
7c673cae
FG
312
313Bind
314----
315
316Bind settings set the default port ranges Ceph OSD and MDS daemons use. The
317default range is ``6800:7300``. Ensure that your `IP Tables`_ configuration
318allows you to use the configured port range.
319
320You may also enable Ceph daemons to bind to IPv6 addresses instead of IPv4
321addresses.
322
20effc67
TL
323.. confval:: ms_bind_port_min
324.. confval:: ms_bind_port_max
325.. confval:: ms_bind_ipv4
326.. confval:: ms_bind_ipv6
327.. confval:: public_bind_addr
7c673cae 328
7c673cae
FG
329TCP
330---
331
332Ceph disables TCP buffering by default.
333
20effc67
TL
334.. confval:: ms_tcp_nodelay
335.. confval:: ms_tcp_rcvbuf
7c673cae 336
20effc67
TL
337General Settings
338----------------
7c673cae 339
20effc67
TL
340.. confval:: ms_type
341.. confval:: ms_async_op_threads
342.. confval:: ms_initial_backoff
343.. confval:: ms_max_backoff
344.. confval:: ms_die_on_bad_msg
345.. confval:: ms_dispatch_throttle_bytes
346.. confval:: ms_inject_socket_failures
7c673cae
FG
347
348
349.. _Scalability and High Availability: ../../../architecture#scalability-and-high-availability
350.. _Hardware Recommendations - Networks: ../../../start/hardware-recommendations#networks
7c673cae
FG
351.. _hardware recommendations: ../../../start/hardware-recommendations
352.. _Monitor / OSD Interaction: ../mon-osd-interaction
353.. _Message Signatures: ../auth-config-ref#signatures
11fdf7f2
TL
354.. _CIDR: https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing
355.. _Nagle's Algorithm: https://en.wikipedia.org/wiki/Nagle's_algorithm