]> git.proxmox.com Git - ceph.git/blob - ceph/doc/rados/configuration/msgr2.rst
add stop-gap to fix compat with CPUs not supporting SSE 4.1
[ceph.git] / ceph / doc / rados / configuration / msgr2.rst
1 .. _msgr2:
2
3 Messenger v2
4 ============
5
6 What is it
7 ----------
8
9 The messenger v2 protocol, or msgr2, is the second major revision on
10 Ceph's on-wire protocol. It brings with it several key features:
11
12 * A *secure* mode that encrypts all data passing over the network
13 * Improved encapsulation of authentication payloads, enabling future
14 integration of new authentication modes like Kerberos
15 * Improved earlier feature advertisement and negotiation, enabling
16 future protocol revisions
17
18 Ceph daemons can now bind to multiple ports, allowing both legacy Ceph
19 clients and new v2-capable clients to connect to the same cluster.
20
21 By default, monitors now bind to the new IANA-assigned port ``3300``
22 (ce4h or 0xce4) for the new v2 protocol, while also binding to the
23 old default port ``6789`` for the legacy v1 protocol.
24
25 .. _address_formats:
26
27 Address formats
28 ---------------
29
30 Prior to Nautilus, all network addresses were rendered like
31 ``1.2.3.4:567/89012`` where there was an IP address, a port, and a
32 nonce to uniquely identify a client or daemon on the network.
33 Starting with Nautilus, we now have three different address types:
34
35 * **v2**: ``v2:1.2.3.4:578/89012`` identifies a daemon binding to a
36 port speaking the new v2 protocol
37 * **v1**: ``v1:1.2.3.4:578/89012`` identifies a daemon binding to a
38 port speaking the legacy v1 protocol. Any address that was
39 previously shown with any prefix is now shown as a ``v1:`` address.
40 * **TYPE_ANY** ``any:1.2.3.4:578/89012`` identifies a client that can
41 speak either version of the protocol. Prior to nautilus, clients would appear as
42 ``1.2.3.4:0/123456``, where the port of 0 indicates they are clients
43 and do not accept incoming connections. Starting with Nautilus,
44 these clients are now internally represented by a **TYPE_ANY**
45 address, and still shown with no prefix, because they may
46 connect to daemons using the v2 or v1 protocol, depending on what
47 protocol(s) the daemons are using.
48
49 Because daemons now bind to multiple ports, they are now described by
50 a vector of addresses instead of a single address. For example,
51 dumping the monitor map on a Nautilus cluster now includes lines
52 like::
53
54 epoch 1
55 fsid 50fcf227-be32-4bcb-8b41-34ca8370bd16
56 last_changed 2019-02-25 11:10:46.700821
57 created 2019-02-25 11:10:46.700821
58 min_mon_release 14 (nautilus)
59 0: [v2:10.0.0.10:3300/0,v1:10.0.0.10:6789/0] mon.foo
60 1: [v2:10.0.0.11:3300/0,v1:10.0.0.11:6789/0] mon.bar
61 2: [v2:10.0.0.12:3300/0,v1:10.0.0.12:6789/0] mon.baz
62
63 The bracketed list or vector of addresses means that the same daemon can be
64 reached on multiple ports (and protocols). Any client or other daemon
65 connecting to that daemon will use the v2 protocol (listed first) if
66 possible; otherwise it will back to the legacy v1 protocol. Legacy
67 clients will only see the v1 addresses and will continue to connect as
68 they did before, with the v1 protocol.
69
70 Starting in Nautilus, the ``mon_host`` configuration option and ``-m
71 <mon-host>`` command line options support the same bracketed address
72 vector syntax.
73
74
75 Bind configuration options
76 ^^^^^^^^^^^^^^^^^^^^^^^^^^
77
78 Two new configuration options control whether the v1 and/or v2
79 protocol is used:
80
81 * :confval:`ms_bind_msgr1` [default: true] controls whether a daemon binds
82 to a port speaking the v1 protocol
83 * :confval:`ms_bind_msgr2` [default: true] controls whether a daemon binds
84 to a port speaking the v2 protocol
85
86 Similarly, two options control whether IPv4 and IPv6 addresses are used:
87
88 * :confval:`ms_bind_ipv4` [default: true] controls whether a daemon binds
89 to an IPv4 address
90 * :confval:`ms_bind_ipv6` [default: false] controls whether a daemon binds
91 to an IPv6 address
92
93 .. note:: The ability to bind to multiple ports has paved the way for
94 dual-stack IPv4 and IPv6 support. That said, dual-stack support is
95 not yet supported as of Quincy v17.2.0.
96
97 Connection modes
98 ----------------
99
100 The v2 protocol supports two connection modes:
101
102 * *crc* mode provides:
103
104 - a strong initial authentication when the connection is established
105 (with cephx, mutual authentication of both parties with protection
106 from a man-in-the-middle or eavesdropper), and
107 - a crc32c integrity check to protect against bit flips due to flaky
108 hardware or cosmic rays
109
110 *crc* mode does *not* provide:
111
112 - secrecy (an eavesdropper on the network can see all
113 post-authentication traffic as it goes by) or
114 - protection from a malicious man-in-the-middle (who can deliberate
115 modify traffic as it goes by, as long as they are careful to
116 adjust the crc32c values to match)
117
118 * *secure* mode provides:
119
120 - a strong initial authentication when the connection is established
121 (with cephx, mutual authentication of both parties with protection
122 from a man-in-the-middle or eavesdropper), and
123 - full encryption of all post-authentication traffic, including a
124 cryptographic integrity check.
125
126 In Nautilus, secure mode uses the `AES-GCM
127 <https://en.wikipedia.org/wiki/Galois/Counter_Mode>`_ stream cipher,
128 which is generally very fast on modern processors (e.g., faster than
129 a SHA-256 cryptographic hash).
130
131 Connection mode configuration options
132 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
133
134 For most connections, there are options that control which modes are used:
135
136 .. confval:: ms_cluster_mode
137 .. confval:: ms_service_mode
138 .. confval:: ms_client_mode
139
140 There are a parallel set of options that apply specifically to
141 monitors, allowing administrators to set different (usually more
142 secure) requirements on communication with the monitors.
143
144 .. confval:: ms_mon_cluster_mode
145 .. confval:: ms_mon_service_mode
146 .. confval:: ms_mon_client_mode
147
148
149 Compression modes
150 -----------------
151
152 The v2 protocol supports two compression modes:
153
154 * *force* mode provides:
155
156 - In multi-availability zones deployment, compressing replication messages between OSDs saves latency.
157 - In the public cloud, inter-AZ communications are expensive. Thus, minimizing message
158 size reduces network costs to cloud provider.
159 - When using instance storage on AWS (probably other public clouds as well) the instances with NVMe
160 provide low network bandwidth relative to the device bandwidth.
161 In this case, NW compression can improve the overall performance since this is clearly
162 the bottleneck.
163
164 * *none* mode provides:
165
166 - messages are transmitted without compression.
167
168
169 Compression mode configuration options
170 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
171
172 For all connections, there is an option that controls compression usage in secure mode
173
174 .. confval:: ms_compress_secure
175
176 There is a parallel set of options that apply specifically to OSDs,
177 allowing administrators to set different requirements on communication between OSDs.
178
179 .. confval:: ms_osd_compress_mode
180 .. confval:: ms_osd_compress_min_size
181 .. confval:: ms_osd_compression_algorithm
182
183 Transitioning from v1-only to v2-plus-v1
184 ----------------------------------------
185
186 By default, ``ms_bind_msgr2`` is true starting with Nautilus 14.2.z.
187 However, until the monitors start using v2, only limited services will
188 start advertising v2 addresses.
189
190 For most users, the monitors are binding to the default legacy port ``6789``
191 for the v1 protocol. When this is the case, enabling v2 is as simple as:
192
193 .. prompt:: bash $
194
195 ceph mon enable-msgr2
196
197 If the monitors are bound to non-standard ports, you will need to
198 specify an additional port for v2 explicitly. For example, if your
199 monitor ``mon.a`` binds to ``1.2.3.4:1111``, and you want to add v2 on
200 port ``1112``:
201
202 .. prompt:: bash $
203
204 ceph mon set-addrs a [v2:1.2.3.4:1112,v1:1.2.3.4:1111]
205
206 Once the monitors bind to v2, each daemon will start advertising a v2
207 address when it is next restarted.
208
209
210 .. _msgr2_ceph_conf:
211
212 Updating ceph.conf and mon_host
213 -------------------------------
214
215 Prior to Nautilus, a CLI user or daemon will normally discover the
216 monitors via the ``mon_host`` option in ``/etc/ceph/ceph.conf``. The
217 syntax for this option has expanded starting with Nautilus to allow
218 support the new bracketed list format. For example, an old line
219 like::
220
221 mon_host = 10.0.0.1:6789,10.0.0.2:6789,10.0.0.3:6789
222
223 Can be changed to::
224
225 mon_host = [v2:10.0.0.1:3300/0,v1:10.0.0.1:6789/0],[v2:10.0.0.2:3300/0,v1:10.0.0.2:6789/0],[v2:10.0.0.3:3300/0,v1:10.0.0.3:6789/0]
226
227 However, when default ports are used (``3300`` and ``6789``), they can
228 be omitted::
229
230 mon_host = 10.0.0.1,10.0.0.2,10.0.0.3
231
232 Once v2 has been enabled on the monitors, ``ceph.conf`` may need to be
233 updated to either specify no ports (this is usually simplest), or
234 explicitly specify both the v2 and v1 addresses. Note, however, that
235 the new bracketed syntax is only understood by Nautilus and later, so
236 do not make that change on hosts that have not yet had their ceph
237 packages upgraded.
238
239 When you are updating ``ceph.conf``, note the new ``ceph config
240 generate-minimal-conf`` command (which generates a barebones config
241 file with just enough information to reach the monitors) and the
242 ``ceph config assimilate-conf`` (which moves config file options into
243 the monitors' configuration database) may be helpful. For example,::
244
245 # ceph config assimilate-conf < /etc/ceph/ceph.conf
246 # ceph config generate-minimal-config > /etc/ceph/ceph.conf.new
247 # cat /etc/ceph/ceph.conf.new
248 # minimal ceph.conf for 0e5a806b-0ce5-4bc6-b949-aa6f68f5c2a3
249 [global]
250 fsid = 0e5a806b-0ce5-4bc6-b949-aa6f68f5c2a3
251 mon_host = [v2:10.0.0.1:3300/0,v1:10.0.0.1:6789/0]
252 # mv /etc/ceph/ceph.conf.new /etc/ceph/ceph.conf
253
254 Protocol
255 --------
256
257 For a detailed description of the v2 wire protocol, see :ref:`msgr2-protocol`.