]>
Commit | Line | Data |
---|---|---|
11fdf7f2 TL |
1 | .. _telemetry: |
2 | ||
3 | Telemetry Module | |
4 | ================ | |
5 | ||
6 | The telemetry module sends anonymous data about the cluster back to the Ceph | |
7 | developers to help understand how Ceph is used and what problems users may | |
8 | be experiencing. | |
9 | ||
e306af50 TL |
10 | This data is visualized on `public dashboards <https://telemetry-public.ceph.com/>`_ |
11 | that allow the community to quickly see summary statistics on how many clusters | |
12 | are reporting, their total capacity and OSD count, and version distribution | |
13 | trends. | |
14 | ||
eafe8130 TL |
15 | Channels |
16 | -------- | |
17 | ||
e306af50 | 18 | The telemetry report is broken down into several "channels", each with |
eafe8130 TL |
19 | a different type of information. Assuming telemetry has been enabled, |
20 | individual channels can be turned on and off. (If telemetry is off, | |
21 | the per-channel setting has no effect.) | |
22 | ||
23 | * **basic** (default: on): Basic information about the cluster | |
11fdf7f2 | 24 | |
eafe8130 | 25 | - capacity of the cluster |
e306af50 | 26 | - number of monitors, managers, OSDs, MDSs, object gateways, or other daemons |
eafe8130 TL |
27 | - software version currently being used |
28 | - number and types of RADOS pools and CephFS file systems | |
29 | - names of configuration options that have been changed from their | |
30 | default (but *not* their values) | |
31 | ||
32 | * **crash** (default: on): Information about daemon crashes, including | |
11fdf7f2 TL |
33 | |
34 | - type of daemon | |
35 | - version of the daemon | |
36 | - operating system (OS distribution, kernel version) | |
37 | - stack trace identifying where in the Ceph code the crash occurred | |
38 | ||
9f95a23c TL |
39 | * **device** (default: on): Information about device metrics, including |
40 | ||
41 | - anonymized SMART metrics | |
42 | ||
43 | * **ident** (default: off): User-provided identifying information about | |
eafe8130 TL |
44 | the cluster |
45 | ||
46 | - cluster description | |
47 | - contact email address | |
48 | ||
20effc67 TL |
49 | * **perf** (default: off): Various performance metrics of a cluster, which can be used to |
50 | ||
51 | - reveal overall cluster health | |
52 | - identify workload patterns | |
53 | - troubleshoot issues with latency, throttling, memory management, etc. | |
54 | - monitor cluster performance by daemon | |
55 | ||
11fdf7f2 | 56 | The data being reported does *not* contain any sensitive |
9f95a23c TL |
57 | data like pool names, object names, object contents, hostnames, or device |
58 | serial numbers. | |
11fdf7f2 TL |
59 | |
60 | It contains counters and statistics on how the cluster has been | |
9f95a23c | 61 | deployed, the version of Ceph, the distribution of the hosts and other |
11fdf7f2 TL |
62 | parameters which help the project to gain a better understanding of |
63 | the way Ceph is used. | |
64 | ||
e306af50 | 65 | Data is sent secured to *https://telemetry.ceph.com*. |
11fdf7f2 | 66 | |
20effc67 TL |
67 | Individual channels can be enabled or disabled with:: |
68 | ||
69 | ceph telemetry enable channel basic | |
70 | ceph telemetry enable channel crash | |
71 | ceph telemetry enable channel device | |
72 | ceph telemetry enable channel ident | |
73 | ceph telemetry enable channel perf | |
74 | ||
75 | ceph telemetry disable channel basic | |
76 | ceph telemetry disable channel crash | |
77 | ceph telemetry disable channel device | |
78 | ceph telemetry disable channel ident | |
79 | ceph telemetry disable channel perf | |
80 | ||
81 | Multiple channels can be enabled or disabled with:: | |
82 | ||
83 | ceph telemetry enable channel basic crash device ident perf | |
84 | ceph telemetry disable channel basic crash device ident perf | |
85 | ||
86 | Channels can be enabled or disabled all at once with:: | |
87 | ||
88 | ceph telemetry enable channel all | |
89 | ceph telemetry disable channel all | |
90 | ||
91 | Please note that telemetry should be on for these commands to take effect. | |
92 | ||
93 | List all channels with:: | |
94 | ||
95 | ceph telemetry channel ls | |
96 | ||
97 | NAME ENABLED DEFAULT DESC | |
98 | basic ON ON Share basic cluster information (size, version) | |
99 | crash ON ON Share metadata about Ceph daemon crashes (version, stack straces, etc) | |
100 | device ON ON Share device health metrics (e.g., SMART data, minus potentially identifying info like serial numbers) | |
101 | ident OFF OFF Share a user-provided description and/or contact email for the cluster | |
102 | perf ON OFF Share various performance metrics of a cluster | |
103 | ||
104 | ||
105 | Enabling Telemetry | |
106 | ------------------ | |
107 | ||
108 | To allow the *telemetry* module to start sharing data:: | |
109 | ||
110 | ceph telemetry on | |
111 | ||
112 | Please note: Telemetry data is licensed under the Community Data License | |
113 | Agreement - Sharing - Version 1.0 (https://cdla.io/sharing-1-0/). Hence, | |
114 | telemetry module can be enabled only after you add '--license sharing-1-0' to | |
115 | the 'ceph telemetry on' command. | |
116 | Once telemetry is on, please consider enabling channels which are off by | |
117 | default, such as the 'perf' channel. 'ceph telemetry on' output will list the | |
118 | exact command to enable these channels. | |
119 | ||
120 | Telemetry can be disabled at any time with:: | |
121 | ||
122 | ceph telemetry off | |
123 | ||
11fdf7f2 TL |
124 | Sample report |
125 | ------------- | |
126 | ||
127 | You can look at what data is reported at any time with the command:: | |
128 | ||
11fdf7f2 TL |
129 | ceph telemetry show |
130 | ||
20effc67 TL |
131 | If telemetry is off, you can preview a sample report with:: |
132 | ||
133 | ceph telemetry preview | |
134 | ||
135 | Generating a sample report might take a few moments in big clusters (clusters | |
136 | with hundreds of OSDs or more). | |
137 | ||
9f95a23c TL |
138 | To protect your privacy, device reports are generated separately, and data such |
139 | as hostname and device serial number is anonymized. The device telemetry is | |
140 | sent to a different endpoint and does not associate the device data with a | |
141 | particular cluster. To see a preview of the device report use the command:: | |
142 | ||
143 | ceph telemetry show-device | |
144 | ||
20effc67 TL |
145 | If telemetry is off, you can preview a sample device report with:: |
146 | ||
147 | ceph telemetry preview-device | |
148 | ||
9f95a23c TL |
149 | Please note: In order to generate the device report we use Smartmontools |
150 | version 7.0 and up, which supports JSON output. | |
11fdf7f2 TL |
151 | If you have any concerns about privacy with regard to the information included in |
152 | this report, please contact the Ceph developers. | |
153 | ||
20effc67 | 154 | In case you prefer to have a single output of both reports, and telemetry is on, use:: |
11fdf7f2 | 155 | |
20effc67 | 156 | ceph telemetry show-all |
eafe8130 | 157 | |
20effc67 | 158 | If you would like to view a single output of both reports, and telemetry is off, use:: |
eafe8130 | 159 | |
20effc67 | 160 | ceph telemetry preview-all |
eafe8130 | 161 | |
20effc67 | 162 | **Sample report by channel** |
11fdf7f2 | 163 | |
20effc67 | 164 | When telemetry is on you can see what data is reported by channel with:: |
11fdf7f2 | 165 | |
20effc67 | 166 | ceph telemetry show <channel_name> |
9f95a23c | 167 | |
20effc67 TL |
168 | Please note: If telemetry is on, and <channel_name> is disabled, the command |
169 | above will output a sample report by that channel, according to the collections | |
170 | the user is enrolled to. However this data is not reported, since the channel | |
171 | is disabled. | |
11fdf7f2 | 172 | |
20effc67 TL |
173 | If telemetry is off you can preview a sample report by channel with:: |
174 | ||
175 | ceph telemetry preview <channel_name> | |
176 | ||
177 | Collections | |
178 | ----------- | |
179 | ||
180 | Collections represent different aspects of data that we collect within a channel. | |
181 | ||
182 | List all collections with:: | |
183 | ||
184 | ceph telemetry collection ls | |
185 | ||
39ae355f TL |
186 | NAME STATUS DESC |
187 | basic_base NOT REPORTING: NOT OPTED-IN Basic information about the cluster (capacity, number and type of daemons, version, etc.) | |
188 | basic_mds_metadata NOT REPORTING: NOT OPTED-IN MDS metadata | |
189 | basic_pool_options_bluestore NOT REPORTING: NOT OPTED-IN Per-pool bluestore config options | |
190 | basic_pool_usage NOT REPORTING: NOT OPTED-IN Default pool application and usage statistics | |
191 | basic_rook_v01 NOT REPORTING: NOT OPTED-IN Basic Rook deployment data | |
192 | basic_usage_by_class NOT REPORTING: NOT OPTED-IN Default device class usage statistics | |
193 | crash_base NOT REPORTING: NOT OPTED-IN Information about daemon crashes (daemon type and version, backtrace, etc.) | |
194 | device_base NOT REPORTING: NOT OPTED-IN Information about device health metrics | |
195 | ident_base NOT REPORTING: NOT OPTED-IN, CHANNEL ident IS OFF User-provided identifying information about the cluster | |
196 | perf_memory_metrics NOT REPORTING: NOT OPTED-IN, CHANNEL perf IS OFF Heap stats and mempools for mon and mds | |
197 | perf_perf NOT REPORTING: NOT OPTED-IN, CHANNEL perf IS OFF Information about performance counters of the cluster | |
20effc67 TL |
198 | |
199 | Where: | |
200 | ||
201 | **NAME**: Collection name; prefix indicates the channel the collection belongs to. | |
202 | ||
203 | **STATUS**: Indicates whether the collection metrics are reported; this is | |
204 | determined by the status (enabled / disabled) of the channel the collection | |
205 | belongs to, along with the enrollment status of the collection (whether the user | |
206 | is opted-in to this collection). | |
207 | ||
208 | **DESC**: General description of the collection. | |
209 | ||
210 | See the diff between the collections you are enrolled to, and the new, | |
211 | available collections with:: | |
212 | ||
213 | ceph telemetry diff | |
214 | ||
215 | Enroll to the most recent collections with:: | |
216 | ||
217 | ceph telemetry on | |
218 | ||
219 | Then enable new channels that are off with:: | |
220 | ||
221 | ceph telemetry enable channel <channel_name> | |
11fdf7f2 TL |
222 | |
223 | Interval | |
224 | -------- | |
225 | ||
eafe8130 | 226 | The module compiles and sends a new report every 24 hours by default. |
11fdf7f2 TL |
227 | You can adjust this interval with:: |
228 | ||
eafe8130 | 229 | ceph config set mgr mgr/telemetry/interval 72 # report every three days |
11fdf7f2 | 230 | |
9f95a23c TL |
231 | Status |
232 | -------- | |
233 | ||
234 | The see the current configuration:: | |
235 | ||
236 | ceph telemetry status | |
237 | ||
238 | Manually sending telemetry | |
239 | -------------------------- | |
240 | ||
241 | To ad hoc send telemetry data:: | |
242 | ||
243 | ceph telemetry send | |
244 | ||
245 | In case telemetry is not enabled (with 'ceph telemetry on'), you need to add | |
246 | '--license sharing-1-0' to 'ceph telemetry send' command. | |
247 | ||
248 | Sending telemetry through a proxy | |
249 | --------------------------------- | |
250 | ||
251 | If the cluster cannot directly connect to the configured telemetry | |
252 | endpoint (default *telemetry.ceph.com*), you can configure a HTTP/HTTPS | |
253 | proxy server with:: | |
254 | ||
255 | ceph config set mgr mgr/telemetry/proxy https://10.0.0.1:8080 | |
256 | ||
257 | You can also include a *user:pass* if needed:: | |
258 | ||
259 | ceph config set mgr mgr/telemetry/proxy https://ceph:telemetry@10.0.0.1:8080 | |
260 | ||
261 | ||
11fdf7f2 TL |
262 | Contact and Description |
263 | ----------------------- | |
264 | ||
eafe8130 TL |
265 | A contact and description can be added to the report. This is |
266 | completely optional, and disabled by default.:: | |
11fdf7f2 TL |
267 | |
268 | ceph config set mgr mgr/telemetry/contact 'John Doe <john.doe@example.com>' | |
269 | ceph config set mgr mgr/telemetry/description 'My first Ceph cluster' | |
eafe8130 | 270 | ceph config set mgr mgr/telemetry/channel_ident true |
11fdf7f2 | 271 |