ceph/doc/cephadm/operations.rst

   1 ==================
   2 Cephadm Operations
   3 ==================
   4
   5 Watching cephadm log messages
   6 =============================
   7
   8 Cephadm logs to the ``cephadm`` cluster log channel, meaning you can
   9 monitor progress in realtime with::
  10
  11   # ceph -W cephadm
  12
  13 By default it will show info-level events and above.  To see
  14 debug-level messages too::
  15
  16   # ceph config set mgr mgr/cephadm/log_to_cluster_level debug
  17   # ceph -W cephadm --watch-debug
  18
  19 Be careful: the debug messages are very verbose!
  20
  21 You can see recent events with::
  22
  23   # ceph log last cephadm
  24
  25 These events are also logged to the ``ceph.cephadm.log`` file on
  26 monitor hosts and to the monitor daemons' stderr.
  27
  28
  29 Ceph daemon logs
  30 ================
  31
  32 Logging to stdout
  33 -----------------
  34
  35 Traditionally, Ceph daemons have logged to ``/var/log/ceph``.  By
  36 default, cephadm daemons log to stderr and the logs are
  37 captured by the container runtime environment.  For most systems, by
  38 default, these logs are sent to journald and accessible via
  39 ``journalctl``.
  40
  41 For example, to view the logs for the daemon ``mon.foo`` for a cluster
  42 with ID ``5c5a50ae-272a-455d-99e9-32c6a013e694``, the command would be
  43 something like::
  44
  45   journalctl -u ceph-5c5a50ae-272a-455d-99e9-32c6a013e694@mon.foo
  46
  47 This works well for normal operations when logging levels are low.
  48
  49 To disable logging to stderr::
  50
  51   ceph config set global log_to_stderr false
  52   ceph config set global mon_cluster_log_to_stderr false
  53
  54 Logging to files
  55 ----------------
  56
  57 You can also configure Ceph daemons to log to files instead of stderr,
  58 just like they have in the past.  When logging to files, Ceph logs appear
  59 in ``/var/log/ceph/<cluster-fsid>``.
  60
  61 To enable logging to files::
  62
  63   ceph config set global log_to_file true
  64   ceph config set global mon_cluster_log_to_file true
  65
  66 We recommend disabling logging to stderr (see above) or else everything
  67 will be logged twice::
  68
  69   ceph config set global log_to_stderr false
  70   ceph config set global mon_cluster_log_to_stderr false
  71
  72 By default, cephadm sets up log rotation on each host to rotate these
  73 files.  You can configure the logging retention schedule by modifying
  74 ``/etc/logrotate.d/ceph.<cluster-fsid>``.
  75
  76
  77 Data location
  78 =============
  79
  80 Cephadm daemon data and logs in slightly different locations than older
  81 versions of ceph:
  82
  83 * ``/var/log/ceph/<cluster-fsid>`` contains all cluster logs.  Note
  84   that by default cephadm logs via stderr and the container runtime,
  85   so these logs are normally not present.
  86 * ``/var/lib/ceph/<cluster-fsid>`` contains all cluster daemon data
  87   (besides logs).
  88 * ``/var/lib/ceph/<cluster-fsid>/<daemon-name>`` contains all data for
  89   an individual daemon.
  90 * ``/var/lib/ceph/<cluster-fsid>/crash`` contains crash reports for
  91   the cluster.
  92 * ``/var/lib/ceph/<cluster-fsid>/removed`` contains old daemon
  93   data directories for stateful daemons (e.g., monitor, prometheus)
  94   that have been removed by cephadm.
  95
  96 Disk usage
  97 ----------
  98
  99 Because a few Ceph daemons may store a significant amount of data in
 100 ``/var/lib/ceph`` (notably, the monitors and prometheus), we recommend
 101 moving this directory to its own disk, partition, or logical volume so
 102 that it does not fill up the root file system.
 103
 104
 105
 106 SSH Configuration
 107 =================
 108
 109 Cephadm uses SSH to connect to remote hosts.  SSH uses a key to authenticate
 110 with those hosts in a secure way.
 111
 112
 113 Default behavior
 114 ----------------
 115
 116 Cephadm stores an SSH key in the monitor that is used to
 117 connect to remote hosts.  When the cluster is bootstrapped, this SSH
 118 key is generated automatically and no additional configuration
 119 is necessary.
 120
 121 A *new* SSH key can be generated with::
 122
 123   ceph cephadm generate-key
 124
 125 The public portion of the SSH key can be retrieved with::
 126
 127   ceph cephadm get-pub-key
 128
 129 The currently stored SSH key can be deleted with::
 130
 131   ceph cephadm clear-key
 132
 133 You can make use of an existing key by directly importing it with::
 134
 135   ceph config-key set mgr/cephadm/ssh_identity_key -i <key>
 136   ceph config-key set mgr/cephadm/ssh_identity_pub -i <pub>
 137
 138 You will then need to restart the mgr daemon to reload the configuration with::
 139
 140   ceph mgr fail
 141
 142
 143 Customizing the SSH configuration
 144 ---------------------------------
 145
 146 Cephadm generates an appropriate ``ssh_config`` file that is
 147 used for connecting to remote hosts.  This configuration looks
 148 something like this::
 149
 150   Host *
 151   User root
 152   StrictHostKeyChecking no
 153   UserKnownHostsFile /dev/null
 154
 155 There are two ways to customize this configuration for your environment:
 156
 157 #. Import a customized configuration file that will be stored
 158    by the monitor with::
 159
 160      ceph cephadm set-ssh-config -i <ssh_config_file>
 161
 162    To remove a customized SSH config and revert back to the default behavior::
 163
 164      ceph cephadm clear-ssh-config
 165
 166 #. You can configure a file location for the SSH configuration file with::
 167
 168      ceph config set mgr mgr/cephadm/ssh_config_file <path>
 169
 170    We do *not recommend* this approach.  The path name must be
 171    visible to *any* mgr daemon, and cephadm runs all daemons as
 172    containers. That means that the file either need to be placed
 173    inside a customized container image for your deployment, or
 174    manually distributed to the mgr data directory
 175    (``/var/lib/ceph/<cluster-fsid>/mgr.<id>`` on the host, visible at
 176    ``/var/lib/ceph/mgr/ceph-<id>`` from inside the container).
 177
 178
 179 Health checks
 180 =============
 181
 182 CEPHADM_PAUSED
 183 --------------
 184
 185 Cephadm background work has been paused with ``ceph orch pause``.  Cephadm
 186 continues to perform passive monitoring activities (like checking
 187 host and daemon status), but it will not make any changes (like deploying
 188 or removing daemons).
 189
 190 Resume cephadm work with::
 191
 192   ceph orch resume
 193
 194 CEPHADM_STRAY_HOST
 195 ------------------
 196
 197 One or more hosts have running Ceph daemons but are not registered as
 198 hosts managed by *cephadm*.  This means that those services cannot
 199 currently be managed by cephadm (e.g., restarted, upgraded, included
 200 in `ceph orch ps`).
 201
 202 You can manage the host(s) with::
 203
 204   ceph orch host add *<hostname>*
 205
 206 Note that you may need to configure SSH access to the remote host
 207 before this will work.
 208
 209 Alternatively, you can manually connect to the host and ensure that
 210 services on that host are removed or migrated to a host that is
 211 managed by *cephadm*.
 212
 213 You can also disable this warning entirely with::
 214
 215   ceph config set mgr mgr/cephadm/warn_on_stray_hosts false
 216
 217 CEPHADM_STRAY_DAEMON
 218 --------------------
 219
 220 One or more Ceph daemons are running but not are not managed by
 221 *cephadm*.  This may be because they were deployed using a different
 222 tool, or because they were started manually.  Those
 223 services cannot currently be managed by cephadm (e.g., restarted,
 224 upgraded, or included in `ceph orch ps`).
 225
 226 If the daemon is a stateful one (monitor or OSD), it should be adopted
 227 by cephadm; see :ref:`cephadm-adoption`.  For stateless daemons, it is
 228 usually easiest to provision a new daemon with the ``ceph orch apply``
 229 command and then stop the unmanaged daemon.
 230
 231 This warning can be disabled entirely with::
 232
 233   ceph config set mgr mgr/cephadm/warn_on_stray_daemons false
 234
 235 CEPHADM_HOST_CHECK_FAILED
 236 -------------------------
 237
 238 One or more hosts have failed the basic cephadm host check, which verifies
 239 that (1) the host is reachable and cephadm can be executed there, and (2)
 240 that the host satisfies basic prerequisites, like a working container
 241 runtime (podman or docker) and working time synchronization.
 242 If this test fails, cephadm will no be able to manage services on that host.
 243
 244 You can manually run this check with::
 245
 246   ceph cephadm check-host *<hostname>*
 247
 248 You can remove a broken host from management with::
 249
 250   ceph orch host rm *<hostname>*
 251
 252 You can disable this health warning with::
 253
 254   ceph config set mgr mgr/cephadm/warn_on_failed_host_check false