[ceph.git] / ceph / src / pybind / mgr / cephadm / HACKING.rst

Development
===========


There are multiple ways to set up a development environment for the SSH orchestrator.
In the following I'll use the `vstart` method.

1) Make sure remoto is installed (0.35 or newer)

2) Use vstart to spin up a cluster


::

   # ../src/vstart.sh -n --cephadm

*Note that when you specify `--cephadm` you have to have passwordless ssh access to localhost*

It will add your ~/.ssh/id_rsa and ~/.ssh/id_rsa.pub to `mgr/ssh/ssh_identity_{key, pub}`
and add your $HOSTNAME to the list of known hosts.

This will also enable the cephadm mgr module and enable it as the orchestrator backend.

*Optional:*

While the above is sufficient for most operations, you may want to add a second host to the mix.
There is `Vagrantfile` for creating a minimal cluster in `src/pybind/mgr/cephadm/`.

If you wish to extend the one-node-localhost cluster to i.e. test more sophisticated OSD deployments you can follow the next steps:

From within the `src/pybind/mgr/cephadm` directory.


1) Spawn VMs

::

   # vagrant up

This will spawn three machines by default.
mon0, mgr0 and osd0 with 2 additional disks.

You can change that by passing `MONS` (default: 1), `MGRS` (default: 1), `OSDS` (default: 1) and
`DISKS` (default: 2) environment variables to overwrite the defaults. In order to not always have
to set the environment variables you can now create as JSON see `./vagrant.config.example.json`
for details.

If will also come with the necessary packages preinstalled as well as your ~/.ssh/id_rsa.pub key
injected. (to users root and vagrant; the cephadm-orchestrator currently connects as root)


2) Update the ssh-config

The cephadm orchestrator needs to understand how to connect to the new node. Most likely the VM
isn't reachable with the default settings used:

```
Host *
User root
StrictHostKeyChecking no
```

You want to adjust this by retrieving an adapted ssh_config from Vagrant.

::

   # vagrant ssh-config > ssh-config


Now set the newly created config for Ceph.

::

   # ceph cephadm set-ssh-config -i <path_to_ssh_conf>


3) Add the new host

Add the newly created host(s) to the inventory.

::


   # ceph orch host add <host>


4) Verify the inventory

You should see the hostname in the list.

::

   # ceph orch host ls


5) Verify the devices

To verify all disks are set and in good shape look if all devices have been spawned
and can be found

::

   # ceph orch device ls


6) Make a snapshot of all your VMs!

To not go the long way again the next time snapshot your VMs in order to revert them back
if they are dirty.

In `this repository <https://github.com/Devp00l/vagrant-helper-scripts>`_ you can find two
scripts that will help you with doing a snapshot and reverting it, without having to manual
snapshot and revert each VM individually.


Understanding ``AsyncCompletion``
=================================

How can I store temporary variables?
------------------------------------

Let's imagine you want to write code similar to

.. code:: python

    hosts = self.get_hosts()
    inventory = self.get_inventory(hosts)
    return self._create_osd(hosts, drive_group, inventory)

That won't work, as ``get_hosts`` and ``get_inventory`` return objects
of type ``AsyncCompletion``.

Now let's imaging a Python 3 world, where we can use ``async`` and
``await``. Then we actually can write this like so:

.. code:: python

    hosts = await self.get_hosts()
    inventory = await self.get_inventory(hosts)
    return self._create_osd(hosts, drive_group, inventory)

Let's use a simple example to make this clear:

.. code:: python

    val = await func_1()
    return func_2(val)

As we're not yet in Python 3, we need to do write ``await`` manually by
calling ``orchestrator.Completion.then()``:

.. code:: python

    func_1().then(lambda val: func_2(val))

    # or
    func_1().then(func_2)

Now let's desugar the original example:

.. code:: python

    hosts = await self.get_hosts()
    inventory = await self.get_inventory(hosts)
    return self._create_osd(hosts, drive_group, inventory)

Now let's replace one ``async`` at a time:

.. code:: python

    hosts = await self.get_hosts()
    return self.get_inventory(hosts).then(lambda inventory:
        self._create_osd(hosts, drive_group, inventory))

Then finally:

.. code:: python

    self.get_hosts().then(lambda hosts:
        self.get_inventory(hosts).then(lambda inventory:
         self._create_osd(hosts,
                          drive_group, inventory)))

This also works without lambdas:

.. code:: python

    def call_inventory(hosts):
        def call_create(inventory)
            return self._create_osd(hosts, drive_group, inventory)

        return self.get_inventory(hosts).then(call_create)

    self.get_hosts(call_inventory)

We should add support for ``await`` as soon as we're on Python 3.

I want to call my function for every host!
------------------------------------------

Imagine you have a function that looks like so:

.. code:: python

    @async_completion
    def deploy_stuff(name, node):
        ...

And you want to call ``deploy_stuff`` like so:

.. code:: python

    return [deploy_stuff(name, node) for node in nodes]

This won't work as expected. The number of ``AsyncCompletion`` objects
created should be ``O(1)``. But there is a solution:
``@async_map_completion``

.. code:: python

    @async_map_completion
    def deploy_stuff(name, node):
        ...

    return deploy_stuff([(name, node) for node in nodes])

This way, we're only creating one ``AsyncCompletion`` object. Note that
you should not create new ``AsyncCompletion`` within ``deploy_stuff``, as
we're then no longer have ``O(1)`` completions:

.. code:: python

    @async_completion
    def other_async_function():
        ...

    @async_map_completion
    def deploy_stuff(name, node):
        return other_async_function() # wrong!

Why do we need this?
--------------------

I've tried to look into making Completions composable by being able to
call one completion from another completion. I.e. making them re-usable
using Promises E.g.:

.. code:: python

    >>> return self.get_hosts().then(self._create_osd)

where ``get_hosts`` returns a Completion of list of hosts and
``_create_osd`` takes a list of hosts.

The concept behind this is to store the computation steps explicit and
then explicitly evaluate the chain:

.. code:: python

    p = Completion(on_complete=lambda x: x*2).then(on_complete=lambda x: str(x))
    p.finalize(2)
    assert p.result = "4"

or graphically:

::

    +---------------+      +-----------------+
    |               | then |                 |
    | lambda x: x*x | +--> | lambda x: str(x)|
    |               |      |                 |
    +---------------+      +-----------------+
Commit	Line	Data
9f95a23c TL	1	Development
	2	===========
	3
	4
	5	There are multiple ways to set up a development environment for the SSH orchestrator.
	6	In the following I'll use the `vstart` method.
	7
	8	1) Make sure remoto is installed (0.35 or newer)
	9
	10	2) Use vstart to spin up a cluster
	11
	12
	13	::
	14
	15	# ../src/vstart.sh -n --cephadm
	16
	17	Note that when you specify `--cephadm` you have to have passwordless ssh access to localhost
	18
	19	It will add your ~/.ssh/id_rsa and ~/.ssh/id_rsa.pub to `mgr/ssh/ssh_identity_{key, pub}`
	20	and add your $HOSTNAME to the list of known hosts.
	21
	22	This will also enable the cephadm mgr module and enable it as the orchestrator backend.
	23
	24	Optional:
	25
	26	While the above is sufficient for most operations, you may want to add a second host to the mix.
	27	There is `Vagrantfile` for creating a minimal cluster in `src/pybind/mgr/cephadm/`.
	28
	29	If you wish to extend the one-node-localhost cluster to i.e. test more sophisticated OSD deployments you can follow the next steps:
	30
	31	From within the `src/pybind/mgr/cephadm` directory.
	32
	33
	34	1) Spawn VMs
	35
	36	::
	37
	38	# vagrant up
	39
f6b5b4d7 TL	40	This will spawn three machines by default.
f6b5b4d7 TL	41	mon0, mgr0 and osd0 with 2 additional disks.
9f95a23c	42
f6b5b4d7 TL	43	You can change that by passing `MONS` (default: 1), `MGRS` (default: 1), `OSDS` (default: 1) and
	44	`DISKS` (default: 2) environment variables to overwrite the defaults. In order to not always have
	45	to set the environment variables you can now create as JSON see `./vagrant.config.example.json`
	46	for details.
9f95a23c TL	47
	48	If will also come with the necessary packages preinstalled as well as your ~/.ssh/id_rsa.pub key
	49	injected. (to users root and vagrant; the cephadm-orchestrator currently connects as root)
	50
	51
	52	2) Update the ssh-config
	53
f6b5b4d7 TL	54	The cephadm orchestrator needs to understand how to connect to the new node. Most likely the VM
f6b5b4d7 TL	55	isn't reachable with the default settings used:
9f95a23c TL	56
	57	```
	58	Host *
	59	User root
	60	StrictHostKeyChecking no
	61	```
	62
	63	You want to adjust this by retrieving an adapted ssh_config from Vagrant.
	64
	65	::
	66
	67	# vagrant ssh-config > ssh-config
	68
	69
	70	Now set the newly created config for Ceph.
	71
	72	::
	73
	74	# ceph cephadm set-ssh-config -i <path_to_ssh_conf>
	75
	76
	77	3) Add the new host
	78
	79	Add the newly created host(s) to the inventory.
	80
	81	::
	82
	83
	84	# ceph orch host add <host>
	85
	86
	87	4) Verify the inventory
	88
f6b5b4d7 TL	89	You should see the hostname in the list.
f6b5b4d7 TL	90
9f95a23c TL	91	::
	92
	93	# ceph orch host ls
	94
	95
f6b5b4d7 TL	96	5) Verify the devices
	97
	98	To verify all disks are set and in good shape look if all devices have been spawned
	99	and can be found
	100
	101	::
	102
	103	# ceph orch device ls
	104
	105
	106	6) Make a snapshot of all your VMs!
	107
	108	To not go the long way again the next time snapshot your VMs in order to revert them back
	109	if they are dirty.
	110
	111	In `this repository <https://github.com/Devp00l/vagrant-helper-scripts>`_ you can find two
	112	scripts that will help you with doing a snapshot and reverting it, without having to manual
	113	snapshot and revert each VM individually.
	114
9f95a23c TL	115
	116	Understanding ``AsyncCompletion``
	117	=================================
	118
	119	How can I store temporary variables?
	120	------------------------------------
	121
	122	Let's imagine you want to write code similar to
	123
	124	.. code:: python
	125
	126	hosts = self.get_hosts()
	127	inventory = self.get_inventory(hosts)
	128	return self._create_osd(hosts, drive_group, inventory)
	129
	130	That won't work, as ``get_hosts`` and ``get_inventory`` return objects
	131	of type ``AsyncCompletion``.
	132
	133	Now let's imaging a Python 3 world, where we can use ``async`` and
	134	``await``. Then we actually can write this like so:
	135
	136	.. code:: python
	137
	138	hosts = await self.get_hosts()
	139	inventory = await self.get_inventory(hosts)
	140	return self._create_osd(hosts, drive_group, inventory)
	141
	142	Let's use a simple example to make this clear:
	143
	144	.. code:: python
	145
	146	val = await func_1()
	147	return func_2(val)
	148
	149	As we're not yet in Python 3, we need to do write ``await`` manually by
	150	calling ``orchestrator.Completion.then()``:
	151
	152	.. code:: python
	153
	154	func_1().then(lambda val: func_2(val))
	155
	156	# or
	157	func_1().then(func_2)
	158
	159	Now let's desugar the original example:
	160
	161	.. code:: python
	162
	163	hosts = await self.get_hosts()
	164	inventory = await self.get_inventory(hosts)
	165	return self._create_osd(hosts, drive_group, inventory)
	166
	167	Now let's replace one ``async`` at a time:
	168
	169	.. code:: python
	170
	171	hosts = await self.get_hosts()
	172	return self.get_inventory(hosts).then(lambda inventory:
	173	self._create_osd(hosts, drive_group, inventory))
	174
	175	Then finally:
	176
	177	.. code:: python
	178
179	self.get_hosts().then(lambda hosts:
180	self.get_inventory(hosts).then(lambda inventory:
181	self._create_osd(hosts,
182	drive_group, inventory)))
183
184	This also works without lambdas:
185
186	.. code:: python
187
188	def call_inventory(hosts):
189	def call_create(inventory)
190	return self._create_osd(hosts, drive_group, inventory)
191
192	return self.get_inventory(hosts).then(call_create)
193
194	self.get_hosts(call_inventory)
195
196	We should add support for ``await`` as soon as we're on Python 3.
197
198	I want to call my function for every host!
199	------------------------------------------
200
201	Imagine you have a function that looks like so:
202
203	.. code:: python
204
205	@async_completion
206	def deploy_stuff(name, node):
207	...
208
209	And you want to call ``deploy_stuff`` like so:
210
211	.. code:: python
212
213	return [deploy_stuff(name, node) for node in nodes]
214
215	This won't work as expected. The number of ``AsyncCompletion`` objects
216	created should be ``O(1)``. But there is a solution:
217	``@async_map_completion``
218
219	.. code:: python
220
221	@async_map_completion
222	def deploy_stuff(name, node):
223	...
224
225	return deploy_stuff([(name, node) for node in nodes])
226
227	This way, we're only creating one ``AsyncCompletion`` object. Note that
228	you should not create new ``AsyncCompletion`` within ``deploy_stuff``, as
229	we're then no longer have ``O(1)`` completions:
230
231	.. code:: python
232
233	@async_completion
234	def other_async_function():
235	...
236
237	@async_map_completion
238	def deploy_stuff(name, node):
239	return other_async_function() # wrong!
240
241	Why do we need this?
242	--------------------
243
244	I've tried to look into making Completions composable by being able to
245	call one completion from another completion. I.e. making them re-usable
246	using Promises E.g.:
247
248	.. code:: python
249
250	>>> return self.get_hosts().then(self._create_osd)
251
252	where ``get_hosts`` returns a Completion of list of hosts and
253	``_create_osd`` takes a list of hosts.
254
255	The concept behind this is to store the computation steps explicit and
256	then explicitly evaluate the chain:
257
258	.. code:: python
259
260	p = Completion(on_complete=lambda x: x*2).then(on_complete=lambda x: str(x))
261	p.finalize(2)
262	assert p.result = "4"
263
264	or graphically:
265
266	::
267
268	+---------------+ +-----------------+
269	\| \| then \| \|
270	\| lambda x: x*x \| +--> \| lambda x: str(x)\|
271	\| \| \| \|
272	+---------------+ +-----------------+