]> git.proxmox.com Git - ceph.git/blame - ceph/doc/dev/index.rst
update sources to v12.1.3
[ceph.git] / ceph / doc / dev / index.rst
CommitLineData
7c673cae
FG
1============================================
2Contributing to Ceph: A Guide for Developers
3============================================
4
5:Author: Loic Dachary
6:Author: Nathan Cutler
7:License: Creative Commons Attribution-ShareAlike (CC BY-SA)
8
9.. note:: The old (pre-2016) developer documentation has been moved to :doc:`/dev/index-old`.
10
11.. contents::
12 :depth: 3
13
14Introduction
15============
16
17This guide has two aims. First, it should lower the barrier to entry for
18software developers who wish to get involved in the Ceph project. Second,
19it should serve as a reference for Ceph developers.
20
21We assume that readers are already familiar with Ceph (the distributed
22object store and file system designed to provide excellent performance,
23reliability and scalability). If not, please refer to the `project website`_
24and especially the `publications list`_.
25
26.. _`project website`: http://ceph.com
27.. _`publications list`: https://ceph.com/resources/publications/
28
29Since this document is to be consumed by developers, who are assumed to
30have Internet access, topics covered elsewhere, either within the Ceph
31documentation or elsewhere on the web, are treated by linking. If you
32notice that a link is broken or if you know of a better link, please
33`report it as a bug`_.
34
35.. _`report it as a bug`: http://tracker.ceph.com/projects/ceph/issues/new
36
37Essentials (tl;dr)
38==================
39
40This chapter presents essential information that every Ceph developer needs
41to know.
42
43Leads
44-----
45
46The Ceph project is led by Sage Weil. In addition, each major project
47component has its own lead. The following table shows all the leads and
48their nicks on `GitHub`_:
49
50.. _github: https://github.com/
51
c07f9fc5
FG
52========= ================ =============
53Scope Lead GitHub nick
54========= ================ =============
55Ceph Sage Weil liewegas
56RADOS Samuel Just athanatos
57RGW Yehuda Sadeh yehudasa
58RBD Jason Dillaman dillaman
59CephFS Patrick Donnelly batrick
60Build/Ops Ken Dreyer ktdreyer
61========= ================ =============
7c673cae
FG
62
63The Ceph-specific acronyms in the table are explained in
64:doc:`/architecture`.
65
66History
67-------
68
69See the `History chapter of the Wikipedia article`_.
70
71.. _`History chapter of the Wikipedia article`: https://en.wikipedia.org/wiki/Ceph_%28software%29#History
72
73Licensing
74---------
75
76Ceph is free software.
77
78Unless stated otherwise, the Ceph source code is distributed under the terms of
79the LGPL2.1. For full details, see `the file COPYING in the top-level
80directory of the source-code tree`_.
81
82.. _`the file COPYING in the top-level directory of the source-code tree`:
83 https://github.com/ceph/ceph/blob/master/COPYING
84
85Source code repositories
86------------------------
87
88The source code of Ceph lives on `GitHub`_ in a number of repositories below
89the `Ceph "organization"`_.
90
91.. _`Ceph "organization"`: https://github.com/ceph
92
93To make a meaningful contribution to the project as a developer, a working
94knowledge of git_ is essential.
95
96.. _git: https://git-scm.com/documentation
97
98Although the `Ceph "organization"`_ includes several software repositories,
99this document covers only one: https://github.com/ceph/ceph.
100
101Redmine issue tracker
102---------------------
103
104Although `GitHub`_ is used for code, Ceph-related issues (Bugs, Features,
105Backports, Documentation, etc.) are tracked at http://tracker.ceph.com,
106which is powered by `Redmine`_.
107
108.. _Redmine: http://www.redmine.org
109
110The tracker has a Ceph project with a number of subprojects loosely
111corresponding to the various architectural components (see
112:doc:`/architecture`).
113
114Mere `registration`_ in the tracker automatically grants permissions
115sufficient to open new issues and comment on existing ones.
116
117.. _registration: http://tracker.ceph.com/account/register
118
119To report a bug or propose a new feature, `jump to the Ceph project`_ and
120click on `New issue`_.
121
122.. _`jump to the Ceph project`: http://tracker.ceph.com/projects/ceph
123.. _`New issue`: http://tracker.ceph.com/projects/ceph/issues/new
124
125Mailing list
126------------
127
128Ceph development email discussions take place on the mailing list
129``ceph-devel@vger.kernel.org``. The list is open to all. Subscribe by
130sending a message to ``majordomo@vger.kernel.org`` with the line: ::
131
132 subscribe ceph-devel
133
134in the body of the message.
135
136There are also `other Ceph-related mailing lists`_.
137
31f18b77 138.. _`other Ceph-related mailing lists`: https://ceph.com/irc/
7c673cae
FG
139
140IRC
141---
142
143In addition to mailing lists, the Ceph community also communicates in real
144time using `Internet Relay Chat`_.
145
146.. _`Internet Relay Chat`: http://www.irchelp.org/
147
31f18b77 148See https://ceph.com/irc/ for how to set up your IRC
7c673cae
FG
149client and a list of channels.
150
151Submitting patches
152------------------
153
154The canonical instructions for submitting patches are contained in the
155`the file CONTRIBUTING.rst in the top-level directory of the source-code
156tree`_. There may be some overlap between this guide and that file.
157
158.. _`the file CONTRIBUTING.rst in the top-level directory of the source-code tree`:
159 https://github.com/ceph/ceph/blob/master/CONTRIBUTING.rst
160
161All newcomers are encouraged to read that file carefully.
162
163Building from source
164--------------------
165
166See instructions at :doc:`/install/build-ceph`.
167
168Using ccache to speed up local builds
169-------------------------------------
170
171Rebuilds of the ceph source tree can benefit significantly from use of `ccache`_.
172Many a times while switching branches and such, one might see build failures for
173certain older branches mostly due to older build artifacts. These rebuilds can
174significantly benefit the use of ccache. For a full clean source tree, one could
175do ::
176
177 $ make clean
178
179 # note the following will nuke everything in the source tree that
180 # isn't tracked by git, so make sure to backup any log files /conf options
181
182 $ git clean -fdx; git submodule foreach git clean -fdx
183
184ccache is available as a package in most distros. To build ceph with ccache one
185can::
186
187 $ cmake -DWITH_CCACHE=ON ..
188
189ccache can also be used for speeding up all builds in the system. for more
190details refer to the `run modes`_ of the ccache manual. The default settings of
191``ccache`` can be displayed with ``ccache -s``.
192
193.. note: It is recommended to override the ``max_size``, which is the size of
194 cache, defaulting to 10G, to a larger size like 25G or so. Refer to the
195 `configuration`_ section of ccache manual.
196
197.. _`ccache`: https://ccache.samba.org/
198.. _`run modes`: https://ccache.samba.org/manual.html#_run_modes
199.. _`configuration`: https://ccache.samba.org/manual.html#_configuration
200
201Development-mode cluster
202------------------------
203
204See :doc:`/dev/quick_guide`.
205
206Backporting
207-----------
208
209All bugfixes should be merged to the ``master`` branch before being backported.
210To flag a bugfix for backporting, make sure it has a `tracker issue`_
211associated with it and set the ``Backport`` field to a comma-separated list of
212previous releases (e.g. "hammer,jewel") that you think need the backport.
213The rest (including the actual backporting) will be taken care of by the
214`Stable Releases and Backports`_ team.
215
216.. _`tracker issue`: http://tracker.ceph.com/
217.. _`Stable Releases and Backports`: http://tracker.ceph.com/projects/ceph-releases/wiki
218
d2e6a577
FG
219Guidance for use of cluster log
220-------------------------------
221
222If your patches emit messages to the Ceph cluster log, please consult
223this guidance: :doc:`/dev/logging`.
224
7c673cae
FG
225
226What is merged where and when ?
227===============================
228
229Commits are merged into branches according to criteria that change
230during the lifecycle of a Ceph release. This chapter is the inventory
231of what can be merged in which branch at a given point in time.
232
233Development releases (i.e. x.0.z)
234---------------------------------
235
236What ?
237^^^^^^
238
239* features
240* bug fixes
241
242Where ?
243^^^^^^^
244
245Features are merged to the master branch. Bug fixes should be merged
246to the corresponding named branch (e.g. "jewel" for 10.0.z, "kraken"
247for 11.0.z, etc.). However, this is not mandatory - bug fixes can be
248merged to the master branch as well, since the master branch is
249periodically merged to the named branch during the development
250releases phase. In either case, if the bugfix is important it can also
251be flagged for backport to one or more previous stable releases.
252
253When ?
254^^^^^^
255
256After the stable release candidates of the previous release enters
257phase 2 (see below). For example: the "jewel" named branch was
258created when the infernalis release candidates entered phase 2. From
259this point on, master was no longer associated with infernalis. As
260soon as the named branch of the next stable release is created, master
261starts getting periodically merged into it.
262
263Branch merges
264^^^^^^^^^^^^^
265
266* The branch of the stable release is merged periodically into master.
267* The master branch is merged periodically into the branch of the
268 stable release.
269* The master is merged into the branch of the stable release
270 immediately after each development x.0.z release.
271
272Stable release candidates (i.e. x.1.z) phase 1
273----------------------------------------------
274
275What ?
276^^^^^^
277
278* bug fixes only
279
280Where ?
281^^^^^^^
282
283The branch of the stable release (e.g. "jewel" for 10.0.z, "kraken"
284for 11.0.z, etc.) or master. Bug fixes should be merged to the named
285branch corresponding to the stable release candidate (e.g. "jewel" for
28610.1.z) or to master. During this phase, all commits to master will be
287merged to the named branch, and vice versa. In other words, it makes
288no difference whether a commit is merged to the named branch or to
289master - it will make it into the next release candidate either way.
290
291When ?
292^^^^^^
293
294After the first stable release candidate is published, i.e. after the
295x.1.0 tag is set in the release branch.
296
297Branch merges
298^^^^^^^^^^^^^
299
300* The branch of the stable release is merged periodically into master.
301* The master branch is merged periodically into the branch of the
302 stable release.
303* The master is merged into the branch of the stable release
304 immediately after each x.1.z release candidate.
305
306Stable release candidates (i.e. x.1.z) phase 2
307----------------------------------------------
308
309What ?
310^^^^^^
311
312* bug fixes only
313
314Where ?
315^^^^^^^
316
317The branch of the stable release (e.g. "jewel" for 10.0.z, "kraken"
318for 11.0.z, etc.). During this phase, all commits to the named branch
319will be merged into master. Cherry-picking to the named branch during
320release candidate phase 2 is done manually since the official
321backporting process only begins when the release is pronounced
322"stable".
323
324When ?
325^^^^^^
326
327After Sage Weil decides it is time for phase 2 to happen.
328
329Branch merges
330^^^^^^^^^^^^^
331
332* The branch of the stable release is merged periodically into master.
333
334Stable releases (i.e. x.2.z)
335----------------------------
336
337What ?
338^^^^^^
339
340* bug fixes
341* features are sometime accepted
342* commits should be cherry-picked from master when possible
343* commits that are not cherry-picked from master must be about a bug unique to the stable release
344* see also `the backport HOWTO`_
345
346.. _`the backport HOWTO`:
347 http://tracker.ceph.com/projects/ceph-releases/wiki/HOWTO#HOWTO
348
349Where ?
350^^^^^^^
351
352The branch of the stable release (hammer for 0.94.x, infernalis for 9.2.x, etc.)
353
354When ?
355^^^^^^
356
357After the stable release is published, i.e. after the "vx.2.0" tag is
358set in the release branch.
359
360Branch merges
361^^^^^^^^^^^^^
362
363Never
364
365Issue tracker
366=============
367
368See `Redmine issue tracker`_ for a brief introduction to the Ceph Issue Tracker.
369
370Ceph developers use the issue tracker to
371
3721. keep track of issues - bugs, fix requests, feature requests, backport
373requests, etc.
374
3752. communicate with other developers and keep them informed as work
376on the issues progresses.
377
378Issue tracker conventions
379-------------------------
380
381When you start working on an existing issue, it's nice to let the other
382developers know this - to avoid duplication of labor. Typically, this is
383done by changing the :code:`Assignee` field (to yourself) and changing the
384:code:`Status` to *In progress*. Newcomers to the Ceph community typically do not
385have sufficient privileges to update these fields, however: they can
386simply update the issue with a brief note.
387
388.. table:: Meanings of some commonly used statuses
389
390 ================ ===========================================
391 Status Meaning
392 ================ ===========================================
393 New Initial status
394 In Progress Somebody is working on it
395 Need Review Pull request is open with a fix
396 Pending Backport Fix has been merged, backport(s) pending
397 Resolved Fix and backports (if any) have been merged
398 ================ ===========================================
399
400Basic workflow
401==============
402
403The following chart illustrates basic development workflow:
404
405.. ditaa::
406
407 Upstream Code Your Local Environment
408
409 /----------\ git clone /-------------\
410 | Ceph | -------------------------> | ceph/master |
411 \----------/ \-------------/
412 ^ |
413 | | git branch fix_1
414 | git merge |
415 | v
416 /----------------\ git commit --amend /-------------\
417 | make check |---------------------> | ceph/fix_1 |
418 | ceph--qa--suite| \-------------/
419 \----------------/ |
420 ^ | fix changes
421 | | test changes
422 | review | git commit
423 | |
424 | v
425 /--------------\ /-------------\
426 | github |<---------------------- | ceph/fix_1 |
427 | pull request | git push \-------------/
428 \--------------/
429
430Below we present an explanation of this chart. The explanation is written
431with the assumption that you, the reader, are a beginning developer who
432has an idea for a bugfix, but do not know exactly how to proceed.
433
434Update the tracker
435------------------
436
437Before you start, you should know the `Issue tracker`_ number of the bug
438you intend to fix. If there is no tracker issue, now is the time to create
439one.
440
441The tracker is there to explain the issue (bug) to your fellow Ceph
442developers and keep them informed as you make progress toward resolution.
443To this end, then, provide a descriptive title as well as sufficient
444information and details in the description.
445
446If you have sufficient tracker permissions, assign the bug to yourself by
447changing the ``Assignee`` field. If your tracker permissions have not yet
448been elevated, simply add a comment to the issue with a short message like
449"I am working on this issue".
450
451Upstream code
452-------------
453
454This section, and the ones that follow, correspond to the nodes in the
455above chart.
456
457The upstream code lives in https://github.com/ceph/ceph.git, which is
458sometimes referred to as the "upstream repo", or simply "upstream". As the
459chart illustrates, we will make a local copy of this code, modify it, test
460our modifications, and submit the modifications back to the upstream repo
461for review.
462
463A local copy of the upstream code is made by
464
4651. forking the upstream repo on GitHub, and
4662. cloning your fork to make a local working copy
467
468See the `the GitHub documentation
469<https://help.github.com/articles/fork-a-repo/#platform-linux>`_ for
470detailed instructions on forking. In short, if your GitHub username is
471"mygithubaccount", your fork of the upstream repo will show up at
472https://github.com/mygithubaccount/ceph. Once you have created your fork,
473you clone it by doing:
474
475.. code::
476
477 $ git clone https://github.com/mygithubaccount/ceph
478
479While it is possible to clone the upstream repo directly, in this case you
480must fork it first. Forking is what enables us to open a `GitHub pull
481request`_.
482
483For more information on using GitHub, refer to `GitHub Help
484<https://help.github.com/>`_.
485
486Local environment
487-----------------
488
489In the local environment created in the previous step, you now have a
490copy of the ``master`` branch in ``remotes/origin/master``. Since the fork
491(https://github.com/mygithubaccount/ceph.git) is frozen in time and the
492upstream repo (https://github.com/ceph/ceph.git, typically abbreviated to
493``ceph/ceph.git``) is updated frequently by other developers, you will need
494to sync your fork periodically. To do this, first add the upstream repo as
495a "remote" and fetch it::
496
497 $ git remote add ceph https://github.com/ceph/ceph.git
498 $ git fetch ceph
499
500Fetching downloads all objects (commits, branches) that were added since
501the last sync. After running these commands, all the branches from
502``ceph/ceph.git`` are downloaded to the local git repo as
503``remotes/ceph/$BRANCH_NAME`` and can be referenced as
504``ceph/$BRANCH_NAME`` in certain git commands.
505
506For example, your local ``master`` branch can be reset to the upstream Ceph
507``master`` branch by doing::
508
509 $ git fetch ceph
510 $ git checkout master
511 $ git reset --hard ceph/master
512
513Finally, the ``master`` branch of your fork can then be synced to upstream
514master by::
515
516 $ git push -u origin master
517
518Bugfix branch
519-------------
520
521Next, create a branch for the bugfix:
522
523.. code::
524
525 $ git checkout master
526 $ git checkout -b fix_1
527 $ git push -u origin fix_1
528
529This creates a ``fix_1`` branch locally and in our GitHub fork. At this
530point, the ``fix_1`` branch is identical to the ``master`` branch, but not
531for long! You are now ready to modify the code.
532
533Fix bug locally
534---------------
535
536At this point, change the status of the tracker issue to "In progress" to
537communicate to the other Ceph developers that you have begun working on a
538fix. If you don't have permission to change that field, your comment that
539you are working on the issue is sufficient.
540
541Possibly, your fix is very simple and requires only minimal testing.
542More likely, it will be an iterative process involving trial and error, not
543to mention skill. An explanation of how to fix bugs is beyond the
544scope of this document. Instead, we focus on the mechanics of the process
545in the context of the Ceph project.
546
547A detailed discussion of the tools available for validating your bugfixes,
548see the `Testing`_ chapter.
549
550For now, let us just assume that you have finished work on the bugfix and
551that you have tested it and believe it works. Commit the changes to your local
552branch using the ``--signoff`` option::
553
554 $ git commit -as
555
556and push the changes to your fork::
557
558 $ git push origin fix_1
559
560GitHub pull request
561-------------------
562
563The next step is to open a GitHub pull request. The purpose of this step is
564to make your bugfix available to the community of Ceph developers. They
565will review it and may do additional testing on it.
566
567In short, this is the point where you "go public" with your modifications.
568Psychologically, you should be prepared to receive suggestions and
569constructive criticism. Don't worry! In our experience, the Ceph project is
570a friendly place!
571
572If you are uncertain how to use pull requests, you may read
573`this GitHub pull request tutorial`_.
574
575.. _`this GitHub pull request tutorial`:
576 https://help.github.com/articles/using-pull-requests/
577
578For some ideas on what constitutes a "good" pull request, see
579the `Git Commit Good Practice`_ article at the `OpenStack Project Wiki`_.
580
581.. _`Git Commit Good Practice`: https://wiki.openstack.org/wiki/GitCommitMessages
582.. _`OpenStack Project Wiki`: https://wiki.openstack.org/wiki/Main_Page
583
584Once your pull request (PR) is opened, update the `Issue tracker`_ by
585adding a comment to the bug pointing the other developers to your PR. The
586update can be as simple as::
587
588 *PR*: https://github.com/ceph/ceph/pull/$NUMBER_OF_YOUR_PULL_REQUEST
589
590Automated PR validation
591-----------------------
592
593When your PR hits GitHub, the Ceph project's `Continuous Integration (CI)
594<https://en.wikipedia.org/wiki/Continuous_integration>`_
595infrastructure will test it automatically. At the time of this writing
596(March 2016), the automated CI testing included a test to check that the
597commits in the PR are properly signed (see `Submitting patches`_) and a
224ce89b 598`make check`_ test.
7c673cae 599
224ce89b 600The latter, `make check`_, builds the PR and runs it through a battery of
7c673cae
FG
601tests. These tests run on machines operated by the Ceph Continuous
602Integration (CI) team. When the tests complete, the result will be shown
603on GitHub in the pull request itself.
604
605You can (and should) also test your modifications before you open a PR.
606Refer to the `Testing`_ chapter for details.
607
224ce89b
WB
608Notes on PR make check test
609^^^^^^^^^^^^^^^^^^^^^^^^^^^
610
611The GitHub `make check`_ test is driven by a Jenkins instance.
612
613Jenkins merges the PR branch into the latest version of the base branch before
614starting the build, so you don't have to rebase the PR to pick up any fixes.
615
616You can trigger the PR tests at any time by adding a comment to the PR - the
617comment should contain the string "test this please". Since a human subscribed
618to the PR might interpret that as a request for him or her to test the PR, it's
619good to write the request as "Jenkins, test this please".
620
621The `make check`_ log is the place to go if there is a failure and you're not
622sure what caused it. To reach it, first click on "details" (next to the `make
623check`_ test in the PR) to get into the Jenkins web GUI, and then click on
624"Console Output" (on the left).
625
626Jenkins is set up to grep the log for strings known to have been associated
627with `make check`_ failures in the past. However, there is no guarantee that
628the strings are associated with any given `make check`_ failure. You have to
629dig into the log to be sure.
630
7c673cae
FG
631Integration tests AKA ceph-qa-suite
632-----------------------------------
633
634Since Ceph is a complex beast, it may also be necessary to test your fix to
635see how it behaves on real clusters running either on real or virtual
636hardware. Tests designed for this purpose live in the `ceph/qa
637sub-directory`_ and are run via the `teuthology framework`_.
638
639.. _`ceph/qa sub-directory`: https://github.com/ceph/ceph/tree/master/qa/
640.. _`teuthology repository`: https://github.com/ceph/teuthology
641.. _`teuthology framework`: https://github.com/ceph/teuthology
642
643If you have access to an OpenStack tenant, you are encouraged to run the
644integration tests yourself using `ceph-workbench ceph-qa-suite`_,
645and to post the test results to the PR.
646
647.. _`ceph-workbench ceph-qa-suite`: http://ceph-workbench.readthedocs.org/
648
649The Ceph community has access to the `Sepia lab
650<http://ceph.github.io/sepia/>`_ where integration tests can be run on
651real hardware. Other developers may add tags like "needs-qa" to your PR.
652This allows PRs that need testing to be merged into a single branch and
653tested all at the same time. Since teuthology suites can take hours
654(even days in some cases) to run, this can save a lot of time.
655
656Integration testing is discussed in more detail in the `Testing`_ chapter.
657
658Code review
659-----------
660
661Once your bugfix has been thoroughly tested, or even during this process,
662it will be subjected to code review by other developers. This typically
663takes the form of correspondence in the PR itself, but can be supplemented
664by discussions on `IRC`_ and the `Mailing list`_.
665
666Amending your PR
667----------------
668
669While your PR is going through `Testing`_ and `Code review`_, you can
670modify it at any time by editing files in your local branch.
671
672After the changes are committed locally (to the ``fix_1`` branch in our
673example), they need to be pushed to GitHub so they appear in the PR.
674
675Modifying the PR is done by adding commits to the ``fix_1`` branch upon
676which it is based, often followed by rebasing to modify the branch's git
677history. See `this tutorial
678<https://www.atlassian.com/git/tutorials/rewriting-history>`_ for a good
679introduction to rebasing. When you are done with your modifications, you
680will need to force push your branch with:
681
682.. code::
683
684 $ git push --force origin fix_1
685
686Merge
687-----
688
689The bugfixing process culminates when one of the project leads decides to
690merge your PR.
691
692When this happens, it is a signal for you (or the lead who merged the PR)
693to change the `Issue tracker`_ status to "Resolved". Some issues may be
694flagged for backporting, in which case the status should be changed to
695"Pending Backport" (see the `Backporting`_ chapter for details).
696
697
698Testing
699=======
700
224ce89b 701Ceph has two types of tests: `make check`_ tests and integration tests.
7c673cae
FG
702The former are run via `GNU Make <https://www.gnu.org/software/make/>`,
703and the latter are run via the `teuthology framework`_. The following two
224ce89b
WB
704chapters examine the `make check`_ and integration tests in detail.
705
706.. _`make check`:
7c673cae
FG
707
708Testing - make check
709====================
710
224ce89b 711After compiling Ceph, the `make check`_ command can be used to run the
7c673cae 712code through a battery of tests covering various aspects of Ceph. For
224ce89b 713inclusion in `make check`_, a test must:
7c673cae
FG
714
715* bind ports that do not conflict with other tests
716* not require root access
717* not require more than one machine to run
718* complete within a few minutes
719
224ce89b 720While it is possible to run `make check`_ directly, it can be tricky to
7c673cae 721correctly set up your environment. Fortunately, a script is provided to
224ce89b 722make it easier run `make check`_ on your code. It can be run from the
7c673cae
FG
723top-level directory of the Ceph source tree by doing::
724
725 $ ./run-make-check.sh
726
727You will need a minimum of 8GB of RAM and 32GB of free disk space for this
728command to complete successfully on x86_64 (other architectures may have
729different constraints). Depending on your hardware, it can take from 20
730minutes to three hours to complete, but it's worth the wait.
731
224ce89b
WB
732Caveats
733-------
7c673cae 734
224ce89b
WB
7351. Unlike the various Ceph daemons and ``ceph-fuse``, the `make check`_ tests
736 are linked against the default memory allocator (glibc) unless explicitly
737 linked against something else. This enables tools like valgrind to be used
738 in the tests.
7c673cae
FG
739
740Testing - integration tests
741===========================
742
743When a test requires multiple machines, root access or lasts for a
744longer time (for example, to simulate a realistic Ceph deployment), it
745is deemed to be an integration test. Integration tests are organized into
746"suites", which are defined in the `ceph/qa sub-directory`_ and run with
747the ``teuthology-suite`` command.
748
749The ``teuthology-suite`` command is part of the `teuthology framework`_.
750In the sections that follow we attempt to provide a detailed introduction
751to that framework from the perspective of a beginning Ceph developer.
752
753Teuthology consumes packages
754----------------------------
755
756It may take some time to understand the significance of this fact, but it
757is `very` significant. It means that automated tests can be conducted on
758multiple platforms using the same packages (RPM, DEB) that can be
759installed on any machine running those platforms.
760
761Teuthology has a `list of platforms that it supports
762<https://github.com/ceph/ceph/tree/master/qa/distros/supported>`_ (as
763of March 2016 the list consisted of "CentOS 7.2" and "Ubuntu 14.04"). It
764expects to be provided pre-built Ceph packages for these platforms.
765Teuthology deploys these platforms on machines (bare-metal or
766cloud-provisioned), installs the packages on them, and deploys Ceph
767clusters on them - all as called for by the test.
768
769The nightlies
770-------------
771
772A number of integration tests are run on a regular basis in the `Sepia
773lab`_ against the official Ceph repositories (on the ``master`` development
774branch and the stable branches). Traditionally, these tests are called "the
775nightlies" because the Ceph core developers used to live and work in
776the same time zone and from their perspective the tests were run overnight.
777
778The results of the nightlies are published at http://pulpito.ceph.com/ and
779http://pulpito.ovh.sepia.ceph.com:8081/. The developer nick shows in the
780test results URL and in the first column of the Pulpito dashboard. The
781results are also reported on the `ceph-qa mailing list
31f18b77 782<https://ceph.com/irc/>`_ for analysis.
7c673cae
FG
783
784Suites inventory
785----------------
786
787The ``suites`` directory of the `ceph/qa sub-directory`_ contains
788all the integration tests, for all the Ceph components.
789
790`ceph-deploy <https://github.com/ceph/ceph/tree/master/qa/suites/ceph-deploy>`_
791 install a Ceph cluster with ``ceph-deploy`` (`ceph-deploy man page`_)
792
793`ceph-disk <https://github.com/ceph/ceph/tree/master/qa/suites/ceph-disk>`_
794 verify init scripts (upstart etc.) and udev integration with
795 ``ceph-disk`` (`ceph-disk man page`_), with and without `dmcrypt
796 <https://gitlab.com/cryptsetup/cryptsetup/wikis/DMCrypt>`_ support.
797
798`dummy <https://github.com/ceph/ceph/tree/master/qa/suites/dummy>`_
799 get a machine, do nothing and return success (commonly used to
800 verify the integration testing infrastructure works as expected)
801
802`fs <https://github.com/ceph/ceph/tree/master/qa/suites/fs>`_
803 test CephFS
804
805`kcephfs <https://github.com/ceph/ceph/tree/master/qa/suites/kcephfs>`_
806 test the CephFS kernel module
807
808`krbd <https://github.com/ceph/ceph/tree/master/qa/suites/krbd>`_
809 test the RBD kernel module
810
811`powercycle <https://github.com/ceph/ceph/tree/master/qa/suites/powercycle>`_
812 verify the Ceph cluster behaves when machines are powered off
813 and on again
814
815`rados <https://github.com/ceph/ceph/tree/master/qa/suites/rados>`_
816 run Ceph clusters including OSDs and MONs, under various conditions of
817 stress
818
819`rbd <https://github.com/ceph/ceph/tree/master/qa/suites/rbd>`_
820 run RBD tests using actual Ceph clusters, with and without qemu
821
822`rgw <https://github.com/ceph/ceph/tree/master/qa/suites/rgw>`_
823 run RGW tests using actual Ceph clusters
824
825`smoke <https://github.com/ceph/ceph/tree/master/qa/suites/smoke>`_
826 run tests that exercise the Ceph API with an actual Ceph cluster
827
828`teuthology <https://github.com/ceph/ceph/tree/master/qa/suites/teuthology>`_
829 verify that teuthology can run integration tests, with and without OpenStack
830
831`upgrade <https://github.com/ceph/ceph/tree/master/qa/suites/upgrade>`_
832 for various versions of Ceph, verify that upgrades can happen
833 without disrupting an ongoing workload
834
835.. _`ceph-deploy man page`: ../../man/8/ceph-deploy
836.. _`ceph-disk man page`: ../../man/8/ceph-disk
837
838teuthology-describe-tests
839-------------------------
840
841In February 2016, a new feature called ``teuthology-describe-tests`` was
842added to the `teuthology framework`_ to facilitate documentation and better
843understanding of integration tests (`feature announcement
844<http://article.gmane.org/gmane.comp.file-systems.ceph.devel/29287>`_).
845
846The upshot is that tests can be documented by embedding ``meta:``
847annotations in the yaml files used to define the tests. The results can be
848seen in the `ceph-qa-suite wiki
849<http://tracker.ceph.com/projects/ceph-qa-suite/wiki/>`_.
850
851Since this is a new feature, many yaml files have yet to be annotated.
852Developers are encouraged to improve the documentation, in terms of both
853coverage and quality.
854
855How integration tests are run
856-----------------------------
857
858Given that - as a new Ceph developer - you will typically not have access
859to the `Sepia lab`_, you may rightly ask how you can run the integration
860tests in your own environment.
861
862One option is to set up a teuthology cluster on bare metal. Though this is
863a non-trivial task, it `is` possible. Here are `some notes
864<http://docs.ceph.com/teuthology/docs/LAB_SETUP.html>`_ to get you started
865if you decide to go this route.
866
867If you have access to an OpenStack tenant, you have another option: the
868`teuthology framework`_ has an OpenStack backend, which is documented `here
869<https://github.com/dachary/teuthology/tree/openstack#openstack-backend>`__.
870This OpenStack backend can build packages from a given git commit or
871branch, provision VMs, install the packages and run integration tests
872on those VMs. This process is controlled using a tool called
873`ceph-workbench ceph-qa-suite`_. This tool also automates publishing of
874test results at http://teuthology-logs.public.ceph.com.
875
876Running integration tests on your code contributions and publishing the
877results allows reviewers to verify that changes to the code base do not
878cause regressions, or to analyze test failures when they do occur.
879
880Every teuthology cluster, whether bare-metal or cloud-provisioned, has a
881so-called "teuthology machine" from which tests suites are triggered using the
882``teuthology-suite`` command.
883
884A detailed and up-to-date description of each `teuthology-suite`_ option is
885available by running the following command on the teuthology machine::
886
887 $ teuthology-suite --help
888
889.. _teuthology-suite: http://docs.ceph.com/teuthology/docs/teuthology.suite.html
890
891How integration tests are defined
892---------------------------------
893
894Integration tests are defined by yaml files found in the ``suites``
895subdirectory of the `ceph/qa sub-directory`_ and implemented by python
896code found in the ``tasks`` subdirectory. Some tests ("standalone tests")
897are defined in a single yaml file, while other tests are defined by a
898directory tree containing yaml files that are combined, at runtime, into a
899larger yaml file.
900
901Reading a standalone test
902-------------------------
903
904Let us first examine a standalone test, or "singleton".
905
906Here is a commented example using the integration test
907`rados/singleton/all/admin-socket.yaml
908<https://github.com/ceph/ceph/blob/master/qa/suites/rados/singleton/all/admin-socket.yaml>`_
909::
910
911 roles:
912 - - mon.a
913 - osd.0
914 - osd.1
915 tasks:
916 - install:
917 - ceph:
918 - admin_socket:
919 osd.0:
920 version:
921 git_version:
922 help:
923 config show:
924 config set filestore_dump_file /tmp/foo:
925 perf dump:
926 perf schema:
927
928The ``roles`` array determines the composition of the cluster (how
929many MONs, OSDs, etc.) on which this test is designed to run, as well
930as how these roles will be distributed over the machines in the
931testing cluster. In this case, there is only one element in the
932top-level array: therefore, only one machine is allocated to the
933test. The nested array declares that this machine shall run a MON with
934id ``a`` (that is the ``mon.a`` in the list of roles) and two OSDs
935(``osd.0`` and ``osd.1``).
936
937The body of the test is in the ``tasks`` array: each element is
938evaluated in order, causing the corresponding python file found in the
939``tasks`` subdirectory of the `teuthology repository`_ or
940`ceph/qa sub-directory`_ to be run. "Running" in this case means calling
941the ``task()`` function defined in that file.
942
943In this case, the `install
944<https://github.com/ceph/teuthology/blob/master/teuthology/task/install/__init__.py>`_
945task comes first. It installs the Ceph packages on each machine (as
946defined by the ``roles`` array). A full description of the ``install``
947task is `found in the python file
948<https://github.com/ceph/teuthology/blob/master/teuthology/task/install/__init__.py>`_
949(search for "def task").
950
951The ``ceph`` task, which is documented `here
952<https://github.com/ceph/ceph/blob/master/qa/tasks/ceph.py>`__ (again,
953search for "def task"), starts OSDs and MONs (and possibly MDSs as well)
954as required by the ``roles`` array. In this example, it will start one MON
955(``mon.a``) and two OSDs (``osd.0`` and ``osd.1``), all on the same
956machine. Control moves to the next task when the Ceph cluster reaches
957``HEALTH_OK`` state.
958
959The next task is ``admin_socket`` (`source code
960<https://github.com/ceph/ceph/blob/master/qa/tasks/admin_socket.py>`_).
961The parameter of the ``admin_socket`` task (and any other task) is a
962structure which is interpreted as documented in the task. In this example
963the parameter is a set of commands to be sent to the admin socket of
964``osd.0``. The task verifies that each of them returns on success (i.e.
965exit code zero).
966
967This test can be run with::
968
969 $ teuthology-suite --suite rados/singleton/all/admin-socket.yaml fs/ext4.yaml
970
971Test descriptions
972-----------------
973
974Each test has a "test description", which is similar to a directory path,
975but not the same. In the case of a standalone test, like the one in
976`Reading a standalone test`_, the test description is identical to the
977relative path (starting from the ``suites/`` directory of the
978`ceph/qa sub-directory`_) of the yaml file defining the test.
979
980Much more commonly, tests are defined not by a single yaml file, but by a
981`directory tree of yaml files`. At runtime, the tree is walked and all yaml
982files (facets) are combined into larger yaml "programs" that define the
983tests. A full listing of the yaml defining the test is included at the
984beginning of every test log.
985
986In these cases, the description of each test consists of the
987subdirectory under `suites/
988<https://github.com/ceph/ceph/tree/master/qa/suites>`_ containing the
989yaml facets, followed by an expression in curly braces (``{}``) consisting of
990a list of yaml facets in order of concatenation. For instance the
991test description::
992
993 ceph-disk/basic/{distros/centos_7.0.yaml tasks/ceph-disk.yaml}
994
995signifies the concatenation of two files:
996
997* ceph-disk/basic/distros/centos_7.0.yaml
998* ceph-disk/basic/tasks/ceph-disk.yaml
999
1000How are tests built from directories?
1001-------------------------------------
1002
1003As noted in the previous section, most tests are not defined in a single
1004yaml file, but rather as a `combination` of files collected from a
1005directory tree within the ``suites/`` subdirectory of the `ceph/qa sub-directory`_.
1006
1007The set of all tests defined by a given subdirectory of ``suites/`` is
1008called an "integration test suite", or a "teuthology suite".
1009
1010Combination of yaml facets is controlled by special files (``%`` and
1011``+``) that are placed within the directory tree and can be thought of as
1012operators. The ``%`` file is the "convolution" operator and ``+``
1013signifies concatenation.
1014
1015Convolution operator
1016--------------------
1017
1018The convolution operator, implemented as an empty file called ``%``, tells
1019teuthology to construct a test matrix from yaml facets found in
1020subdirectories below the directory containing the operator.
1021
1022For example, the `ceph-disk suite
1023<https://github.com/ceph/ceph/tree/jewel/qa/suites/ceph-disk/>`_ is
1024defined by the ``suites/ceph-disk/`` tree, which consists of the files and
1025subdirectories in the following structure::
1026
1027 directory: ceph-disk/basic
1028 file: %
1029 directory: distros
1030 file: centos_7.0.yaml
1031 file: ubuntu_14.04.yaml
1032 directory: tasks
1033 file: ceph-disk.yaml
1034
1035This is interpreted as a 2x1 matrix consisting of two tests:
1036
10371. ceph-disk/basic/{distros/centos_7.0.yaml tasks/ceph-disk.yaml}
10382. ceph-disk/basic/{distros/ubuntu_14.04.yaml tasks/ceph-disk.yaml}
1039
1040i.e. the concatenation of centos_7.0.yaml and ceph-disk.yaml and
1041the concatenation of ubuntu_14.04.yaml and ceph-disk.yaml, respectively.
1042In human terms, this means that the task found in ``ceph-disk.yaml`` is
1043intended to run on both CentOS 7.0 and Ubuntu 14.04.
1044
1045Without the file percent, the ``ceph-disk`` tree would be interpreted as
1046three standalone tests:
1047
1048* ceph-disk/basic/distros/centos_7.0.yaml
1049* ceph-disk/basic/distros/ubuntu_14.04.yaml
1050* ceph-disk/basic/tasks/ceph-disk.yaml
1051
1052(which would of course be wrong in this case).
1053
1054Referring to the `ceph/qa sub-directory`_, you will notice that the
1055``centos_7.0.yaml`` and ``ubuntu_14.04.yaml`` files in the
1056``suites/ceph-disk/basic/distros/`` directory are implemented as symlinks.
1057By using symlinks instead of copying, a single file can appear in multiple
1058suites. This eases the maintenance of the test framework as a whole.
1059
1060All the tests generated from the ``suites/ceph-disk/`` directory tree
1061(also known as the "ceph-disk suite") can be run with::
1062
1063 $ teuthology-suite --suite ceph-disk
1064
1065An individual test from the `ceph-disk suite`_ can be run by adding the
1066``--filter`` option::
1067
1068 $ teuthology-suite \
1069 --suite ceph-disk/basic \
1070 --filter 'ceph-disk/basic/{distros/ubuntu_14.04.yaml tasks/ceph-disk.yaml}'
1071
1072.. note: To run a standalone test like the one in `Reading a standalone
1073 test`_, ``--suite`` alone is sufficient. If you want to run a single
1074 test from a suite that is defined as a directory tree, ``--suite`` must
1075 be combined with ``--filter``. This is because the ``--suite`` option
1076 understands POSIX relative paths only.
1077
1078Concatenation operator
1079----------------------
1080
1081For even greater flexibility in sharing yaml files between suites, the
1082special file plus (``+``) can be used to concatenate files within a
1083directory. For instance, consider the `suites/rbd/thrash
1084<https://github.com/ceph/ceph/tree/master/qa/suites/rbd/thrash>`_
1085tree::
1086
1087 directory: rbd/thrash
1088 file: %
1089 directory: clusters
1090 file: +
1091 file: fixed-2.yaml
1092 file: openstack.yaml
1093 directory: workloads
1094 file: rbd_api_tests_copy_on_read.yaml
1095 file: rbd_api_tests.yaml
1096
1097This creates two tests:
1098
1099* rbd/thrash/{clusters/fixed-2.yaml clusters/openstack.yaml workloads/rbd_api_tests_copy_on_read.yaml}
1100* rbd/thrash/{clusters/fixed-2.yaml clusters/openstack.yaml workloads/rbd_api_tests.yaml}
1101
1102Because the ``clusters/`` subdirectory contains the special file plus
1103(``+``), all the other files in that subdirectory (``fixed-2.yaml`` and
1104``openstack.yaml`` in this case) are concatenated together
1105and treated as a single file. Without the special file plus, they would
1106have been convolved with the files from the workloads directory to create
1107a 2x2 matrix:
1108
1109* rbd/thrash/{clusters/openstack.yaml workloads/rbd_api_tests_copy_on_read.yaml}
1110* rbd/thrash/{clusters/openstack.yaml workloads/rbd_api_tests.yaml}
1111* rbd/thrash/{clusters/fixed-2.yaml workloads/rbd_api_tests_copy_on_read.yaml}
1112* rbd/thrash/{clusters/fixed-2.yaml workloads/rbd_api_tests.yaml}
1113
1114The ``clusters/fixed-2.yaml`` file is shared among many suites to
1115define the following ``roles``::
1116
1117 roles:
1118 - [mon.a, mon.c, osd.0, osd.1, osd.2, client.0]
1119 - [mon.b, osd.3, osd.4, osd.5, client.1]
1120
1121The ``rbd/thrash`` suite as defined above, consisting of two tests,
1122can be run with::
1123
1124 $ teuthology-suite --suite rbd/thrash
1125
1126A single test from the rbd/thrash suite can be run by adding the
1127``--filter`` option::
1128
1129 $ teuthology-suite \
1130 --suite rbd/thrash \
1131 --filter 'rbd/thrash/{clusters/fixed-2.yaml clusters/openstack.yaml workloads/rbd_api_tests_copy_on_read.yaml}'
1132
1133Filtering tests by their description
1134------------------------------------
1135
1136When a few jobs fail and need to be run again, the ``--filter`` option
1137can be used to select tests with a matching description. For instance, if the
1138``rados`` suite fails the `all/peer.yaml <https://github.com/ceph/ceph/blob/master/qa/suites/rados/singleton/all/peer.yaml>`_ test, the following will only run the tests that contain this file::
1139
1140 teuthology-suite --suite rados --filter all/peer.yaml
1141
1142The ``--filter-out`` option does the opposite (it matches tests that do
1143`not` contain a given string), and can be combined with the ``--filter``
1144option.
1145
1146Both ``--filter`` and ``--filter-out`` take a comma-separated list of strings (which
1147means the comma character is implicitly forbidden in filenames found in the
1148`ceph/qa sub-directory`_). For instance::
1149
1150 teuthology-suite --suite rados --filter all/peer.yaml,all/rest-api.yaml
1151
1152will run tests that contain either
1153`all/peer.yaml <https://github.com/ceph/ceph/blob/master/qa/suites/rados/singleton/all/peer.yaml>`_
1154or
1155`all/rest-api.yaml <https://github.com/ceph/ceph/blob/master/qa/suites/rados/singleton/all/rest-api.yaml>`_
1156
1157Each string is looked up anywhere in the test description and has to
1158be an exact match: they are not regular expressions.
1159
1160Reducing the number of tests
1161----------------------------
1162
1163The ``rados`` suite generates thousands of tests out of a few hundred
31f18b77
FG
1164files. This happens because teuthology constructs test matrices from
1165subdirectories wherever it encounters a file named ``%``. For instance,
1166all tests in the `rados/basic suite
1167<https://github.com/ceph/ceph/tree/master/qa/suites/rados/basic>`_
1168run with different messenger types: ``simple``, ``async`` and
1169``random``, because they are combined (via the special file ``%``) with
1170the `msgr directory
1171<https://github.com/ceph/ceph/tree/master/qa/suites/rados/basic/msgr>`_
7c673cae
FG
1172
1173All integration tests are required to be run before a Ceph release is published.
1174When merely verifying whether a contribution can be merged without
1175risking a trivial regression, it is enough to run a subset. The ``--subset`` option can be used to
1176reduce the number of tests that are triggered. For instance::
1177
1178 teuthology-suite --suite rados --subset 0/4000
1179
1180will run as few tests as possible. The tradeoff in this case is that
224ce89b 1181not all combinations of test variations will together,
7c673cae
FG
1182but no matter how small a ratio is provided in the ``--subset``,
1183teuthology will still ensure that all files in the suite are in at
1184least one test. Understanding the actual logic that drives this
1185requires reading the teuthology source code.
1186
1187The ``--limit`` option only runs the first ``N`` tests in the suite:
1188this is rarely useful, however, because there is no way to control which
1189test will be first.
1190
1191Testing in the cloud
1192====================
1193
1194In this chapter, we will explain in detail how use an OpenStack
1195tenant as an environment for Ceph integration testing.
1196
1197Assumptions and caveat
1198----------------------
1199
1200We assume that:
1201
12021. you are the only person using the tenant
12032. you have the credentials
12043. the tenant supports the ``nova`` and ``cinder`` APIs
1205
1206Caveat: be aware that, as of this writing (July 2016), testing in
1207OpenStack clouds is a new feature. Things may not work as advertised.
1208If you run into trouble, ask for help on `IRC`_ or the `Mailing list`_, or
1209open a bug report at the `ceph-workbench bug tracker`_.
1210
1211.. _`ceph-workbench bug tracker`: http://ceph-workbench.dachary.org/root/ceph-workbench/issues
1212
1213Prepare tenant
1214--------------
1215
1216If you have not tried to use ``ceph-workbench`` with this tenant before,
1217proceed to the next step.
1218
1219To start with a clean slate, login to your tenant via the Horizon dashboard and:
1220
1221* terminate the ``teuthology`` and ``packages-repository`` instances, if any
1222* delete the ``teuthology`` and ``teuthology-worker`` security groups, if any
1223* delete the ``teuthology`` and ``teuthology-myself`` key pairs, if any
1224
1225Also do the above if you ever get key-related errors ("invalid key", etc.) when
1226trying to schedule suites.
1227
1228Getting ceph-workbench
1229----------------------
1230
1231Since testing in the cloud is done using the `ceph-workbench
1232ceph-qa-suite`_ tool, you will need to install that first. It is designed
1233to be installed via Docker, so if you don't have Docker running on your
31f18b77 1234development machine, take care of that first. You can follow `the official
224ce89b 1235tutorial <https://docs.docker.com/engine/installation/>`_ to install if
31f18b77 1236you have not installed yet.
7c673cae
FG
1237
1238Once Docker is up and running, install ``ceph-workbench`` by following the
1239`Installation instructions in the ceph-workbench documentation
1240<http://ceph-workbench.readthedocs.org/en/latest/#installation>`_.
1241
1242Linking ceph-workbench with your OpenStack tenant
1243-------------------------------------------------
1244
1245Before you can trigger your first teuthology suite, you will need to link
1246``ceph-workbench`` with your OpenStack account.
1247
1248First, download a ``openrc.sh`` file by clicking on the "Download OpenStack
1249RC File" button, which can be found in the "API Access" tab of the "Access
1250& Security" dialog of the OpenStack Horizon dashboard.
1251
1252Second, create a ``~/.ceph-workbench`` directory, set its permissions to
1253700, and move the ``openrc.sh`` file into it. Make sure that the filename
1254is exactly ``~/.ceph-workbench/openrc.sh``.
1255
1256Third, edit the file so it does not ask for your OpenStack password
1257interactively. Comment out the relevant lines and replace them with
1258something like::
1259
1260 export OS_PASSWORD="aiVeth0aejee3eep8rogho3eep7Pha6ek"
1261
1262When `ceph-workbench ceph-qa-suite`_ connects to your OpenStack tenant for
1263the first time, it will generate two keypairs: ``teuthology-myself`` and
1264``teuthology``.
1265
1266.. If this is not the first time you have tried to use
1267.. `ceph-workbench ceph-qa-suite`_ with this tenant, make sure to delete any
1268.. stale keypairs with these names!
1269
1270Run the dummy suite
1271-------------------
1272
1273You are now ready to take your OpenStack teuthology setup for a test
1274drive::
1275
1276 $ ceph-workbench ceph-qa-suite --suite dummy
1277
1278Be forewarned that the first run of `ceph-workbench ceph-qa-suite`_ on a
1279pristine tenant will take a long time to complete because it downloads a VM
1280image and during this time the command may not produce any output.
1281
1282The images are cached in OpenStack, so they are only downloaded once.
1283Subsequent runs of the same command will complete faster.
1284
1285Although ``dummy`` suite does not run any tests, in all other respects it
1286behaves just like a teuthology suite and produces some of the same
1287artifacts.
1288
1289The last bit of output should look something like this::
1290
1291 pulpito web interface: http://149.202.168.201:8081/
1292 ssh access : ssh -i /home/smithfarm/.ceph-workbench/teuthology-myself.pem ubuntu@149.202.168.201 # logs in /usr/share/nginx/html
1293
1294What this means is that `ceph-workbench ceph-qa-suite`_ triggered the test
1295suite run. It does not mean that the suite run has completed. To monitor
1296progress of the run, check the Pulpito web interface URL periodically, or
1297if you are impatient, ssh to the teuthology machine using the ssh command
1298shown and do::
1299
1300 $ tail -f /var/log/teuthology.*
1301
1302The `/usr/share/nginx/html` directory contains the complete logs of the
1303test suite. If we had provided the ``--upload`` option to the
1304`ceph-workbench ceph-qa-suite`_ command, these logs would have been
1305uploaded to http://teuthology-logs.public.ceph.com.
1306
1307Run a standalone test
1308---------------------
1309
1310The standalone test explained in `Reading a standalone test`_ can be run
1311with the following command::
1312
1313 $ ceph-workbench ceph-qa-suite --suite rados/singleton/all/admin-socket.yaml
1314
1315This will run the suite shown on the current ``master`` branch of
1316``ceph/ceph.git``. You can specify a different branch with the ``--ceph``
1317option, and even a different git repo with the ``--ceph-git-url`` option. (Run
1318``ceph-workbench ceph-qa-suite --help`` for an up-to-date list of available
1319options.)
1320
1321The first run of a suite will also take a long time, because ceph packages
1322have to be built, first. Again, the packages so built are cached and
1323`ceph-workbench ceph-qa-suite`_ will not build identical packages a second
1324time.
1325
1326Interrupt a running suite
1327-------------------------
1328
1329Teuthology suites take time to run. From time to time one may wish to
1330interrupt a running suite. One obvious way to do this is::
1331
1332 ceph-workbench ceph-qa-suite --teardown
1333
1334This destroys all VMs created by `ceph-workbench ceph-qa-suite`_ and
1335returns the OpenStack tenant to a "clean slate".
1336
1337Sometimes you may wish to interrupt the running suite, but keep the logs,
1338the teuthology VM, the packages-repository VM, etc. To do this, you can
1339``ssh`` to the teuthology VM (using the ``ssh access`` command reported
1340when you triggered the suite -- see `Run the dummy suite`_) and, once
1341there::
1342
1343 sudo /etc/init.d/teuthology restart
1344
1345This will keep the teuthology machine, the logs and the packages-repository
1346instance but nuke everything else.
1347
1348Upload logs to archive server
1349-----------------------------
1350
1351Since the teuthology instance in OpenStack is only semi-permanent, with limited
1352space for storing logs, ``teuthology-openstack`` provides an ``--upload``
1353option which, if included in the ``ceph-workbench ceph-qa-suite`` command,
1354will cause logs from all failed jobs to be uploaded to the log archive server
1355maintained by the Ceph project. The logs will appear at the URL::
1356
1357 http://teuthology-logs.public.ceph.com/$RUN
1358
1359where ``$RUN`` is the name of the run. It will be a string like this::
1360
1361 ubuntu-2016-07-23_16:08:12-rados-hammer-backports---basic-openstack
1362
1363Even if you don't providing the ``--upload`` option, however, all the logs can
1364still be found on the teuthology machine in the directory
1365``/usr/share/nginx/html``.
1366
1367Provision VMs ad hoc
1368--------------------
1369
1370From the teuthology VM, it is possible to provision machines on an "ad hoc"
1371basis, to use however you like. The magic incantation is::
1372
1373 teuthology-lock --lock-many $NUMBER_OF_MACHINES \
1374 --os-type $OPERATING_SYSTEM \
1375 --os-version $OS_VERSION \
1376 --machine-type openstack \
1377 --owner $EMAIL_ADDRESS
1378
1379The command must be issued from the ``~/teuthology`` directory. The possible
1380values for ``OPERATING_SYSTEM`` AND ``OS_VERSION`` can be found by examining
1381the contents of the directory ``teuthology/openstack/``. For example::
1382
1383 teuthology-lock --lock-many 1 --os-type ubuntu --os-version 16.04 \
1384 --machine-type openstack --owner foo@example.com
1385
1386When you are finished with the machine, find it in the list of machines::
1387
1388 openstack server list
1389
1390to determine the name or ID, and then terminate it with::
1391
1392 openstack server delete $NAME_OR_ID
1393
1394Deploy a cluster for manual testing
1395-----------------------------------
1396
1397The `teuthology framework`_ and `ceph-workbench ceph-qa-suite`_ are
1398versatile tools that automatically provision Ceph clusters in the cloud and
1399run various tests on them in an automated fashion. This enables a single
1400engineer, in a matter of hours, to perform thousands of tests that would
1401keep dozens of human testers occupied for days or weeks if conducted
1402manually.
1403
1404However, there are times when the automated tests do not cover a particular
1405scenario and manual testing is desired. It turns out that it is simple to
1406adapt a test to stop and wait after the Ceph installation phase, and the
1407engineer can then ssh into the running cluster. Simply add the following
1408snippet in the desired place within the test YAML and schedule a run with the
1409test::
1410
1411 tasks:
1412 - exec:
1413 client.0:
1414 - sleep 1000000000 # forever
1415
1416(Make sure you have a ``client.0`` defined in your ``roles`` stanza or adapt
1417accordingly.)
1418
1419The same effect can be achieved using the ``interactive`` task::
1420
1421 tasks:
1422 - interactive
1423
1424By following the test log, you can determine when the test cluster has entered
1425the "sleep forever" condition. At that point, you can ssh to the teuthology
1426machine and from there to one of the target VMs (OpenStack) or teuthology
1427worker machines machine (Sepia) where the test cluster is running.
1428
1429The VMs (or "instances" in OpenStack terminology) created by
1430`ceph-workbench ceph-qa-suite`_ are named as follows:
1431
1432``teuthology`` - the teuthology machine
1433
1434``packages-repository`` - VM where packages are stored
1435
1436``ceph-*`` - VM where packages are built
1437
1438``target*`` - machines where tests are run
1439
1440The VMs named ``target*`` are used by tests. If you are monitoring the
1441teuthology log for a given test, the hostnames of these target machines can
1442be found out by searching for the string ``Locked targets``::
1443
1444 2016-03-20T11:39:06.166 INFO:teuthology.task.internal:Locked targets:
1445 target149202171058.teuthology: null
1446 target149202171059.teuthology: null
1447
1448The IP addresses of the target machines can be found by running ``openstack
1449server list`` on the teuthology machine, but the target VM hostnames (e.g.
1450``target149202171058.teuthology``) are resolvable within the teuthology
1451cluster.
1452
1453
1454Testing - how to run s3-tests locally
1455=====================================
1456
1457RGW code can be tested by building Ceph locally from source, starting a vstart
1458cluster, and running the "s3-tests" suite against it.
1459
1460The following instructions should work on jewel and above.
1461
1462Step 1 - build Ceph
1463-------------------
1464
224ce89b 1465Refer to :doc:`/install/build-ceph`.
7c673cae
FG
1466
1467You can do step 2 separately while it is building.
1468
31f18b77 1469Step 2 - vstart
7c673cae
FG
1470---------------
1471
1472When the build completes, and still in the top-level directory of the git
31f18b77 1473clone where you built Ceph, do the following, for cmake builds::
7c673cae 1474
31f18b77
FG
1475 cd build/
1476 RGW=1 ../vstart.sh -n
7c673cae
FG
1477
1478This will produce a lot of output as the vstart cluster is started up. At the
1479end you should see a message like::
1480
1481 started. stop.sh to stop. see out/* (e.g. 'tail -f out/????') for debug output.
1482
1483This means the cluster is running.
1484
7c673cae 1485
31f18b77 1486Step 3 - run s3-tests
7c673cae
FG
1487---------------------
1488
31f18b77 1489To run the s3tests suite do the following::
7c673cae 1490
31f18b77 1491 $ ../qa/workunits/rgw/run-s3tests.sh
7c673cae
FG
1492
1493.. WIP
1494.. ===
1495..
1496.. Building RPM packages
1497.. ---------------------
1498..
1499.. Ceph is regularly built and packaged for a number of major Linux
1500.. distributions. At the time of this writing, these included CentOS, Debian,
1501.. Fedora, openSUSE, and Ubuntu.
1502..
1503.. Architecture
1504.. ============
1505..
1506.. Ceph is a collection of components built on top of RADOS and provide
1507.. services (RBD, RGW, CephFS) and APIs (S3, Swift, POSIX) for the user to
1508.. store and retrieve data.
1509..
1510.. See :doc:`/architecture` for an overview of Ceph architecture. The
1511.. following sections treat each of the major architectural components
1512.. in more detail, with links to code and tests.
1513..
1514.. FIXME The following are just stubs. These need to be developed into
1515.. detailed descriptions of the various high-level components (RADOS, RGW,
1516.. etc.) with breakdowns of their respective subcomponents.
1517..
1518.. FIXME Later, in the Testing chapter I would like to take another look
1519.. at these components/subcomponents with a focus on how they are tested.
1520..
1521.. RADOS
1522.. -----
1523..
1524.. RADOS stands for "Reliable, Autonomic Distributed Object Store". In a Ceph
1525.. cluster, all data are stored in objects, and RADOS is the component responsible
1526.. for that.
1527..
1528.. RADOS itself can be further broken down into Monitors, Object Storage Daemons
1529.. (OSDs), and client APIs (librados). Monitors and OSDs are introduced at
1530.. :doc:`/start/intro`. The client library is explained at
1531.. :doc:`/rados/api/index`.
1532..
1533.. RGW
1534.. ---
1535..
1536.. RGW stands for RADOS Gateway. Using the embedded HTTP server civetweb_ or
1537.. Apache FastCGI, RGW provides a REST interface to RADOS objects.
1538..
1539.. .. _civetweb: https://github.com/civetweb/civetweb
1540..
1541.. A more thorough introduction to RGW can be found at :doc:`/radosgw/index`.
1542..
1543.. RBD
1544.. ---
1545..
1546.. RBD stands for RADOS Block Device. It enables a Ceph cluster to store disk
1547.. images, and includes in-kernel code enabling RBD images to be mounted.
1548..
1549.. To delve further into RBD, see :doc:`/rbd/rbd`.
1550..
1551.. CephFS
1552.. ------
1553..
1554.. CephFS is a distributed file system that enables a Ceph cluster to be used as a NAS.
1555..
1556.. File system metadata is managed by Meta Data Server (MDS) daemons. The Ceph
1557.. file system is explained in more detail at :doc:`/cephfs/index`.
1558..