]> git.proxmox.com Git - ceph.git/blob - ceph/doc/dev/index.rst
update sources to v12.1.1
[ceph.git] / ceph / doc / dev / index.rst
1 ============================================
2 Contributing to Ceph: A Guide for Developers
3 ============================================
4
5 :Author: Loic Dachary
6 :Author: Nathan Cutler
7 :License: Creative Commons Attribution-ShareAlike (CC BY-SA)
8
9 .. note:: The old (pre-2016) developer documentation has been moved to :doc:`/dev/index-old`.
10
11 .. contents::
12 :depth: 3
13
14 Introduction
15 ============
16
17 This guide has two aims. First, it should lower the barrier to entry for
18 software developers who wish to get involved in the Ceph project. Second,
19 it should serve as a reference for Ceph developers.
20
21 We assume that readers are already familiar with Ceph (the distributed
22 object store and file system designed to provide excellent performance,
23 reliability and scalability). If not, please refer to the `project website`_
24 and especially the `publications list`_.
25
26 .. _`project website`: http://ceph.com
27 .. _`publications list`: https://ceph.com/resources/publications/
28
29 Since this document is to be consumed by developers, who are assumed to
30 have Internet access, topics covered elsewhere, either within the Ceph
31 documentation or elsewhere on the web, are treated by linking. If you
32 notice that a link is broken or if you know of a better link, please
33 `report it as a bug`_.
34
35 .. _`report it as a bug`: http://tracker.ceph.com/projects/ceph/issues/new
36
37 Essentials (tl;dr)
38 ==================
39
40 This chapter presents essential information that every Ceph developer needs
41 to know.
42
43 Leads
44 -----
45
46 The Ceph project is led by Sage Weil. In addition, each major project
47 component has its own lead. The following table shows all the leads and
48 their nicks on `GitHub`_:
49
50 .. _github: https://github.com/
51
52 ========= =============== =============
53 Scope Lead GitHub nick
54 ========= =============== =============
55 Ceph Sage Weil liewegas
56 RADOS Samuel Just athanatos
57 RGW Yehuda Sadeh yehudasa
58 RBD Jason Dillaman dillaman
59 CephFS John Spray jcsp
60 Build/Ops Ken Dreyer ktdreyer
61 ========= =============== =============
62
63 The Ceph-specific acronyms in the table are explained in
64 :doc:`/architecture`.
65
66 History
67 -------
68
69 See the `History chapter of the Wikipedia article`_.
70
71 .. _`History chapter of the Wikipedia article`: https://en.wikipedia.org/wiki/Ceph_%28software%29#History
72
73 Licensing
74 ---------
75
76 Ceph is free software.
77
78 Unless stated otherwise, the Ceph source code is distributed under the terms of
79 the LGPL2.1. For full details, see `the file COPYING in the top-level
80 directory of the source-code tree`_.
81
82 .. _`the file COPYING in the top-level directory of the source-code tree`:
83 https://github.com/ceph/ceph/blob/master/COPYING
84
85 Source code repositories
86 ------------------------
87
88 The source code of Ceph lives on `GitHub`_ in a number of repositories below
89 the `Ceph "organization"`_.
90
91 .. _`Ceph "organization"`: https://github.com/ceph
92
93 To make a meaningful contribution to the project as a developer, a working
94 knowledge of git_ is essential.
95
96 .. _git: https://git-scm.com/documentation
97
98 Although the `Ceph "organization"`_ includes several software repositories,
99 this document covers only one: https://github.com/ceph/ceph.
100
101 Redmine issue tracker
102 ---------------------
103
104 Although `GitHub`_ is used for code, Ceph-related issues (Bugs, Features,
105 Backports, Documentation, etc.) are tracked at http://tracker.ceph.com,
106 which is powered by `Redmine`_.
107
108 .. _Redmine: http://www.redmine.org
109
110 The tracker has a Ceph project with a number of subprojects loosely
111 corresponding to the various architectural components (see
112 :doc:`/architecture`).
113
114 Mere `registration`_ in the tracker automatically grants permissions
115 sufficient to open new issues and comment on existing ones.
116
117 .. _registration: http://tracker.ceph.com/account/register
118
119 To report a bug or propose a new feature, `jump to the Ceph project`_ and
120 click on `New issue`_.
121
122 .. _`jump to the Ceph project`: http://tracker.ceph.com/projects/ceph
123 .. _`New issue`: http://tracker.ceph.com/projects/ceph/issues/new
124
125 Mailing list
126 ------------
127
128 Ceph development email discussions take place on the mailing list
129 ``ceph-devel@vger.kernel.org``. The list is open to all. Subscribe by
130 sending a message to ``majordomo@vger.kernel.org`` with the line: ::
131
132 subscribe ceph-devel
133
134 in the body of the message.
135
136 There are also `other Ceph-related mailing lists`_.
137
138 .. _`other Ceph-related mailing lists`: https://ceph.com/irc/
139
140 IRC
141 ---
142
143 In addition to mailing lists, the Ceph community also communicates in real
144 time using `Internet Relay Chat`_.
145
146 .. _`Internet Relay Chat`: http://www.irchelp.org/
147
148 See https://ceph.com/irc/ for how to set up your IRC
149 client and a list of channels.
150
151 Submitting patches
152 ------------------
153
154 The canonical instructions for submitting patches are contained in the
155 `the file CONTRIBUTING.rst in the top-level directory of the source-code
156 tree`_. There may be some overlap between this guide and that file.
157
158 .. _`the file CONTRIBUTING.rst in the top-level directory of the source-code tree`:
159 https://github.com/ceph/ceph/blob/master/CONTRIBUTING.rst
160
161 All newcomers are encouraged to read that file carefully.
162
163 Building from source
164 --------------------
165
166 See instructions at :doc:`/install/build-ceph`.
167
168 Using ccache to speed up local builds
169 -------------------------------------
170
171 Rebuilds of the ceph source tree can benefit significantly from use of `ccache`_.
172 Many a times while switching branches and such, one might see build failures for
173 certain older branches mostly due to older build artifacts. These rebuilds can
174 significantly benefit the use of ccache. For a full clean source tree, one could
175 do ::
176
177 $ make clean
178
179 # note the following will nuke everything in the source tree that
180 # isn't tracked by git, so make sure to backup any log files /conf options
181
182 $ git clean -fdx; git submodule foreach git clean -fdx
183
184 ccache is available as a package in most distros. To build ceph with ccache one
185 can::
186
187 $ cmake -DWITH_CCACHE=ON ..
188
189 ccache can also be used for speeding up all builds in the system. for more
190 details refer to the `run modes`_ of the ccache manual. The default settings of
191 ``ccache`` can be displayed with ``ccache -s``.
192
193 .. note: It is recommended to override the ``max_size``, which is the size of
194 cache, defaulting to 10G, to a larger size like 25G or so. Refer to the
195 `configuration`_ section of ccache manual.
196
197 .. _`ccache`: https://ccache.samba.org/
198 .. _`run modes`: https://ccache.samba.org/manual.html#_run_modes
199 .. _`configuration`: https://ccache.samba.org/manual.html#_configuration
200
201 Development-mode cluster
202 ------------------------
203
204 See :doc:`/dev/quick_guide`.
205
206 Backporting
207 -----------
208
209 All bugfixes should be merged to the ``master`` branch before being backported.
210 To flag a bugfix for backporting, make sure it has a `tracker issue`_
211 associated with it and set the ``Backport`` field to a comma-separated list of
212 previous releases (e.g. "hammer,jewel") that you think need the backport.
213 The rest (including the actual backporting) will be taken care of by the
214 `Stable Releases and Backports`_ team.
215
216 .. _`tracker issue`: http://tracker.ceph.com/
217 .. _`Stable Releases and Backports`: http://tracker.ceph.com/projects/ceph-releases/wiki
218
219
220 What is merged where and when ?
221 ===============================
222
223 Commits are merged into branches according to criteria that change
224 during the lifecycle of a Ceph release. This chapter is the inventory
225 of what can be merged in which branch at a given point in time.
226
227 Development releases (i.e. x.0.z)
228 ---------------------------------
229
230 What ?
231 ^^^^^^
232
233 * features
234 * bug fixes
235
236 Where ?
237 ^^^^^^^
238
239 Features are merged to the master branch. Bug fixes should be merged
240 to the corresponding named branch (e.g. "jewel" for 10.0.z, "kraken"
241 for 11.0.z, etc.). However, this is not mandatory - bug fixes can be
242 merged to the master branch as well, since the master branch is
243 periodically merged to the named branch during the development
244 releases phase. In either case, if the bugfix is important it can also
245 be flagged for backport to one or more previous stable releases.
246
247 When ?
248 ^^^^^^
249
250 After the stable release candidates of the previous release enters
251 phase 2 (see below). For example: the "jewel" named branch was
252 created when the infernalis release candidates entered phase 2. From
253 this point on, master was no longer associated with infernalis. As
254 soon as the named branch of the next stable release is created, master
255 starts getting periodically merged into it.
256
257 Branch merges
258 ^^^^^^^^^^^^^
259
260 * The branch of the stable release is merged periodically into master.
261 * The master branch is merged periodically into the branch of the
262 stable release.
263 * The master is merged into the branch of the stable release
264 immediately after each development x.0.z release.
265
266 Stable release candidates (i.e. x.1.z) phase 1
267 ----------------------------------------------
268
269 What ?
270 ^^^^^^
271
272 * bug fixes only
273
274 Where ?
275 ^^^^^^^
276
277 The branch of the stable release (e.g. "jewel" for 10.0.z, "kraken"
278 for 11.0.z, etc.) or master. Bug fixes should be merged to the named
279 branch corresponding to the stable release candidate (e.g. "jewel" for
280 10.1.z) or to master. During this phase, all commits to master will be
281 merged to the named branch, and vice versa. In other words, it makes
282 no difference whether a commit is merged to the named branch or to
283 master - it will make it into the next release candidate either way.
284
285 When ?
286 ^^^^^^
287
288 After the first stable release candidate is published, i.e. after the
289 x.1.0 tag is set in the release branch.
290
291 Branch merges
292 ^^^^^^^^^^^^^
293
294 * The branch of the stable release is merged periodically into master.
295 * The master branch is merged periodically into the branch of the
296 stable release.
297 * The master is merged into the branch of the stable release
298 immediately after each x.1.z release candidate.
299
300 Stable release candidates (i.e. x.1.z) phase 2
301 ----------------------------------------------
302
303 What ?
304 ^^^^^^
305
306 * bug fixes only
307
308 Where ?
309 ^^^^^^^
310
311 The branch of the stable release (e.g. "jewel" for 10.0.z, "kraken"
312 for 11.0.z, etc.). During this phase, all commits to the named branch
313 will be merged into master. Cherry-picking to the named branch during
314 release candidate phase 2 is done manually since the official
315 backporting process only begins when the release is pronounced
316 "stable".
317
318 When ?
319 ^^^^^^
320
321 After Sage Weil decides it is time for phase 2 to happen.
322
323 Branch merges
324 ^^^^^^^^^^^^^
325
326 * The branch of the stable release is merged periodically into master.
327
328 Stable releases (i.e. x.2.z)
329 ----------------------------
330
331 What ?
332 ^^^^^^
333
334 * bug fixes
335 * features are sometime accepted
336 * commits should be cherry-picked from master when possible
337 * commits that are not cherry-picked from master must be about a bug unique to the stable release
338 * see also `the backport HOWTO`_
339
340 .. _`the backport HOWTO`:
341 http://tracker.ceph.com/projects/ceph-releases/wiki/HOWTO#HOWTO
342
343 Where ?
344 ^^^^^^^
345
346 The branch of the stable release (hammer for 0.94.x, infernalis for 9.2.x, etc.)
347
348 When ?
349 ^^^^^^
350
351 After the stable release is published, i.e. after the "vx.2.0" tag is
352 set in the release branch.
353
354 Branch merges
355 ^^^^^^^^^^^^^
356
357 Never
358
359 Issue tracker
360 =============
361
362 See `Redmine issue tracker`_ for a brief introduction to the Ceph Issue Tracker.
363
364 Ceph developers use the issue tracker to
365
366 1. keep track of issues - bugs, fix requests, feature requests, backport
367 requests, etc.
368
369 2. communicate with other developers and keep them informed as work
370 on the issues progresses.
371
372 Issue tracker conventions
373 -------------------------
374
375 When you start working on an existing issue, it's nice to let the other
376 developers know this - to avoid duplication of labor. Typically, this is
377 done by changing the :code:`Assignee` field (to yourself) and changing the
378 :code:`Status` to *In progress*. Newcomers to the Ceph community typically do not
379 have sufficient privileges to update these fields, however: they can
380 simply update the issue with a brief note.
381
382 .. table:: Meanings of some commonly used statuses
383
384 ================ ===========================================
385 Status Meaning
386 ================ ===========================================
387 New Initial status
388 In Progress Somebody is working on it
389 Need Review Pull request is open with a fix
390 Pending Backport Fix has been merged, backport(s) pending
391 Resolved Fix and backports (if any) have been merged
392 ================ ===========================================
393
394 Basic workflow
395 ==============
396
397 The following chart illustrates basic development workflow:
398
399 .. ditaa::
400
401 Upstream Code Your Local Environment
402
403 /----------\ git clone /-------------\
404 | Ceph | -------------------------> | ceph/master |
405 \----------/ \-------------/
406 ^ |
407 | | git branch fix_1
408 | git merge |
409 | v
410 /----------------\ git commit --amend /-------------\
411 | make check |---------------------> | ceph/fix_1 |
412 | ceph--qa--suite| \-------------/
413 \----------------/ |
414 ^ | fix changes
415 | | test changes
416 | review | git commit
417 | |
418 | v
419 /--------------\ /-------------\
420 | github |<---------------------- | ceph/fix_1 |
421 | pull request | git push \-------------/
422 \--------------/
423
424 Below we present an explanation of this chart. The explanation is written
425 with the assumption that you, the reader, are a beginning developer who
426 has an idea for a bugfix, but do not know exactly how to proceed.
427
428 Update the tracker
429 ------------------
430
431 Before you start, you should know the `Issue tracker`_ number of the bug
432 you intend to fix. If there is no tracker issue, now is the time to create
433 one.
434
435 The tracker is there to explain the issue (bug) to your fellow Ceph
436 developers and keep them informed as you make progress toward resolution.
437 To this end, then, provide a descriptive title as well as sufficient
438 information and details in the description.
439
440 If you have sufficient tracker permissions, assign the bug to yourself by
441 changing the ``Assignee`` field. If your tracker permissions have not yet
442 been elevated, simply add a comment to the issue with a short message like
443 "I am working on this issue".
444
445 Upstream code
446 -------------
447
448 This section, and the ones that follow, correspond to the nodes in the
449 above chart.
450
451 The upstream code lives in https://github.com/ceph/ceph.git, which is
452 sometimes referred to as the "upstream repo", or simply "upstream". As the
453 chart illustrates, we will make a local copy of this code, modify it, test
454 our modifications, and submit the modifications back to the upstream repo
455 for review.
456
457 A local copy of the upstream code is made by
458
459 1. forking the upstream repo on GitHub, and
460 2. cloning your fork to make a local working copy
461
462 See the `the GitHub documentation
463 <https://help.github.com/articles/fork-a-repo/#platform-linux>`_ for
464 detailed instructions on forking. In short, if your GitHub username is
465 "mygithubaccount", your fork of the upstream repo will show up at
466 https://github.com/mygithubaccount/ceph. Once you have created your fork,
467 you clone it by doing:
468
469 .. code::
470
471 $ git clone https://github.com/mygithubaccount/ceph
472
473 While it is possible to clone the upstream repo directly, in this case you
474 must fork it first. Forking is what enables us to open a `GitHub pull
475 request`_.
476
477 For more information on using GitHub, refer to `GitHub Help
478 <https://help.github.com/>`_.
479
480 Local environment
481 -----------------
482
483 In the local environment created in the previous step, you now have a
484 copy of the ``master`` branch in ``remotes/origin/master``. Since the fork
485 (https://github.com/mygithubaccount/ceph.git) is frozen in time and the
486 upstream repo (https://github.com/ceph/ceph.git, typically abbreviated to
487 ``ceph/ceph.git``) is updated frequently by other developers, you will need
488 to sync your fork periodically. To do this, first add the upstream repo as
489 a "remote" and fetch it::
490
491 $ git remote add ceph https://github.com/ceph/ceph.git
492 $ git fetch ceph
493
494 Fetching downloads all objects (commits, branches) that were added since
495 the last sync. After running these commands, all the branches from
496 ``ceph/ceph.git`` are downloaded to the local git repo as
497 ``remotes/ceph/$BRANCH_NAME`` and can be referenced as
498 ``ceph/$BRANCH_NAME`` in certain git commands.
499
500 For example, your local ``master`` branch can be reset to the upstream Ceph
501 ``master`` branch by doing::
502
503 $ git fetch ceph
504 $ git checkout master
505 $ git reset --hard ceph/master
506
507 Finally, the ``master`` branch of your fork can then be synced to upstream
508 master by::
509
510 $ git push -u origin master
511
512 Bugfix branch
513 -------------
514
515 Next, create a branch for the bugfix:
516
517 .. code::
518
519 $ git checkout master
520 $ git checkout -b fix_1
521 $ git push -u origin fix_1
522
523 This creates a ``fix_1`` branch locally and in our GitHub fork. At this
524 point, the ``fix_1`` branch is identical to the ``master`` branch, but not
525 for long! You are now ready to modify the code.
526
527 Fix bug locally
528 ---------------
529
530 At this point, change the status of the tracker issue to "In progress" to
531 communicate to the other Ceph developers that you have begun working on a
532 fix. If you don't have permission to change that field, your comment that
533 you are working on the issue is sufficient.
534
535 Possibly, your fix is very simple and requires only minimal testing.
536 More likely, it will be an iterative process involving trial and error, not
537 to mention skill. An explanation of how to fix bugs is beyond the
538 scope of this document. Instead, we focus on the mechanics of the process
539 in the context of the Ceph project.
540
541 A detailed discussion of the tools available for validating your bugfixes,
542 see the `Testing`_ chapter.
543
544 For now, let us just assume that you have finished work on the bugfix and
545 that you have tested it and believe it works. Commit the changes to your local
546 branch using the ``--signoff`` option::
547
548 $ git commit -as
549
550 and push the changes to your fork::
551
552 $ git push origin fix_1
553
554 GitHub pull request
555 -------------------
556
557 The next step is to open a GitHub pull request. The purpose of this step is
558 to make your bugfix available to the community of Ceph developers. They
559 will review it and may do additional testing on it.
560
561 In short, this is the point where you "go public" with your modifications.
562 Psychologically, you should be prepared to receive suggestions and
563 constructive criticism. Don't worry! In our experience, the Ceph project is
564 a friendly place!
565
566 If you are uncertain how to use pull requests, you may read
567 `this GitHub pull request tutorial`_.
568
569 .. _`this GitHub pull request tutorial`:
570 https://help.github.com/articles/using-pull-requests/
571
572 For some ideas on what constitutes a "good" pull request, see
573 the `Git Commit Good Practice`_ article at the `OpenStack Project Wiki`_.
574
575 .. _`Git Commit Good Practice`: https://wiki.openstack.org/wiki/GitCommitMessages
576 .. _`OpenStack Project Wiki`: https://wiki.openstack.org/wiki/Main_Page
577
578 Once your pull request (PR) is opened, update the `Issue tracker`_ by
579 adding a comment to the bug pointing the other developers to your PR. The
580 update can be as simple as::
581
582 *PR*: https://github.com/ceph/ceph/pull/$NUMBER_OF_YOUR_PULL_REQUEST
583
584 Automated PR validation
585 -----------------------
586
587 When your PR hits GitHub, the Ceph project's `Continuous Integration (CI)
588 <https://en.wikipedia.org/wiki/Continuous_integration>`_
589 infrastructure will test it automatically. At the time of this writing
590 (March 2016), the automated CI testing included a test to check that the
591 commits in the PR are properly signed (see `Submitting patches`_) and a
592 `make check`_ test.
593
594 The latter, `make check`_, builds the PR and runs it through a battery of
595 tests. These tests run on machines operated by the Ceph Continuous
596 Integration (CI) team. When the tests complete, the result will be shown
597 on GitHub in the pull request itself.
598
599 You can (and should) also test your modifications before you open a PR.
600 Refer to the `Testing`_ chapter for details.
601
602 Notes on PR make check test
603 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
604
605 The GitHub `make check`_ test is driven by a Jenkins instance.
606
607 Jenkins merges the PR branch into the latest version of the base branch before
608 starting the build, so you don't have to rebase the PR to pick up any fixes.
609
610 You can trigger the PR tests at any time by adding a comment to the PR - the
611 comment should contain the string "test this please". Since a human subscribed
612 to the PR might interpret that as a request for him or her to test the PR, it's
613 good to write the request as "Jenkins, test this please".
614
615 The `make check`_ log is the place to go if there is a failure and you're not
616 sure what caused it. To reach it, first click on "details" (next to the `make
617 check`_ test in the PR) to get into the Jenkins web GUI, and then click on
618 "Console Output" (on the left).
619
620 Jenkins is set up to grep the log for strings known to have been associated
621 with `make check`_ failures in the past. However, there is no guarantee that
622 the strings are associated with any given `make check`_ failure. You have to
623 dig into the log to be sure.
624
625 Integration tests AKA ceph-qa-suite
626 -----------------------------------
627
628 Since Ceph is a complex beast, it may also be necessary to test your fix to
629 see how it behaves on real clusters running either on real or virtual
630 hardware. Tests designed for this purpose live in the `ceph/qa
631 sub-directory`_ and are run via the `teuthology framework`_.
632
633 .. _`ceph/qa sub-directory`: https://github.com/ceph/ceph/tree/master/qa/
634 .. _`teuthology repository`: https://github.com/ceph/teuthology
635 .. _`teuthology framework`: https://github.com/ceph/teuthology
636
637 If you have access to an OpenStack tenant, you are encouraged to run the
638 integration tests yourself using `ceph-workbench ceph-qa-suite`_,
639 and to post the test results to the PR.
640
641 .. _`ceph-workbench ceph-qa-suite`: http://ceph-workbench.readthedocs.org/
642
643 The Ceph community has access to the `Sepia lab
644 <http://ceph.github.io/sepia/>`_ where integration tests can be run on
645 real hardware. Other developers may add tags like "needs-qa" to your PR.
646 This allows PRs that need testing to be merged into a single branch and
647 tested all at the same time. Since teuthology suites can take hours
648 (even days in some cases) to run, this can save a lot of time.
649
650 Integration testing is discussed in more detail in the `Testing`_ chapter.
651
652 Code review
653 -----------
654
655 Once your bugfix has been thoroughly tested, or even during this process,
656 it will be subjected to code review by other developers. This typically
657 takes the form of correspondence in the PR itself, but can be supplemented
658 by discussions on `IRC`_ and the `Mailing list`_.
659
660 Amending your PR
661 ----------------
662
663 While your PR is going through `Testing`_ and `Code review`_, you can
664 modify it at any time by editing files in your local branch.
665
666 After the changes are committed locally (to the ``fix_1`` branch in our
667 example), they need to be pushed to GitHub so they appear in the PR.
668
669 Modifying the PR is done by adding commits to the ``fix_1`` branch upon
670 which it is based, often followed by rebasing to modify the branch's git
671 history. See `this tutorial
672 <https://www.atlassian.com/git/tutorials/rewriting-history>`_ for a good
673 introduction to rebasing. When you are done with your modifications, you
674 will need to force push your branch with:
675
676 .. code::
677
678 $ git push --force origin fix_1
679
680 Merge
681 -----
682
683 The bugfixing process culminates when one of the project leads decides to
684 merge your PR.
685
686 When this happens, it is a signal for you (or the lead who merged the PR)
687 to change the `Issue tracker`_ status to "Resolved". Some issues may be
688 flagged for backporting, in which case the status should be changed to
689 "Pending Backport" (see the `Backporting`_ chapter for details).
690
691
692 Testing
693 =======
694
695 Ceph has two types of tests: `make check`_ tests and integration tests.
696 The former are run via `GNU Make <https://www.gnu.org/software/make/>`,
697 and the latter are run via the `teuthology framework`_. The following two
698 chapters examine the `make check`_ and integration tests in detail.
699
700 .. _`make check`:
701
702 Testing - make check
703 ====================
704
705 After compiling Ceph, the `make check`_ command can be used to run the
706 code through a battery of tests covering various aspects of Ceph. For
707 inclusion in `make check`_, a test must:
708
709 * bind ports that do not conflict with other tests
710 * not require root access
711 * not require more than one machine to run
712 * complete within a few minutes
713
714 While it is possible to run `make check`_ directly, it can be tricky to
715 correctly set up your environment. Fortunately, a script is provided to
716 make it easier run `make check`_ on your code. It can be run from the
717 top-level directory of the Ceph source tree by doing::
718
719 $ ./run-make-check.sh
720
721 You will need a minimum of 8GB of RAM and 32GB of free disk space for this
722 command to complete successfully on x86_64 (other architectures may have
723 different constraints). Depending on your hardware, it can take from 20
724 minutes to three hours to complete, but it's worth the wait.
725
726 Caveats
727 -------
728
729 1. Unlike the various Ceph daemons and ``ceph-fuse``, the `make check`_ tests
730 are linked against the default memory allocator (glibc) unless explicitly
731 linked against something else. This enables tools like valgrind to be used
732 in the tests.
733
734 Testing - integration tests
735 ===========================
736
737 When a test requires multiple machines, root access or lasts for a
738 longer time (for example, to simulate a realistic Ceph deployment), it
739 is deemed to be an integration test. Integration tests are organized into
740 "suites", which are defined in the `ceph/qa sub-directory`_ and run with
741 the ``teuthology-suite`` command.
742
743 The ``teuthology-suite`` command is part of the `teuthology framework`_.
744 In the sections that follow we attempt to provide a detailed introduction
745 to that framework from the perspective of a beginning Ceph developer.
746
747 Teuthology consumes packages
748 ----------------------------
749
750 It may take some time to understand the significance of this fact, but it
751 is `very` significant. It means that automated tests can be conducted on
752 multiple platforms using the same packages (RPM, DEB) that can be
753 installed on any machine running those platforms.
754
755 Teuthology has a `list of platforms that it supports
756 <https://github.com/ceph/ceph/tree/master/qa/distros/supported>`_ (as
757 of March 2016 the list consisted of "CentOS 7.2" and "Ubuntu 14.04"). It
758 expects to be provided pre-built Ceph packages for these platforms.
759 Teuthology deploys these platforms on machines (bare-metal or
760 cloud-provisioned), installs the packages on them, and deploys Ceph
761 clusters on them - all as called for by the test.
762
763 The nightlies
764 -------------
765
766 A number of integration tests are run on a regular basis in the `Sepia
767 lab`_ against the official Ceph repositories (on the ``master`` development
768 branch and the stable branches). Traditionally, these tests are called "the
769 nightlies" because the Ceph core developers used to live and work in
770 the same time zone and from their perspective the tests were run overnight.
771
772 The results of the nightlies are published at http://pulpito.ceph.com/ and
773 http://pulpito.ovh.sepia.ceph.com:8081/. The developer nick shows in the
774 test results URL and in the first column of the Pulpito dashboard. The
775 results are also reported on the `ceph-qa mailing list
776 <https://ceph.com/irc/>`_ for analysis.
777
778 Suites inventory
779 ----------------
780
781 The ``suites`` directory of the `ceph/qa sub-directory`_ contains
782 all the integration tests, for all the Ceph components.
783
784 `ceph-deploy <https://github.com/ceph/ceph/tree/master/qa/suites/ceph-deploy>`_
785 install a Ceph cluster with ``ceph-deploy`` (`ceph-deploy man page`_)
786
787 `ceph-disk <https://github.com/ceph/ceph/tree/master/qa/suites/ceph-disk>`_
788 verify init scripts (upstart etc.) and udev integration with
789 ``ceph-disk`` (`ceph-disk man page`_), with and without `dmcrypt
790 <https://gitlab.com/cryptsetup/cryptsetup/wikis/DMCrypt>`_ support.
791
792 `dummy <https://github.com/ceph/ceph/tree/master/qa/suites/dummy>`_
793 get a machine, do nothing and return success (commonly used to
794 verify the integration testing infrastructure works as expected)
795
796 `fs <https://github.com/ceph/ceph/tree/master/qa/suites/fs>`_
797 test CephFS
798
799 `kcephfs <https://github.com/ceph/ceph/tree/master/qa/suites/kcephfs>`_
800 test the CephFS kernel module
801
802 `krbd <https://github.com/ceph/ceph/tree/master/qa/suites/krbd>`_
803 test the RBD kernel module
804
805 `powercycle <https://github.com/ceph/ceph/tree/master/qa/suites/powercycle>`_
806 verify the Ceph cluster behaves when machines are powered off
807 and on again
808
809 `rados <https://github.com/ceph/ceph/tree/master/qa/suites/rados>`_
810 run Ceph clusters including OSDs and MONs, under various conditions of
811 stress
812
813 `rbd <https://github.com/ceph/ceph/tree/master/qa/suites/rbd>`_
814 run RBD tests using actual Ceph clusters, with and without qemu
815
816 `rgw <https://github.com/ceph/ceph/tree/master/qa/suites/rgw>`_
817 run RGW tests using actual Ceph clusters
818
819 `smoke <https://github.com/ceph/ceph/tree/master/qa/suites/smoke>`_
820 run tests that exercise the Ceph API with an actual Ceph cluster
821
822 `teuthology <https://github.com/ceph/ceph/tree/master/qa/suites/teuthology>`_
823 verify that teuthology can run integration tests, with and without OpenStack
824
825 `upgrade <https://github.com/ceph/ceph/tree/master/qa/suites/upgrade>`_
826 for various versions of Ceph, verify that upgrades can happen
827 without disrupting an ongoing workload
828
829 .. _`ceph-deploy man page`: ../../man/8/ceph-deploy
830 .. _`ceph-disk man page`: ../../man/8/ceph-disk
831
832 teuthology-describe-tests
833 -------------------------
834
835 In February 2016, a new feature called ``teuthology-describe-tests`` was
836 added to the `teuthology framework`_ to facilitate documentation and better
837 understanding of integration tests (`feature announcement
838 <http://article.gmane.org/gmane.comp.file-systems.ceph.devel/29287>`_).
839
840 The upshot is that tests can be documented by embedding ``meta:``
841 annotations in the yaml files used to define the tests. The results can be
842 seen in the `ceph-qa-suite wiki
843 <http://tracker.ceph.com/projects/ceph-qa-suite/wiki/>`_.
844
845 Since this is a new feature, many yaml files have yet to be annotated.
846 Developers are encouraged to improve the documentation, in terms of both
847 coverage and quality.
848
849 How integration tests are run
850 -----------------------------
851
852 Given that - as a new Ceph developer - you will typically not have access
853 to the `Sepia lab`_, you may rightly ask how you can run the integration
854 tests in your own environment.
855
856 One option is to set up a teuthology cluster on bare metal. Though this is
857 a non-trivial task, it `is` possible. Here are `some notes
858 <http://docs.ceph.com/teuthology/docs/LAB_SETUP.html>`_ to get you started
859 if you decide to go this route.
860
861 If you have access to an OpenStack tenant, you have another option: the
862 `teuthology framework`_ has an OpenStack backend, which is documented `here
863 <https://github.com/dachary/teuthology/tree/openstack#openstack-backend>`__.
864 This OpenStack backend can build packages from a given git commit or
865 branch, provision VMs, install the packages and run integration tests
866 on those VMs. This process is controlled using a tool called
867 `ceph-workbench ceph-qa-suite`_. This tool also automates publishing of
868 test results at http://teuthology-logs.public.ceph.com.
869
870 Running integration tests on your code contributions and publishing the
871 results allows reviewers to verify that changes to the code base do not
872 cause regressions, or to analyze test failures when they do occur.
873
874 Every teuthology cluster, whether bare-metal or cloud-provisioned, has a
875 so-called "teuthology machine" from which tests suites are triggered using the
876 ``teuthology-suite`` command.
877
878 A detailed and up-to-date description of each `teuthology-suite`_ option is
879 available by running the following command on the teuthology machine::
880
881 $ teuthology-suite --help
882
883 .. _teuthology-suite: http://docs.ceph.com/teuthology/docs/teuthology.suite.html
884
885 How integration tests are defined
886 ---------------------------------
887
888 Integration tests are defined by yaml files found in the ``suites``
889 subdirectory of the `ceph/qa sub-directory`_ and implemented by python
890 code found in the ``tasks`` subdirectory. Some tests ("standalone tests")
891 are defined in a single yaml file, while other tests are defined by a
892 directory tree containing yaml files that are combined, at runtime, into a
893 larger yaml file.
894
895 Reading a standalone test
896 -------------------------
897
898 Let us first examine a standalone test, or "singleton".
899
900 Here is a commented example using the integration test
901 `rados/singleton/all/admin-socket.yaml
902 <https://github.com/ceph/ceph/blob/master/qa/suites/rados/singleton/all/admin-socket.yaml>`_
903 ::
904
905 roles:
906 - - mon.a
907 - osd.0
908 - osd.1
909 tasks:
910 - install:
911 - ceph:
912 - admin_socket:
913 osd.0:
914 version:
915 git_version:
916 help:
917 config show:
918 config set filestore_dump_file /tmp/foo:
919 perf dump:
920 perf schema:
921
922 The ``roles`` array determines the composition of the cluster (how
923 many MONs, OSDs, etc.) on which this test is designed to run, as well
924 as how these roles will be distributed over the machines in the
925 testing cluster. In this case, there is only one element in the
926 top-level array: therefore, only one machine is allocated to the
927 test. The nested array declares that this machine shall run a MON with
928 id ``a`` (that is the ``mon.a`` in the list of roles) and two OSDs
929 (``osd.0`` and ``osd.1``).
930
931 The body of the test is in the ``tasks`` array: each element is
932 evaluated in order, causing the corresponding python file found in the
933 ``tasks`` subdirectory of the `teuthology repository`_ or
934 `ceph/qa sub-directory`_ to be run. "Running" in this case means calling
935 the ``task()`` function defined in that file.
936
937 In this case, the `install
938 <https://github.com/ceph/teuthology/blob/master/teuthology/task/install/__init__.py>`_
939 task comes first. It installs the Ceph packages on each machine (as
940 defined by the ``roles`` array). A full description of the ``install``
941 task is `found in the python file
942 <https://github.com/ceph/teuthology/blob/master/teuthology/task/install/__init__.py>`_
943 (search for "def task").
944
945 The ``ceph`` task, which is documented `here
946 <https://github.com/ceph/ceph/blob/master/qa/tasks/ceph.py>`__ (again,
947 search for "def task"), starts OSDs and MONs (and possibly MDSs as well)
948 as required by the ``roles`` array. In this example, it will start one MON
949 (``mon.a``) and two OSDs (``osd.0`` and ``osd.1``), all on the same
950 machine. Control moves to the next task when the Ceph cluster reaches
951 ``HEALTH_OK`` state.
952
953 The next task is ``admin_socket`` (`source code
954 <https://github.com/ceph/ceph/blob/master/qa/tasks/admin_socket.py>`_).
955 The parameter of the ``admin_socket`` task (and any other task) is a
956 structure which is interpreted as documented in the task. In this example
957 the parameter is a set of commands to be sent to the admin socket of
958 ``osd.0``. The task verifies that each of them returns on success (i.e.
959 exit code zero).
960
961 This test can be run with::
962
963 $ teuthology-suite --suite rados/singleton/all/admin-socket.yaml fs/ext4.yaml
964
965 Test descriptions
966 -----------------
967
968 Each test has a "test description", which is similar to a directory path,
969 but not the same. In the case of a standalone test, like the one in
970 `Reading a standalone test`_, the test description is identical to the
971 relative path (starting from the ``suites/`` directory of the
972 `ceph/qa sub-directory`_) of the yaml file defining the test.
973
974 Much more commonly, tests are defined not by a single yaml file, but by a
975 `directory tree of yaml files`. At runtime, the tree is walked and all yaml
976 files (facets) are combined into larger yaml "programs" that define the
977 tests. A full listing of the yaml defining the test is included at the
978 beginning of every test log.
979
980 In these cases, the description of each test consists of the
981 subdirectory under `suites/
982 <https://github.com/ceph/ceph/tree/master/qa/suites>`_ containing the
983 yaml facets, followed by an expression in curly braces (``{}``) consisting of
984 a list of yaml facets in order of concatenation. For instance the
985 test description::
986
987 ceph-disk/basic/{distros/centos_7.0.yaml tasks/ceph-disk.yaml}
988
989 signifies the concatenation of two files:
990
991 * ceph-disk/basic/distros/centos_7.0.yaml
992 * ceph-disk/basic/tasks/ceph-disk.yaml
993
994 How are tests built from directories?
995 -------------------------------------
996
997 As noted in the previous section, most tests are not defined in a single
998 yaml file, but rather as a `combination` of files collected from a
999 directory tree within the ``suites/`` subdirectory of the `ceph/qa sub-directory`_.
1000
1001 The set of all tests defined by a given subdirectory of ``suites/`` is
1002 called an "integration test suite", or a "teuthology suite".
1003
1004 Combination of yaml facets is controlled by special files (``%`` and
1005 ``+``) that are placed within the directory tree and can be thought of as
1006 operators. The ``%`` file is the "convolution" operator and ``+``
1007 signifies concatenation.
1008
1009 Convolution operator
1010 --------------------
1011
1012 The convolution operator, implemented as an empty file called ``%``, tells
1013 teuthology to construct a test matrix from yaml facets found in
1014 subdirectories below the directory containing the operator.
1015
1016 For example, the `ceph-disk suite
1017 <https://github.com/ceph/ceph/tree/jewel/qa/suites/ceph-disk/>`_ is
1018 defined by the ``suites/ceph-disk/`` tree, which consists of the files and
1019 subdirectories in the following structure::
1020
1021 directory: ceph-disk/basic
1022 file: %
1023 directory: distros
1024 file: centos_7.0.yaml
1025 file: ubuntu_14.04.yaml
1026 directory: tasks
1027 file: ceph-disk.yaml
1028
1029 This is interpreted as a 2x1 matrix consisting of two tests:
1030
1031 1. ceph-disk/basic/{distros/centos_7.0.yaml tasks/ceph-disk.yaml}
1032 2. ceph-disk/basic/{distros/ubuntu_14.04.yaml tasks/ceph-disk.yaml}
1033
1034 i.e. the concatenation of centos_7.0.yaml and ceph-disk.yaml and
1035 the concatenation of ubuntu_14.04.yaml and ceph-disk.yaml, respectively.
1036 In human terms, this means that the task found in ``ceph-disk.yaml`` is
1037 intended to run on both CentOS 7.0 and Ubuntu 14.04.
1038
1039 Without the file percent, the ``ceph-disk`` tree would be interpreted as
1040 three standalone tests:
1041
1042 * ceph-disk/basic/distros/centos_7.0.yaml
1043 * ceph-disk/basic/distros/ubuntu_14.04.yaml
1044 * ceph-disk/basic/tasks/ceph-disk.yaml
1045
1046 (which would of course be wrong in this case).
1047
1048 Referring to the `ceph/qa sub-directory`_, you will notice that the
1049 ``centos_7.0.yaml`` and ``ubuntu_14.04.yaml`` files in the
1050 ``suites/ceph-disk/basic/distros/`` directory are implemented as symlinks.
1051 By using symlinks instead of copying, a single file can appear in multiple
1052 suites. This eases the maintenance of the test framework as a whole.
1053
1054 All the tests generated from the ``suites/ceph-disk/`` directory tree
1055 (also known as the "ceph-disk suite") can be run with::
1056
1057 $ teuthology-suite --suite ceph-disk
1058
1059 An individual test from the `ceph-disk suite`_ can be run by adding the
1060 ``--filter`` option::
1061
1062 $ teuthology-suite \
1063 --suite ceph-disk/basic \
1064 --filter 'ceph-disk/basic/{distros/ubuntu_14.04.yaml tasks/ceph-disk.yaml}'
1065
1066 .. note: To run a standalone test like the one in `Reading a standalone
1067 test`_, ``--suite`` alone is sufficient. If you want to run a single
1068 test from a suite that is defined as a directory tree, ``--suite`` must
1069 be combined with ``--filter``. This is because the ``--suite`` option
1070 understands POSIX relative paths only.
1071
1072 Concatenation operator
1073 ----------------------
1074
1075 For even greater flexibility in sharing yaml files between suites, the
1076 special file plus (``+``) can be used to concatenate files within a
1077 directory. For instance, consider the `suites/rbd/thrash
1078 <https://github.com/ceph/ceph/tree/master/qa/suites/rbd/thrash>`_
1079 tree::
1080
1081 directory: rbd/thrash
1082 file: %
1083 directory: clusters
1084 file: +
1085 file: fixed-2.yaml
1086 file: openstack.yaml
1087 directory: workloads
1088 file: rbd_api_tests_copy_on_read.yaml
1089 file: rbd_api_tests.yaml
1090
1091 This creates two tests:
1092
1093 * rbd/thrash/{clusters/fixed-2.yaml clusters/openstack.yaml workloads/rbd_api_tests_copy_on_read.yaml}
1094 * rbd/thrash/{clusters/fixed-2.yaml clusters/openstack.yaml workloads/rbd_api_tests.yaml}
1095
1096 Because the ``clusters/`` subdirectory contains the special file plus
1097 (``+``), all the other files in that subdirectory (``fixed-2.yaml`` and
1098 ``openstack.yaml`` in this case) are concatenated together
1099 and treated as a single file. Without the special file plus, they would
1100 have been convolved with the files from the workloads directory to create
1101 a 2x2 matrix:
1102
1103 * rbd/thrash/{clusters/openstack.yaml workloads/rbd_api_tests_copy_on_read.yaml}
1104 * rbd/thrash/{clusters/openstack.yaml workloads/rbd_api_tests.yaml}
1105 * rbd/thrash/{clusters/fixed-2.yaml workloads/rbd_api_tests_copy_on_read.yaml}
1106 * rbd/thrash/{clusters/fixed-2.yaml workloads/rbd_api_tests.yaml}
1107
1108 The ``clusters/fixed-2.yaml`` file is shared among many suites to
1109 define the following ``roles``::
1110
1111 roles:
1112 - [mon.a, mon.c, osd.0, osd.1, osd.2, client.0]
1113 - [mon.b, osd.3, osd.4, osd.5, client.1]
1114
1115 The ``rbd/thrash`` suite as defined above, consisting of two tests,
1116 can be run with::
1117
1118 $ teuthology-suite --suite rbd/thrash
1119
1120 A single test from the rbd/thrash suite can be run by adding the
1121 ``--filter`` option::
1122
1123 $ teuthology-suite \
1124 --suite rbd/thrash \
1125 --filter 'rbd/thrash/{clusters/fixed-2.yaml clusters/openstack.yaml workloads/rbd_api_tests_copy_on_read.yaml}'
1126
1127 Filtering tests by their description
1128 ------------------------------------
1129
1130 When a few jobs fail and need to be run again, the ``--filter`` option
1131 can be used to select tests with a matching description. For instance, if the
1132 ``rados`` suite fails the `all/peer.yaml <https://github.com/ceph/ceph/blob/master/qa/suites/rados/singleton/all/peer.yaml>`_ test, the following will only run the tests that contain this file::
1133
1134 teuthology-suite --suite rados --filter all/peer.yaml
1135
1136 The ``--filter-out`` option does the opposite (it matches tests that do
1137 `not` contain a given string), and can be combined with the ``--filter``
1138 option.
1139
1140 Both ``--filter`` and ``--filter-out`` take a comma-separated list of strings (which
1141 means the comma character is implicitly forbidden in filenames found in the
1142 `ceph/qa sub-directory`_). For instance::
1143
1144 teuthology-suite --suite rados --filter all/peer.yaml,all/rest-api.yaml
1145
1146 will run tests that contain either
1147 `all/peer.yaml <https://github.com/ceph/ceph/blob/master/qa/suites/rados/singleton/all/peer.yaml>`_
1148 or
1149 `all/rest-api.yaml <https://github.com/ceph/ceph/blob/master/qa/suites/rados/singleton/all/rest-api.yaml>`_
1150
1151 Each string is looked up anywhere in the test description and has to
1152 be an exact match: they are not regular expressions.
1153
1154 Reducing the number of tests
1155 ----------------------------
1156
1157 The ``rados`` suite generates thousands of tests out of a few hundred
1158 files. This happens because teuthology constructs test matrices from
1159 subdirectories wherever it encounters a file named ``%``. For instance,
1160 all tests in the `rados/basic suite
1161 <https://github.com/ceph/ceph/tree/master/qa/suites/rados/basic>`_
1162 run with different messenger types: ``simple``, ``async`` and
1163 ``random``, because they are combined (via the special file ``%``) with
1164 the `msgr directory
1165 <https://github.com/ceph/ceph/tree/master/qa/suites/rados/basic/msgr>`_
1166
1167 All integration tests are required to be run before a Ceph release is published.
1168 When merely verifying whether a contribution can be merged without
1169 risking a trivial regression, it is enough to run a subset. The ``--subset`` option can be used to
1170 reduce the number of tests that are triggered. For instance::
1171
1172 teuthology-suite --suite rados --subset 0/4000
1173
1174 will run as few tests as possible. The tradeoff in this case is that
1175 not all combinations of test variations will together,
1176 but no matter how small a ratio is provided in the ``--subset``,
1177 teuthology will still ensure that all files in the suite are in at
1178 least one test. Understanding the actual logic that drives this
1179 requires reading the teuthology source code.
1180
1181 The ``--limit`` option only runs the first ``N`` tests in the suite:
1182 this is rarely useful, however, because there is no way to control which
1183 test will be first.
1184
1185 Testing in the cloud
1186 ====================
1187
1188 In this chapter, we will explain in detail how use an OpenStack
1189 tenant as an environment for Ceph integration testing.
1190
1191 Assumptions and caveat
1192 ----------------------
1193
1194 We assume that:
1195
1196 1. you are the only person using the tenant
1197 2. you have the credentials
1198 3. the tenant supports the ``nova`` and ``cinder`` APIs
1199
1200 Caveat: be aware that, as of this writing (July 2016), testing in
1201 OpenStack clouds is a new feature. Things may not work as advertised.
1202 If you run into trouble, ask for help on `IRC`_ or the `Mailing list`_, or
1203 open a bug report at the `ceph-workbench bug tracker`_.
1204
1205 .. _`ceph-workbench bug tracker`: http://ceph-workbench.dachary.org/root/ceph-workbench/issues
1206
1207 Prepare tenant
1208 --------------
1209
1210 If you have not tried to use ``ceph-workbench`` with this tenant before,
1211 proceed to the next step.
1212
1213 To start with a clean slate, login to your tenant via the Horizon dashboard and:
1214
1215 * terminate the ``teuthology`` and ``packages-repository`` instances, if any
1216 * delete the ``teuthology`` and ``teuthology-worker`` security groups, if any
1217 * delete the ``teuthology`` and ``teuthology-myself`` key pairs, if any
1218
1219 Also do the above if you ever get key-related errors ("invalid key", etc.) when
1220 trying to schedule suites.
1221
1222 Getting ceph-workbench
1223 ----------------------
1224
1225 Since testing in the cloud is done using the `ceph-workbench
1226 ceph-qa-suite`_ tool, you will need to install that first. It is designed
1227 to be installed via Docker, so if you don't have Docker running on your
1228 development machine, take care of that first. You can follow `the official
1229 tutorial <https://docs.docker.com/engine/installation/>`_ to install if
1230 you have not installed yet.
1231
1232 Once Docker is up and running, install ``ceph-workbench`` by following the
1233 `Installation instructions in the ceph-workbench documentation
1234 <http://ceph-workbench.readthedocs.org/en/latest/#installation>`_.
1235
1236 Linking ceph-workbench with your OpenStack tenant
1237 -------------------------------------------------
1238
1239 Before you can trigger your first teuthology suite, you will need to link
1240 ``ceph-workbench`` with your OpenStack account.
1241
1242 First, download a ``openrc.sh`` file by clicking on the "Download OpenStack
1243 RC File" button, which can be found in the "API Access" tab of the "Access
1244 & Security" dialog of the OpenStack Horizon dashboard.
1245
1246 Second, create a ``~/.ceph-workbench`` directory, set its permissions to
1247 700, and move the ``openrc.sh`` file into it. Make sure that the filename
1248 is exactly ``~/.ceph-workbench/openrc.sh``.
1249
1250 Third, edit the file so it does not ask for your OpenStack password
1251 interactively. Comment out the relevant lines and replace them with
1252 something like::
1253
1254 export OS_PASSWORD="aiVeth0aejee3eep8rogho3eep7Pha6ek"
1255
1256 When `ceph-workbench ceph-qa-suite`_ connects to your OpenStack tenant for
1257 the first time, it will generate two keypairs: ``teuthology-myself`` and
1258 ``teuthology``.
1259
1260 .. If this is not the first time you have tried to use
1261 .. `ceph-workbench ceph-qa-suite`_ with this tenant, make sure to delete any
1262 .. stale keypairs with these names!
1263
1264 Run the dummy suite
1265 -------------------
1266
1267 You are now ready to take your OpenStack teuthology setup for a test
1268 drive::
1269
1270 $ ceph-workbench ceph-qa-suite --suite dummy
1271
1272 Be forewarned that the first run of `ceph-workbench ceph-qa-suite`_ on a
1273 pristine tenant will take a long time to complete because it downloads a VM
1274 image and during this time the command may not produce any output.
1275
1276 The images are cached in OpenStack, so they are only downloaded once.
1277 Subsequent runs of the same command will complete faster.
1278
1279 Although ``dummy`` suite does not run any tests, in all other respects it
1280 behaves just like a teuthology suite and produces some of the same
1281 artifacts.
1282
1283 The last bit of output should look something like this::
1284
1285 pulpito web interface: http://149.202.168.201:8081/
1286 ssh access : ssh -i /home/smithfarm/.ceph-workbench/teuthology-myself.pem ubuntu@149.202.168.201 # logs in /usr/share/nginx/html
1287
1288 What this means is that `ceph-workbench ceph-qa-suite`_ triggered the test
1289 suite run. It does not mean that the suite run has completed. To monitor
1290 progress of the run, check the Pulpito web interface URL periodically, or
1291 if you are impatient, ssh to the teuthology machine using the ssh command
1292 shown and do::
1293
1294 $ tail -f /var/log/teuthology.*
1295
1296 The `/usr/share/nginx/html` directory contains the complete logs of the
1297 test suite. If we had provided the ``--upload`` option to the
1298 `ceph-workbench ceph-qa-suite`_ command, these logs would have been
1299 uploaded to http://teuthology-logs.public.ceph.com.
1300
1301 Run a standalone test
1302 ---------------------
1303
1304 The standalone test explained in `Reading a standalone test`_ can be run
1305 with the following command::
1306
1307 $ ceph-workbench ceph-qa-suite --suite rados/singleton/all/admin-socket.yaml
1308
1309 This will run the suite shown on the current ``master`` branch of
1310 ``ceph/ceph.git``. You can specify a different branch with the ``--ceph``
1311 option, and even a different git repo with the ``--ceph-git-url`` option. (Run
1312 ``ceph-workbench ceph-qa-suite --help`` for an up-to-date list of available
1313 options.)
1314
1315 The first run of a suite will also take a long time, because ceph packages
1316 have to be built, first. Again, the packages so built are cached and
1317 `ceph-workbench ceph-qa-suite`_ will not build identical packages a second
1318 time.
1319
1320 Interrupt a running suite
1321 -------------------------
1322
1323 Teuthology suites take time to run. From time to time one may wish to
1324 interrupt a running suite. One obvious way to do this is::
1325
1326 ceph-workbench ceph-qa-suite --teardown
1327
1328 This destroys all VMs created by `ceph-workbench ceph-qa-suite`_ and
1329 returns the OpenStack tenant to a "clean slate".
1330
1331 Sometimes you may wish to interrupt the running suite, but keep the logs,
1332 the teuthology VM, the packages-repository VM, etc. To do this, you can
1333 ``ssh`` to the teuthology VM (using the ``ssh access`` command reported
1334 when you triggered the suite -- see `Run the dummy suite`_) and, once
1335 there::
1336
1337 sudo /etc/init.d/teuthology restart
1338
1339 This will keep the teuthology machine, the logs and the packages-repository
1340 instance but nuke everything else.
1341
1342 Upload logs to archive server
1343 -----------------------------
1344
1345 Since the teuthology instance in OpenStack is only semi-permanent, with limited
1346 space for storing logs, ``teuthology-openstack`` provides an ``--upload``
1347 option which, if included in the ``ceph-workbench ceph-qa-suite`` command,
1348 will cause logs from all failed jobs to be uploaded to the log archive server
1349 maintained by the Ceph project. The logs will appear at the URL::
1350
1351 http://teuthology-logs.public.ceph.com/$RUN
1352
1353 where ``$RUN`` is the name of the run. It will be a string like this::
1354
1355 ubuntu-2016-07-23_16:08:12-rados-hammer-backports---basic-openstack
1356
1357 Even if you don't providing the ``--upload`` option, however, all the logs can
1358 still be found on the teuthology machine in the directory
1359 ``/usr/share/nginx/html``.
1360
1361 Provision VMs ad hoc
1362 --------------------
1363
1364 From the teuthology VM, it is possible to provision machines on an "ad hoc"
1365 basis, to use however you like. The magic incantation is::
1366
1367 teuthology-lock --lock-many $NUMBER_OF_MACHINES \
1368 --os-type $OPERATING_SYSTEM \
1369 --os-version $OS_VERSION \
1370 --machine-type openstack \
1371 --owner $EMAIL_ADDRESS
1372
1373 The command must be issued from the ``~/teuthology`` directory. The possible
1374 values for ``OPERATING_SYSTEM`` AND ``OS_VERSION`` can be found by examining
1375 the contents of the directory ``teuthology/openstack/``. For example::
1376
1377 teuthology-lock --lock-many 1 --os-type ubuntu --os-version 16.04 \
1378 --machine-type openstack --owner foo@example.com
1379
1380 When you are finished with the machine, find it in the list of machines::
1381
1382 openstack server list
1383
1384 to determine the name or ID, and then terminate it with::
1385
1386 openstack server delete $NAME_OR_ID
1387
1388 Deploy a cluster for manual testing
1389 -----------------------------------
1390
1391 The `teuthology framework`_ and `ceph-workbench ceph-qa-suite`_ are
1392 versatile tools that automatically provision Ceph clusters in the cloud and
1393 run various tests on them in an automated fashion. This enables a single
1394 engineer, in a matter of hours, to perform thousands of tests that would
1395 keep dozens of human testers occupied for days or weeks if conducted
1396 manually.
1397
1398 However, there are times when the automated tests do not cover a particular
1399 scenario and manual testing is desired. It turns out that it is simple to
1400 adapt a test to stop and wait after the Ceph installation phase, and the
1401 engineer can then ssh into the running cluster. Simply add the following
1402 snippet in the desired place within the test YAML and schedule a run with the
1403 test::
1404
1405 tasks:
1406 - exec:
1407 client.0:
1408 - sleep 1000000000 # forever
1409
1410 (Make sure you have a ``client.0`` defined in your ``roles`` stanza or adapt
1411 accordingly.)
1412
1413 The same effect can be achieved using the ``interactive`` task::
1414
1415 tasks:
1416 - interactive
1417
1418 By following the test log, you can determine when the test cluster has entered
1419 the "sleep forever" condition. At that point, you can ssh to the teuthology
1420 machine and from there to one of the target VMs (OpenStack) or teuthology
1421 worker machines machine (Sepia) where the test cluster is running.
1422
1423 The VMs (or "instances" in OpenStack terminology) created by
1424 `ceph-workbench ceph-qa-suite`_ are named as follows:
1425
1426 ``teuthology`` - the teuthology machine
1427
1428 ``packages-repository`` - VM where packages are stored
1429
1430 ``ceph-*`` - VM where packages are built
1431
1432 ``target*`` - machines where tests are run
1433
1434 The VMs named ``target*`` are used by tests. If you are monitoring the
1435 teuthology log for a given test, the hostnames of these target machines can
1436 be found out by searching for the string ``Locked targets``::
1437
1438 2016-03-20T11:39:06.166 INFO:teuthology.task.internal:Locked targets:
1439 target149202171058.teuthology: null
1440 target149202171059.teuthology: null
1441
1442 The IP addresses of the target machines can be found by running ``openstack
1443 server list`` on the teuthology machine, but the target VM hostnames (e.g.
1444 ``target149202171058.teuthology``) are resolvable within the teuthology
1445 cluster.
1446
1447
1448 Testing - how to run s3-tests locally
1449 =====================================
1450
1451 RGW code can be tested by building Ceph locally from source, starting a vstart
1452 cluster, and running the "s3-tests" suite against it.
1453
1454 The following instructions should work on jewel and above.
1455
1456 Step 1 - build Ceph
1457 -------------------
1458
1459 Refer to :doc:`/install/build-ceph`.
1460
1461 You can do step 2 separately while it is building.
1462
1463 Step 2 - vstart
1464 ---------------
1465
1466 When the build completes, and still in the top-level directory of the git
1467 clone where you built Ceph, do the following, for cmake builds::
1468
1469 cd build/
1470 RGW=1 ../vstart.sh -n
1471
1472 This will produce a lot of output as the vstart cluster is started up. At the
1473 end you should see a message like::
1474
1475 started. stop.sh to stop. see out/* (e.g. 'tail -f out/????') for debug output.
1476
1477 This means the cluster is running.
1478
1479
1480 Step 3 - run s3-tests
1481 ---------------------
1482
1483 To run the s3tests suite do the following::
1484
1485 $ ../qa/workunits/rgw/run-s3tests.sh
1486
1487 .. WIP
1488 .. ===
1489 ..
1490 .. Building RPM packages
1491 .. ---------------------
1492 ..
1493 .. Ceph is regularly built and packaged for a number of major Linux
1494 .. distributions. At the time of this writing, these included CentOS, Debian,
1495 .. Fedora, openSUSE, and Ubuntu.
1496 ..
1497 .. Architecture
1498 .. ============
1499 ..
1500 .. Ceph is a collection of components built on top of RADOS and provide
1501 .. services (RBD, RGW, CephFS) and APIs (S3, Swift, POSIX) for the user to
1502 .. store and retrieve data.
1503 ..
1504 .. See :doc:`/architecture` for an overview of Ceph architecture. The
1505 .. following sections treat each of the major architectural components
1506 .. in more detail, with links to code and tests.
1507 ..
1508 .. FIXME The following are just stubs. These need to be developed into
1509 .. detailed descriptions of the various high-level components (RADOS, RGW,
1510 .. etc.) with breakdowns of their respective subcomponents.
1511 ..
1512 .. FIXME Later, in the Testing chapter I would like to take another look
1513 .. at these components/subcomponents with a focus on how they are tested.
1514 ..
1515 .. RADOS
1516 .. -----
1517 ..
1518 .. RADOS stands for "Reliable, Autonomic Distributed Object Store". In a Ceph
1519 .. cluster, all data are stored in objects, and RADOS is the component responsible
1520 .. for that.
1521 ..
1522 .. RADOS itself can be further broken down into Monitors, Object Storage Daemons
1523 .. (OSDs), and client APIs (librados). Monitors and OSDs are introduced at
1524 .. :doc:`/start/intro`. The client library is explained at
1525 .. :doc:`/rados/api/index`.
1526 ..
1527 .. RGW
1528 .. ---
1529 ..
1530 .. RGW stands for RADOS Gateway. Using the embedded HTTP server civetweb_ or
1531 .. Apache FastCGI, RGW provides a REST interface to RADOS objects.
1532 ..
1533 .. .. _civetweb: https://github.com/civetweb/civetweb
1534 ..
1535 .. A more thorough introduction to RGW can be found at :doc:`/radosgw/index`.
1536 ..
1537 .. RBD
1538 .. ---
1539 ..
1540 .. RBD stands for RADOS Block Device. It enables a Ceph cluster to store disk
1541 .. images, and includes in-kernel code enabling RBD images to be mounted.
1542 ..
1543 .. To delve further into RBD, see :doc:`/rbd/rbd`.
1544 ..
1545 .. CephFS
1546 .. ------
1547 ..
1548 .. CephFS is a distributed file system that enables a Ceph cluster to be used as a NAS.
1549 ..
1550 .. File system metadata is managed by Meta Data Server (MDS) daemons. The Ceph
1551 .. file system is explained in more detail at :doc:`/cephfs/index`.
1552 ..