[ceph.git] / ceph / doc / cephfs / nfs.rst

===
NFS
===

CephFS namespaces can be exported over NFS protocol using the
`NFS-Ganesha NFS server <https://github.com/nfs-ganesha/nfs-ganesha/wiki>`_.

Requirements
============

-  Ceph file system (preferably latest stable luminous or higher versions)
-  In the NFS server host machine, 'libcephfs2' (preferably latest stable
   luminous or higher), 'nfs-ganesha' and 'nfs-ganesha-ceph' packages (latest
   ganesha v2.5 stable or higher versions)
-  NFS-Ganesha server host connected to the Ceph public network

Configuring NFS-Ganesha to export CephFS
========================================

NFS-Ganesha provides a File System Abstraction Layer (FSAL) to plug in different
storage backends. `FSAL_CEPH <https://github.com/nfs-ganesha/nfs-ganesha/tree/next/src/FSAL/FSAL_CEPH>`_
is the plugin FSAL for CephFS. For each NFS-Ganesha export, FSAL_CEPH uses a
libcephfs client, user-space CephFS client, to mount the CephFS path that
NFS-Ganesha exports.

Setting up NFS-Ganesha with CephFS, involves setting up NFS-Ganesha's
configuration file, and also setting up a Ceph configuration file and cephx
access credentials for the Ceph clients created by NFS-Ganesha to access
CephFS.

NFS-Ganesha configuration
-------------------------

A sample ganesha.conf configured with FSAL_CEPH can be found here,
`<https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/config_samples/ceph.conf>`_.
It is suitable for a standalone NFS-Ganesha server, or an active/passive
configuration of NFS-Ganesha servers managed by some sort of clustering
software (e.g., Pacemaker). Important details about the options are
added as comments in the sample conf. There are options to do the following:

- minimize Ganesha caching wherever possible since the libcephfs clients
  (of FSAL_CEPH) also cache aggressively

- read from Ganesha config files stored in RADOS objects

- store client recovery data in RADOS OMAP key-value interface

- mandate NFSv4.1+ access

- enable read delegations (need at least v13.0.1 'libcephfs2' package
  and v2.6.0 stable 'nfs-ganesha' and 'nfs-ganesha-ceph' packages)

Configuration for libcephfs clients
-----------------------------------

Required ceph.conf for libcephfs clients includes:

* a [client] section with ``mon_host`` option set to let the clients connect
  to the Ceph cluster's monitors, usually generated via ``ceph config generate-minimal-conf``, e.g., ::

    [global]
            mon host = [v2:192.168.1.7:3300,v1:192.168.1.7:6789], [v2:192.168.1.8:3300,v1:192.168.1.8:6789], [v2:192.168.1.9:3300,v1:192.168.1.9:6789]

Mount using NFSv4 clients
=========================

It is preferred to mount the NFS-Ganesha exports using NFSv4.1+ protocols
to get the benefit of sessions.

Conventions for mounting NFS resources are platform-specific. The
following conventions work on Linux and some Unix platforms:

From the command line::

  mount -t nfs -o nfsvers=4.1,proto=tcp <ganesha-host-name>:<ganesha-pseudo-path> <mount-point>

Current limitations
===================

- Per running ganesha daemon, FSAL_CEPH can only export one Ceph file system
  although multiple directories in a Ceph file system may be exported.

Exporting over NFS clusters deployed using rook
===============================================

This tutorial assumes you have a kubernetes cluster deployed. If not `minikube
<https://kubernetes.io/docs/setup/learning-environment/minikube/>`_ can be used
to setup a single node cluster. In this tutorial minikube is used.

.. note:: Configuration of this tutorial should not be used in a a real
          production cluster. For the purpose of simplification, the security
          aspects of Ceph are overlooked in this setup.

`Rook <https://rook.io/docs/rook/master/ceph-quickstart.html>`_ Setup And Cluster Deployment
--------------------------------------------------------------------------------------------

Clone the rook repository::

        git clone https://github.com/rook/rook.git

Deploy the rook operator::

        cd cluster/examples/kubernetes/ceph
        kubectl create -f common.yaml
        kubectl create -f operator.yaml

.. note:: Nautilus release or latest Ceph image should be used.

Before proceding check if the pods are running::

        kubectl -n rook-ceph get pod


.. note::
        For troubleshooting on any pod use::

                kubectl describe -n rook-ceph pod <pod-name>

If using minikube cluster change the **dataDirHostPath** to **/data/rook** in
cluster-test.yaml file. This is to make sure data persists across reboots.

Deploy the ceph cluster::

        kubectl create -f cluster-test.yaml

To interact with Ceph Daemons, let's deploy toolbox::

        kubectl create -f ./toolbox.yaml

Exec into the rook-ceph-tools pod::

        kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash

Check if you have one Ceph monitor, manager, OSD running and cluster is healthy::

        [root@minikube /]# ceph -s
           cluster:
                id:     3a30f44c-a9ce-4c26-9f25-cc6fd23128d0
                health: HEALTH_OK

           services:
                mon: 1 daemons, quorum a (age 14m)
                mgr: a(active, since 13m)
                osd: 1 osds: 1 up (since 13m), 1 in (since 13m)

           data:
                pools:   0 pools, 0 pgs
                objects: 0 objects, 0 B
                usage:   5.0 GiB used, 11 GiB / 16 GiB avail
                pgs:

.. note:: Single monitor should never be used in real production deployment. As
          it can cause single point of failure.

Create a Ceph File System
-------------------------
Using ceph-mgr volumes module, we will create a ceph file system::

        [root@minikube /]# ceph fs volume create myfs

By default replicated size for OSD is 3. Since we are using only one OSD. It can cause error. Let's fix this up by setting replicated size to 1.::

        [root@minikube /]# ceph osd pool set cephfs.myfs.meta size 1
        [root@minikube /]# ceph osd pool set cephfs.myfs.data size 1

.. note:: The replicated size should never be less than 3 in real production deployment.

Check Cluster status again::

        [root@minikube /]# ceph -s
          cluster:
            id:     3a30f44c-a9ce-4c26-9f25-cc6fd23128d0
            health: HEALTH_OK

          services:
            mon: 1 daemons, quorum a (age 27m)
            mgr: a(active, since 27m)
            mds: myfs:1 {0=myfs-a=up:active} 1 up:standby-replay
            osd: 1 osds: 1 up (since 56m), 1 in (since 56m)

          data:
            pools:   2 pools, 24 pgs
            objects: 22 objects, 2.2 KiB
            usage:   5.1 GiB used, 11 GiB / 16 GiB avail
            pgs:     24 active+clean

          io:
            client:   639 B/s rd, 1 op/s rd, 0 op/s wr

Create a NFS-Ganesha Server Cluster
-----------------------------------
Add Storage for NFS-Ganesha Servers to prevent recovery conflicts::

        [root@minikube /]# ceph osd pool create nfs-ganesha 64
        pool 'nfs-ganesha' created
        [root@minikube /]# ceph osd pool set nfs-ganesha size 1
        [root@minikube /]# ceph orch nfs add mynfs nfs-ganesha ganesha

Here we have created a NFS-Ganesha cluster called "mynfs" in "ganesha"
namespace with "nfs-ganesha" OSD pool.

Scale out NFS-Ganesha cluster::

        [root@minikube /]# ceph orch nfs update mynfs 2

Configure NFS-Ganesha Exports
-----------------------------
Initially rook creates ClusterIP service for the dashboard. With this service
type, only the pods in same kubernetes cluster can access it.

Expose Ceph Dashboard port::

        kubectl patch service -n rook-ceph -p '{"spec":{"type": "NodePort"}}' rook-ceph-mgr-dashboard
        kubectl get service -n rook-ceph rook-ceph-mgr-dashboard
        NAME                      TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
        rook-ceph-mgr-dashboard   NodePort   10.108.183.148   <none>        8443:31727/TCP   117m

This makes the dashboard reachable outside kubernetes cluster and the service
type is changed to NodePort service.

Create JSON file for dashboard::

        $ cat ~/export.json
        {
              "cluster_id": "mynfs",
              "path": "/",
              "fsal": {"name": "CEPH", "user_id":"admin", "fs_name": "myfs", "sec_label_xattr": null},
              "pseudo": "/cephfs",
              "tag": null,
              "access_type": "RW",
              "squash": "no_root_squash",
              "protocols": [4],
              "transports": ["TCP"],
              "security_label": true,
              "daemons": ["mynfs.a", "mynfs.b"],
              "clients": []
        }

.. note:: Don't use this JSON file for real production deployment. As here the
          ganesha servers are given client-admin access rights.

We need to download and run this `script
<https://raw.githubusercontent.com/ceph/ceph/master/src/pybind/mgr/dashboard/run-backend-rook-api-request.sh>`_
to pass the JSON file contents. Dashboard creates NFS-Ganesha export file
based on this JSON file.::

        ./run-backend-rook-api-request.sh POST /api/nfs-ganesha/export "$(cat <json-file-path>)"

Expose the NFS Servers::

        kubectl patch service -n rook-ceph -p '{"spec":{"type": "NodePort"}}' rook-ceph-nfs-mynfs-a
        kubectl patch service -n rook-ceph -p '{"spec":{"type": "NodePort"}}' rook-ceph-nfs-mynfs-b
        kubectl get services -n rook-ceph rook-ceph-nfs-mynfs-a rook-ceph-nfs-mynfs-b
        NAME                    TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
        rook-ceph-nfs-mynfs-a   NodePort   10.101.186.111   <none>        2049:31013/TCP   72m
        rook-ceph-nfs-mynfs-b   NodePort   10.99.216.92     <none>        2049:31587/TCP   63m

.. note:: Ports are chosen at random by Kubernetes from a certain range.
          Specific port number can be added to nodePort field in spec.

Testing access to NFS Servers
-----------------------------
Open a root shell on the host and mount one of the NFS servers::

        mkdir -p /mnt/rook
        mount -t nfs -o port=31013 $(minikube ip):/cephfs /mnt/rook

Normal file operations can be performed on /mnt/rook if the mount is successful.

.. note:: If minikube is used then VM host is the only client for the servers.
          In a real kubernetes cluster, multiple hosts can be used as clients,
          only when kubernetes cluster node IP addresses are accessible to
          them.
Commit	Line	Data
11fdf7f2 TL	1	===
	2	NFS
	3	===
	4
	5	CephFS namespaces can be exported over NFS protocol using the
	6	`NFS-Ganesha NFS server <https://github.com/nfs-ganesha/nfs-ganesha/wiki>`_.
	7
	8	Requirements
	9	============
	10
9f95a23c	11	- Ceph file system (preferably latest stable luminous or higher versions)
11fdf7f2 TL	12	- In the NFS server host machine, 'libcephfs2' (preferably latest stable
	13	luminous or higher), 'nfs-ganesha' and 'nfs-ganesha-ceph' packages (latest
	14	ganesha v2.5 stable or higher versions)
	15	- NFS-Ganesha server host connected to the Ceph public network
	16
	17	Configuring NFS-Ganesha to export CephFS
	18	========================================
	19
	20	NFS-Ganesha provides a File System Abstraction Layer (FSAL) to plug in different
	21	storage backends. `FSAL_CEPH <https://github.com/nfs-ganesha/nfs-ganesha/tree/next/src/FSAL/FSAL_CEPH>`_
	22	is the plugin FSAL for CephFS. For each NFS-Ganesha export, FSAL_CEPH uses a
	23	libcephfs client, user-space CephFS client, to mount the CephFS path that
	24	NFS-Ganesha exports.
	25
	26	Setting up NFS-Ganesha with CephFS, involves setting up NFS-Ganesha's
	27	configuration file, and also setting up a Ceph configuration file and cephx
	28	access credentials for the Ceph clients created by NFS-Ganesha to access
	29	CephFS.
	30
	31	NFS-Ganesha configuration
	32	-------------------------
	33
	34	A sample ganesha.conf configured with FSAL_CEPH can be found here,
	35	`<https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/config_samples/ceph.conf>`_.
	36	It is suitable for a standalone NFS-Ganesha server, or an active/passive
	37	configuration of NFS-Ganesha servers managed by some sort of clustering
	38	software (e.g., Pacemaker). Important details about the options are
	39	added as comments in the sample conf. There are options to do the following:
	40
	41	- minimize Ganesha caching wherever possible since the libcephfs clients
	42	(of FSAL_CEPH) also cache aggressively
	43
	44	- read from Ganesha config files stored in RADOS objects
	45
	46	- store client recovery data in RADOS OMAP key-value interface
	47
	48	- mandate NFSv4.1+ access
	49
	50	- enable read delegations (need at least v13.0.1 'libcephfs2' package
	51	and v2.6.0 stable 'nfs-ganesha' and 'nfs-ganesha-ceph' packages)
	52
	53	Configuration for libcephfs clients
	54	-----------------------------------
	55
	56	Required ceph.conf for libcephfs clients includes:
	57
	58	* a [client] section with ``mon_host`` option set to let the clients connect
9f95a23c	59	to the Ceph cluster's monitors, usually generated via ``ceph config generate-minimal-conf``, e.g., ::
11fdf7f2	60
9f95a23c TL	61	[global]
9f95a23c TL	62	mon host = [v2:192.168.1.7:3300,v1:192.168.1.7:6789], [v2:192.168.1.8:3300,v1:192.168.1.8:6789], [v2:192.168.1.9:3300,v1:192.168.1.9:6789]
11fdf7f2 TL	63
	64	Mount using NFSv4 clients
	65	=========================
	66
	67	It is preferred to mount the NFS-Ganesha exports using NFSv4.1+ protocols
	68	to get the benefit of sessions.
	69
	70	Conventions for mounting NFS resources are platform-specific. The
	71	following conventions work on Linux and some Unix platforms:
	72
	73	From the command line::
	74
	75	mount -t nfs -o nfsvers=4.1,proto=tcp <ganesha-host-name>:<ganesha-pseudo-path> <mount-point>
	76
	77	Current limitations
	78	===================
	79
9f95a23c TL	80	- Per running ganesha daemon, FSAL_CEPH can only export one Ceph file system
	81	although multiple directories in a Ceph file system may be exported.
	82
	83	Exporting over NFS clusters deployed using rook
	84	===============================================
	85
	86	This tutorial assumes you have a kubernetes cluster deployed. If not `minikube
	87	<https://kubernetes.io/docs/setup/learning-environment/minikube/>`_ can be used
	88	to setup a single node cluster. In this tutorial minikube is used.
	89
	90	.. note:: Configuration of this tutorial should not be used in a a real
	91	production cluster. For the purpose of simplification, the security
	92	aspects of Ceph are overlooked in this setup.
	93
	94	`Rook <https://rook.io/docs/rook/master/ceph-quickstart.html>`_ Setup And Cluster Deployment
	95	--------------------------------------------------------------------------------------------
	96
	97	Clone the rook repository::
	98
	99	git clone https://github.com/rook/rook.git
	100
	101	Deploy the rook operator::
	102
	103	cd cluster/examples/kubernetes/ceph
	104	kubectl create -f common.yaml
	105	kubectl create -f operator.yaml
	106
	107	.. note:: Nautilus release or latest Ceph image should be used.
	108
	109	Before proceding check if the pods are running::
	110
	111	kubectl -n rook-ceph get pod
	112
	113
	114	.. note::
	115	For troubleshooting on any pod use::
	116
	117	kubectl describe -n rook-ceph pod <pod-name>
	118
	119	If using minikube cluster change the dataDirHostPath to /data/rook in
	120	cluster-test.yaml file. This is to make sure data persists across reboots.
	121
	122	Deploy the ceph cluster::
	123
	124	kubectl create -f cluster-test.yaml
	125
	126	To interact with Ceph Daemons, let's deploy toolbox::
	127
	128	kubectl create -f ./toolbox.yaml
	129
	130	Exec into the rook-ceph-tools pod::
	131
	132	kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
	133
	134	Check if you have one Ceph monitor, manager, OSD running and cluster is healthy::
	135
	136	[root@minikube /]# ceph -s
	137	cluster:
	138	id: 3a30f44c-a9ce-4c26-9f25-cc6fd23128d0
	139	health: HEALTH_OK
	140
	141	services:
	142	mon: 1 daemons, quorum a (age 14m)
	143	mgr: a(active, since 13m)
144	osd: 1 osds: 1 up (since 13m), 1 in (since 13m)
145
146	data:
147	pools: 0 pools, 0 pgs
148	objects: 0 objects, 0 B
149	usage: 5.0 GiB used, 11 GiB / 16 GiB avail
150	pgs:
151
152	.. note:: Single monitor should never be used in real production deployment. As
153	it can cause single point of failure.
154
155	Create a Ceph File System
156	-------------------------
157	Using ceph-mgr volumes module, we will create a ceph file system::
158
159	[root@minikube /]# ceph fs volume create myfs
160
161	By default replicated size for OSD is 3. Since we are using only one OSD. It can cause error. Let's fix this up by setting replicated size to 1.::
162
163	[root@minikube /]# ceph osd pool set cephfs.myfs.meta size 1
164	[root@minikube /]# ceph osd pool set cephfs.myfs.data size 1
165
166	.. note:: The replicated size should never be less than 3 in real production deployment.
167
168	Check Cluster status again::
169
170	[root@minikube /]# ceph -s
171	cluster:
172	id: 3a30f44c-a9ce-4c26-9f25-cc6fd23128d0
173	health: HEALTH_OK
174
175	services:
176	mon: 1 daemons, quorum a (age 27m)
177	mgr: a(active, since 27m)
178	mds: myfs:1 {0=myfs-a=up:active} 1 up:standby-replay
179	osd: 1 osds: 1 up (since 56m), 1 in (since 56m)
180
181	data:
182	pools: 2 pools, 24 pgs
183	objects: 22 objects, 2.2 KiB
184	usage: 5.1 GiB used, 11 GiB / 16 GiB avail
185	pgs: 24 active+clean
186
187	io:
188	client: 639 B/s rd, 1 op/s rd, 0 op/s wr
189
190	Create a NFS-Ganesha Server Cluster
191	-----------------------------------
192	Add Storage for NFS-Ganesha Servers to prevent recovery conflicts::
193
194	[root@minikube /]# ceph osd pool create nfs-ganesha 64
195	pool 'nfs-ganesha' created
196	[root@minikube /]# ceph osd pool set nfs-ganesha size 1
197	[root@minikube /]# ceph orch nfs add mynfs nfs-ganesha ganesha
198
199	Here we have created a NFS-Ganesha cluster called "mynfs" in "ganesha"
200	namespace with "nfs-ganesha" OSD pool.
201
202	Scale out NFS-Ganesha cluster::
203
204	[root@minikube /]# ceph orch nfs update mynfs 2
205
206	Configure NFS-Ganesha Exports
207	-----------------------------
208	Initially rook creates ClusterIP service for the dashboard. With this service
209	type, only the pods in same kubernetes cluster can access it.
210
211	Expose Ceph Dashboard port::
212
213	kubectl patch service -n rook-ceph -p '{"spec":{"type": "NodePort"}}' rook-ceph-mgr-dashboard
214	kubectl get service -n rook-ceph rook-ceph-mgr-dashboard
215	NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
216	rook-ceph-mgr-dashboard NodePort 10.108.183.148 <none> 8443:31727/TCP 117m
217
218	This makes the dashboard reachable outside kubernetes cluster and the service
219	type is changed to NodePort service.
220
221	Create JSON file for dashboard::
222
223	$ cat ~/export.json
224	{
225	"cluster_id": "mynfs",
226	"path": "/",
227	"fsal": {"name": "CEPH", "user_id":"admin", "fs_name": "myfs", "sec_label_xattr": null},
228	"pseudo": "/cephfs",
229	"tag": null,
230	"access_type": "RW",
231	"squash": "no_root_squash",
232	"protocols": [4],
233	"transports": ["TCP"],
234	"security_label": true,
235	"daemons": ["mynfs.a", "mynfs.b"],
236	"clients": []
237	}
238
239	.. note:: Don't use this JSON file for real production deployment. As here the
240	ganesha servers are given client-admin access rights.
241
242	We need to download and run this `script
243	<https://raw.githubusercontent.com/ceph/ceph/master/src/pybind/mgr/dashboard/run-backend-rook-api-request.sh>`_
244	to pass the JSON file contents. Dashboard creates NFS-Ganesha export file
245	based on this JSON file.::
246
247	./run-backend-rook-api-request.sh POST /api/nfs-ganesha/export "$(cat <json-file-path>)"
248
249	Expose the NFS Servers::
250
251	kubectl patch service -n rook-ceph -p '{"spec":{"type": "NodePort"}}' rook-ceph-nfs-mynfs-a
252	kubectl patch service -n rook-ceph -p '{"spec":{"type": "NodePort"}}' rook-ceph-nfs-mynfs-b
253	kubectl get services -n rook-ceph rook-ceph-nfs-mynfs-a rook-ceph-nfs-mynfs-b
254	NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
255	rook-ceph-nfs-mynfs-a NodePort 10.101.186.111 <none> 2049:31013/TCP 72m
256	rook-ceph-nfs-mynfs-b NodePort 10.99.216.92 <none> 2049:31587/TCP 63m
257
258	.. note:: Ports are chosen at random by Kubernetes from a certain range.
259	Specific port number can be added to nodePort field in spec.
260
261	Testing access to NFS Servers
262	-----------------------------
263	Open a root shell on the host and mount one of the NFS servers::
264
265	mkdir -p /mnt/rook
266	mount -t nfs -o port=31013 $(minikube ip):/cephfs /mnt/rook
267
268	Normal file operations can be performed on /mnt/rook if the mount is successful.
269
270	.. note:: If minikube is used then VM host is the only client for the servers.
271	In a real kubernetes cluster, multiple hosts can be used as clients,
272	only when kubernetes cluster node IP addresses are accessible to
273	them.