]>
Commit | Line | Data |
---|---|---|
11fdf7f2 TL |
1 | === |
2 | NFS | |
3 | === | |
4 | ||
5 | CephFS namespaces can be exported over NFS protocol using the | |
6 | `NFS-Ganesha NFS server <https://github.com/nfs-ganesha/nfs-ganesha/wiki>`_. | |
7 | ||
8 | Requirements | |
9 | ============ | |
10 | ||
9f95a23c | 11 | - Ceph file system (preferably latest stable luminous or higher versions) |
11fdf7f2 TL |
12 | - In the NFS server host machine, 'libcephfs2' (preferably latest stable |
13 | luminous or higher), 'nfs-ganesha' and 'nfs-ganesha-ceph' packages (latest | |
14 | ganesha v2.5 stable or higher versions) | |
15 | - NFS-Ganesha server host connected to the Ceph public network | |
16 | ||
17 | Configuring NFS-Ganesha to export CephFS | |
18 | ======================================== | |
19 | ||
20 | NFS-Ganesha provides a File System Abstraction Layer (FSAL) to plug in different | |
21 | storage backends. `FSAL_CEPH <https://github.com/nfs-ganesha/nfs-ganesha/tree/next/src/FSAL/FSAL_CEPH>`_ | |
22 | is the plugin FSAL for CephFS. For each NFS-Ganesha export, FSAL_CEPH uses a | |
23 | libcephfs client, user-space CephFS client, to mount the CephFS path that | |
24 | NFS-Ganesha exports. | |
25 | ||
26 | Setting up NFS-Ganesha with CephFS, involves setting up NFS-Ganesha's | |
27 | configuration file, and also setting up a Ceph configuration file and cephx | |
28 | access credentials for the Ceph clients created by NFS-Ganesha to access | |
29 | CephFS. | |
30 | ||
31 | NFS-Ganesha configuration | |
32 | ------------------------- | |
33 | ||
34 | A sample ganesha.conf configured with FSAL_CEPH can be found here, | |
35 | `<https://github.com/nfs-ganesha/nfs-ganesha/blob/next/src/config_samples/ceph.conf>`_. | |
36 | It is suitable for a standalone NFS-Ganesha server, or an active/passive | |
37 | configuration of NFS-Ganesha servers managed by some sort of clustering | |
38 | software (e.g., Pacemaker). Important details about the options are | |
39 | added as comments in the sample conf. There are options to do the following: | |
40 | ||
41 | - minimize Ganesha caching wherever possible since the libcephfs clients | |
42 | (of FSAL_CEPH) also cache aggressively | |
43 | ||
44 | - read from Ganesha config files stored in RADOS objects | |
45 | ||
46 | - store client recovery data in RADOS OMAP key-value interface | |
47 | ||
48 | - mandate NFSv4.1+ access | |
49 | ||
50 | - enable read delegations (need at least v13.0.1 'libcephfs2' package | |
51 | and v2.6.0 stable 'nfs-ganesha' and 'nfs-ganesha-ceph' packages) | |
52 | ||
53 | Configuration for libcephfs clients | |
54 | ----------------------------------- | |
55 | ||
56 | Required ceph.conf for libcephfs clients includes: | |
57 | ||
58 | * a [client] section with ``mon_host`` option set to let the clients connect | |
9f95a23c | 59 | to the Ceph cluster's monitors, usually generated via ``ceph config generate-minimal-conf``, e.g., :: |
11fdf7f2 | 60 | |
9f95a23c TL |
61 | [global] |
62 | mon host = [v2:192.168.1.7:3300,v1:192.168.1.7:6789], [v2:192.168.1.8:3300,v1:192.168.1.8:6789], [v2:192.168.1.9:3300,v1:192.168.1.9:6789] | |
11fdf7f2 TL |
63 | |
64 | Mount using NFSv4 clients | |
65 | ========================= | |
66 | ||
67 | It is preferred to mount the NFS-Ganesha exports using NFSv4.1+ protocols | |
68 | to get the benefit of sessions. | |
69 | ||
70 | Conventions for mounting NFS resources are platform-specific. The | |
71 | following conventions work on Linux and some Unix platforms: | |
72 | ||
73 | From the command line:: | |
74 | ||
75 | mount -t nfs -o nfsvers=4.1,proto=tcp <ganesha-host-name>:<ganesha-pseudo-path> <mount-point> | |
76 | ||
77 | Current limitations | |
78 | =================== | |
79 | ||
9f95a23c TL |
80 | - Per running ganesha daemon, FSAL_CEPH can only export one Ceph file system |
81 | although multiple directories in a Ceph file system may be exported. | |
82 | ||
83 | Exporting over NFS clusters deployed using rook | |
84 | =============================================== | |
85 | ||
86 | This tutorial assumes you have a kubernetes cluster deployed. If not `minikube | |
87 | <https://kubernetes.io/docs/setup/learning-environment/minikube/>`_ can be used | |
88 | to setup a single node cluster. In this tutorial minikube is used. | |
89 | ||
90 | .. note:: Configuration of this tutorial should not be used in a a real | |
91 | production cluster. For the purpose of simplification, the security | |
92 | aspects of Ceph are overlooked in this setup. | |
93 | ||
94 | `Rook <https://rook.io/docs/rook/master/ceph-quickstart.html>`_ Setup And Cluster Deployment | |
95 | -------------------------------------------------------------------------------------------- | |
96 | ||
97 | Clone the rook repository:: | |
98 | ||
99 | git clone https://github.com/rook/rook.git | |
100 | ||
101 | Deploy the rook operator:: | |
102 | ||
103 | cd cluster/examples/kubernetes/ceph | |
104 | kubectl create -f common.yaml | |
105 | kubectl create -f operator.yaml | |
106 | ||
107 | .. note:: Nautilus release or latest Ceph image should be used. | |
108 | ||
109 | Before proceding check if the pods are running:: | |
110 | ||
111 | kubectl -n rook-ceph get pod | |
112 | ||
113 | ||
114 | .. note:: | |
115 | For troubleshooting on any pod use:: | |
116 | ||
117 | kubectl describe -n rook-ceph pod <pod-name> | |
118 | ||
119 | If using minikube cluster change the **dataDirHostPath** to **/data/rook** in | |
120 | cluster-test.yaml file. This is to make sure data persists across reboots. | |
121 | ||
122 | Deploy the ceph cluster:: | |
123 | ||
124 | kubectl create -f cluster-test.yaml | |
125 | ||
126 | To interact with Ceph Daemons, let's deploy toolbox:: | |
127 | ||
128 | kubectl create -f ./toolbox.yaml | |
129 | ||
130 | Exec into the rook-ceph-tools pod:: | |
131 | ||
132 | kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash | |
133 | ||
134 | Check if you have one Ceph monitor, manager, OSD running and cluster is healthy:: | |
135 | ||
136 | [root@minikube /]# ceph -s | |
137 | cluster: | |
138 | id: 3a30f44c-a9ce-4c26-9f25-cc6fd23128d0 | |
139 | health: HEALTH_OK | |
140 | ||
141 | services: | |
142 | mon: 1 daemons, quorum a (age 14m) | |
143 | mgr: a(active, since 13m) | |
144 | osd: 1 osds: 1 up (since 13m), 1 in (since 13m) | |
145 | ||
146 | data: | |
147 | pools: 0 pools, 0 pgs | |
148 | objects: 0 objects, 0 B | |
149 | usage: 5.0 GiB used, 11 GiB / 16 GiB avail | |
150 | pgs: | |
151 | ||
152 | .. note:: Single monitor should never be used in real production deployment. As | |
153 | it can cause single point of failure. | |
154 | ||
155 | Create a Ceph File System | |
156 | ------------------------- | |
157 | Using ceph-mgr volumes module, we will create a ceph file system:: | |
158 | ||
159 | [root@minikube /]# ceph fs volume create myfs | |
160 | ||
161 | By default replicated size for OSD is 3. Since we are using only one OSD. It can cause error. Let's fix this up by setting replicated size to 1.:: | |
162 | ||
163 | [root@minikube /]# ceph osd pool set cephfs.myfs.meta size 1 | |
164 | [root@minikube /]# ceph osd pool set cephfs.myfs.data size 1 | |
165 | ||
166 | .. note:: The replicated size should never be less than 3 in real production deployment. | |
167 | ||
168 | Check Cluster status again:: | |
169 | ||
170 | [root@minikube /]# ceph -s | |
171 | cluster: | |
172 | id: 3a30f44c-a9ce-4c26-9f25-cc6fd23128d0 | |
173 | health: HEALTH_OK | |
174 | ||
175 | services: | |
176 | mon: 1 daemons, quorum a (age 27m) | |
177 | mgr: a(active, since 27m) | |
178 | mds: myfs:1 {0=myfs-a=up:active} 1 up:standby-replay | |
179 | osd: 1 osds: 1 up (since 56m), 1 in (since 56m) | |
180 | ||
181 | data: | |
182 | pools: 2 pools, 24 pgs | |
183 | objects: 22 objects, 2.2 KiB | |
184 | usage: 5.1 GiB used, 11 GiB / 16 GiB avail | |
185 | pgs: 24 active+clean | |
186 | ||
187 | io: | |
188 | client: 639 B/s rd, 1 op/s rd, 0 op/s wr | |
189 | ||
190 | Create a NFS-Ganesha Server Cluster | |
191 | ----------------------------------- | |
192 | Add Storage for NFS-Ganesha Servers to prevent recovery conflicts:: | |
193 | ||
194 | [root@minikube /]# ceph osd pool create nfs-ganesha 64 | |
195 | pool 'nfs-ganesha' created | |
196 | [root@minikube /]# ceph osd pool set nfs-ganesha size 1 | |
197 | [root@minikube /]# ceph orch nfs add mynfs nfs-ganesha ganesha | |
198 | ||
199 | Here we have created a NFS-Ganesha cluster called "mynfs" in "ganesha" | |
200 | namespace with "nfs-ganesha" OSD pool. | |
201 | ||
202 | Scale out NFS-Ganesha cluster:: | |
203 | ||
204 | [root@minikube /]# ceph orch nfs update mynfs 2 | |
205 | ||
206 | Configure NFS-Ganesha Exports | |
207 | ----------------------------- | |
208 | Initially rook creates ClusterIP service for the dashboard. With this service | |
209 | type, only the pods in same kubernetes cluster can access it. | |
210 | ||
211 | Expose Ceph Dashboard port:: | |
212 | ||
213 | kubectl patch service -n rook-ceph -p '{"spec":{"type": "NodePort"}}' rook-ceph-mgr-dashboard | |
214 | kubectl get service -n rook-ceph rook-ceph-mgr-dashboard | |
215 | NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE | |
216 | rook-ceph-mgr-dashboard NodePort 10.108.183.148 <none> 8443:31727/TCP 117m | |
217 | ||
218 | This makes the dashboard reachable outside kubernetes cluster and the service | |
219 | type is changed to NodePort service. | |
220 | ||
221 | Create JSON file for dashboard:: | |
222 | ||
223 | $ cat ~/export.json | |
224 | { | |
225 | "cluster_id": "mynfs", | |
226 | "path": "/", | |
227 | "fsal": {"name": "CEPH", "user_id":"admin", "fs_name": "myfs", "sec_label_xattr": null}, | |
228 | "pseudo": "/cephfs", | |
229 | "tag": null, | |
230 | "access_type": "RW", | |
231 | "squash": "no_root_squash", | |
232 | "protocols": [4], | |
233 | "transports": ["TCP"], | |
234 | "security_label": true, | |
235 | "daemons": ["mynfs.a", "mynfs.b"], | |
236 | "clients": [] | |
237 | } | |
238 | ||
239 | .. note:: Don't use this JSON file for real production deployment. As here the | |
240 | ganesha servers are given client-admin access rights. | |
241 | ||
242 | We need to download and run this `script | |
243 | <https://raw.githubusercontent.com/ceph/ceph/master/src/pybind/mgr/dashboard/run-backend-rook-api-request.sh>`_ | |
244 | to pass the JSON file contents. Dashboard creates NFS-Ganesha export file | |
245 | based on this JSON file.:: | |
246 | ||
247 | ./run-backend-rook-api-request.sh POST /api/nfs-ganesha/export "$(cat <json-file-path>)" | |
248 | ||
249 | Expose the NFS Servers:: | |
250 | ||
251 | kubectl patch service -n rook-ceph -p '{"spec":{"type": "NodePort"}}' rook-ceph-nfs-mynfs-a | |
252 | kubectl patch service -n rook-ceph -p '{"spec":{"type": "NodePort"}}' rook-ceph-nfs-mynfs-b | |
253 | kubectl get services -n rook-ceph rook-ceph-nfs-mynfs-a rook-ceph-nfs-mynfs-b | |
254 | NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE | |
255 | rook-ceph-nfs-mynfs-a NodePort 10.101.186.111 <none> 2049:31013/TCP 72m | |
256 | rook-ceph-nfs-mynfs-b NodePort 10.99.216.92 <none> 2049:31587/TCP 63m | |
257 | ||
258 | .. note:: Ports are chosen at random by Kubernetes from a certain range. | |
259 | Specific port number can be added to nodePort field in spec. | |
260 | ||
261 | Testing access to NFS Servers | |
262 | ----------------------------- | |
263 | Open a root shell on the host and mount one of the NFS servers:: | |
264 | ||
265 | mkdir -p /mnt/rook | |
266 | mount -t nfs -o port=31013 $(minikube ip):/cephfs /mnt/rook | |
267 | ||
268 | Normal file operations can be performed on /mnt/rook if the mount is successful. | |
269 | ||
270 | .. note:: If minikube is used then VM host is the only client for the servers. | |
271 | In a real kubernetes cluster, multiple hosts can be used as clients, | |
272 | only when kubernetes cluster node IP addresses are accessible to | |
273 | them. |