]> git.proxmox.com Git - ceph.git/blob - ceph/src/spdk/doc/iscsi.md
import 15.2.0 Octopus source
[ceph.git] / ceph / src / spdk / doc / iscsi.md
1 # iSCSI Target {#iscsi}
2
3 # iSCSI Target Getting Started Guide {#iscsi_getting_started}
4
5 The Storage Performance Development Kit iSCSI target application is named `iscsi_tgt`.
6 This following section describes how to run iscsi from your cloned package.
7
8 ## Prerequisites {#iscsi_prereqs}
9
10 This guide starts by assuming that you can already build the standard SPDK distribution on your
11 platform.
12
13 Once built, the binary will be in `app/iscsi_tgt`.
14
15 If you want to kill the application by using signal, make sure use the SIGTERM, then the application
16 will release all the shared memory resource before exit, the SIGKILL will make the shared memory
17 resource have no chance to be released by applications, you may need to release the resource manually.
18
19 ## Introduction
20
21 The following diagram shows relations between different parts of iSCSI structure described in this
22 document.
23
24 ![iSCSI structure](iscsi.svg)
25
26 ## Configuring iSCSI Target via config file {#iscsi_config}
27
28 A `iscsi_tgt` specific configuration file is used to configure the iSCSI target. A fully documented
29 example configuration file is located at `etc/spdk/iscsi.conf.in`.
30
31 The configuration file is used to configure the SPDK iSCSI target. This file defines the following:
32 TCP ports to use as iSCSI portals; general iSCSI parameters; initiator names and addresses to allow
33 access to iSCSI target nodes; number and types of storage backends to export over iSCSI LUNs; iSCSI
34 target node mappings between portal groups, initiator groups, and LUNs.
35
36 You should make a copy of the example configuration file, modify it to suit your environment, and
37 then run the iscsi_tgt application and pass it the configuration file using the -c option. Right now,
38 the target requires elevated privileges (root) to run.
39
40 ~~~
41 app/iscsi_tgt/iscsi_tgt -c /path/to/iscsi.conf
42 ~~~
43
44 ### Assigning CPU Cores to the iSCSI Target {#iscsi_config_lcore}
45
46 SPDK uses the [DPDK Environment Abstraction Layer](http://dpdk.org/doc/guides/prog_guide/env_abstraction_layer.html)
47 to gain access to hardware resources such as huge memory pages and CPU core(s). DPDK EAL provides
48 functions to assign threads to specific cores.
49 To ensure the SPDK iSCSI target has the best performance, place the NICs and the NVMe devices on the
50 same NUMA node and configure the target to run on CPU cores associated with that node. The following
51 command line option is used to configure the SPDK iSCSI target:
52
53 ~~~
54 -m 0xF000000
55 ~~~
56
57 This is a hexadecimal bit mask of the CPU cores where the iSCSI target will start polling threads.
58 In this example, CPU cores 24, 25, 26 and 27 would be used.
59
60 ### Configuring a LUN in the iSCSI Target {#iscsi_lun}
61
62 Each LUN in an iSCSI target node is associated with an SPDK block device. See @ref bdev
63 for details on configuring SPDK block devices. The block device to LUN mappings are specified in the
64 configuration file as:
65
66 ~~~~
67 [TargetNodeX]
68 LUN0 Malloc0
69 LUN1 Nvme0n1
70 ~~~~
71
72 This exports a malloc'd target. The disk is a RAM disk that is a chunk of memory allocated by iscsi in
73 user space. It will use offload engine to do the copy job instead of memcpy if the system has enough DMA
74 channels.
75
76 ## Configuring iSCSI Target via RPC method {#iscsi_rpc}
77
78 In addition to the configuration file, the iSCSI target may also be configured via JSON-RPC calls. See
79 @ref jsonrpc for details.
80
81 ### Portal groups
82
83 - add_portal_group -- Add a portal group.
84 - delete_portal_group -- Delete an existing portal group.
85 - add_pg_ig_maps -- Add initiator group to portal group mappings to an existing iSCSI target node.
86 - delete_pg_ig_maps -- Delete initiator group to portal group mappings from an existing iSCSI target node.
87 - get_portal_groups -- Show information about all available portal groups.
88
89 ~~~
90 /path/to/spdk/scripts/rpc.py add_portal_group 1 10.0.0.1:3260
91 ~~~
92
93 ### Initiator groups
94
95 - add_initiator_group -- Add an initiator group.
96 - delete_initiator_group -- Delete an existing initiator group.
97 - add_initiators_to_initiator_group -- Add initiators to an existing initiator group.
98 - get_initiator_groups -- Show information about all available initiator groups.
99
100 ~~~
101 /path/to/spdk/scripts/rpc.py add_initiator_group 2 ANY 10.0.0.2/32
102 ~~~
103
104 ### Target nodes
105
106 - construct_target_node -- Add a iSCSI target node.
107 - delete_target_node -- Delete a iSCSI target node.
108 - target_node_add_lun -- Add an LUN to an existing iSCSI target node.
109 - get_target_nodes -- Show information about all available iSCSI target nodes.
110
111 ~~~
112 /path/to/spdk/scripts/rpc.py construct_target_node Target3 Target3_alias MyBdev:0 1:2 64 -d
113 ~~~
114
115 ## Configuring iSCSI Initiator {#iscsi_initiator}
116
117 The Linux initiator is open-iscsi.
118
119 Installing open-iscsi package
120 Fedora:
121 ~~~
122 yum install -y iscsi-initiator-utils
123 ~~~
124
125 Ubuntu:
126 ~~~
127 apt-get install -y open-iscsi
128 ~~~
129
130 ### Setup
131
132 Edit /etc/iscsi/iscsid.conf
133 ~~~
134 node.session.cmds_max = 4096
135 node.session.queue_depth = 128
136 ~~~
137
138 iscsid must be restarted or receive SIGHUP for changes to take effect. To send SIGHUP, run:
139 ~~~
140 killall -HUP iscsid
141 ~~~
142
143 Recommended changes to /etc/sysctl.conf
144 ~~~
145 net.ipv4.tcp_timestamps = 1
146 net.ipv4.tcp_sack = 0
147
148 net.ipv4.tcp_rmem = 10000000 10000000 10000000
149 net.ipv4.tcp_wmem = 10000000 10000000 10000000
150 net.ipv4.tcp_mem = 10000000 10000000 10000000
151 net.core.rmem_default = 524287
152 net.core.wmem_default = 524287
153 net.core.rmem_max = 524287
154 net.core.wmem_max = 524287
155 net.core.optmem_max = 524287
156 net.core.netdev_max_backlog = 300000
157 ~~~
158
159 ### Discovery
160
161 Assume target is at 10.0.0.1
162 ~~~
163 iscsiadm -m discovery -t sendtargets -p 10.0.0.1
164 ~~~
165
166 ### Connect to target
167
168 ~~~
169 iscsiadm -m node --login
170 ~~~
171
172 At this point the iSCSI target should show up as SCSI disks. Check dmesg to see what
173 they came up as.
174
175 ### Disconnect from target
176
177 ~~~
178 iscsiadm -m node --logout
179 ~~~
180
181 ### Deleting target node cache
182
183 ~~~
184 iscsiadm -m node -o delete
185 ~~~
186
187 This will cause the initiator to forget all previously discovered iSCSI target nodes.
188
189 ### Finding /dev/sdX nodes for iSCSI LUNs
190
191 ~~~
192 iscsiadm -m session -P 3 | grep "Attached scsi disk" | awk '{print $4}'
193 ~~~
194
195 This will show the /dev node name for each SCSI LUN in all logged in iSCSI sessions.
196
197 ### Tuning
198
199 After the targets are connected, they can be tuned. For example if /dev/sdc is
200 an iSCSI disk then the following can be done:
201 Set noop to scheduler
202
203 ~~~
204 echo noop > /sys/block/sdc/queue/scheduler
205 ~~~
206
207 Disable merging/coalescing (can be useful for precise workload measurements)
208
209 ~~~
210 echo "2" > /sys/block/sdc/queue/nomerges
211 ~~~
212
213 Increase requests for block queue
214
215 ~~~
216 echo "1024" > /sys/block/sdc/queue/nr_requests
217 ~~~
218
219 ### Example: Configure simple iSCSI Target with one portal and two LUNs
220
221 Assuming we have one iSCSI Target server with portal at 10.0.0.1:3200, two LUNs (Malloc0 and Malloc),
222 and accepting initiators on 10.0.0.2/32, like on diagram below:
223
224 ![Sample iSCSI configuration](iscsi_example.svg)
225
226 #### Configure iSCSI Target
227
228 Start iscsi_tgt application:
229 ```
230 $ ./app/iscsi_tgt/iscsi_tgt
231 ```
232
233 Construct two 64MB Malloc block devices with 512B sector size "Malloc0" and "Malloc1":
234
235 ```
236 $ ./scripts/rpc.py construct_malloc_bdev -b Malloc0 64 512
237 $ ./scripts/rpc.py construct_malloc_bdev -b Malloc1 64 512
238 ```
239
240 Create new portal group with id 1, and address 10.0.0.1:3260:
241
242 ```
243 $ ./scripts/rpc.py add_portal_group 1 10.0.0.1:3260
244 ```
245
246 Create one initiator group with id 2 to accept any connection from 10.0.0.2/32:
247
248 ```
249 $ ./scripts/rpc.py add_initiator_group 2 ANY 10.0.0.2/32
250 ```
251
252 Finally construct one target using previously created bdevs as LUN0 (Malloc0) and LUN1 (Malloc1)
253 with a name "disk1" and alias "Data Disk1" using portal group 1 and initiator group 2.
254
255 ```
256 $ ./scripts/rpc.py construct_target_node disk1 "Data Disk1" "Malloc0:0 Malloc1:1" 1:2 64 -d
257 ```
258
259 #### Configure initiator
260
261 Discover target
262
263 ~~~
264 $ iscsiadm -m discovery -t sendtargets -p 10.0.0.1
265 10.0.0.1:3260,1 iqn.2016-06.io.spdk:disk1
266 ~~~
267
268 Connect to the target
269
270 ~~~
271 $ iscsiadm -m node --login
272 ~~~
273
274 At this point the iSCSI target should show up as SCSI disks.
275
276 Check dmesg to see what they came up as. In this example it can look like below:
277
278 ~~~
279 ...
280 [630111.860078] scsi host68: iSCSI Initiator over TCP/IP
281 [630112.124743] scsi 68:0:0:0: Direct-Access INTEL Malloc disk 0001 PQ: 0 ANSI: 5
282 [630112.125445] sd 68:0:0:0: [sdd] 131072 512-byte logical blocks: (67.1 MB/64.0 MiB)
283 [630112.125468] sd 68:0:0:0: Attached scsi generic sg3 type 0
284 [630112.125926] sd 68:0:0:0: [sdd] Write Protect is off
285 [630112.125934] sd 68:0:0:0: [sdd] Mode Sense: 83 00 00 08
286 [630112.126049] sd 68:0:0:0: [sdd] Write cache: enabled, read cache: disabled, doesn't support DPO or FUA
287 [630112.126483] scsi 68:0:0:1: Direct-Access INTEL Malloc disk 0001 PQ: 0 ANSI: 5
288 [630112.127096] sd 68:0:0:1: Attached scsi generic sg4 type 0
289 [630112.127143] sd 68:0:0:1: [sde] 131072 512-byte logical blocks: (67.1 MB/64.0 MiB)
290 [630112.127566] sd 68:0:0:1: [sde] Write Protect is off
291 [630112.127573] sd 68:0:0:1: [sde] Mode Sense: 83 00 00 08
292 [630112.127728] sd 68:0:0:1: [sde] Write cache: enabled, read cache: disabled, doesn't support DPO or FUA
293 [630112.128246] sd 68:0:0:0: [sdd] Attached SCSI disk
294 [630112.129789] sd 68:0:0:1: [sde] Attached SCSI disk
295 ...
296 ~~~
297
298 You may also use simple bash command to find /dev/sdX nodes for each iSCSI LUN
299 in all logged iSCSI sessions:
300
301 ~~~
302 $ iscsiadm -m session -P 3 | grep "Attached scsi disk" | awk '{print $4}'
303 sdd
304 sde
305 ~~~
306
307 # iSCSI Hotplug {#iscsi_hotplug}
308
309 At the iSCSI level, we provide the following support for Hotplug:
310
311 1. bdev/nvme:
312 At the bdev/nvme level, we start one hotplug monitor which will call
313 spdk_nvme_probe() periodically to get the hotplug events. We provide the
314 private attach_cb and remove_cb for spdk_nvme_probe(). For the attach_cb,
315 we will create the block device base on the NVMe device attached, and for the
316 remove_cb, we will unregister the block device, which will also notify the
317 upper level stack (for iSCSI target, the upper level stack is scsi/lun) to
318 handle the hot-remove event.
319
320 2. scsi/lun:
321 When the LUN receive the hot-remove notification from block device layer,
322 the LUN will be marked as removed, and all the IOs after this point will
323 return with check condition status. Then the LUN starts one poller which will
324 wait for all the commands which have already been submitted to block device to
325 return back; after all the commands return back, the LUN will be deleted.
326
327 ## Known bugs and limitations {#iscsi_hotplug_bugs}
328
329 For write command, if you want to test hotplug with write command which will
330 cause r2t, for example 1M size IO, it will crash the iscsi tgt.
331 For read command, if you want to test hotplug with large read IO, for example 1M
332 size IO, it will probably crash the iscsi tgt.
333
334 @sa spdk_nvme_probe