]>
Commit | Line | Data |
---|---|---|
214c8adb BH |
1 | =============================================================================== |
2 | WHAT IS EXOFS? | |
3 | =============================================================================== | |
4 | ||
5 | exofs is a file system that uses an OSD and exports the API of a normal Linux | |
6 | file system. Users access exofs like any other local file system, and exofs | |
7 | will in turn issue commands to the local OSD initiator. | |
8 | ||
9 | OSD is a new T10 command set that views storage devices not as a large/flat | |
10 | array of sectors but as a container of objects, each having a length, quota, | |
11 | time attributes and more. Each object is addressed by a 64bit ID, and is | |
12 | contained in a 64bit ID partition. Each object has associated attributes | |
13 | attached to it, which are integral part of the object and provide metadata about | |
14 | the object. The standard defines some common obligatory attributes, but user | |
15 | attributes can be added as needed. | |
16 | ||
17 | =============================================================================== | |
18 | ENVIRONMENT | |
19 | =============================================================================== | |
20 | ||
21 | To use this file system, you need to have an object store to run it on. You | |
22 | may download a target from: | |
23 | http://open-osd.org | |
24 | ||
25 | See Documentation/scsi/osd.txt for how to setup a working osd environment. | |
26 | ||
27 | =============================================================================== | |
28 | USAGE | |
29 | =============================================================================== | |
30 | ||
31 | 1. Download and compile exofs and open-osd initiator: | |
32 | You need an external Kernel source tree or kernel headers from your | |
33 | distribution. (anything based on 2.6.26 or later). | |
34 | ||
35 | a. download open-osd including exofs source using: | |
36 | [parent-directory]$ git clone git://git.open-osd.org/open-osd.git | |
37 | ||
38 | b. Build the library module like this: | |
39 | [parent-directory]$ make -C KSRC=$(KER_DIR) open-osd | |
40 | ||
41 | This will build both the open-osd initiator as well as the exofs kernel | |
42 | module. Use whatever parameters you compiled your Kernel with and | |
43 | $(KER_DIR) above pointing to the Kernel you compile against. See the file | |
44 | open-osd/top-level-Makefile for an example. | |
45 | ||
46 | 2. Get the OSD initiator and target set up properly, and login to the target. | |
47 | See Documentation/scsi/osd.txt for farther instructions. Also see ./do-osd | |
48 | for example script that does all these steps. | |
49 | ||
50 | 3. Insmod the exofs.ko module: | |
51 | [exofs]$ insmod exofs.ko | |
52 | ||
53 | 4. Make sure the directory where you want to mount exists. If not, create it. | |
54 | (For example, mkdir /mnt/exofs) | |
55 | ||
56 | 5. At first run you will need to invoke the mkfs.exofs application | |
57 | ||
58 | As an example, this will create the file system on: | |
59 | /dev/osd0 partition ID 65536 | |
60 | ||
61 | mkfs.exofs --pid=65536 --format /dev/osd0 | |
62 | ||
9f249162 TLSC |
63 | The --format is optional. If not specified, no OSD_FORMAT will be |
64 | performed and a clean file system will be created in the specified pid, | |
214c8adb BH |
65 | in the available space of the target. (Use --format=size_in_meg to limit |
66 | the total LUN space available) | |
67 | ||
9f249162 TLSC |
68 | If pid already exists, it will be deleted and a new one will be created in |
69 | its place. Be careful. | |
214c8adb BH |
70 | |
71 | An exofs lives inside a single OSD partition. You can create multiple exofs | |
72 | filesystems on the same device using multiple pids. | |
73 | ||
74 | (run mkfs.exofs without any parameters for usage help message) | |
75 | ||
76 | 6. Mount the file system. | |
77 | ||
78 | For example, to mount /dev/osd0, partition ID 0x10000 on /mnt/exofs: | |
79 | ||
80 | mount -t exofs -o pid=65536 /dev/osd0 /mnt/exofs/ | |
81 | ||
82 | 7. For reference (See do-exofs example script): | |
83 | do-exofs start - an example of how to perform the above steps. | |
9f249162 | 84 | do-exofs stop - an example of how to unmount the file system. |
214c8adb BH |
85 | do-exofs format - an example of how to format and mkfs a new exofs. |
86 | ||
87 | 8. Extra compilation flags (uncomment in fs/exofs/Kbuild): | |
88 | CONFIG_EXOFS_DEBUG - for debug messages and extra checks. | |
89 | ||
90 | =============================================================================== | |
91 | exofs mount options | |
92 | =============================================================================== | |
93 | Similar to any mount command: | |
94 | mount -t exofs -o exofs_options /dev/osdX mount_exofs_directory | |
95 | ||
96 | Where: | |
97 | -t exofs: specifies the exofs file system | |
98 | ||
99 | /dev/osdX: X is a decimal number. /dev/osdX was created after a successful | |
100 | login into an OSD target. | |
101 | ||
102 | mount_exofs_directory: The directory to mount the file system on | |
103 | ||
104 | exofs specific options: Options are separated by commas (,) | |
105 | pid=<integer> - The partition number to mount/create as | |
106 | container of the filesystem. | |
9ed96484 BH |
107 | This option is mandatory. integer can be |
108 | Hex by pre-pending an 0x to the number. | |
109 | osdname=<id> - Mount by a device's osdname. | |
110 | osdname is usually a 36 character uuid of the | |
111 | form "d2683732-c906-4ee1-9dbd-c10c27bb40df". | |
112 | It is one of the device's uuid specified in the | |
113 | mkfs.exofs format command. | |
114 | If this option is specified then the /dev/osdX | |
115 | above can be empty and is ignored. | |
9f249162 | 116 | to=<integer> - Timeout in ticks for a single command. |
214c8adb BH |
117 | default is (60 * HZ) [for debugging only] |
118 | ||
119 | =============================================================================== | |
120 | DESIGN | |
121 | =============================================================================== | |
122 | ||
123 | * The file system control block (AKA on-disk superblock) resides in an object | |
124 | with a special ID (defined in common.h). | |
125 | Information included in the file system control block is used to fill the | |
126 | in-memory superblock structure at mount time. This object is created before | |
9f249162 | 127 | the file system is used by mkexofs.c. It contains information such as: |
214c8adb BH |
128 | - The file system's magic number |
129 | - The next inode number to be allocated | |
130 | ||
131 | * Each file resides in its own object and contains the data (and it will be | |
132 | possible to extend the file over multiple objects, though this has not been | |
133 | implemented yet). | |
134 | ||
135 | * A directory is treated as a file, and essentially contains a list of <file | |
136 | name, inode #> pairs for files that are found in that directory. The object | |
137 | IDs correspond to the files' inode numbers and will be allocated according to | |
138 | a bitmap (stored in a separate object). Now they are allocated using a | |
139 | counter. | |
140 | ||
141 | * Each file's control block (AKA on-disk inode) is stored in its object's | |
142 | attributes. This applies to both regular files and other types (directories, | |
143 | device files, symlinks, etc.). | |
144 | ||
9f249162 TLSC |
145 | * Credentials are generated per object (inode and superblock) when they are |
146 | created in memory (read from disk or created). The credential works for all | |
214c8adb BH |
147 | operations and is used as long as the object remains in memory. |
148 | ||
149 | * Async OSD operations are used whenever possible, but the target may execute | |
150 | them out of order. The operations that concern us are create, delete, | |
151 | readpage, writepage, update_inode, and truncate. The following pairs of | |
152 | operations should execute in the order written, and we need to prevent them | |
153 | from executing in reverse order: | |
154 | - The following are handled with the OBJ_CREATED and OBJ_2BCREATED | |
155 | flags. OBJ_CREATED is set when we know the object exists on the OSD - | |
9f249162 TLSC |
156 | in create's callback function, and when we successfully do a |
157 | read_inode. | |
214c8adb BH |
158 | OBJ_2BCREATED is set in the beginning of the create function, so we |
159 | know that we should wait. | |
160 | - create/delete: delete should wait until the object is created | |
161 | on the OSD. | |
162 | - create/readpage: readpage should be able to return a page | |
163 | full of zeroes in this case. If there was a write already | |
164 | en-route (i.e. create, writepage, readpage) then the page | |
165 | would be locked, and so it would really be the same as | |
166 | create/writepage. | |
167 | - create/writepage: if writepage is called for a sync write, it | |
168 | should wait until the object is created on the OSD. | |
169 | Otherwise, it should just return. | |
170 | - create/truncate: truncate should wait until the object is | |
171 | created on the OSD. | |
172 | - create/update_inode: update_inode should wait until the | |
173 | object is created on the OSD. | |
174 | - Handled by VFS locks: | |
175 | - readpage/delete: shouldn't happen because of page lock. | |
176 | - writepage/delete: shouldn't happen because of page lock. | |
177 | - readpage/writepage: shouldn't happen because of page lock. | |
178 | ||
179 | =============================================================================== | |
180 | LICENSE/COPYRIGHT | |
181 | =============================================================================== | |
182 | The exofs file system is based on ext2 v0.5b (distributed with the Linux kernel | |
183 | version 2.6.10). All files include the original copyrights, and the license | |
184 | is GPL version 2 (only version 2, as is true for the Linux kernel). The | |
185 | Linux kernel can be downloaded from www.kernel.org. |