]>
Commit | Line | Data |
---|---|---|
11fdf7f2 TL |
1 | # BlobFS (Blobstore Filesystem) {#blobfs} |
2 | ||
3 | # BlobFS Getting Started Guide {#blobfs_getting_started} | |
4 | ||
5 | # RocksDB Integration {#blobfs_rocksdb} | |
6 | ||
7 | Clone and build the SPDK repository as per https://github.com/spdk/spdk | |
8 | ||
9 | ~~~{.sh} | |
10 | git clone https://github.com/spdk/spdk.git | |
11 | cd spdk | |
12 | ./configure | |
13 | make | |
14 | ~~~ | |
15 | ||
16 | Clone the RocksDB repository from the SPDK GitHub fork into a separate directory. | |
17 | Make sure you check out the `spdk-v5.6.1` branch. | |
18 | ||
19 | ~~~{.sh} | |
20 | cd .. | |
21 | git clone -b spdk-v5.6.1 https://github.com/spdk/rocksdb.git | |
22 | ~~~ | |
23 | ||
24 | Build RocksDB. Only the `db_bench` benchmarking tool is integrated with BlobFS. | |
25 | (Note: add `DEBUG_LEVEL=0` for a release build.) | |
26 | ||
27 | ~~~{.sh} | |
28 | cd rocksdb | |
29 | make db_bench SPDK_DIR=path/to/spdk | |
30 | ~~~ | |
31 | ||
32 | Create an NVMe section in the configuration file using SPDK's `gen_nvme.sh` script. | |
33 | ||
34 | ~~~{.sh} | |
35 | scripts/gen_nvme.sh > /usr/local/etc/spdk/rocksdb.conf | |
36 | ~~~ | |
37 | ||
38 | Verify the configuration file has specified the correct NVMe SSD. | |
39 | If there are any NVMe SSDs you do not wish to use for RocksDB/SPDK testing, remove them from the configuration file. | |
40 | ||
41 | Make sure you have at least 5GB of memory allocated for huge pages. | |
42 | By default, the SPDK `setup.sh` script only allocates 2GB. | |
43 | The following will allocate 5GB of huge page memory (in addition to binding the NVMe devices to uio/vfio). | |
44 | ||
45 | ~~~{.sh} | |
46 | HUGEMEM=5120 scripts/setup.sh | |
47 | ~~~ | |
48 | ||
49 | Create an empty SPDK blobfs for testing. | |
50 | ||
51 | ~~~{.sh} | |
52 | test/blobfs/mkfs/mkfs /usr/local/etc/spdk/rocksdb.conf Nvme0n1 | |
53 | ~~~ | |
54 | ||
55 | At this point, RocksDB is ready for testing with SPDK. Three `db_bench` parameters are used to configure SPDK: | |
56 | ||
57 | 1. `spdk` - Defines the name of the SPDK configuration file. If omitted, RocksDB will use the default PosixEnv implementation | |
58 | instead of SpdkEnv. (Required) | |
59 | 2. `spdk_bdev` - Defines the name of the SPDK block device which contains the BlobFS to be used for testing. (Required) | |
60 | 3. `spdk_cache_size` - Defines the amount of userspace cache memory used by SPDK. Specified in terms of megabytes (MB). | |
61 | Default is 4096 (4GB). (Optional) | |
62 | ||
63 | SPDK has a set of scripts which will run `db_bench` against a variety of workloads and capture performance and profiling | |
64 | data. The primary script is `test/blobfs/rocksdb/run_tests.sh`. | |
65 | ||
66 | # FUSE | |
67 | ||
68 | BlobFS provides a FUSE plug-in to mount an SPDK BlobFS as a kernel filesystem for inspection or debug purposes. | |
69 | The FUSE plug-in requires fuse3 and will be built automatically when fuse3 is detected on the system. | |
70 | ||
71 | ~~~{.sh} | |
72 | test/blobfs/fuse/fuse /usr/local/etc/spdk/rocksdb.conf Nvme0n1 /mnt/fuse | |
73 | ~~~ | |
74 | ||
75 | Note that the FUSE plug-in has some limitations - see the list below. | |
76 | ||
77 | # Limitations | |
78 | ||
79 | * BlobFS has primarily been tested with RocksDB so far, so any use cases different from how RocksDB uses a filesystem | |
80 | may run into issues. BlobFS will be tested in a broader range of use cases after this initial release. | |
81 | * Only a synchronous API is currently supported. An asynchronous API has been developed but not thoroughly tested | |
82 | yet so is not part of the public interface yet. This will be added in a future release. | |
83 | * File renames are not atomic. This will be fixed in a future release. | |
84 | * BlobFS currently supports only a flat namespace for files with no directory support. Filenames are currently stored | |
85 | as xattrs in each blob. This means that filename lookup is an O(n) operation. An SPDK btree implementation is | |
86 | underway which will be the underpinning for BlobFS directory support in a future release. | |
87 | * Writes to a file must always append to the end of the file. Support for writes to any location within the file | |
88 | will be added in a future release. |