]> git.proxmox.com Git - ceph.git/blame - ceph/src/rocksdb/WINDOWS_PORT.md
update sources to ceph Nautilus 14.2.1
[ceph.git] / ceph / src / rocksdb / WINDOWS_PORT.md
Content-type: text/html ]> git.proxmox.com Git - ceph.git/blame - ceph/src/rocksdb/WINDOWS_PORT.md


500 - Internal Server Error

Malformed UTF-8 character (fatal) at (eval 6) line 1, <$fd> line 159.
CommitLineData
7c673cae
FG
1# Microsoft Contribution Notes
2
3## Contributors
4* Alexander Zinoviev https://github.com/zinoale
5* Dmitri Smirnov https://github.com/yuslepukhin
6* Praveen Rao https://github.com/PraveenSinghRao
7* Sherlock Huang https://github.com/SherlockNoMad
8
9## Introduction
10RocksDB is a well proven open source key-value persistent store, optimized for fast storage. It provides scalability with number of CPUs and storage IOPS, to support IO-bound, in-memory and write-once workloads, most importantly, to be flexible to allow for innovation.
11
12As Microsoft Bing team we have been continuously pushing hard to improve the scalability, efficiency of platform and eventually benefit Bing end-user satisfaction. We would like to explore the opportunity to embrace open source, RocksDB here, to use, enhance and customize for our usage, and also contribute back to the RocksDB community. Herein, we are pleased to offer this RocksDB port for Windows platform.
13
14These notes describe some decisions and changes we had to make with regards to porting RocksDB on Windows. We hope this will help both reviewers and users of the Windows port.
15We are open for comments and improvements.
16
17## OS specifics
18All of the porting, testing and benchmarking was done on Windows Server 2012 R2 Datacenter 64-bit but to the best of our knowledge there is not a specific API we used during porting that is unsupported on other Windows OS after Vista.
19
20## Porting goals
21We strive to achieve the following goals:
22* make use of the existing porting interface of RocksDB
23* make minimum [WY2]modifications within platform independent code.
24* make all unit test pass both in debug and release builds.
25 * Note: latest introduction of SyncPoint seems to disable running db_test in Release.
26* make performance on par with published benchmarks accounting for HW differences
27* we would like to keep the port code inline with the master branch with no forking
28
29## Build system
30We have chosen CMake as a widely accepted build system to build the Windows port. It is very fast and convenient.
31
32At the same time it generates Visual Studio projects that are both usable from a command line and IDE.
33
34The top-level CMakeLists.txt file contains description of all targets and build rules. It also provides brief instructions on how to build the software for Windows. One more build related file is thirdparty.inc that also resides on the top level. This file must be edited to point to actual third party libraries location.
35We think that it would be beneficial to merge the existing make-based build system and the new cmake-based build system into a single one to use on all platforms.
36
37All building and testing was done for 64-bit. We have not conducted any testing for 32-bit and early reports indicate that it will not run on 32-bit.
38
39## C++ and STL notes
40We had to make some minimum changes within the portable files that either account for OS differences or the shortcomings of C++11 support in the current version of the MS compiler. Most or all of them are expected to be fixed in the upcoming compiler releases.
41
42We plan to use this port for our business purposes here at Bing and this provided business justification for this port. This also means, we do not have at present to choose the compiler version at will.
43
44* Certain headers that are not present and not necessary on Windows were simply `#ifndef OS_WIN` in a few places (`unistd.h`)
45* All posix specific headers were replaced to port/port.h which worked well
46* Replaced `dirent.h` for `port/dirent.h` (very few places) with the implementation of the relevant interfaces within `rocksdb::port` namespace
47* Replaced `sys/time.h` to `port/sys_time.h` (few places) implemented equivalents within `rocksdb::port`
11fdf7f2 48* `printf %z` specification is not supported on Windows. To imitate existing standards we came up with a string macro `ROCKSDB_PRIszt` which expands to `zu` on posix systems and to `Iu` on windows.
7c673cae
FG
49* in class member initialization were moved to a __ctors in some cases
50* `constexpr` is not supported. We had to replace `std::numeric_limits<>::max/min()` to its C macros for constants. Sometimes we had to make class members `static const` and place a definition within a .cc file.
51* `constexpr` for functions was replaced to a template specialization (1 place)
52* Union members that have non-trivial constructors were replaced to `char[]` in one place along with bug fixes (spatial experimental feature)
53* Zero-sized arrays are deemed a non-standard extension which we converted to 1 size array and that should work well for the purposes of these classes.
54* `std::chrono` lacks nanoseconds support (fixed in the upcoming release of the STL) and we had to use `QueryPerfCounter()` within env_win.cc
55* Function local statics initialization is still not safe. Used `std::once` to mitigate within WinEnv.
56
57## Windows Environments notes
58We endeavored to make it functionally on par with posix_env. This means we replicated the functionality of the thread pool and other things as precise as possible, including:
59* Replicate posix logic using std:thread primitives.
60* Implement all posix_env disk access functionality.
61* Set `use_os_buffer=false` to disable OS disk buffering for WinWritableFile and WinRandomAccessFile.
62* Replace `pread/pwrite` with `WriteFile/ReadFile` with `OVERLAPPED` structure.
63* Use `SetFileInformationByHandle` to compensate absence of `fallocate`.
64
65### In detail
66Even though Windows provides its own efficient thread-pool implementation we chose to replicate posix logic using `std::thread` primitives. This allows anyone to quickly detect any changes within the posix source code and replicate them within windows env. This has proven to work very well. At the same time for anyone who wishes to replace the built-in thread-pool can do so using RocksDB stackable environments.
67
68For disk access we implemented all of the functionality present within the posix_env which includes memory mapped files, random access, rate-limiter support etc.
69