src/doc/book/rust-inside-other-languages.md

   1 % Rust Inside Other Languages
   2
   3 For our third project, we’re going to choose something that shows off one of
   4 Rust’s greatest strengths: a lack of a substantial runtime.
   5
   6 As organizations grow, they increasingly rely on a multitude of programming
   7 languages. Different programming languages have different strengths and
   8 weaknesses, and a polyglot stack lets you use a particular language where
   9 its strengths make sense and a different one where it’s weak.
  10
  11 A very common area where many programming languages are weak is in runtime
  12 performance of programs. Often, using a language that is slower, but offers
  13 greater programmer productivity, is a worthwhile trade-off. To help mitigate
  14 this, they provide a way to write some of your system in C and then call
  15 that C code as though it were written in the higher-level language. This is
  16 called a ‘foreign function interface’, often shortened to ‘FFI’.
  17
  18 Rust has support for FFI in both directions: it can call into C code easily,
  19 but crucially, it can also be called _into_ as easily as C. Combined with
  20 Rust’s lack of a garbage collector and low runtime requirements, this makes
  21 Rust a great candidate to embed inside of other languages when you need
  22 that extra oomph.
  23
  24 There is a whole [chapter devoted to FFI][ffi] and its specifics elsewhere in
  25 the book, but in this chapter, we’ll examine this particular use-case of FFI,
  26 with examples in Ruby, Python, and JavaScript.
  27
  28 [ffi]: ffi.html
  29
  30 # The problem
  31
  32 There are many different projects we could choose here, but we’re going to
  33 pick an example where Rust has a clear advantage over many other languages:
  34 numeric computing and threading.
  35
  36 Many languages, for the sake of consistency, place numbers on the heap, rather
  37 than on the stack. Especially in languages that focus on object-oriented
  38 programming and use garbage collection, heap allocation is the default. Sometimes
  39 optimizations can stack allocate particular numbers, but rather than relying
  40 on an optimizer to do its job, we may want to ensure that we’re always using
  41 primitive number types rather than some sort of object type.
  42
  43 Second, many languages have a ‘global interpreter lock’ (GIL), which limits
  44 concurrency in many situations. This is done in the name of safety, which is
  45 a positive effect, but it limits the amount of work that can be done at the
  46 same time, which is a big negative.
  47
  48 To emphasize these two aspects, we’re going to create a little project that
  49 uses these two aspects heavily. Since the focus of the example is to embed
  50 Rust into other languages, rather than the problem itself, we’ll just use a
  51 toy example:
  52
  53 > Start ten threads. Inside each thread, count from one to five million. After
  54 > all ten threads are finished, print out ‘done!’.
  55
  56 I chose five million based on my particular computer. Here’s an example of this
  57 code in Ruby:
  58
  59 ```ruby
  60 threads = []
  61
  62 10.times do
  63   threads << Thread.new do
  64     count = 0
  65
  66     5_000_000.times do
  67       count += 1
  68     end
  69
  70     count
  71   end
  72 end
  73
  74 threads.each do |t|
  75   puts "Thread finished with count=#{t.value}"
  76 end
  77 puts "done!"
  78 ```
  79
  80 Try running this example, and choose a number that runs for a few seconds.
  81 Depending on your computer’s hardware, you may have to increase or decrease the
  82 number.
  83
  84 On my system, running this program takes `2.156` seconds. And, if I use some
  85 sort of process monitoring tool, like `top`, I can see that it only uses one
  86 core on my machine. That’s the GIL kicking in.
  87
  88 While it’s true that this is a synthetic program, one can imagine many problems
  89 that are similar to this in the real world. For our purposes, spinning up a few
  90 busy threads represents some sort of parallel, expensive computation.
  91
  92 # A Rust library
  93
  94 Let’s rewrite this problem in Rust. First, let’s make a new project with
  95 Cargo:
  96
  97 ```bash
  98 $ cargo new embed
  99 $ cd embed
 100 ```
 101
 102 This program is fairly easy to write in Rust:
 103
 104 ```rust
 105 use std::thread;
 106
 107 fn process() {
 108     let handles: Vec<_> = (0..10).map(|_| {
 109         thread::spawn(|| {
 110             let mut x = 0;
 111             for _ in 0..5_000_000 {
 112                 x += 1
 113             }
 114             x
 115         })
 116     }).collect();
 117
 118     for h in handles {
 119         println!("Thread finished with count={}",
 120             h.join().map_err(|_| "Could not join a thread!").unwrap());
 121     }
 122 }
 123 ```
 124
 125 Some of this should look familiar from previous examples. We spin up ten
 126 threads, collecting them into a `handles` vector. Inside of each thread, we
 127 loop five million times, and add one to `x` each time. Finally, we join on
 128 each thread.
 129
 130 Right now, however, this is a Rust library, and it doesn’t expose anything
 131 that’s callable from C. If we tried to hook this up to another language right
 132 now, it wouldn’t work. We only need to make two small changes to fix this,
 133 though. The first is to modify the beginning of our code:
 134
 135 ```rust,ignore
 136 #[no_mangle]
 137 pub extern fn process() {
 138 ```
 139
 140 We have to add a new attribute, `no_mangle`. When you create a Rust library, it
 141 changes the name of the function in the compiled output. The reasons for this
 142 are outside the scope of this tutorial, but in order for other languages to
 143 know how to call the function, we can’t do that. This attribute turns
 144 that behavior off.
 145
 146 The other change is the `pub extern`. The `pub` means that this function should
 147 be callable from outside of this module, and the `extern` says that it should
 148 be able to be called from C. That’s it! Not a whole lot of change.
 149
 150 The second thing we need to do is to change a setting in our `Cargo.toml`. Add
 151 this at the bottom:
 152
 153 ```toml
 154 [lib]
 155 name = "embed"
 156 crate-type = ["dylib"]
 157 ```
 158
 159 This tells Rust that we want to compile our library into a standard dynamic
 160 library. By default, Rust compiles an ‘rlib’, a Rust-specific format.
 161
 162 Let’s build the project now:
 163
 164 ```bash
 165 $ cargo build --release
 166    Compiling embed v0.1.0 (file:///home/steve/src/embed)
 167 ```
 168
 169 We’ve chosen `cargo build --release`, which builds with optimizations on. We
 170 want this to be as fast as possible! You can find the output of the library in
 171 `target/release`:
 172
 173 ```bash
 174 $ ls target/release/
 175 build  deps  examples  libembed.so  native
 176 ```
 177
 178 That `libembed.so` is our ‘shared object’ library. We can use this file
 179 just like any shared object library written in C! As an aside, this may be
 180 `embed.dll` (Microsoft Windows) or `libembed.dylib` (Mac OS X), depending on
 181 your operating system.
 182
 183 Now that we’ve got our Rust library built, let’s use it from our Ruby.
 184
 185 # Ruby
 186
 187 Open up an `embed.rb` file inside of our project, and do this:
 188
 189 ```ruby
 190 require 'ffi'
 191
 192 module Hello
 193   extend FFI::Library
 194   ffi_lib 'target/release/libembed.so'
 195   attach_function :process, [], :void
 196 end
 197
 198 Hello.process
 199
 200 puts 'done!'
 201 ```
 202
 203 Before we can run this, we need to install the `ffi` gem:
 204
 205 ```bash
 206 $ gem install ffi # this may need sudo
 207 Fetching: ffi-1.9.8.gem (100%)
 208 Building native extensions.  This could take a while...
 209 Successfully installed ffi-1.9.8
 210 Parsing documentation for ffi-1.9.8
 211 Installing ri documentation for ffi-1.9.8
 212 Done installing documentation for ffi after 0 seconds
 213 1 gem installed
 214 ```
 215
 216 And finally, we can try running it:
 217
 218 ```bash
 219 $ ruby embed.rb
 220 Thread finished with count=5000000
 221 Thread finished with count=5000000
 222 Thread finished with count=5000000
 223 Thread finished with count=5000000
 224 Thread finished with count=5000000
 225 Thread finished with count=5000000
 226 Thread finished with count=5000000
 227 Thread finished with count=5000000
 228 Thread finished with count=5000000
 229 Thread finished with count=5000000
 230 done!
 231 done!
 232 $
 233 ```
 234
 235 Whoa, that was fast! On my system, this took `0.086` seconds, rather than
 236 the two seconds the pure Ruby version took. Let’s break down this Ruby
 237 code:
 238
 239 ```ruby
 240 require 'ffi'
 241 ```
 242
 243 We first need to require the `ffi` gem. This lets us interface with our
 244 Rust library like a C library.
 245
 246 ```ruby
 247 module Hello
 248   extend FFI::Library
 249   ffi_lib 'target/release/libembed.so'
 250 ```
 251
 252 The `Hello` module is used to attach the native functions from the shared
 253 library. Inside, we `extend` the necessary `FFI::Library` module and then call
 254 `ffi_lib` to load up our shared object library. We just pass it the path that
 255 our library is stored, which, as we saw before, is
 256 `target/release/libembed.so`.
 257
 258 ```ruby
 259 attach_function :process, [], :void
 260 ```
 261
 262 The `attach_function` method is provided by the FFI gem. It’s what
 263 connects our `process()` function in Rust to a Ruby function of the
 264 same name. Since `process()` takes no arguments, the second parameter
 265 is an empty array, and since it returns nothing, we pass `:void` as
 266 the final argument.
 267
 268 ```ruby
 269 Hello.process
 270 ```
 271
 272 This is the actual call into Rust. The combination of our `module`
 273 and the call to `attach_function` sets this all up. It looks like
 274 a Ruby function but is actually Rust!
 275
 276 ```ruby
 277 puts 'done!'
 278 ```
 279
 280 Finally, as per our project’s requirements, we print out `done!`.
 281
 282 That’s it! As we’ve seen, bridging between the two languages is really easy,
 283 and buys us a lot of performance.
 284
 285 Next, let’s try Python!
 286
 287 # Python
 288
 289 Create an `embed.py` file in this directory, and put this in it:
 290
 291 ```python
 292 from ctypes import cdll
 293
 294 lib = cdll.LoadLibrary("target/release/libembed.so")
 295
 296 lib.process()
 297
 298 print("done!")
 299 ```
 300
 301 Even easier! We use `cdll` from the `ctypes` module. A quick call
 302 to `LoadLibrary` later, and we can call `process()`.
 303
 304 On my system, this takes `0.017` seconds. Speedy!
 305
 306 # Node.js
 307
 308 Node isn’t a language, but it’s currently the dominant implementation of
 309 server-side JavaScript.
 310
 311 In order to do FFI with Node, we first need to install the library:
 312
 313 ```bash
 314 $ npm install ffi
 315 ```
 316
 317 After that installs, we can use it:
 318
 319 ```javascript
 320 var ffi = require('ffi');
 321
 322 var lib = ffi.Library('target/release/libembed', {
 323   'process': ['void', []]
 324 });
 325
 326 lib.process();
 327
 328 console.log("done!");
 329 ```
 330
 331 It looks more like the Ruby example than the Python example. We use
 332 the `ffi` module to get access to `ffi.Library()`, which loads up
 333 our shared object. We need to annotate the return type and argument
 334 types of the function, which are `void` for return and an empty
 335 array to signify no arguments. From there, we just call it and
 336 print the result.
 337
 338 On my system, this takes a quick `0.092` seconds.
 339
 340 # Conclusion
 341
 342 As you can see, the basics of doing this are _very_ easy. Of course,
 343 there's a lot more that we could do here. Check out the [FFI][ffi]
 344 chapter for more details.