]> git.proxmox.com Git - rustc.git/blob - src/doc/book/rust-inside-other-languages.md
Imported Upstream version 1.6.0+dfsg1
[rustc.git] / src / doc / book / rust-inside-other-languages.md
1 % Rust Inside Other Languages
2
3 For our third project, we’re going to choose something that shows off one of
4 Rust’s greatest strengths: a lack of a substantial runtime.
5
6 As organizations grow, they increasingly rely on a multitude of programming
7 languages. Different programming languages have different strengths and
8 weaknesses, and a polyglot stack lets you use a particular language where
9 its strengths make sense and a different one where it’s weak.
10
11 A very common area where many programming languages are weak is in runtime
12 performance of programs. Often, using a language that is slower, but offers
13 greater programmer productivity, is a worthwhile trade-off. To help mitigate
14 this, they provide a way to write some of your system in C and then call
15 that C code as though it were written in the higher-level language. This is
16 called a ‘foreign function interface’, often shortened to ‘FFI’.
17
18 Rust has support for FFI in both directions: it can call into C code easily,
19 but crucially, it can also be called _into_ as easily as C. Combined with
20 Rust’s lack of a garbage collector and low runtime requirements, this makes
21 Rust a great candidate to embed inside of other languages when you need
22 that extra oomph.
23
24 There is a whole [chapter devoted to FFI][ffi] and its specifics elsewhere in
25 the book, but in this chapter, we’ll examine this particular use-case of FFI,
26 with examples in Ruby, Python, and JavaScript.
27
28 [ffi]: ffi.html
29
30 # The problem
31
32 There are many different projects we could choose here, but we’re going to
33 pick an example where Rust has a clear advantage over many other languages:
34 numeric computing and threading.
35
36 Many languages, for the sake of consistency, place numbers on the heap, rather
37 than on the stack. Especially in languages that focus on object-oriented
38 programming and use garbage collection, heap allocation is the default. Sometimes
39 optimizations can stack allocate particular numbers, but rather than relying
40 on an optimizer to do its job, we may want to ensure that we’re always using
41 primitive number types rather than some sort of object type.
42
43 Second, many languages have a ‘global interpreter lock’ (GIL), which limits
44 concurrency in many situations. This is done in the name of safety, which is
45 a positive effect, but it limits the amount of work that can be done at the
46 same time, which is a big negative.
47
48 To emphasize these two aspects, we’re going to create a little project that
49 uses these two aspects heavily. Since the focus of the example is to embed
50 Rust into other languages, rather than the problem itself, we’ll just use a
51 toy example:
52
53 > Start ten threads. Inside each thread, count from one to five million. After
54 > all ten threads are finished, print out ‘done!’.
55
56 I chose five million based on my particular computer. Here’s an example of this
57 code in Ruby:
58
59 ```ruby
60 threads = []
61
62 10.times do
63 threads << Thread.new do
64 count = 0
65
66 5_000_000.times do
67 count += 1
68 end
69
70 count
71 end
72 end
73
74 threads.each do |t|
75 puts "Thread finished with count=#{t.value}"
76 end
77 puts "done!"
78 ```
79
80 Try running this example, and choose a number that runs for a few seconds.
81 Depending on your computer’s hardware, you may have to increase or decrease the
82 number.
83
84 On my system, running this program takes `2.156` seconds. And, if I use some
85 sort of process monitoring tool, like `top`, I can see that it only uses one
86 core on my machine. That’s the GIL kicking in.
87
88 While it’s true that this is a synthetic program, one can imagine many problems
89 that are similar to this in the real world. For our purposes, spinning up a few
90 busy threads represents some sort of parallel, expensive computation.
91
92 # A Rust library
93
94 Let’s rewrite this problem in Rust. First, let’s make a new project with
95 Cargo:
96
97 ```bash
98 $ cargo new embed
99 $ cd embed
100 ```
101
102 This program is fairly easy to write in Rust:
103
104 ```rust
105 use std::thread;
106
107 fn process() {
108 let handles: Vec<_> = (0..10).map(|_| {
109 thread::spawn(|| {
110 let mut x = 0;
111 for _ in 0..5_000_000 {
112 x += 1
113 }
114 x
115 })
116 }).collect();
117
118 for h in handles {
119 println!("Thread finished with count={}",
120 h.join().map_err(|_| "Could not join a thread!").unwrap());
121 }
122 }
123 ```
124
125 Some of this should look familiar from previous examples. We spin up ten
126 threads, collecting them into a `handles` vector. Inside of each thread, we
127 loop five million times, and add one to `x` each time. Finally, we join on
128 each thread.
129
130 Right now, however, this is a Rust library, and it doesn’t expose anything
131 that’s callable from C. If we tried to hook this up to another language right
132 now, it wouldn’t work. We only need to make two small changes to fix this,
133 though. The first is to modify the beginning of our code:
134
135 ```rust,ignore
136 #[no_mangle]
137 pub extern fn process() {
138 ```
139
140 We have to add a new attribute, `no_mangle`. When you create a Rust library, it
141 changes the name of the function in the compiled output. The reasons for this
142 are outside the scope of this tutorial, but in order for other languages to
143 know how to call the function, we can’t do that. This attribute turns
144 that behavior off.
145
146 The other change is the `pub extern`. The `pub` means that this function should
147 be callable from outside of this module, and the `extern` says that it should
148 be able to be called from C. That’s it! Not a whole lot of change.
149
150 The second thing we need to do is to change a setting in our `Cargo.toml`. Add
151 this at the bottom:
152
153 ```toml
154 [lib]
155 name = "embed"
156 crate-type = ["dylib"]
157 ```
158
159 This tells Rust that we want to compile our library into a standard dynamic
160 library. By default, Rust compiles an ‘rlib’, a Rust-specific format.
161
162 Let’s build the project now:
163
164 ```bash
165 $ cargo build --release
166 Compiling embed v0.1.0 (file:///home/steve/src/embed)
167 ```
168
169 We’ve chosen `cargo build --release`, which builds with optimizations on. We
170 want this to be as fast as possible! You can find the output of the library in
171 `target/release`:
172
173 ```bash
174 $ ls target/release/
175 build deps examples libembed.so native
176 ```
177
178 That `libembed.so` is our ‘shared object’ library. We can use this file
179 just like any shared object library written in C! As an aside, this may be
180 `embed.dll` (Microsoft Windows) or `libembed.dylib` (Mac OS X), depending on
181 your operating system.
182
183 Now that we’ve got our Rust library built, let’s use it from our Ruby.
184
185 # Ruby
186
187 Open up an `embed.rb` file inside of our project, and do this:
188
189 ```ruby
190 require 'ffi'
191
192 module Hello
193 extend FFI::Library
194 ffi_lib 'target/release/libembed.so'
195 attach_function :process, [], :void
196 end
197
198 Hello.process
199
200 puts 'done!'
201 ```
202
203 Before we can run this, we need to install the `ffi` gem:
204
205 ```bash
206 $ gem install ffi # this may need sudo
207 Fetching: ffi-1.9.8.gem (100%)
208 Building native extensions. This could take a while...
209 Successfully installed ffi-1.9.8
210 Parsing documentation for ffi-1.9.8
211 Installing ri documentation for ffi-1.9.8
212 Done installing documentation for ffi after 0 seconds
213 1 gem installed
214 ```
215
216 And finally, we can try running it:
217
218 ```bash
219 $ ruby embed.rb
220 Thread finished with count=5000000
221 Thread finished with count=5000000
222 Thread finished with count=5000000
223 Thread finished with count=5000000
224 Thread finished with count=5000000
225 Thread finished with count=5000000
226 Thread finished with count=5000000
227 Thread finished with count=5000000
228 Thread finished with count=5000000
229 Thread finished with count=5000000
230 done!
231 done!
232 $
233 ```
234
235 Whoa, that was fast! On my system, this took `0.086` seconds, rather than
236 the two seconds the pure Ruby version took. Let’s break down this Ruby
237 code:
238
239 ```ruby
240 require 'ffi'
241 ```
242
243 We first need to require the `ffi` gem. This lets us interface with our
244 Rust library like a C library.
245
246 ```ruby
247 module Hello
248 extend FFI::Library
249 ffi_lib 'target/release/libembed.so'
250 ```
251
252 The `Hello` module is used to attach the native functions from the shared
253 library. Inside, we `extend` the necessary `FFI::Library` module and then call
254 `ffi_lib` to load up our shared object library. We just pass it the path that
255 our library is stored, which, as we saw before, is
256 `target/release/libembed.so`.
257
258 ```ruby
259 attach_function :process, [], :void
260 ```
261
262 The `attach_function` method is provided by the FFI gem. It’s what
263 connects our `process()` function in Rust to a Ruby function of the
264 same name. Since `process()` takes no arguments, the second parameter
265 is an empty array, and since it returns nothing, we pass `:void` as
266 the final argument.
267
268 ```ruby
269 Hello.process
270 ```
271
272 This is the actual call into Rust. The combination of our `module`
273 and the call to `attach_function` sets this all up. It looks like
274 a Ruby function but is actually Rust!
275
276 ```ruby
277 puts 'done!'
278 ```
279
280 Finally, as per our project’s requirements, we print out `done!`.
281
282 That’s it! As we’ve seen, bridging between the two languages is really easy,
283 and buys us a lot of performance.
284
285 Next, let’s try Python!
286
287 # Python
288
289 Create an `embed.py` file in this directory, and put this in it:
290
291 ```python
292 from ctypes import cdll
293
294 lib = cdll.LoadLibrary("target/release/libembed.so")
295
296 lib.process()
297
298 print("done!")
299 ```
300
301 Even easier! We use `cdll` from the `ctypes` module. A quick call
302 to `LoadLibrary` later, and we can call `process()`.
303
304 On my system, this takes `0.017` seconds. Speedy!
305
306 # Node.js
307
308 Node isn’t a language, but it’s currently the dominant implementation of
309 server-side JavaScript.
310
311 In order to do FFI with Node, we first need to install the library:
312
313 ```bash
314 $ npm install ffi
315 ```
316
317 After that installs, we can use it:
318
319 ```javascript
320 var ffi = require('ffi');
321
322 var lib = ffi.Library('target/release/libembed', {
323 'process': ['void', []]
324 });
325
326 lib.process();
327
328 console.log("done!");
329 ```
330
331 It looks more like the Ruby example than the Python example. We use
332 the `ffi` module to get access to `ffi.Library()`, which loads up
333 our shared object. We need to annotate the return type and argument
334 types of the function, which are `void` for return and an empty
335 array to signify no arguments. From there, we just call it and
336 print the result.
337
338 On my system, this takes a quick `0.092` seconds.
339
340 # Conclusion
341
342 As you can see, the basics of doing this are _very_ easy. Of course,
343 there's a lot more that we could do here. Check out the [FFI][ffi]
344 chapter for more details.