]> git.proxmox.com Git - rustc.git/blob - src/doc/trpl/rust-inside-other-languages.md
a1ae50a0c5396565b4298e0f365d3b04078d9740
[rustc.git] / src / doc / trpl / rust-inside-other-languages.md
1 % Rust Inside Other Languages
2
3 For our third project, we’re going to choose something that shows off one of
4 Rust’s greatest strengths: a lack of a substantial runtime.
5
6 As organizations grow, they increasingly rely on a multitude of programming
7 languages. Different programming languages have different strengths and
8 weaknesses, and a polyglot stack lets you use a particular language where
9 its strengths make sense, and use a different language where it’s weak.
10
11 A very common area where many programming languages are weak is in runtime
12 performance of programs. Often, using a language that is slower, but offers
13 greater programmer productivity is a worthwhile trade-off. To help mitigate
14 this, they provide a way to write some of your system in C, and then call
15 the C code as though it were written in the higher-level language. This is
16 called a ‘foreign function interface’, often shortened to ‘FFI’.
17
18 Rust has support for FFI in both directions: it can call into C code easily,
19 but crucially, it can also be called _into_ as easily as C. Combined with
20 Rust’s lack of a garbage collector and low runtime requirements, this makes
21 Rust a great candidate to embed inside of other languages when you need
22 some extra oomph.
23
24 There is a whole [chapter devoted to FFI][ffi] and its specifics elsewhere in
25 the book, but in this chapter, we’ll examine this particular use-case of FFI,
26 with three examples, in Ruby, Python, and JavaScript.
27
28 [ffi]: ffi.html
29
30 # The problem
31
32 There are many different projects we could choose here, but we’re going to
33 pick an example where Rust has a clear advantage over many other languages:
34 numeric computing and threading.
35
36 Many languages, for the sake of consistency, place numbers on the heap, rather
37 than on the stack. Especially in languages that focus on object-oriented
38 programming and use garbage collection, heap allocation is the default. Sometimes
39 optimizations can stack allocate particular numbers, but rather than relying
40 on an optimizer to do its job, we may want to ensure that we’re always using
41 primitive number types rather than some sort of object type.
42
43 Second, many languages have a ‘global interpreter lock’, which limits
44 concurrency in many situations. This is done in the name of safety, which is
45 a positive effect, but it limits the amount of work that can be done at the
46 same time, which is a big negative.
47
48 To emphasize these two aspects, we’re going to create a little project that
49 uses these two aspects heavily. Since the focus of the example is the embedding
50 of Rust into the languages, rather than the problem itself, we’ll just use a
51 toy example:
52
53 > Start ten threads. Inside each thread, count from one to five million. After
54 > All ten threads are finished, print out ‘done!’.
55
56 I chose five million based on my particular computer. Here’s an example of this
57 code in Ruby:
58
59 ```ruby
60 threads = []
61
62 10.times do
63 threads << Thread.new do
64 count = 0
65
66 5_000_000.times do
67 count += 1
68 end
69 end
70 end
71
72 threads.each {|t| t.join }
73 puts "done!"
74 ```
75
76 Try running this example, and choose a number that runs for a few seconds.
77 Depending on your computer’s hardware, you may have to increase or decrease the
78 number.
79
80 On my system, running this program takes `2.156` seconds. And, if I use some
81 sort of process monitoring tool, like `top`, I can see that it only uses one
82 core on my machine. That’s the GIL kicking in.
83
84 While it’s true that this is a synthetic program, one can imagine many problems
85 that are similar to this in the real world. For our purposes, spinning up some
86 busy threads represents some sort of parallel, expensive computation.
87
88 # A Rust library
89
90 Let’s re-write this problem in Rust. First, let’s make a new project with
91 Cargo:
92
93 ```bash
94 $ cargo new embed
95 $ cd embed
96 ```
97
98 This program is fairly easy to write in Rust:
99
100 ```rust
101 use std::thread;
102
103 fn process() {
104 let handles: Vec<_> = (0..10).map(|_| {
105 thread::spawn(|| {
106 let mut _x = 0;
107 for _ in (0..5_000_001) {
108 _x += 1
109 }
110 })
111 }).collect();
112
113 for h in handles {
114 h.join().ok().expect("Could not join a thread!");
115 }
116 }
117 ```
118
119 Some of this should look familiar from previous examples. We spin up ten
120 threads, collecting them into a `handles` vector. Inside of each thread, we
121 loop five million times, and add one to `_x` each time. Why the underscore?
122 Well, if we remove it and compile:
123
124 ```bash
125 $ cargo build
126 Compiling embed v0.1.0 (file:///home/steve/src/embed)
127 src/lib.rs:3:1: 16:2 warning: function is never used: `process`, #[warn(dead_code)] on by default
128 src/lib.rs:3 fn process() {
129 src/lib.rs:4 let handles: Vec<_> = (0..10).map(|_| {
130 src/lib.rs:5 thread::spawn(|| {
131 src/lib.rs:6 let mut x = 0;
132 src/lib.rs:7 for _ in (0..5_000_001) {
133 src/lib.rs:8 x += 1
134 ...
135 src/lib.rs:6:17: 6:22 warning: variable `x` is assigned to, but never used, #[warn(unused_variables)] on by default
136 src/lib.rs:6 let mut x = 0;
137 ^~~~~
138 ```
139
140 That first warning is because we are building a library. If we had a test
141 for this function, the warning would go away. But for now, it’s never
142 called.
143
144 The second is related to `x` versus `_x`. Because we never actually _do_
145 anything with `x`, we get a warning about it. In our case, that’s perfectly
146 okay, as we’re just trying to waste CPU cycles. Prefixing `x` with the
147 underscore removes the warning.
148
149 Finally, we join on each thread.
150
151 Right now, however, this is a Rust library, and it doesn’t expose anything
152 that’s callable from C. If we tried to hook this up to another language right
153 now, it wouldn’t work. We only need to make two small changes to fix this,
154 though. The first is modify the beginning of our code:
155
156 ```rust,ignore
157 #[no_mangle]
158 pub extern fn process() {
159 ```
160
161 We have to add a new attribute, `no_mangle`. When you create a Rust library, it
162 changes the name of the function in the compiled output. The reasons for this
163 are outside the scope of this tutorial, but in order for other languages to
164 know how to call the function, we need to not do that. This attribute turns
165 that behavior off.
166
167 The other change is the `pub extern`. The `pub` means that this function should
168 be callable from outside of this module, and the `extern` says that it should
169 be able to be called from C. That’s it! Not a whole lot of change.
170
171 The second thing we need to do is to change a setting in our `Cargo.toml`. Add
172 this at the bottom:
173
174 ```toml
175 [lib]
176 name = "embed"
177 crate-type = ["dylib"]
178 ```
179
180 This tells Rust that we want to compile our library into a standard dynamic
181 library. By default, Rust compiles into an ‘rlib’, a Rust-specific format.
182
183 Let’s build the project now:
184
185 ```bash
186 $ cargo build --release
187 Compiling embed v0.1.0 (file:///home/steve/src/embed)
188 ```
189
190 We’ve chosen `cargo build --release`, which builds with optimizations on. We
191 want this to be as fast as possible! You can find the output of the library in
192 `target/release`:
193
194 ```bash
195 $ ls target/release/
196 build deps examples libembed.so native
197 ```
198
199 That `libembed.so` is our ‘shared object’ library. We can use this file
200 just like any shared object library written in C! As an aside, this may be
201 `embed.dll` or `libembed.dylib`, depending on the platform.
202
203 Now that we’ve got our Rust library built, let’s use it from our Ruby.
204
205 # Ruby
206
207 Open up a `embed.rb` file inside of our project, and do this:
208
209 ```ruby
210 require 'ffi'
211
212 module Hello
213 extend FFI::Library
214 ffi_lib 'target/release/libembed.so'
215 attach_function :process, [], :void
216 end
217
218 Hello.process
219
220 puts "done!”
221 ```
222
223 Before we can run this, we need to install the `ffi` gem:
224
225 ```bash
226 $ gem install ffi # this may need sudo
227 Fetching: ffi-1.9.8.gem (100%)
228 Building native extensions. This could take a while...
229 Successfully installed ffi-1.9.8
230 Parsing documentation for ffi-1.9.8
231 Installing ri documentation for ffi-1.9.8
232 Done installing documentation for ffi after 0 seconds
233 1 gem installed
234 ```
235
236 And finally, we can try running it:
237
238 ```bash
239 $ ruby embed.rb
240 done!
241 $
242 ```
243
244 Whoah, that was fast! On my system, this took `0.086` seconds, rather than
245 the two seconds the pure Ruby version took. Let’s break down this Ruby
246 code:
247
248 ```ruby
249 require 'ffi'
250 ```
251
252 We first need to require the `ffi` gem. This lets us interface with our
253 Rust library like a C library.
254
255 ```ruby
256 module Hello
257 extend FFI::Library
258 ffi_lib 'target/release/libembed.so'
259 ```
260
261 The `ffi` gem’s authors recommend using a module to scope the functions
262 we’ll import from the shared library. Inside, we `extend` the necessary
263 `FFI::Library` module, and then call `ffi_lib` to load up our shared
264 object library. We just pass it the path that our library is stored,
265 which as we saw before, is `target/release/libembed.so`.
266
267 ```ruby
268 attach_function :process, [], :void
269 ```
270
271 The `attach_function` method is provided by the FFI gem. It’s what
272 connects our `process()` function in Rust to a Ruby function of the
273 same name. Since `process()` takes no arguments, the second parameter
274 is an empty array, and since it returns nothing, we pass `:void` as
275 the final argument.
276
277 ```ruby
278 Hello.process
279 ```
280
281 This is the actual call into Rust. The combination of our `module`
282 and the call to `attach_function` sets this all up. It looks like
283 a Ruby function, but is actually Rust!
284
285 ```ruby
286 puts "done!"
287 ```
288
289 Finally, as per our project’s requirements, we print out `done!`.
290
291 That’s it! As we’ve seen, bridging between the two languages is really easy,
292 and buys us a lot of performance.
293
294 Next, let’s try Python!
295
296 # Python
297
298 Create an `embed.py` file in this directory, and put this in it:
299
300 ```python
301 from ctypes import cdll
302
303 lib = cdll.LoadLibrary("target/release/libembed.so")
304
305 lib.process()
306
307 print("done!")
308 ```
309
310 Even easier! We use `cdll` from the `ctypes` module. A quick call
311 to `LoadLibrary` later, and we can call `process()`.
312
313 On my system, this takes `0.017` seconds. Speedy!
314
315 # Node.js
316
317 Node isn’t a language, but it’s currently the dominant implementation of
318 server-side JavaScript.
319
320 In order to do FFI with Node, we first need to install the library:
321
322 ```bash
323 $ npm install ffi
324 ```
325
326 After that installs, we can use it:
327
328 ```javascript
329 var ffi = require('ffi');
330
331 var lib = ffi.Library('target/release/libembed', {
332 'process': [ 'void', [] ]
333 });
334
335 lib.process();
336
337 console.log("done!");
338 ```
339
340 It looks more like the Ruby example than the Python example. We use
341 the `ffi` module to get access to `ffi.Library()`, which loads up
342 our shared object. We need to annotate the return type and argument
343 types of the function, which are 'void' for return, and an empty
344 array to signify no arguments. From there, we just call it and
345 print the result.
346
347 On my system, this takes a quick `0.092` seconds.
348
349 # Conclusion
350
351 As you can see, the basics of doing this are _very_ easy. Of course,
352 there's a lot more that we could do here. Check out the [FFI][ffi]
353 chapter for more details.