]>
Commit | Line | Data |
---|---|---|
7cac9316 XL |
1 | # Foreign Function Interface |
2 | ||
136023e0 | 3 | ## Introduction |
7cac9316 XL |
4 | |
5 | This guide will use the [snappy](https://github.com/google/snappy) | |
6 | compression/decompression library as an introduction to writing bindings for | |
7 | foreign code. Rust is currently unable to call directly into a C++ library, but | |
8 | snappy includes a C interface (documented in | |
9 | [`snappy-c.h`](https://github.com/google/snappy/blob/master/snappy-c.h)). | |
10 | ||
11 | ## A note about libc | |
12 | ||
13 | Many of these examples use [the `libc` crate][libc], which provides various | |
14 | type definitions for C types, among other things. If you’re trying these | |
15 | examples yourself, you’ll need to add `libc` to your `Cargo.toml`: | |
16 | ||
17 | ```toml | |
18 | [dependencies] | |
19 | libc = "0.2.0" | |
20 | ``` | |
21 | ||
22 | [libc]: https://crates.io/crates/libc | |
23 | ||
24 | and add `extern crate libc;` to your crate root. | |
25 | ||
26 | ## Calling foreign functions | |
27 | ||
28 | The following is a minimal example of calling a foreign function which will | |
29 | compile if snappy is installed: | |
30 | ||
136023e0 | 31 | <!-- ignore: requires libc crate --> |
7cac9316 XL |
32 | ```rust,ignore |
33 | extern crate libc; | |
34 | use libc::size_t; | |
35 | ||
36 | #[link(name = "snappy")] | |
37 | extern { | |
38 | fn snappy_max_compressed_length(source_length: size_t) -> size_t; | |
39 | } | |
40 | ||
41 | fn main() { | |
42 | let x = unsafe { snappy_max_compressed_length(100) }; | |
43 | println!("max compressed length of a 100 byte buffer: {}", x); | |
44 | } | |
45 | ``` | |
46 | ||
47 | The `extern` block is a list of function signatures in a foreign library, in | |
48 | this case with the platform's C ABI. The `#[link(...)]` attribute is used to | |
49 | instruct the linker to link against the snappy library so the symbols are | |
50 | resolved. | |
51 | ||
52 | Foreign functions are assumed to be unsafe so calls to them need to be wrapped | |
53 | with `unsafe {}` as a promise to the compiler that everything contained within | |
54 | truly is safe. C libraries often expose interfaces that aren't thread-safe, and | |
55 | almost any function that takes a pointer argument isn't valid for all possible | |
56 | inputs since the pointer could be dangling, and raw pointers fall outside of | |
57 | Rust's safe memory model. | |
58 | ||
59 | When declaring the argument types to a foreign function, the Rust compiler | |
60 | cannot check if the declaration is correct, so specifying it correctly is part | |
61 | of keeping the binding correct at runtime. | |
62 | ||
63 | The `extern` block can be extended to cover the entire snappy API: | |
64 | ||
136023e0 | 65 | <!-- ignore: requires libc crate --> |
7cac9316 XL |
66 | ```rust,ignore |
67 | extern crate libc; | |
68 | use libc::{c_int, size_t}; | |
69 | ||
70 | #[link(name = "snappy")] | |
71 | extern { | |
72 | fn snappy_compress(input: *const u8, | |
73 | input_length: size_t, | |
74 | compressed: *mut u8, | |
75 | compressed_length: *mut size_t) -> c_int; | |
76 | fn snappy_uncompress(compressed: *const u8, | |
77 | compressed_length: size_t, | |
78 | uncompressed: *mut u8, | |
79 | uncompressed_length: *mut size_t) -> c_int; | |
80 | fn snappy_max_compressed_length(source_length: size_t) -> size_t; | |
81 | fn snappy_uncompressed_length(compressed: *const u8, | |
82 | compressed_length: size_t, | |
83 | result: *mut size_t) -> c_int; | |
84 | fn snappy_validate_compressed_buffer(compressed: *const u8, | |
85 | compressed_length: size_t) -> c_int; | |
86 | } | |
87 | # fn main() {} | |
88 | ``` | |
89 | ||
136023e0 | 90 | ## Creating a safe interface |
7cac9316 XL |
91 | |
92 | The raw C API needs to be wrapped to provide memory safety and make use of higher-level concepts | |
93 | like vectors. A library can choose to expose only the safe, high-level interface and hide the unsafe | |
94 | internal details. | |
95 | ||
96 | Wrapping the functions which expect buffers involves using the `slice::raw` module to manipulate Rust | |
97 | vectors as pointers to memory. Rust's vectors are guaranteed to be a contiguous block of memory. The | |
98 | length is the number of elements currently contained, and the capacity is the total size in elements of | |
99 | the allocated memory. The length is less than or equal to the capacity. | |
100 | ||
136023e0 | 101 | <!-- ignore: requires libc crate --> |
7cac9316 XL |
102 | ```rust,ignore |
103 | # extern crate libc; | |
104 | # use libc::{c_int, size_t}; | |
105 | # unsafe fn snappy_validate_compressed_buffer(_: *const u8, _: size_t) -> c_int { 0 } | |
106 | # fn main() {} | |
107 | pub fn validate_compressed_buffer(src: &[u8]) -> bool { | |
108 | unsafe { | |
109 | snappy_validate_compressed_buffer(src.as_ptr(), src.len() as size_t) == 0 | |
110 | } | |
111 | } | |
112 | ``` | |
113 | ||
114 | The `validate_compressed_buffer` wrapper above makes use of an `unsafe` block, but it makes the | |
115 | guarantee that calling it is safe for all inputs by leaving off `unsafe` from the function | |
116 | signature. | |
117 | ||
118 | The `snappy_compress` and `snappy_uncompress` functions are more complex, since a buffer has to be | |
119 | allocated to hold the output too. | |
120 | ||
121 | The `snappy_max_compressed_length` function can be used to allocate a vector with the maximum | |
122 | required capacity to hold the compressed output. The vector can then be passed to the | |
123 | `snappy_compress` function as an output parameter. An output parameter is also passed to retrieve | |
124 | the true length after compression for setting the length. | |
125 | ||
136023e0 | 126 | <!-- ignore: requires libc crate --> |
7cac9316 XL |
127 | ```rust,ignore |
128 | # extern crate libc; | |
129 | # use libc::{size_t, c_int}; | |
130 | # unsafe fn snappy_compress(a: *const u8, b: size_t, c: *mut u8, | |
131 | # d: *mut size_t) -> c_int { 0 } | |
132 | # unsafe fn snappy_max_compressed_length(a: size_t) -> size_t { a } | |
133 | # fn main() {} | |
134 | pub fn compress(src: &[u8]) -> Vec<u8> { | |
135 | unsafe { | |
136 | let srclen = src.len() as size_t; | |
137 | let psrc = src.as_ptr(); | |
138 | ||
139 | let mut dstlen = snappy_max_compressed_length(srclen); | |
140 | let mut dst = Vec::with_capacity(dstlen as usize); | |
141 | let pdst = dst.as_mut_ptr(); | |
142 | ||
143 | snappy_compress(psrc, srclen, pdst, &mut dstlen); | |
144 | dst.set_len(dstlen as usize); | |
145 | dst | |
146 | } | |
147 | } | |
148 | ``` | |
149 | ||
150 | Decompression is similar, because snappy stores the uncompressed size as part of the compression | |
151 | format and `snappy_uncompressed_length` will retrieve the exact buffer size required. | |
152 | ||
136023e0 | 153 | <!-- ignore: requires libc crate --> |
7cac9316 XL |
154 | ```rust,ignore |
155 | # extern crate libc; | |
156 | # use libc::{size_t, c_int}; | |
157 | # unsafe fn snappy_uncompress(compressed: *const u8, | |
158 | # compressed_length: size_t, | |
159 | # uncompressed: *mut u8, | |
160 | # uncompressed_length: *mut size_t) -> c_int { 0 } | |
161 | # unsafe fn snappy_uncompressed_length(compressed: *const u8, | |
162 | # compressed_length: size_t, | |
163 | # result: *mut size_t) -> c_int { 0 } | |
164 | # fn main() {} | |
165 | pub fn uncompress(src: &[u8]) -> Option<Vec<u8>> { | |
166 | unsafe { | |
167 | let srclen = src.len() as size_t; | |
168 | let psrc = src.as_ptr(); | |
169 | ||
170 | let mut dstlen: size_t = 0; | |
171 | snappy_uncompressed_length(psrc, srclen, &mut dstlen); | |
172 | ||
173 | let mut dst = Vec::with_capacity(dstlen as usize); | |
174 | let pdst = dst.as_mut_ptr(); | |
175 | ||
176 | if snappy_uncompress(psrc, srclen, pdst, &mut dstlen) == 0 { | |
177 | dst.set_len(dstlen as usize); | |
178 | Some(dst) | |
179 | } else { | |
180 | None // SNAPPY_INVALID_INPUT | |
181 | } | |
182 | } | |
183 | } | |
184 | ``` | |
185 | ||
186 | Then, we can add some tests to show how to use them. | |
187 | ||
136023e0 | 188 | <!-- ignore: requires libc crate --> |
7cac9316 XL |
189 | ```rust,ignore |
190 | # extern crate libc; | |
191 | # use libc::{c_int, size_t}; | |
192 | # unsafe fn snappy_compress(input: *const u8, | |
193 | # input_length: size_t, | |
194 | # compressed: *mut u8, | |
195 | # compressed_length: *mut size_t) | |
196 | # -> c_int { 0 } | |
197 | # unsafe fn snappy_uncompress(compressed: *const u8, | |
198 | # compressed_length: size_t, | |
199 | # uncompressed: *mut u8, | |
200 | # uncompressed_length: *mut size_t) | |
201 | # -> c_int { 0 } | |
202 | # unsafe fn snappy_max_compressed_length(source_length: size_t) -> size_t { 0 } | |
203 | # unsafe fn snappy_uncompressed_length(compressed: *const u8, | |
204 | # compressed_length: size_t, | |
205 | # result: *mut size_t) | |
206 | # -> c_int { 0 } | |
207 | # unsafe fn snappy_validate_compressed_buffer(compressed: *const u8, | |
208 | # compressed_length: size_t) | |
209 | # -> c_int { 0 } | |
210 | # fn main() { } | |
211 | ||
212 | #[cfg(test)] | |
213 | mod tests { | |
214 | use super::*; | |
215 | ||
216 | #[test] | |
217 | fn valid() { | |
218 | let d = vec![0xde, 0xad, 0xd0, 0x0d]; | |
219 | let c: &[u8] = &compress(&d); | |
220 | assert!(validate_compressed_buffer(c)); | |
221 | assert!(uncompress(c) == Some(d)); | |
222 | } | |
223 | ||
224 | #[test] | |
225 | fn invalid() { | |
226 | let d = vec![0, 0, 0, 0]; | |
227 | assert!(!validate_compressed_buffer(&d)); | |
228 | assert!(uncompress(&d).is_none()); | |
229 | } | |
230 | ||
231 | #[test] | |
232 | fn empty() { | |
233 | let d = vec![]; | |
234 | assert!(!validate_compressed_buffer(&d)); | |
235 | assert!(uncompress(&d).is_none()); | |
236 | let c = compress(&d); | |
237 | assert!(validate_compressed_buffer(&c)); | |
238 | assert!(uncompress(&c) == Some(d)); | |
239 | } | |
240 | } | |
241 | ``` | |
242 | ||
136023e0 | 243 | ## Destructors |
7cac9316 XL |
244 | |
245 | Foreign libraries often hand off ownership of resources to the calling code. | |
246 | When this occurs, we must use Rust's destructors to provide safety and guarantee | |
247 | the release of these resources (especially in the case of panic). | |
248 | ||
249 | For more about destructors, see the [Drop trait](../std/ops/trait.Drop.html). | |
250 | ||
136023e0 | 251 | ## Callbacks from C code to Rust functions |
7cac9316 XL |
252 | |
253 | Some external libraries require the usage of callbacks to report back their | |
254 | current state or intermediate data to the caller. | |
255 | It is possible to pass functions defined in Rust to an external library. | |
256 | The requirement for this is that the callback function is marked as `extern` | |
257 | with the correct calling convention to make it callable from C code. | |
258 | ||
259 | The callback function can then be sent through a registration call | |
260 | to the C library and afterwards be invoked from there. | |
261 | ||
262 | A basic example is: | |
263 | ||
264 | Rust code: | |
265 | ||
266 | ```rust,no_run | |
267 | extern fn callback(a: i32) { | |
268 | println!("I'm called from C with value {0}", a); | |
269 | } | |
270 | ||
271 | #[link(name = "extlib")] | |
272 | extern { | |
273 | fn register_callback(cb: extern fn(i32)) -> i32; | |
274 | fn trigger_callback(); | |
275 | } | |
276 | ||
277 | fn main() { | |
278 | unsafe { | |
279 | register_callback(callback); | |
280 | trigger_callback(); // Triggers the callback. | |
281 | } | |
282 | } | |
283 | ``` | |
284 | ||
285 | C code: | |
286 | ||
287 | ```c | |
288 | typedef void (*rust_callback)(int32_t); | |
289 | rust_callback cb; | |
290 | ||
291 | int32_t register_callback(rust_callback callback) { | |
292 | cb = callback; | |
293 | return 1; | |
294 | } | |
295 | ||
296 | void trigger_callback() { | |
297 | cb(7); // Will call callback(7) in Rust. | |
298 | } | |
299 | ``` | |
300 | ||
301 | In this example Rust's `main()` will call `trigger_callback()` in C, | |
302 | which would, in turn, call back to `callback()` in Rust. | |
303 | ||
7cac9316 XL |
304 | ## Targeting callbacks to Rust objects |
305 | ||
306 | The former example showed how a global function can be called from C code. | |
307 | However it is often desired that the callback is targeted to a special | |
308 | Rust object. This could be the object that represents the wrapper for the | |
309 | respective C object. | |
310 | ||
311 | This can be achieved by passing a raw pointer to the object down to the | |
312 | C library. The C library can then include the pointer to the Rust object in | |
313 | the notification. This will allow the callback to unsafely access the | |
314 | referenced Rust object. | |
315 | ||
316 | Rust code: | |
317 | ||
318 | ```rust,no_run | |
7cac9316 XL |
319 | struct RustObject { |
320 | a: i32, | |
321 | // Other members... | |
322 | } | |
323 | ||
324 | extern "C" fn callback(target: *mut RustObject, a: i32) { | |
325 | println!("I'm called from C with value {0}", a); | |
326 | unsafe { | |
327 | // Update the value in RustObject with the value received from the callback: | |
328 | (*target).a = a; | |
329 | } | |
330 | } | |
331 | ||
332 | #[link(name = "extlib")] | |
333 | extern { | |
334 | fn register_callback(target: *mut RustObject, | |
335 | cb: extern fn(*mut RustObject, i32)) -> i32; | |
336 | fn trigger_callback(); | |
337 | } | |
338 | ||
339 | fn main() { | |
340 | // Create the object that will be referenced in the callback: | |
341 | let mut rust_object = Box::new(RustObject { a: 5 }); | |
342 | ||
343 | unsafe { | |
344 | register_callback(&mut *rust_object, callback); | |
345 | trigger_callback(); | |
346 | } | |
347 | } | |
348 | ``` | |
349 | ||
350 | C code: | |
351 | ||
352 | ```c | |
353 | typedef void (*rust_callback)(void*, int32_t); | |
354 | void* cb_target; | |
355 | rust_callback cb; | |
356 | ||
357 | int32_t register_callback(void* callback_target, rust_callback callback) { | |
358 | cb_target = callback_target; | |
359 | cb = callback; | |
360 | return 1; | |
361 | } | |
362 | ||
363 | void trigger_callback() { | |
364 | cb(cb_target, 7); // Will call callback(&rustObject, 7) in Rust. | |
365 | } | |
366 | ``` | |
367 | ||
368 | ## Asynchronous callbacks | |
369 | ||
370 | In the previously given examples the callbacks are invoked as a direct reaction | |
371 | to a function call to the external C library. | |
372 | The control over the current thread is switched from Rust to C to Rust for the | |
373 | execution of the callback, but in the end the callback is executed on the | |
374 | same thread that called the function which triggered the callback. | |
375 | ||
376 | Things get more complicated when the external library spawns its own threads | |
377 | and invokes callbacks from there. | |
378 | In these cases access to Rust data structures inside the callbacks is | |
379 | especially unsafe and proper synchronization mechanisms must be used. | |
380 | Besides classical synchronization mechanisms like mutexes, one possibility in | |
381 | Rust is to use channels (in `std::sync::mpsc`) to forward data from the C | |
382 | thread that invoked the callback into a Rust thread. | |
383 | ||
384 | If an asynchronous callback targets a special object in the Rust address space | |
385 | it is also absolutely necessary that no more callbacks are performed by the | |
386 | C library after the respective Rust object gets destroyed. | |
387 | This can be achieved by unregistering the callback in the object's | |
388 | destructor and designing the library in a way that guarantees that no | |
389 | callback will be performed after deregistration. | |
390 | ||
136023e0 | 391 | ## Linking |
7cac9316 XL |
392 | |
393 | The `link` attribute on `extern` blocks provides the basic building block for | |
394 | instructing rustc how it will link to native libraries. There are two accepted | |
395 | forms of the link attribute today: | |
396 | ||
397 | * `#[link(name = "foo")]` | |
398 | * `#[link(name = "foo", kind = "bar")]` | |
399 | ||
400 | In both of these cases, `foo` is the name of the native library that we're | |
401 | linking to, and in the second case `bar` is the type of native library that the | |
402 | compiler is linking to. There are currently three known types of native | |
403 | libraries: | |
404 | ||
405 | * Dynamic - `#[link(name = "readline")]` | |
406 | * Static - `#[link(name = "my_build_dependency", kind = "static")]` | |
407 | * Frameworks - `#[link(name = "CoreFoundation", kind = "framework")]` | |
408 | ||
409 | Note that frameworks are only available on macOS targets. | |
410 | ||
411 | The different `kind` values are meant to differentiate how the native library | |
412 | participates in linkage. From a linkage perspective, the Rust compiler creates | |
413 | two flavors of artifacts: partial (rlib/staticlib) and final (dylib/binary). | |
414 | Native dynamic library and framework dependencies are propagated to the final | |
415 | artifact boundary, while static library dependencies are not propagated at | |
416 | all, because the static libraries are integrated directly into the subsequent | |
417 | artifact. | |
418 | ||
419 | A few examples of how this model can be used are: | |
420 | ||
421 | * A native build dependency. Sometimes some C/C++ glue is needed when writing | |
422 | some Rust code, but distribution of the C/C++ code in a library format is | |
423 | a burden. In this case, the code will be archived into `libfoo.a` and then the | |
424 | Rust crate would declare a dependency via `#[link(name = "foo", kind = | |
425 | "static")]`. | |
426 | ||
427 | Regardless of the flavor of output for the crate, the native static library | |
428 | will be included in the output, meaning that distribution of the native static | |
429 | library is not necessary. | |
430 | ||
431 | * A normal dynamic dependency. Common system libraries (like `readline`) are | |
432 | available on a large number of systems, and often a static copy of these | |
433 | libraries cannot be found. When this dependency is included in a Rust crate, | |
434 | partial targets (like rlibs) will not link to the library, but when the rlib | |
435 | is included in a final target (like a binary), the native library will be | |
436 | linked in. | |
437 | ||
438 | On macOS, frameworks behave with the same semantics as a dynamic library. | |
439 | ||
136023e0 | 440 | ## Unsafe blocks |
7cac9316 XL |
441 | |
442 | Some operations, like dereferencing raw pointers or calling functions that have been marked | |
443 | unsafe are only allowed inside unsafe blocks. Unsafe blocks isolate unsafety and are a promise to | |
444 | the compiler that the unsafety does not leak out of the block. | |
445 | ||
446 | Unsafe functions, on the other hand, advertise it to the world. An unsafe function is written like | |
447 | this: | |
448 | ||
449 | ```rust | |
450 | unsafe fn kaboom(ptr: *const i32) -> i32 { *ptr } | |
451 | ``` | |
452 | ||
453 | This function can only be called from an `unsafe` block or another `unsafe` function. | |
454 | ||
136023e0 | 455 | ## Accessing foreign globals |
7cac9316 XL |
456 | |
457 | Foreign APIs often export a global variable which could do something like track | |
458 | global state. In order to access these variables, you declare them in `extern` | |
459 | blocks with the `static` keyword: | |
460 | ||
136023e0 | 461 | <!-- ignore: requires libc crate --> |
7cac9316 XL |
462 | ```rust,ignore |
463 | extern crate libc; | |
464 | ||
465 | #[link(name = "readline")] | |
466 | extern { | |
467 | static rl_readline_version: libc::c_int; | |
468 | } | |
469 | ||
470 | fn main() { | |
471 | println!("You have readline version {} installed.", | |
472 | unsafe { rl_readline_version as i32 }); | |
473 | } | |
474 | ``` | |
475 | ||
476 | Alternatively, you may need to alter global state provided by a foreign | |
477 | interface. To do this, statics can be declared with `mut` so we can mutate | |
478 | them. | |
479 | ||
136023e0 | 480 | <!-- ignore: requires libc crate --> |
7cac9316 XL |
481 | ```rust,ignore |
482 | extern crate libc; | |
483 | ||
484 | use std::ffi::CString; | |
485 | use std::ptr; | |
486 | ||
487 | #[link(name = "readline")] | |
488 | extern { | |
489 | static mut rl_prompt: *const libc::c_char; | |
490 | } | |
491 | ||
492 | fn main() { | |
493 | let prompt = CString::new("[my-awesome-shell] $").unwrap(); | |
494 | unsafe { | |
495 | rl_prompt = prompt.as_ptr(); | |
496 | ||
497 | println!("{:?}", rl_prompt); | |
498 | ||
499 | rl_prompt = ptr::null(); | |
500 | } | |
501 | } | |
502 | ``` | |
503 | ||
504 | Note that all interaction with a `static mut` is unsafe, both reading and | |
505 | writing. Dealing with global mutable state requires a great deal of care. | |
506 | ||
136023e0 | 507 | ## Foreign calling conventions |
7cac9316 XL |
508 | |
509 | Most foreign code exposes a C ABI, and Rust uses the platform's C calling convention by default when | |
510 | calling foreign functions. Some foreign functions, most notably the Windows API, use other calling | |
511 | conventions. Rust provides a way to tell the compiler which convention to use: | |
512 | ||
136023e0 | 513 | <!-- ignore: requires libc crate --> |
7cac9316 XL |
514 | ```rust,ignore |
515 | extern crate libc; | |
516 | ||
517 | #[cfg(all(target_os = "win32", target_arch = "x86"))] | |
518 | #[link(name = "kernel32")] | |
519 | #[allow(non_snake_case)] | |
520 | extern "stdcall" { | |
521 | fn SetEnvironmentVariableA(n: *const u8, v: *const u8) -> libc::c_int; | |
522 | } | |
523 | # fn main() { } | |
524 | ``` | |
525 | ||
526 | This applies to the entire `extern` block. The list of supported ABI constraints | |
527 | are: | |
528 | ||
529 | * `stdcall` | |
530 | * `aapcs` | |
531 | * `cdecl` | |
532 | * `fastcall` | |
533 | * `vectorcall` | |
534 | This is currently hidden behind the `abi_vectorcall` gate and is subject to change. | |
535 | * `Rust` | |
536 | * `rust-intrinsic` | |
537 | * `system` | |
538 | * `C` | |
539 | * `win64` | |
540 | * `sysv64` | |
541 | ||
542 | Most of the abis in this list are self-explanatory, but the `system` abi may | |
543 | seem a little odd. This constraint selects whatever the appropriate ABI is for | |
544 | interoperating with the target's libraries. For example, on win32 with a x86 | |
545 | architecture, this means that the abi used would be `stdcall`. On x86_64, | |
546 | however, windows uses the `C` calling convention, so `C` would be used. This | |
547 | means that in our previous example, we could have used `extern "system" { ... }` | |
548 | to define a block for all windows systems, not only x86 ones. | |
549 | ||
136023e0 | 550 | ## Interoperability with foreign code |
7cac9316 XL |
551 | |
552 | Rust guarantees that the layout of a `struct` is compatible with the platform's | |
553 | representation in C only if the `#[repr(C)]` attribute is applied to it. | |
554 | `#[repr(C, packed)]` can be used to lay out struct members without padding. | |
555 | `#[repr(C)]` can also be applied to an enum. | |
556 | ||
557 | Rust's owned boxes (`Box<T>`) use non-nullable pointers as handles which point | |
558 | to the contained object. However, they should not be manually created because | |
559 | they are managed by internal allocators. References can safely be assumed to be | |
560 | non-nullable pointers directly to the type. However, breaking the borrow | |
561 | checking or mutability rules is not guaranteed to be safe, so prefer using raw | |
562 | pointers (`*`) if that's needed because the compiler can't make as many | |
563 | assumptions about them. | |
564 | ||
565 | Vectors and strings share the same basic memory layout, and utilities are | |
566 | available in the `vec` and `str` modules for working with C APIs. However, | |
567 | strings are not terminated with `\0`. If you need a NUL-terminated string for | |
568 | interoperability with C, you should use the `CString` type in the `std::ffi` | |
569 | module. | |
570 | ||
571 | The [`libc` crate on crates.io][libc] includes type aliases and function | |
572 | definitions for the C standard library in the `libc` module, and Rust links | |
573 | against `libc` and `libm` by default. | |
574 | ||
136023e0 | 575 | ## Variadic functions |
7cac9316 XL |
576 | |
577 | In C, functions can be 'variadic', meaning they accept a variable number of arguments. This can | |
578 | be achieved in Rust by specifying `...` within the argument list of a foreign function declaration: | |
579 | ||
580 | ```no_run | |
581 | extern { | |
582 | fn foo(x: i32, ...); | |
583 | } | |
584 | ||
585 | fn main() { | |
586 | unsafe { | |
587 | foo(10, 20, 30, 40, 50); | |
588 | } | |
589 | } | |
590 | ``` | |
591 | ||
592 | Normal Rust functions can *not* be variadic: | |
593 | ||
136023e0 | 594 | ```rust,compile_fail |
7cac9316 XL |
595 | // This will not compile |
596 | ||
136023e0 | 597 | fn foo(x: i32, ...) {} |
7cac9316 XL |
598 | ``` |
599 | ||
136023e0 | 600 | ## The "nullable pointer optimization" |
7cac9316 XL |
601 | |
602 | Certain Rust types are defined to never be `null`. This includes references (`&T`, | |
603 | `&mut T`), boxes (`Box<T>`), and function pointers (`extern "abi" fn()`). When | |
604 | interfacing with C, pointers that might be `null` are often used, which would seem to | |
605 | require some messy `transmute`s and/or unsafe code to handle conversions to/from Rust types. | |
606 | However, the language provides a workaround. | |
607 | ||
608 | As a special case, an `enum` is eligible for the "nullable pointer optimization" if it contains | |
609 | exactly two variants, one of which contains no data and the other contains a field of one of the | |
610 | non-nullable types listed above. This means no extra space is required for a discriminant; rather, | |
611 | the empty variant is represented by putting a `null` value into the non-nullable field. This is | |
612 | called an "optimization", but unlike other optimizations it is guaranteed to apply to eligible | |
613 | types. | |
614 | ||
615 | The most common type that takes advantage of the nullable pointer optimization is `Option<T>`, | |
616 | where `None` corresponds to `null`. So `Option<extern "C" fn(c_int) -> c_int>` is a correct way | |
617 | to represent a nullable function pointer using the C ABI (corresponding to the C type | |
618 | `int (*)(int)`). | |
619 | ||
620 | Here is a contrived example. Let's say some C library has a facility for registering a | |
621 | callback, which gets called in certain situations. The callback is passed a function pointer | |
622 | and an integer and it is supposed to run the function with the integer as a parameter. So | |
623 | we have function pointers flying across the FFI boundary in both directions. | |
624 | ||
136023e0 | 625 | <!-- ignore: requires libc crate --> |
7cac9316 XL |
626 | ```rust,ignore |
627 | extern crate libc; | |
628 | use libc::c_int; | |
629 | ||
630 | # #[cfg(hidden)] | |
631 | extern "C" { | |
632 | /// Registers the callback. | |
633 | fn register(cb: Option<extern "C" fn(Option<extern "C" fn(c_int) -> c_int>, c_int) -> c_int>); | |
634 | } | |
635 | # unsafe fn register(_: Option<extern "C" fn(Option<extern "C" fn(c_int) -> c_int>, | |
636 | # c_int) -> c_int>) | |
637 | # {} | |
638 | ||
639 | /// This fairly useless function receives a function pointer and an integer | |
640 | /// from C, and returns the result of calling the function with the integer. | |
641 | /// In case no function is provided, it squares the integer by default. | |
642 | extern "C" fn apply(process: Option<extern "C" fn(c_int) -> c_int>, int: c_int) -> c_int { | |
643 | match process { | |
644 | Some(f) => f(int), | |
645 | None => int * int | |
646 | } | |
647 | } | |
648 | ||
649 | fn main() { | |
650 | unsafe { | |
651 | register(Some(apply)); | |
652 | } | |
653 | } | |
654 | ``` | |
655 | ||
656 | And the code on the C side looks like this: | |
657 | ||
658 | ```c | |
f9f354fc | 659 | void register(void (*f)(int (*)(int), int)) { |
7cac9316 XL |
660 | ... |
661 | } | |
662 | ``` | |
663 | ||
664 | No `transmute` required! | |
665 | ||
136023e0 | 666 | ## Calling Rust code from C |
7cac9316 XL |
667 | |
668 | You may wish to compile Rust code in a way so that it can be called from C. This is | |
669 | fairly easy, but requires a few things: | |
670 | ||
671 | ```rust | |
672 | #[no_mangle] | |
74b04a01 | 673 | pub extern "C" fn hello_rust() -> *const u8 { |
7cac9316 XL |
674 | "Hello, world!\0".as_ptr() |
675 | } | |
676 | # fn main() {} | |
677 | ``` | |
678 | ||
74b04a01 | 679 | The `extern "C"` makes this function adhere to the C calling convention, as |
7cac9316 XL |
680 | discussed above in "[Foreign Calling |
681 | Conventions](ffi.html#foreign-calling-conventions)". The `no_mangle` | |
682 | attribute turns off Rust's name mangling, so that it is easier to link to. | |
683 | ||
136023e0 | 684 | ## FFI and panics |
7cac9316 XL |
685 | |
686 | It’s important to be mindful of `panic!`s when working with FFI. A `panic!` | |
687 | across an FFI boundary is undefined behavior. If you’re writing code that may | |
688 | panic, you should run it in a closure with [`catch_unwind`]: | |
689 | ||
690 | ```rust | |
691 | use std::panic::catch_unwind; | |
692 | ||
693 | #[no_mangle] | |
694 | pub extern fn oh_no() -> i32 { | |
695 | let result = catch_unwind(|| { | |
696 | panic!("Oops!"); | |
697 | }); | |
698 | match result { | |
699 | Ok(_) => 0, | |
700 | Err(_) => 1, | |
701 | } | |
702 | } | |
703 | ||
704 | fn main() {} | |
705 | ``` | |
706 | ||
707 | Please note that [`catch_unwind`] will only catch unwinding panics, not | |
708 | those who abort the process. See the documentation of [`catch_unwind`] | |
709 | for more information. | |
710 | ||
711 | [`catch_unwind`]: ../std/panic/fn.catch_unwind.html | |
712 | ||
136023e0 | 713 | ## Representing opaque structs |
7cac9316 | 714 | |
136023e0 XL |
715 | Sometimes, a C library wants to provide a pointer to something, but not let you know the internal details of the thing it wants. |
716 | A stable and simple way is to use a `void *` argument: | |
7cac9316 XL |
717 | |
718 | ```c | |
719 | void foo(void *arg); | |
720 | void bar(void *arg); | |
721 | ``` | |
722 | ||
723 | We can represent this in Rust with the `c_void` type: | |
724 | ||
136023e0 | 725 | <!-- ignore: requires libc crate --> |
7cac9316 XL |
726 | ```rust,ignore |
727 | extern crate libc; | |
728 | ||
729 | extern "C" { | |
730 | pub fn foo(arg: *mut libc::c_void); | |
731 | pub fn bar(arg: *mut libc::c_void); | |
732 | } | |
733 | # fn main() {} | |
734 | ``` | |
735 | ||
736 | This is a perfectly valid way of handling the situation. However, we can do a bit | |
737 | better. To solve this, some C libraries will instead create a `struct`, where | |
738 | the details and memory layout of the struct are private. This gives some amount | |
739 | of type safety. These structures are called ‘opaque’. Here’s an example, in C: | |
740 | ||
741 | ```c | |
742 | struct Foo; /* Foo is a structure, but its contents are not part of the public interface */ | |
743 | struct Bar; | |
744 | void foo(struct Foo *arg); | |
745 | void bar(struct Bar *arg); | |
746 | ``` | |
747 | ||
b7449926 | 748 | To do this in Rust, let’s create our own opaque types: |
7cac9316 XL |
749 | |
750 | ```rust | |
6a06907d XL |
751 | #[repr(C)] |
752 | pub struct Foo { | |
753 | _data: [u8; 0], | |
754 | _marker: | |
755 | core::marker::PhantomData<(*mut u8, core::marker::PhantomPinned)>, | |
756 | } | |
757 | #[repr(C)] | |
758 | pub struct Bar { | |
759 | _data: [u8; 0], | |
760 | _marker: | |
761 | core::marker::PhantomData<(*mut u8, core::marker::PhantomPinned)>, | |
762 | } | |
7cac9316 XL |
763 | |
764 | extern "C" { | |
765 | pub fn foo(arg: *mut Foo); | |
766 | pub fn bar(arg: *mut Bar); | |
767 | } | |
768 | # fn main() {} | |
769 | ``` | |
770 | ||
6a06907d | 771 | By including at least one private field and no constructor, |
b7449926 XL |
772 | we create an opaque type that we can't instantiate outside of this module. |
773 | (A struct with no field could be instantiated by anyone.) | |
774 | We also want to use this type in FFI, so we have to add `#[repr(C)]`. | |
6a06907d XL |
775 | The marker ensures the compiler does not mark the struct as `Send`, `Sync` and `Unpin` are |
776 | not applied to the struct. (`*mut u8` is not `Send` or `Sync`, `PhantomPinned` is not `Unpin`) | |
b7449926 XL |
777 | |
778 | But because our `Foo` and `Bar` types are | |
7cac9316 XL |
779 | different, we’ll get type safety between the two of them, so we cannot |
780 | accidentally pass a pointer to `Foo` to `bar()`. | |
b7449926 XL |
781 | |
782 | Notice that it is a really bad idea to use an empty enum as FFI type. | |
783 | The compiler relies on empty enums being uninhabited, so handling values of type | |
784 | `&Empty` is a huge footgun and can lead to buggy program behavior (by triggering | |
785 | undefined behavior). | |
136023e0 XL |
786 | |
787 | > **NOTE:** The simplest way would use "extern types". | |
788 | But it's currently (as of June 2021) unstable and has some unresolved questions, see the [RFC page][extern-type-rfc] and the [tracking issue][extern-type-issue] for more details. | |
789 | ||
790 | [extern-type-issue]: https://github.com/rust-lang/rust/issues/43467 | |
791 | [extern-type-rfc]: https://rust-lang.github.io/rfcs/1861-extern-types.html |