]>
Commit | Line | Data |
---|---|---|
1 | # Foreign Function Interface | |
2 | ||
3 | ## Introduction | |
4 | ||
5 | This guide will use the [snappy](https://github.com/google/snappy) | |
6 | compression/decompression library as an introduction to writing bindings for | |
7 | foreign code. Rust is currently unable to call directly into a C++ library, but | |
8 | snappy includes a C interface (documented in | |
9 | [`snappy-c.h`](https://github.com/google/snappy/blob/master/snappy-c.h)). | |
10 | ||
11 | ## A note about libc | |
12 | ||
13 | Many of these examples use [the `libc` crate][libc], which provides various | |
14 | type definitions for C types, among other things. If you’re trying these | |
15 | examples yourself, you’ll need to add `libc` to your `Cargo.toml`: | |
16 | ||
17 | ```toml | |
18 | [dependencies] | |
19 | libc = "0.2.0" | |
20 | ``` | |
21 | ||
22 | [libc]: https://crates.io/crates/libc | |
23 | ||
24 | ## Calling foreign functions | |
25 | ||
26 | The following is a minimal example of calling a foreign function which will | |
27 | compile if snappy is installed: | |
28 | ||
29 | <!-- ignore: requires libc crate --> | |
30 | ```rust,ignore | |
31 | use libc::size_t; | |
32 | ||
33 | #[link(name = "snappy")] | |
34 | extern { | |
35 | fn snappy_max_compressed_length(source_length: size_t) -> size_t; | |
36 | } | |
37 | ||
38 | fn main() { | |
39 | let x = unsafe { snappy_max_compressed_length(100) }; | |
40 | println!("max compressed length of a 100 byte buffer: {}", x); | |
41 | } | |
42 | ``` | |
43 | ||
44 | The `extern` block is a list of function signatures in a foreign library, in | |
45 | this case with the platform's C ABI. The `#[link(...)]` attribute is used to | |
46 | instruct the linker to link against the snappy library so the symbols are | |
47 | resolved. | |
48 | ||
49 | Foreign functions are assumed to be unsafe so calls to them need to be wrapped | |
50 | with `unsafe {}` as a promise to the compiler that everything contained within | |
51 | truly is safe. C libraries often expose interfaces that aren't thread-safe, and | |
52 | almost any function that takes a pointer argument isn't valid for all possible | |
53 | inputs since the pointer could be dangling, and raw pointers fall outside of | |
54 | Rust's safe memory model. | |
55 | ||
56 | When declaring the argument types to a foreign function, the Rust compiler | |
57 | cannot check if the declaration is correct, so specifying it correctly is part | |
58 | of keeping the binding correct at runtime. | |
59 | ||
60 | The `extern` block can be extended to cover the entire snappy API: | |
61 | ||
62 | <!-- ignore: requires libc crate --> | |
63 | ```rust,ignore | |
64 | use libc::{c_int, size_t}; | |
65 | ||
66 | #[link(name = "snappy")] | |
67 | extern { | |
68 | fn snappy_compress(input: *const u8, | |
69 | input_length: size_t, | |
70 | compressed: *mut u8, | |
71 | compressed_length: *mut size_t) -> c_int; | |
72 | fn snappy_uncompress(compressed: *const u8, | |
73 | compressed_length: size_t, | |
74 | uncompressed: *mut u8, | |
75 | uncompressed_length: *mut size_t) -> c_int; | |
76 | fn snappy_max_compressed_length(source_length: size_t) -> size_t; | |
77 | fn snappy_uncompressed_length(compressed: *const u8, | |
78 | compressed_length: size_t, | |
79 | result: *mut size_t) -> c_int; | |
80 | fn snappy_validate_compressed_buffer(compressed: *const u8, | |
81 | compressed_length: size_t) -> c_int; | |
82 | } | |
83 | # fn main() {} | |
84 | ``` | |
85 | ||
86 | ## Creating a safe interface | |
87 | ||
88 | The raw C API needs to be wrapped to provide memory safety and make use of higher-level concepts | |
89 | like vectors. A library can choose to expose only the safe, high-level interface and hide the unsafe | |
90 | internal details. | |
91 | ||
92 | Wrapping the functions which expect buffers involves using the `slice::raw` module to manipulate Rust | |
93 | vectors as pointers to memory. Rust's vectors are guaranteed to be a contiguous block of memory. The | |
94 | length is the number of elements currently contained, and the capacity is the total size in elements of | |
95 | the allocated memory. The length is less than or equal to the capacity. | |
96 | ||
97 | <!-- ignore: requires libc crate --> | |
98 | ```rust,ignore | |
99 | # use libc::{c_int, size_t}; | |
100 | # unsafe fn snappy_validate_compressed_buffer(_: *const u8, _: size_t) -> c_int { 0 } | |
101 | # fn main() {} | |
102 | pub fn validate_compressed_buffer(src: &[u8]) -> bool { | |
103 | unsafe { | |
104 | snappy_validate_compressed_buffer(src.as_ptr(), src.len() as size_t) == 0 | |
105 | } | |
106 | } | |
107 | ``` | |
108 | ||
109 | The `validate_compressed_buffer` wrapper above makes use of an `unsafe` block, but it makes the | |
110 | guarantee that calling it is safe for all inputs by leaving off `unsafe` from the function | |
111 | signature. | |
112 | ||
113 | The `snappy_compress` and `snappy_uncompress` functions are more complex, since a buffer has to be | |
114 | allocated to hold the output too. | |
115 | ||
116 | The `snappy_max_compressed_length` function can be used to allocate a vector with the maximum | |
117 | required capacity to hold the compressed output. The vector can then be passed to the | |
118 | `snappy_compress` function as an output parameter. An output parameter is also passed to retrieve | |
119 | the true length after compression for setting the length. | |
120 | ||
121 | <!-- ignore: requires libc crate --> | |
122 | ```rust,ignore | |
123 | # use libc::{size_t, c_int}; | |
124 | # unsafe fn snappy_compress(a: *const u8, b: size_t, c: *mut u8, | |
125 | # d: *mut size_t) -> c_int { 0 } | |
126 | # unsafe fn snappy_max_compressed_length(a: size_t) -> size_t { a } | |
127 | # fn main() {} | |
128 | pub fn compress(src: &[u8]) -> Vec<u8> { | |
129 | unsafe { | |
130 | let srclen = src.len() as size_t; | |
131 | let psrc = src.as_ptr(); | |
132 | ||
133 | let mut dstlen = snappy_max_compressed_length(srclen); | |
134 | let mut dst = Vec::with_capacity(dstlen as usize); | |
135 | let pdst = dst.as_mut_ptr(); | |
136 | ||
137 | snappy_compress(psrc, srclen, pdst, &mut dstlen); | |
138 | dst.set_len(dstlen as usize); | |
139 | dst | |
140 | } | |
141 | } | |
142 | ``` | |
143 | ||
144 | Decompression is similar, because snappy stores the uncompressed size as part of the compression | |
145 | format and `snappy_uncompressed_length` will retrieve the exact buffer size required. | |
146 | ||
147 | <!-- ignore: requires libc crate --> | |
148 | ```rust,ignore | |
149 | # use libc::{size_t, c_int}; | |
150 | # unsafe fn snappy_uncompress(compressed: *const u8, | |
151 | # compressed_length: size_t, | |
152 | # uncompressed: *mut u8, | |
153 | # uncompressed_length: *mut size_t) -> c_int { 0 } | |
154 | # unsafe fn snappy_uncompressed_length(compressed: *const u8, | |
155 | # compressed_length: size_t, | |
156 | # result: *mut size_t) -> c_int { 0 } | |
157 | # fn main() {} | |
158 | pub fn uncompress(src: &[u8]) -> Option<Vec<u8>> { | |
159 | unsafe { | |
160 | let srclen = src.len() as size_t; | |
161 | let psrc = src.as_ptr(); | |
162 | ||
163 | let mut dstlen: size_t = 0; | |
164 | snappy_uncompressed_length(psrc, srclen, &mut dstlen); | |
165 | ||
166 | let mut dst = Vec::with_capacity(dstlen as usize); | |
167 | let pdst = dst.as_mut_ptr(); | |
168 | ||
169 | if snappy_uncompress(psrc, srclen, pdst, &mut dstlen) == 0 { | |
170 | dst.set_len(dstlen as usize); | |
171 | Some(dst) | |
172 | } else { | |
173 | None // SNAPPY_INVALID_INPUT | |
174 | } | |
175 | } | |
176 | } | |
177 | ``` | |
178 | ||
179 | Then, we can add some tests to show how to use them. | |
180 | ||
181 | <!-- ignore: requires libc crate --> | |
182 | ```rust,ignore | |
183 | # use libc::{c_int, size_t}; | |
184 | # unsafe fn snappy_compress(input: *const u8, | |
185 | # input_length: size_t, | |
186 | # compressed: *mut u8, | |
187 | # compressed_length: *mut size_t) | |
188 | # -> c_int { 0 } | |
189 | # unsafe fn snappy_uncompress(compressed: *const u8, | |
190 | # compressed_length: size_t, | |
191 | # uncompressed: *mut u8, | |
192 | # uncompressed_length: *mut size_t) | |
193 | # -> c_int { 0 } | |
194 | # unsafe fn snappy_max_compressed_length(source_length: size_t) -> size_t { 0 } | |
195 | # unsafe fn snappy_uncompressed_length(compressed: *const u8, | |
196 | # compressed_length: size_t, | |
197 | # result: *mut size_t) | |
198 | # -> c_int { 0 } | |
199 | # unsafe fn snappy_validate_compressed_buffer(compressed: *const u8, | |
200 | # compressed_length: size_t) | |
201 | # -> c_int { 0 } | |
202 | # fn main() { } | |
203 | # | |
204 | #[cfg(test)] | |
205 | mod tests { | |
206 | use super::*; | |
207 | ||
208 | #[test] | |
209 | fn valid() { | |
210 | let d = vec![0xde, 0xad, 0xd0, 0x0d]; | |
211 | let c: &[u8] = &compress(&d); | |
212 | assert!(validate_compressed_buffer(c)); | |
213 | assert!(uncompress(c) == Some(d)); | |
214 | } | |
215 | ||
216 | #[test] | |
217 | fn invalid() { | |
218 | let d = vec![0, 0, 0, 0]; | |
219 | assert!(!validate_compressed_buffer(&d)); | |
220 | assert!(uncompress(&d).is_none()); | |
221 | } | |
222 | ||
223 | #[test] | |
224 | fn empty() { | |
225 | let d = vec![]; | |
226 | assert!(!validate_compressed_buffer(&d)); | |
227 | assert!(uncompress(&d).is_none()); | |
228 | let c = compress(&d); | |
229 | assert!(validate_compressed_buffer(&c)); | |
230 | assert!(uncompress(&c) == Some(d)); | |
231 | } | |
232 | } | |
233 | ``` | |
234 | ||
235 | ## Destructors | |
236 | ||
237 | Foreign libraries often hand off ownership of resources to the calling code. | |
238 | When this occurs, we must use Rust's destructors to provide safety and guarantee | |
239 | the release of these resources (especially in the case of panic). | |
240 | ||
241 | For more about destructors, see the [Drop trait](../std/ops/trait.Drop.html). | |
242 | ||
243 | ## Callbacks from C code to Rust functions | |
244 | ||
245 | Some external libraries require the usage of callbacks to report back their | |
246 | current state or intermediate data to the caller. | |
247 | It is possible to pass functions defined in Rust to an external library. | |
248 | The requirement for this is that the callback function is marked as `extern` | |
249 | with the correct calling convention to make it callable from C code. | |
250 | ||
251 | The callback function can then be sent through a registration call | |
252 | to the C library and afterwards be invoked from there. | |
253 | ||
254 | A basic example is: | |
255 | ||
256 | Rust code: | |
257 | ||
258 | ```rust,no_run | |
259 | extern fn callback(a: i32) { | |
260 | println!("I'm called from C with value {0}", a); | |
261 | } | |
262 | ||
263 | #[link(name = "extlib")] | |
264 | extern { | |
265 | fn register_callback(cb: extern fn(i32)) -> i32; | |
266 | fn trigger_callback(); | |
267 | } | |
268 | ||
269 | fn main() { | |
270 | unsafe { | |
271 | register_callback(callback); | |
272 | trigger_callback(); // Triggers the callback. | |
273 | } | |
274 | } | |
275 | ``` | |
276 | ||
277 | C code: | |
278 | ||
279 | ```c | |
280 | typedef void (*rust_callback)(int32_t); | |
281 | rust_callback cb; | |
282 | ||
283 | int32_t register_callback(rust_callback callback) { | |
284 | cb = callback; | |
285 | return 1; | |
286 | } | |
287 | ||
288 | void trigger_callback() { | |
289 | cb(7); // Will call callback(7) in Rust. | |
290 | } | |
291 | ``` | |
292 | ||
293 | In this example Rust's `main()` will call `trigger_callback()` in C, | |
294 | which would, in turn, call back to `callback()` in Rust. | |
295 | ||
296 | ## Targeting callbacks to Rust objects | |
297 | ||
298 | The former example showed how a global function can be called from C code. | |
299 | However it is often desired that the callback is targeted to a special | |
300 | Rust object. This could be the object that represents the wrapper for the | |
301 | respective C object. | |
302 | ||
303 | This can be achieved by passing a raw pointer to the object down to the | |
304 | C library. The C library can then include the pointer to the Rust object in | |
305 | the notification. This will allow the callback to unsafely access the | |
306 | referenced Rust object. | |
307 | ||
308 | Rust code: | |
309 | ||
310 | ```rust,no_run | |
311 | struct RustObject { | |
312 | a: i32, | |
313 | // Other members... | |
314 | } | |
315 | ||
316 | extern "C" fn callback(target: *mut RustObject, a: i32) { | |
317 | println!("I'm called from C with value {0}", a); | |
318 | unsafe { | |
319 | // Update the value in RustObject with the value received from the callback: | |
320 | (*target).a = a; | |
321 | } | |
322 | } | |
323 | ||
324 | #[link(name = "extlib")] | |
325 | extern { | |
326 | fn register_callback(target: *mut RustObject, | |
327 | cb: extern fn(*mut RustObject, i32)) -> i32; | |
328 | fn trigger_callback(); | |
329 | } | |
330 | ||
331 | fn main() { | |
332 | // Create the object that will be referenced in the callback: | |
333 | let mut rust_object = Box::new(RustObject { a: 5 }); | |
334 | ||
335 | unsafe { | |
336 | register_callback(&mut *rust_object, callback); | |
337 | trigger_callback(); | |
338 | } | |
339 | } | |
340 | ``` | |
341 | ||
342 | C code: | |
343 | ||
344 | ```c | |
345 | typedef void (*rust_callback)(void*, int32_t); | |
346 | void* cb_target; | |
347 | rust_callback cb; | |
348 | ||
349 | int32_t register_callback(void* callback_target, rust_callback callback) { | |
350 | cb_target = callback_target; | |
351 | cb = callback; | |
352 | return 1; | |
353 | } | |
354 | ||
355 | void trigger_callback() { | |
356 | cb(cb_target, 7); // Will call callback(&rustObject, 7) in Rust. | |
357 | } | |
358 | ``` | |
359 | ||
360 | ## Asynchronous callbacks | |
361 | ||
362 | In the previously given examples the callbacks are invoked as a direct reaction | |
363 | to a function call to the external C library. | |
364 | The control over the current thread is switched from Rust to C to Rust for the | |
365 | execution of the callback, but in the end the callback is executed on the | |
366 | same thread that called the function which triggered the callback. | |
367 | ||
368 | Things get more complicated when the external library spawns its own threads | |
369 | and invokes callbacks from there. | |
370 | In these cases access to Rust data structures inside the callbacks is | |
371 | especially unsafe and proper synchronization mechanisms must be used. | |
372 | Besides classical synchronization mechanisms like mutexes, one possibility in | |
373 | Rust is to use channels (in `std::sync::mpsc`) to forward data from the C | |
374 | thread that invoked the callback into a Rust thread. | |
375 | ||
376 | If an asynchronous callback targets a special object in the Rust address space | |
377 | it is also absolutely necessary that no more callbacks are performed by the | |
378 | C library after the respective Rust object gets destroyed. | |
379 | This can be achieved by unregistering the callback in the object's | |
380 | destructor and designing the library in a way that guarantees that no | |
381 | callback will be performed after deregistration. | |
382 | ||
383 | ## Linking | |
384 | ||
385 | The `link` attribute on `extern` blocks provides the basic building block for | |
386 | instructing rustc how it will link to native libraries. There are two accepted | |
387 | forms of the link attribute today: | |
388 | ||
389 | * `#[link(name = "foo")]` | |
390 | * `#[link(name = "foo", kind = "bar")]` | |
391 | ||
392 | In both of these cases, `foo` is the name of the native library that we're | |
393 | linking to, and in the second case `bar` is the type of native library that the | |
394 | compiler is linking to. There are currently three known types of native | |
395 | libraries: | |
396 | ||
397 | * Dynamic - `#[link(name = "readline")]` | |
398 | * Static - `#[link(name = "my_build_dependency", kind = "static")]` | |
399 | * Frameworks - `#[link(name = "CoreFoundation", kind = "framework")]` | |
400 | ||
401 | Note that frameworks are only available on macOS targets. | |
402 | ||
403 | The different `kind` values are meant to differentiate how the native library | |
404 | participates in linkage. From a linkage perspective, the Rust compiler creates | |
405 | two flavors of artifacts: partial (rlib/staticlib) and final (dylib/binary). | |
406 | Native dynamic library and framework dependencies are propagated to the final | |
407 | artifact boundary, while static library dependencies are not propagated at | |
408 | all, because the static libraries are integrated directly into the subsequent | |
409 | artifact. | |
410 | ||
411 | A few examples of how this model can be used are: | |
412 | ||
413 | * A native build dependency. Sometimes some C/C++ glue is needed when writing | |
414 | some Rust code, but distribution of the C/C++ code in a library format is | |
415 | a burden. In this case, the code will be archived into `libfoo.a` and then the | |
416 | Rust crate would declare a dependency via `#[link(name = "foo", kind = | |
417 | "static")]`. | |
418 | ||
419 | Regardless of the flavor of output for the crate, the native static library | |
420 | will be included in the output, meaning that distribution of the native static | |
421 | library is not necessary. | |
422 | ||
423 | * A normal dynamic dependency. Common system libraries (like `readline`) are | |
424 | available on a large number of systems, and often a static copy of these | |
425 | libraries cannot be found. When this dependency is included in a Rust crate, | |
426 | partial targets (like rlibs) will not link to the library, but when the rlib | |
427 | is included in a final target (like a binary), the native library will be | |
428 | linked in. | |
429 | ||
430 | On macOS, frameworks behave with the same semantics as a dynamic library. | |
431 | ||
432 | ## Unsafe blocks | |
433 | ||
434 | Some operations, like dereferencing raw pointers or calling functions that have been marked | |
435 | unsafe are only allowed inside unsafe blocks. Unsafe blocks isolate unsafety and are a promise to | |
436 | the compiler that the unsafety does not leak out of the block. | |
437 | ||
438 | Unsafe functions, on the other hand, advertise it to the world. An unsafe function is written like | |
439 | this: | |
440 | ||
441 | ```rust | |
442 | unsafe fn kaboom(ptr: *const i32) -> i32 { *ptr } | |
443 | ``` | |
444 | ||
445 | This function can only be called from an `unsafe` block or another `unsafe` function. | |
446 | ||
447 | ## Accessing foreign globals | |
448 | ||
449 | Foreign APIs often export a global variable which could do something like track | |
450 | global state. In order to access these variables, you declare them in `extern` | |
451 | blocks with the `static` keyword: | |
452 | ||
453 | <!-- ignore: requires libc crate --> | |
454 | ```rust,ignore | |
455 | #[link(name = "readline")] | |
456 | extern { | |
457 | static rl_readline_version: libc::c_int; | |
458 | } | |
459 | ||
460 | fn main() { | |
461 | println!("You have readline version {} installed.", | |
462 | unsafe { rl_readline_version as i32 }); | |
463 | } | |
464 | ``` | |
465 | ||
466 | Alternatively, you may need to alter global state provided by a foreign | |
467 | interface. To do this, statics can be declared with `mut` so we can mutate | |
468 | them. | |
469 | ||
470 | <!-- ignore: requires libc crate --> | |
471 | ```rust,ignore | |
472 | use std::ffi::CString; | |
473 | use std::ptr; | |
474 | ||
475 | #[link(name = "readline")] | |
476 | extern { | |
477 | static mut rl_prompt: *const libc::c_char; | |
478 | } | |
479 | ||
480 | fn main() { | |
481 | let prompt = CString::new("[my-awesome-shell] $").unwrap(); | |
482 | unsafe { | |
483 | rl_prompt = prompt.as_ptr(); | |
484 | ||
485 | println!("{:?}", rl_prompt); | |
486 | ||
487 | rl_prompt = ptr::null(); | |
488 | } | |
489 | } | |
490 | ``` | |
491 | ||
492 | Note that all interaction with a `static mut` is unsafe, both reading and | |
493 | writing. Dealing with global mutable state requires a great deal of care. | |
494 | ||
495 | ## Foreign calling conventions | |
496 | ||
497 | Most foreign code exposes a C ABI, and Rust uses the platform's C calling convention by default when | |
498 | calling foreign functions. Some foreign functions, most notably the Windows API, use other calling | |
499 | conventions. Rust provides a way to tell the compiler which convention to use: | |
500 | ||
501 | <!-- ignore: requires libc crate --> | |
502 | ```rust,ignore | |
503 | #[cfg(all(target_os = "win32", target_arch = "x86"))] | |
504 | #[link(name = "kernel32")] | |
505 | #[allow(non_snake_case)] | |
506 | extern "stdcall" { | |
507 | fn SetEnvironmentVariableA(n: *const u8, v: *const u8) -> libc::c_int; | |
508 | } | |
509 | # fn main() { } | |
510 | ``` | |
511 | ||
512 | This applies to the entire `extern` block. The list of supported ABI constraints | |
513 | are: | |
514 | ||
515 | * `stdcall` | |
516 | * `aapcs` | |
517 | * `cdecl` | |
518 | * `fastcall` | |
519 | * `vectorcall` | |
520 | This is currently hidden behind the `abi_vectorcall` gate and is subject to change. | |
521 | * `Rust` | |
522 | * `rust-intrinsic` | |
523 | * `system` | |
524 | * `C` | |
525 | * `win64` | |
526 | * `sysv64` | |
527 | ||
528 | Most of the abis in this list are self-explanatory, but the `system` abi may | |
529 | seem a little odd. This constraint selects whatever the appropriate ABI is for | |
530 | interoperating with the target's libraries. For example, on win32 with a x86 | |
531 | architecture, this means that the abi used would be `stdcall`. On x86_64, | |
532 | however, windows uses the `C` calling convention, so `C` would be used. This | |
533 | means that in our previous example, we could have used `extern "system" { ... }` | |
534 | to define a block for all windows systems, not only x86 ones. | |
535 | ||
536 | ## Interoperability with foreign code | |
537 | ||
538 | Rust guarantees that the layout of a `struct` is compatible with the platform's | |
539 | representation in C only if the `#[repr(C)]` attribute is applied to it. | |
540 | `#[repr(C, packed)]` can be used to lay out struct members without padding. | |
541 | `#[repr(C)]` can also be applied to an enum. | |
542 | ||
543 | Rust's owned boxes (`Box<T>`) use non-nullable pointers as handles which point | |
544 | to the contained object. However, they should not be manually created because | |
545 | they are managed by internal allocators. References can safely be assumed to be | |
546 | non-nullable pointers directly to the type. However, breaking the borrow | |
547 | checking or mutability rules is not guaranteed to be safe, so prefer using raw | |
548 | pointers (`*`) if that's needed because the compiler can't make as many | |
549 | assumptions about them. | |
550 | ||
551 | Vectors and strings share the same basic memory layout, and utilities are | |
552 | available in the `vec` and `str` modules for working with C APIs. However, | |
553 | strings are not terminated with `\0`. If you need a NUL-terminated string for | |
554 | interoperability with C, you should use the `CString` type in the `std::ffi` | |
555 | module. | |
556 | ||
557 | The [`libc` crate on crates.io][libc] includes type aliases and function | |
558 | definitions for the C standard library in the `libc` module, and Rust links | |
559 | against `libc` and `libm` by default. | |
560 | ||
561 | ## Variadic functions | |
562 | ||
563 | In C, functions can be 'variadic', meaning they accept a variable number of arguments. This can | |
564 | be achieved in Rust by specifying `...` within the argument list of a foreign function declaration: | |
565 | ||
566 | ```no_run | |
567 | extern { | |
568 | fn foo(x: i32, ...); | |
569 | } | |
570 | ||
571 | fn main() { | |
572 | unsafe { | |
573 | foo(10, 20, 30, 40, 50); | |
574 | } | |
575 | } | |
576 | ``` | |
577 | ||
578 | Normal Rust functions can *not* be variadic: | |
579 | ||
580 | ```rust,compile_fail | |
581 | // This will not compile | |
582 | ||
583 | fn foo(x: i32, ...) {} | |
584 | ``` | |
585 | ||
586 | ## The "nullable pointer optimization" | |
587 | ||
588 | Certain Rust types are defined to never be `null`. This includes references (`&T`, | |
589 | `&mut T`), boxes (`Box<T>`), and function pointers (`extern "abi" fn()`). When | |
590 | interfacing with C, pointers that might be `null` are often used, which would seem to | |
591 | require some messy `transmute`s and/or unsafe code to handle conversions to/from Rust types. | |
592 | However, the language provides a workaround. | |
593 | ||
594 | As a special case, an `enum` is eligible for the "nullable pointer optimization" if it contains | |
595 | exactly two variants, one of which contains no data and the other contains a field of one of the | |
596 | non-nullable types listed above. This means no extra space is required for a discriminant; rather, | |
597 | the empty variant is represented by putting a `null` value into the non-nullable field. This is | |
598 | called an "optimization", but unlike other optimizations it is guaranteed to apply to eligible | |
599 | types. | |
600 | ||
601 | The most common type that takes advantage of the nullable pointer optimization is `Option<T>`, | |
602 | where `None` corresponds to `null`. So `Option<extern "C" fn(c_int) -> c_int>` is a correct way | |
603 | to represent a nullable function pointer using the C ABI (corresponding to the C type | |
604 | `int (*)(int)`). | |
605 | ||
606 | Here is a contrived example. Let's say some C library has a facility for registering a | |
607 | callback, which gets called in certain situations. The callback is passed a function pointer | |
608 | and an integer and it is supposed to run the function with the integer as a parameter. So | |
609 | we have function pointers flying across the FFI boundary in both directions. | |
610 | ||
611 | <!-- ignore: requires libc crate --> | |
612 | ```rust,ignore | |
613 | use libc::c_int; | |
614 | ||
615 | # #[cfg(hidden)] | |
616 | extern "C" { | |
617 | /// Registers the callback. | |
618 | fn register(cb: Option<extern "C" fn(Option<extern "C" fn(c_int) -> c_int>, c_int) -> c_int>); | |
619 | } | |
620 | # unsafe fn register(_: Option<extern "C" fn(Option<extern "C" fn(c_int) -> c_int>, | |
621 | # c_int) -> c_int>) | |
622 | # {} | |
623 | ||
624 | /// This fairly useless function receives a function pointer and an integer | |
625 | /// from C, and returns the result of calling the function with the integer. | |
626 | /// In case no function is provided, it squares the integer by default. | |
627 | extern "C" fn apply(process: Option<extern "C" fn(c_int) -> c_int>, int: c_int) -> c_int { | |
628 | match process { | |
629 | Some(f) => f(int), | |
630 | None => int * int | |
631 | } | |
632 | } | |
633 | ||
634 | fn main() { | |
635 | unsafe { | |
636 | register(Some(apply)); | |
637 | } | |
638 | } | |
639 | ``` | |
640 | ||
641 | And the code on the C side looks like this: | |
642 | ||
643 | ```c | |
644 | void register(void (*f)(int (*)(int), int)) { | |
645 | ... | |
646 | } | |
647 | ``` | |
648 | ||
649 | No `transmute` required! | |
650 | ||
651 | ## Calling Rust code from C | |
652 | ||
653 | You may wish to compile Rust code in a way so that it can be called from C. This is | |
654 | fairly easy, but requires a few things: | |
655 | ||
656 | ```rust | |
657 | #[no_mangle] | |
658 | pub extern "C" fn hello_rust() -> *const u8 { | |
659 | "Hello, world!\0".as_ptr() | |
660 | } | |
661 | # fn main() {} | |
662 | ``` | |
663 | ||
664 | The `extern "C"` makes this function adhere to the C calling convention, as | |
665 | discussed above in "[Foreign Calling | |
666 | Conventions](ffi.html#foreign-calling-conventions)". The `no_mangle` | |
667 | attribute turns off Rust's name mangling, so that it is easier to link to. | |
668 | ||
669 | ## FFI and panics | |
670 | ||
671 | It’s important to be mindful of `panic!`s when working with FFI. A `panic!` | |
672 | across an FFI boundary is undefined behavior. If you’re writing code that may | |
673 | panic, you should run it in a closure with [`catch_unwind`]: | |
674 | ||
675 | ```rust | |
676 | use std::panic::catch_unwind; | |
677 | ||
678 | #[no_mangle] | |
679 | pub extern fn oh_no() -> i32 { | |
680 | let result = catch_unwind(|| { | |
681 | panic!("Oops!"); | |
682 | }); | |
683 | match result { | |
684 | Ok(_) => 0, | |
685 | Err(_) => 1, | |
686 | } | |
687 | } | |
688 | ||
689 | fn main() {} | |
690 | ``` | |
691 | ||
692 | Please note that [`catch_unwind`] will only catch unwinding panics, not | |
693 | those who abort the process. See the documentation of [`catch_unwind`] | |
694 | for more information. | |
695 | ||
696 | [`catch_unwind`]: ../std/panic/fn.catch_unwind.html | |
697 | ||
698 | ## Representing opaque structs | |
699 | ||
700 | Sometimes, a C library wants to provide a pointer to something, but not let you know the internal details of the thing it wants. | |
701 | A stable and simple way is to use a `void *` argument: | |
702 | ||
703 | ```c | |
704 | void foo(void *arg); | |
705 | void bar(void *arg); | |
706 | ``` | |
707 | ||
708 | We can represent this in Rust with the `c_void` type: | |
709 | ||
710 | <!-- ignore: requires libc crate --> | |
711 | ```rust,ignore | |
712 | extern "C" { | |
713 | pub fn foo(arg: *mut libc::c_void); | |
714 | pub fn bar(arg: *mut libc::c_void); | |
715 | } | |
716 | # fn main() {} | |
717 | ``` | |
718 | ||
719 | This is a perfectly valid way of handling the situation. However, we can do a bit | |
720 | better. To solve this, some C libraries will instead create a `struct`, where | |
721 | the details and memory layout of the struct are private. This gives some amount | |
722 | of type safety. These structures are called ‘opaque’. Here’s an example, in C: | |
723 | ||
724 | ```c | |
725 | struct Foo; /* Foo is a structure, but its contents are not part of the public interface */ | |
726 | struct Bar; | |
727 | void foo(struct Foo *arg); | |
728 | void bar(struct Bar *arg); | |
729 | ``` | |
730 | ||
731 | To do this in Rust, let’s create our own opaque types: | |
732 | ||
733 | ```rust | |
734 | #[repr(C)] | |
735 | pub struct Foo { | |
736 | _data: [u8; 0], | |
737 | _marker: | |
738 | core::marker::PhantomData<(*mut u8, core::marker::PhantomPinned)>, | |
739 | } | |
740 | #[repr(C)] | |
741 | pub struct Bar { | |
742 | _data: [u8; 0], | |
743 | _marker: | |
744 | core::marker::PhantomData<(*mut u8, core::marker::PhantomPinned)>, | |
745 | } | |
746 | ||
747 | extern "C" { | |
748 | pub fn foo(arg: *mut Foo); | |
749 | pub fn bar(arg: *mut Bar); | |
750 | } | |
751 | # fn main() {} | |
752 | ``` | |
753 | ||
754 | By including at least one private field and no constructor, | |
755 | we create an opaque type that we can't instantiate outside of this module. | |
756 | (A struct with no field could be instantiated by anyone.) | |
757 | We also want to use this type in FFI, so we have to add `#[repr(C)]`. | |
758 | The marker ensures the compiler does not mark the struct as `Send`, `Sync` and `Unpin` are | |
759 | not applied to the struct. (`*mut u8` is not `Send` or `Sync`, `PhantomPinned` is not `Unpin`) | |
760 | ||
761 | But because our `Foo` and `Bar` types are | |
762 | different, we’ll get type safety between the two of them, so we cannot | |
763 | accidentally pass a pointer to `Foo` to `bar()`. | |
764 | ||
765 | Notice that it is a really bad idea to use an empty enum as FFI type. | |
766 | The compiler relies on empty enums being uninhabited, so handling values of type | |
767 | `&Empty` is a huge footgun and can lead to buggy program behavior (by triggering | |
768 | undefined behavior). | |
769 | ||
770 | > **NOTE:** The simplest way would use "extern types". | |
771 | But it's currently (as of June 2021) unstable and has some unresolved questions, see the [RFC page][extern-type-rfc] and the [tracking issue][extern-type-issue] for more details. | |
772 | ||
773 | [extern-type-issue]: https://github.com/rust-lang/rust/issues/43467 | |
774 | [extern-type-rfc]: https://rust-lang.github.io/rfcs/1861-extern-types.html |