]> git.proxmox.com Git - ceph.git/blob - ceph/src/boost/libs/locale/doc/messages_formatting.txt
bump version to 12.2.2-pve1
[ceph.git] / ceph / src / boost / libs / locale / doc / messages_formatting.txt
1 //
2 // Copyright (c) 2009-2011 Artyom Beilis (Tonkikh)
3 //
4 // Distributed under the Boost Software License, Version 1.0. (See
5 // accompanying file LICENSE_1_0.txt or copy at
6 // http://www.boost.org/LICENSE_1_0.txt)
7 //
8
9 // vim: tabstop=4 expandtab shiftwidth=4 softtabstop=4 filetype=cpp.doxygen
10 /*!
11 \page messages_formatting Messages Formatting (Translation)
12
13 - \ref messages_formatting_into
14 - \ref msg_loading_dictionaries
15 - \ref message_translation
16 - \ref indirect_message_translation
17 - \ref plural_forms
18 - \ref multiple_gettext_domain
19 - \ref direct_message_translation
20 - \ref extracting_messages_from_code
21 - \ref custom_file_system_support
22 - \ref msg_non_ascii_keys
23 - \ref msg_qna
24
25 \section messages_formatting_into Introduction
26
27 Messages formatting is probably the most important part of
28 the localization - making your application speak in the user's language.
29
30 Boost.Locale uses the <a href="http://www.gnu.org/software/gettext/">GNU Gettext</a> localization model.
31 We recommend you read the general <a href="http://www.gnu.org/software/gettext/manual/gettext.html">documentation</a>
32 of GNU Gettext, as it is outside the scope of this document.
33
34 The model is following:
35
36 - First, our application \c foo is prepared for localization by calling the \ref boost::locale::translate() "translate" function
37 for each message used in user interface.
38 \n
39 For example:
40 \code
41 cout << "Hello World" << endl;
42 \endcode
43 Is changed to
44 \n
45 \code
46 cout << translate("Hello World") << endl;
47 \endcode
48 - Then all messages are extracted from the source code and a special \c foo.po file is generated that contains all of the
49 original English strings.
50 \n
51 \verbatim
52 ...
53 msgid "Hello World"
54 msgstr ""
55 ...
56 \endverbatim
57 - The \c foo.po file is translated for the supported locales. For example, \c de.po, \c ar.po, \c en_CA.po , and \c he.po.
58 \n
59 \verbatim
60 ...
61 msgid "Hello World"
62 msgstr "שלום עולם"
63 \endverbatim
64 And then compiled to the binary \c mo format and stored in the following file structure:
65 \n
66 \verbatim
67 de
68 de/LC_MESSAGES
69 de/LC_MESSAGES/foo.mo
70 en_CA/
71 en_CA/LC_MESSAGES
72 en_CA/LC_MESSAGES/foo.mo
73 ...
74 \endverbatim
75 \n
76 When the application starts, it loads the required dictionaries. Then when the \c translate function is called and the message is written
77 to an output stream, a dictionary lookup is performed and the localized message is written out instead.
78
79 \section msg_loading_dictionaries Loading dictionaries
80
81 All the dictionaries are loaded by the \ref boost::locale::generator "generator" class.
82 Using localized strings in the application, requires specification
83 of the following parameters:
84
85 -# The search path of the dictionaries
86 -# The application domain (or name)
87
88 This is done by calling the following member functions of the \ref boost::locale::generator "generator" class:
89
90 - \ref boost::locale::generator::add_messages_path() "add_messages_path" - add the root path to the dictionaries.
91 \n
92 For example: if the dictionary is located at \c /usr/share/locale/ar/LC_MESSAGES/foo.mo, then path should be \c /usr/share/locale.
93 \n
94 - \ref boost::locale::generator::add_messages_domain() "add_messages_domain" - add the domain (name) of the application. In the above case it would be "foo".
95
96 \note At least one domain and one path should be specified in order to load dictionaries.
97
98 This is an example of our first fully localized program:
99
100 \code
101 #include <boost/locale.hpp>
102 #include <iostream>
103
104 using namespace std;
105 using namespace boost::locale;
106
107 int main()
108 {
109 generator gen;
110
111 // Specify location of dictionaries
112 gen.add_messages_path(".");
113 gen.add_messages_domain("hello");
114
115 // Generate locales and imbue them to iostream
116 locale::global(gen(""));
117 cout.imbue(locale());
118
119 // Display a message using current system locale
120 cout << translate("Hello World") << endl;
121 }
122 \endcode
123
124
125 \section message_translation Message Translation
126
127 There are two ways to translate messages:
128
129 - using \ref boost_locale_translate_family "boost::locale::translate()" family of functions:
130 \n
131 These functions create a special proxy object \ref boost::locale::basic_message "basic_message"
132 that can be converted to string according to given locale or written to \c std::ostream
133 formatting the message in the \c std::ostream's locale.
134 \n
135 It is very convenient for working with \c std::ostream object and for postponing message
136 translation
137 - Using \ref boost_locale_gettext_family "boost::locale::gettext()" family of functions:
138 \n
139 These are functions that are used for direct message translation: they receive as a parameter
140 an original message or a key and convert it to the \c std::basic_string in given locale.
141 \n
142 These functions have similar names to thous used in the GNU Gettext library.
143
144 \subsection indirect_message_translation Indirect Message Translation
145
146 The basic function that allows us to translate a message is \ref boost_locale_translate_family "boost::locale::translate()" family of functions.
147
148 These functions use a character type \c CharType as template parameter and receive either <tt>CharType const *</tt> or <tt>std::basic_string<CharType></tt> as input.
149
150 These functions receive an original message and return a special proxy
151 object - \ref boost::locale::basic_message "basic_message<CharType>".
152 This object holds all the required information for the message formatting.
153
154 When this object is written to an output \c ostream, it performs a dictionary lookup of the message according to the locale
155 imbued in \c iostream.
156
157 If the message is found in the dictionary it is written to the output stream,
158 otherwise the original string is written to the stream.
159
160 For example:
161
162 \code
163 // Translate a simple message "Hello World!"
164 std::cout << boost::locale::translate("Hello World!") << std::endl;
165 \endcode
166
167 This allows the program to postpone translation of the message until the translation is actually needed, even to different
168 locale targets.
169
170 \code
171 // Several output stream that we write a message to
172 // English, Japanese, Hebrew etc.
173 // Each one them has installed std::locale object that represents
174 // their specific locale
175 std::ofstream en,ja,he,de,ar;
176
177 // Send single message to multiple streams
178 void send_to_all(message const &msg)
179 {
180 // in each of the cases below
181 // the message is translated to different
182 // language
183 en << msg;
184 ja << msg;
185 he << msg;
186 de << msg;
187 ar << msg;
188 }
189
190 int main()
191 {
192 ...
193 send_to_all(translate("Hello World"));
194 }
195 \endcode
196
197 \note
198
199 - \ref boost::locale::basic_message "basic_message" can be implicitly converted
200 to an apopriate std::basic_string using
201 the global locale:
202 \n
203 \code
204 std::wstring msg = translate(L"Do you want to open the file?");
205 \endcode
206 - \ref boost::locale::basic_message "basic_message" can be explicitly converted
207 to a string using the \ref boost::locale::basic_message::str() "str()" member function for a specific locale.
208 \n
209 \code
210 std::locale ru_RU = ... ;
211 std::string msg = translate("Do you want to open the file?").str(ru_RU);
212 \endcode
213
214
215 \subsection plural_forms Plural Forms
216
217 GNU Gettext catalogs have simple, robust and yet powerful plural forms support. We recommend to read the
218 original GNU documentation <a href="http://www.gnu.org/software/gettext/manual/gettext.html#Plural-forms">here</a>.
219
220 Let's try to solve a simple problem, displaying a message to the user:
221
222 \code
223 if(files == 1)
224 cout << translate("You have 1 file in the directory") << endl;
225 else
226 cout << format(translate("You have {1} files in the directory")) % files << endl;
227 \endcode
228
229 This very simple task becomes quite complicated when we deal with languages other than English. Many languages have more
230 than two plural forms. For example, in Hebrew there are special forms for single, double, plural, and plural above 10.
231 They can't be distinguished by the simple rule "is n 1 or not"
232
233 The correct solution is to give a translator an ability to choose a plural form on its own. Thus the translate
234 function can receive two additional parameters English plural form a number: <tt>translate(single,plural,count)</tt>
235
236 For example:
237
238 \code
239 cout << format(translate( "You have {1} file in the directory",
240 "You have {1} files in the directory",
241 files)) % files << endl;
242 \endcode
243
244 A special entry in the dictionary specifies the rule to choose the correct plural form in the target language.
245 For example, the Slavic language family has 3 plural forms, that can be chosen using following equation:
246
247 \code
248 plural=n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;
249 \endcode
250
251 Such equation is stored in the message catalog itself and it is evaluated during translation to supply the correct form.
252
253 So the code above would display 3 different forms in Russian locale for values of 1, 3 and 5:
254
255 \verbatim
256 У вас есть 1 файл в каталоге
257 У вас есть 3 файла в каталоге
258 У вас есть 5 файлов в каталоге
259 \endverbatim
260
261 And for Japanese that does not have plural forms at all it would display the same message
262 for any numeric value.
263
264 For more detailed information please refer to GNU Gettext: <a href="http://www.gnu.org/software/gettext/manual/gettext.html#Plural-forms">11.2.6 Additional functions for plural forms</a>
265
266
267 \subsection adding_context_information Adding Context Information
268
269 In many cases it is not sufficient to provide only the original English string to get the correct translation.
270 You sometimes need to provide some context information. In German, for example, a button labeled "open" is translated to
271 "öffnen" in the context of "opening a file", or to "aufbauen" in the context of opening an internet connection.
272
273 In these cases you must add some context information to the original string, by adding a comment.
274
275 \code
276 button->setLabel(translate("File","open"));
277 \endcode
278
279 The context information is provided as the first parameter to the \ref boost::locale::translate() "translate"
280 function in both singular and plural forms. The translator would see this context information and would be able to translate the
281 "open" string correctly.
282
283 For example, this is how the \c po file would look:
284
285 \code
286 msgctxt "File"
287 msgid "open"
288 msgstr "öffnen"
289
290 msgctxt "Internet Connection"
291 msgid "open"
292 msgstr "aufbauen"
293 \endcode
294
295 \note Context information requires more recent versions of the gettext tools (>=0.15) for extracting strings and
296 formatting message catalogs.
297
298
299 \subsection multiple_gettext_domain Working with multiple messages domains
300
301 In some cases it is useful to work with multiple message domains.
302
303 For example, if an application consists of several independent modules, it may
304 have several domains - a separate domain for each module.
305
306 For example, developing a FooBar office suite we might have:
307
308 - a FooBar Word Processor, using the "foobarwriter" domain
309 - a FooBar Spreadsheet, using the "foobarspreadsheet" domain
310 - a FooBar Spell Checker, using the "foobarspell" domain
311 - a FooBar File handler, using the "foobarodt" domain
312
313 There are three ways to use non-default domains:
314
315 - When working with \c iostream, you can use the parameterized manipulator \ref
316 boost::locale::as::domain "as::domain(std::string const &)", which allows switching domains in a stream:
317 \n
318 \code
319 cout << as::domain("foo") << translate("Hello") << as::domain("bar") << translate("Hello");
320 // First translation is taken from dictionary foo and the other from dictionary bar
321 \endcode
322 - You can specify the domain explicitly when converting a \c message object to a string:
323 \code
324 std::wstring foo_msg = translate(L"Hello World").str("foo");
325 std::wstring bar_msg = translate(L"Hello World").str("bar");
326 \endcode
327 - You can specify the domain directly using a \ref direct_message_translation "convenience" interface:
328 \code
329 MessageBox(dgettext("gui","Error Occurred"));
330 \endcode
331
332 \subsection direct_message_translation Direct translation (Convenience Interface)
333
334 Many applications do not write messages directly to an output stream or use only one locale in the process, so
335 calling <tt>translate("Hello World").str()</tt> for a single message would be annoying. Thus Boost.Locale provides
336 GNU Gettext-like localization functions for direct translation of the messages. However, unlike the GNU Gettext functions,
337 the Boost.Locale translation functions provide an additional optional parameter (locale), and support wide, u16 and u32 strings.
338
339 The GNU Gettext like functions prototypes can be found \ref boost_locale_gettext_family "in this section".
340
341
342 All of these functions can have different prefixes for different forms:
343
344 - \c d - translation in specific domain
345 - \c n - plural form translation
346 - \c p - translation in specific context
347
348 \code
349 MessageBoxW(0,pgettext(L"File Dialog",L"Open?").c_str(),gettext(L"Question").c_str(),MB_YESNO);
350 \endcode
351
352
353 \section extracting_messages_from_code Extracting messages from the source code
354
355 There are many tools to extract messages from the source code into the \c .po file format. The most
356 popular and "native" tool is \c xgettext which is installed by default on most Unix systems and freely downloadable
357 for Windows (see \ref gettext_for_windows).
358
359 For example, we have a source file called \c dir.cpp that prints:
360
361 \code
362 cout << format(translate("Listing of catalog {1}:")) % file_name << endl;
363 cout << format(translate("Catalog {1} contains 1 file","Catalog {1} contains {2,num} files",files_no))
364 % file_name % files_no << endl;
365 \endcode
366
367 Now we run:
368
369 \verbatim
370 xgettext --keyword=translate:1,1t --keyword=translate:1,2,3t dir.cpp
371 \endverbatim
372
373 And a file called \c messages.po created that looks like this (approximately):
374
375 \code
376 #: dir.cpp:1
377 msgid "Listing of catalog {1}:"
378 msgstr ""
379
380 #: dir.cpp:2
381 msgid "Catalog {1} contains 1 file"
382 msgid_plural "Catalog {1} contains {2,num} files"
383 msgstr[0] ""
384 msgstr[1] ""
385 \endcode
386
387 This file can be given to translators to adapt it to specific languages.
388
389 We used the \c --keyword parameter of \c xgettext to make it suitable for extracting messages from
390 source code localized with Boost.Locale, searching for <tt>translate()</tt> function calls instead of the default <tt>gettext()</tt>
391 and <tt>ngettext()</tt> ones.
392 The first parameter <tt>--keyword=translate:1,1t</tt> provides the template for basic messages: a \c translate function that is
393 called with 1 argument (1t) and the first message is taken as the key. The second one <tt>--keyword=translate:1,2,3t</tt> is used
394 for plural forms.
395 It tells \c xgettext to use a <tt>translate()</tt> function call with 3 parameters (3t) and take the 1st and 2nd parameter as keys. An
396 additional marker \c Nc can be used to mark context information.
397
398 The full set of xgettext parameters suitable for Boost.Locale is:
399
400 \code
401 xgettext --keyword=translate:1,1t --keyword=translate:1c,2,2t \
402 --keyword=translate:1,2,3t --keyword=translate:1c,2,3,4t \
403 --keyword=gettext:1 --keyword=pgettext:1c,2 \
404 --keyword=ngettext:1,2 --keyword=npgettext:1c,2,3 \
405 source_file_1.cpp ... source_file_N.cpp
406 \endcode
407
408 Of course, if you do not use "gettext" like translation you
409 may ignore some of these parameters.
410
411 \subsection custom_file_system_support Custom Filesystem Support
412
413 When the access to actual file system is limited like in ActiveX controls or
414 when the developer wants to ship all-in-one executable file,
415 it is useful to be able to load \c gettext catalogs from a custom location -
416 a custom file system.
417
418 Boost.Locale provides an option to install boost::locale::message_format facet
419 with customized options provided in boost::locale::gnu_gettext::messages_info structure.
420
421 This structure contains \c boost::function based
422 \ref boost::locale::gnu_gettext::messages_info::callback_type "callback"
423 that allows user to provide custom functionality to load message catalog files.
424
425 For example:
426
427 \code
428 // Configure all options for message catalog
429 namespace blg = boost::locale::gnu_gettext;
430 blg::messages_info info;
431 info.language = "he";
432 info.country = "IL";
433 info.encoding="UTF-8";
434 info.paths.push_back(""); // You need some even empty path
435 info.domains.push_back(blg::messages_info::domain("my_app"));
436 info.callback = some_file_loader; // Provide a callback
437
438 // Create a basic locale without messages support
439 boost::locale::generator gen;
440 std::locale base_locale = gen("he_IL.UTF-8");
441
442 // Install messages catalogs for "char" support to the final locale
443 // we are going to use
444 std::locale real_locale(base_locale,blg::create_messages_facet<char>(info));
445 \endcode
446
447 In order to setup \ref boost::locale::gnu_gettext::messages_info::language "language", \ref boost::locale::gnu_gettext::messages_info::country "country" and other members you may use \ref boost::locale::info facet for convenience,
448
449 \code
450 // Configure all options for message catalog
451 namespace blg = boost::locale::gnu_gettext;
452 blg::messages_info info;
453
454 info.paths.push_back(""); // You need some even empty path
455 info.domains.push_back(blg::messages_info::domain("my_app"));
456 info.callback = some_file_loader; // Provide a callback
457
458 // Create an object with default locale
459 std::locale base_locale = gen("");
460
461 // Use boost::locale::info to configure all parameters
462
463 boost::locale::info const &properties = std::use_facet<boost::locale::info>(base_locale);
464 info.language = properties.language();
465 info.country = properties.country();
466 info.encoding = properties.encoding();
467 info.variant = properties.variant();
468
469 // Install messages catalogs to the final locale
470 std::locale real_locale(base_locale,blg::create_messages_facet<char>(info));
471 \endcode
472
473 \section msg_non_ascii_keys Non US-ASCII Keys
474
475 Boost.Locale assumes that you use English for original text messages. And the best
476 practice is to use US-ASCII characters for original keys.
477
478 However in some cases it us useful in insert some Unicode characters in text like
479 for example Copyright "©" character.
480
481 As long as your narrow character string encoding is UTF-8 nothing further should be done.
482
483 Boost.Locale assumes that your sources are encoded in UTF-8 and the input narrow
484 string use UTF-8 - which is the default for most compilers around (with notable
485 exception of Microsoft Visual C++).
486
487 However if your narrow strings encoding in the source file is not UTF-8 but some other
488 encoding like windows-1252, the string would be misinterpreted.
489
490 You can specify the character set of the original strings when you specify the
491 domain name for the application.
492
493 \code
494 #include <boost/locale.hpp>
495 #include <iostream>
496
497 using namespace std;
498 using namespace boost::locale;
499
500 int main()
501 {
502 generator gen;
503
504 // Specify location of dictionaries
505 gen.add_messages_path(".");
506 // Specify the encoding of the source string
507 gen.add_messages_domain("copyrighted/windows-1255");
508
509 // Generate locales and imbue them to iostream
510 locale::global(gen(""));
511 cout.imbue(locale());
512
513 // In Windows 1255 (C) symbol is encoded as 0xA9
514 cout << translate("© 2001 All Rights Reserved") << endl;
515 }
516 \endcode
517
518 Thus if the programs runs in UTF-8 locale the copyright symbol would
519 be automatically converted to an appropriate UTF-8 sequence if the
520 key is missing in the dictionary.
521
522
523 \subsection msg_qna Questions and Answers
524
525 - Do I need GNU Gettext to use Boost.Locale?
526 \n
527 Boost.Locale provides a run-time environment to load and use GNU Gettext message catalogs, but it does
528 not provide tools for generation, translation, compilation and management of these catalogs.
529 Boost.Locale only reimplements the GNU Gettext libintl.
530 \n
531 You would probably need:
532 \n
533 -# Boost.Locale itself -- for runtime.
534 -# A tool for extracting strings from source code, and managing them: GNU Gettext provides good tools, but other
535 implementations are available as well.
536 -# A good translation program like <a href="http://userbase.kde.org/Lokalize">Lokalize</a>, <a href="http://www.poedit.net/">Pedit</a> or <a href="http://projects.gnome.org/gtranslator/">GTranslator</a>.
537
538 - Why doesn't Boost.Locale provide tools for extracting and management of message catalogs. Why should
539 I use GPL-ed software? Are my programs or message catalogs affected by its license?
540 \n
541 -# Boost.Locale does not link to or use any of the GNU Gettext code, so you need not worry about your code as
542 the runtime library is fully reimplemented.
543 -# You may freely use GPL-ed software for extracting and managing catalogs, the same way as you are free to use
544 a GPL-ed editor. It does not affect your message catalogs or your code.
545 -# I see no reason to reimplement well debugged, working tools like \c xgettext, \c msgfmt, \c msgmerge that
546 do a very fine job, especially as they are freely available for download and support almost any platform.
547 All Linux distributions, BSD Flavors, Mac OS X and other Unix like operating systems provide GNU Gettext tools
548 as a standard package.\n
549 Windows users can get GNU Gettext utilities via MinGW project. See \ref gettext_for_windows.
550
551
552 - Is there any reason to prefer the Boost.Locale implementation to the original GNU Gettext runtime library?
553 In either case I would probably need some of the GNU tools.
554 \n
555 There are two important differences between the GNU Gettext runtime library and the Boost.Locale implementation:
556 \n
557 -# The GNU Gettext runtime supports only one locale per process. It is not thread-safe to use multiple locales
558 and encodings in the same process. This is perfectly fine for applications that interact directly with
559 a single user like most GUI applications, but is problematic for services and servers.
560 -# The GNU Gettext API supports only 8-bit encodings, making it irrelevant in environments that natively use
561 wide strings.
562 -# The GNU Gettext runtime library distributed under LGPL license which may be not convenient for some users.
563
564 */
565
566