2 // Copyright (c) 2009-2011 Artyom Beilis (Tonkikh)
4 // Distributed under the Boost Software License, Version 1.0. (See
5 // accompanying file LICENSE_1_0.txt or copy at
6 // http://www.boost.org/LICENSE_1_0.txt)
9 // vim: tabstop=4 expandtab shiftwidth=4 softtabstop=4 filetype=cpp.doxygen
11 \page messages_formatting Messages Formatting (Translation)
13 - \ref messages_formatting_into
14 - \ref msg_loading_dictionaries
15 - \ref message_translation
16 - \ref indirect_message_translation
18 - \ref multiple_gettext_domain
19 - \ref direct_message_translation
20 - \ref extracting_messages_from_code
21 - \ref custom_file_system_support
22 - \ref msg_non_ascii_keys
25 \section messages_formatting_into Introduction
27 Messages formatting is probably the most important part of
28 the localization - making your application speak in the user's language.
30 Boost.Locale uses the <a href="http://www.gnu.org/software/gettext/">GNU Gettext</a> localization model.
31 We recommend you read the general <a href="http://www.gnu.org/software/gettext/manual/gettext.html">documentation</a>
32 of GNU Gettext, as it is outside the scope of this document.
34 The model is following:
36 - First, our application \c foo is prepared for localization by calling the \ref boost::locale::translate() "translate" function
37 for each message used in user interface.
41 cout << "Hello World" << endl;
46 cout << translate("Hello World") << endl;
48 - Then all messages are extracted from the source code and a special \c foo.po file is generated that contains all of the
49 original English strings.
57 - The \c foo.po file is translated for the supported locales. For example, \c de.po, \c ar.po, \c en_CA.po , and \c he.po.
64 And then compiled to the binary \c mo format and stored in the following file structure:
72 en_CA/LC_MESSAGES/foo.mo
76 When the application starts, it loads the required dictionaries. Then when the \c translate function is called and the message is written
77 to an output stream, a dictionary lookup is performed and the localized message is written out instead.
79 \section msg_loading_dictionaries Loading dictionaries
81 All the dictionaries are loaded by the \ref boost::locale::generator "generator" class.
82 Using localized strings in the application, requires specification
83 of the following parameters:
85 -# The search path of the dictionaries
86 -# The application domain (or name)
88 This is done by calling the following member functions of the \ref boost::locale::generator "generator" class:
90 - \ref boost::locale::generator::add_messages_path() "add_messages_path" - add the root path to the dictionaries.
92 For example: if the dictionary is located at \c /usr/share/locale/ar/LC_MESSAGES/foo.mo, then path should be \c /usr/share/locale.
94 - \ref boost::locale::generator::add_messages_domain() "add_messages_domain" - add the domain (name) of the application. In the above case it would be "foo".
96 \note At least one domain and one path should be specified in order to load dictionaries.
98 This is an example of our first fully localized program:
101 #include <boost/locale.hpp>
105 using namespace boost::locale;
111 // Specify location of dictionaries
112 gen.add_messages_path(".");
113 gen.add_messages_domain("hello");
115 // Generate locales and imbue them to iostream
116 locale::global(gen(""));
117 cout.imbue(locale());
119 // Display a message using current system locale
120 cout << translate("Hello World") << endl;
125 \section message_translation Message Translation
127 There are two ways to translate messages:
129 - using \ref boost_locale_translate_family "boost::locale::translate()" family of functions:
131 These functions create a special proxy object \ref boost::locale::basic_message "basic_message"
132 that can be converted to string according to given locale or written to \c std::ostream
133 formatting the message in the \c std::ostream's locale.
135 It is very convenient for working with \c std::ostream object and for postponing message
137 - Using \ref boost_locale_gettext_family "boost::locale::gettext()" family of functions:
139 These are functions that are used for direct message translation: they receive as a parameter
140 an original message or a key and convert it to the \c std::basic_string in given locale.
142 These functions have similar names to thous used in the GNU Gettext library.
144 \subsection indirect_message_translation Indirect Message Translation
146 The basic function that allows us to translate a message is \ref boost_locale_translate_family "boost::locale::translate()" family of functions.
148 These functions use a character type \c CharType as template parameter and receive either <tt>CharType const *</tt> or <tt>std::basic_string<CharType></tt> as input.
150 These functions receive an original message and return a special proxy
151 object - \ref boost::locale::basic_message "basic_message<CharType>".
152 This object holds all the required information for the message formatting.
154 When this object is written to an output \c ostream, it performs a dictionary lookup of the message according to the locale
155 imbued in \c iostream.
157 If the message is found in the dictionary it is written to the output stream,
158 otherwise the original string is written to the stream.
163 // Translate a simple message "Hello World!"
164 std::cout << boost::locale::translate("Hello World!") << std::endl;
167 This allows the program to postpone translation of the message until the translation is actually needed, even to different
171 // Several output stream that we write a message to
172 // English, Japanese, Hebrew etc.
173 // Each one them has installed std::locale object that represents
174 // their specific locale
175 std::ofstream en,ja,he,de,ar;
177 // Send single message to multiple streams
178 void send_to_all(message const &msg)
180 // in each of the cases below
181 // the message is translated to different
193 send_to_all(translate("Hello World"));
199 - \ref boost::locale::basic_message "basic_message" can be implicitly converted
200 to an apopriate std::basic_string using
204 std::wstring msg = translate(L"Do you want to open the file?");
206 - \ref boost::locale::basic_message "basic_message" can be explicitly converted
207 to a string using the \ref boost::locale::basic_message::str() "str()" member function for a specific locale.
210 std::locale ru_RU = ... ;
211 std::string msg = translate("Do you want to open the file?").str(ru_RU);
215 \subsection plural_forms Plural Forms
217 GNU Gettext catalogs have simple, robust and yet powerful plural forms support. We recommend to read the
218 original GNU documentation <a href="http://www.gnu.org/software/gettext/manual/gettext.html#Plural-forms">here</a>.
220 Let's try to solve a simple problem, displaying a message to the user:
224 cout << translate("You have 1 file in the directory") << endl;
226 cout << format(translate("You have {1} files in the directory")) % files << endl;
229 This very simple task becomes quite complicated when we deal with languages other than English. Many languages have more
230 than two plural forms. For example, in Hebrew there are special forms for single, double, plural, and plural above 10.
231 They can't be distinguished by the simple rule "is n 1 or not"
233 The correct solution is to give a translator an ability to choose a plural form on its own. Thus the translate
234 function can receive two additional parameters English plural form a number: <tt>translate(single,plural,count)</tt>
239 cout << format(translate( "You have {1} file in the directory",
240 "You have {1} files in the directory",
241 files)) % files << endl;
244 A special entry in the dictionary specifies the rule to choose the correct plural form in the target language.
245 For example, the Slavic language family has 3 plural forms, that can be chosen using following equation:
248 plural=n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;
251 Such equation is stored in the message catalog itself and it is evaluated during translation to supply the correct form.
253 So the code above would display 3 different forms in Russian locale for values of 1, 3 and 5:
256 У вас есть 1 файл в каталоге
257 У вас есть 3 файла в каталоге
258 У вас есть 5 файлов в каталоге
261 And for Japanese that does not have plural forms at all it would display the same message
262 for any numeric value.
264 For more detailed information please refer to GNU Gettext: <a href="http://www.gnu.org/software/gettext/manual/gettext.html#Plural-forms">11.2.6 Additional functions for plural forms</a>
267 \subsection adding_context_information Adding Context Information
269 In many cases it is not sufficient to provide only the original English string to get the correct translation.
270 You sometimes need to provide some context information. In German, for example, a button labeled "open" is translated to
271 "öffnen" in the context of "opening a file", or to "aufbauen" in the context of opening an internet connection.
273 In these cases you must add some context information to the original string, by adding a comment.
276 button->setLabel(translate("File","open"));
279 The context information is provided as the first parameter to the \ref boost::locale::translate() "translate"
280 function in both singular and plural forms. The translator would see this context information and would be able to translate the
281 "open" string correctly.
283 For example, this is how the \c po file would look:
290 msgctxt "Internet Connection"
295 \note Context information requires more recent versions of the gettext tools (>=0.15) for extracting strings and
296 formatting message catalogs.
299 \subsection multiple_gettext_domain Working with multiple messages domains
301 In some cases it is useful to work with multiple message domains.
303 For example, if an application consists of several independent modules, it may
304 have several domains - a separate domain for each module.
306 For example, developing a FooBar office suite we might have:
308 - a FooBar Word Processor, using the "foobarwriter" domain
309 - a FooBar Spreadsheet, using the "foobarspreadsheet" domain
310 - a FooBar Spell Checker, using the "foobarspell" domain
311 - a FooBar File handler, using the "foobarodt" domain
313 There are three ways to use non-default domains:
315 - When working with \c iostream, you can use the parameterized manipulator \ref
316 boost::locale::as::domain "as::domain(std::string const &)", which allows switching domains in a stream:
319 cout << as::domain("foo") << translate("Hello") << as::domain("bar") << translate("Hello");
320 // First translation is taken from dictionary foo and the other from dictionary bar
322 - You can specify the domain explicitly when converting a \c message object to a string:
324 std::wstring foo_msg = translate(L"Hello World").str("foo");
325 std::wstring bar_msg = translate(L"Hello World").str("bar");
327 - You can specify the domain directly using a \ref direct_message_translation "convenience" interface:
329 MessageBox(dgettext("gui","Error Occurred"));
332 \subsection direct_message_translation Direct translation (Convenience Interface)
334 Many applications do not write messages directly to an output stream or use only one locale in the process, so
335 calling <tt>translate("Hello World").str()</tt> for a single message would be annoying. Thus Boost.Locale provides
336 GNU Gettext-like localization functions for direct translation of the messages. However, unlike the GNU Gettext functions,
337 the Boost.Locale translation functions provide an additional optional parameter (locale), and support wide, u16 and u32 strings.
339 The GNU Gettext like functions prototypes can be found \ref boost_locale_gettext_family "in this section".
342 All of these functions can have different prefixes for different forms:
344 - \c d - translation in specific domain
345 - \c n - plural form translation
346 - \c p - translation in specific context
349 MessageBoxW(0,pgettext(L"File Dialog",L"Open?").c_str(),gettext(L"Question").c_str(),MB_YESNO);
353 \section extracting_messages_from_code Extracting messages from the source code
355 There are many tools to extract messages from the source code into the \c .po file format. The most
356 popular and "native" tool is \c xgettext which is installed by default on most Unix systems and freely downloadable
357 for Windows (see \ref gettext_for_windows).
359 For example, we have a source file called \c dir.cpp that prints:
362 cout << format(translate("Listing of catalog {1}:")) % file_name << endl;
363 cout << format(translate("Catalog {1} contains 1 file","Catalog {1} contains {2,num} files",files_no))
364 % file_name % files_no << endl;
370 xgettext --keyword=translate:1,1t --keyword=translate:1,2,3t dir.cpp
373 And a file called \c messages.po created that looks like this (approximately):
377 msgid "Listing of catalog {1}:"
381 msgid "Catalog {1} contains 1 file"
382 msgid_plural "Catalog {1} contains {2,num} files"
387 This file can be given to translators to adapt it to specific languages.
389 We used the \c --keyword parameter of \c xgettext to make it suitable for extracting messages from
390 source code localized with Boost.Locale, searching for <tt>translate()</tt> function calls instead of the default <tt>gettext()</tt>
391 and <tt>ngettext()</tt> ones.
392 The first parameter <tt>--keyword=translate:1,1t</tt> provides the template for basic messages: a \c translate function that is
393 called with 1 argument (1t) and the first message is taken as the key. The second one <tt>--keyword=translate:1,2,3t</tt> is used
395 It tells \c xgettext to use a <tt>translate()</tt> function call with 3 parameters (3t) and take the 1st and 2nd parameter as keys. An
396 additional marker \c Nc can be used to mark context information.
398 The full set of xgettext parameters suitable for Boost.Locale is:
401 xgettext --keyword=translate:1,1t --keyword=translate:1c,2,2t \
402 --keyword=translate:1,2,3t --keyword=translate:1c,2,3,4t \
403 --keyword=gettext:1 --keyword=pgettext:1c,2 \
404 --keyword=ngettext:1,2 --keyword=npgettext:1c,2,3 \
405 source_file_1.cpp ... source_file_N.cpp
408 Of course, if you do not use "gettext" like translation you
409 may ignore some of these parameters.
411 \subsection custom_file_system_support Custom Filesystem Support
413 When the access to actual file system is limited like in ActiveX controls or
414 when the developer wants to ship all-in-one executable file,
415 it is useful to be able to load \c gettext catalogs from a custom location -
416 a custom file system.
418 Boost.Locale provides an option to install boost::locale::message_format facet
419 with customized options provided in boost::locale::gnu_gettext::messages_info structure.
421 This structure contains \c boost::function based
422 \ref boost::locale::gnu_gettext::messages_info::callback_type "callback"
423 that allows user to provide custom functionality to load message catalog files.
428 // Configure all options for message catalog
429 namespace blg = boost::locale::gnu_gettext;
430 blg::messages_info info;
431 info.language = "he";
433 info.encoding="UTF-8";
434 info.paths.push_back(""); // You need some even empty path
435 info.domains.push_back(blg::messages_info::domain("my_app"));
436 info.callback = some_file_loader; // Provide a callback
438 // Create a basic locale without messages support
439 boost::locale::generator gen;
440 std::locale base_locale = gen("he_IL.UTF-8");
442 // Install messages catalogs for "char" support to the final locale
443 // we are going to use
444 std::locale real_locale(base_locale,blg::create_messages_facet<char>(info));
447 In order to setup \ref boost::locale::gnu_gettext::messages_info::language "language", \ref boost::locale::gnu_gettext::messages_info::country "country" and other members you may use \ref boost::locale::info facet for convenience,
450 // Configure all options for message catalog
451 namespace blg = boost::locale::gnu_gettext;
452 blg::messages_info info;
454 info.paths.push_back(""); // You need some even empty path
455 info.domains.push_back(blg::messages_info::domain("my_app"));
456 info.callback = some_file_loader; // Provide a callback
458 // Create an object with default locale
459 std::locale base_locale = gen("");
461 // Use boost::locale::info to configure all parameters
463 boost::locale::info const &properties = std::use_facet<boost::locale::info>(base_locale);
464 info.language = properties.language();
465 info.country = properties.country();
466 info.encoding = properties.encoding();
467 info.variant = properties.variant();
469 // Install messages catalogs to the final locale
470 std::locale real_locale(base_locale,blg::create_messages_facet<char>(info));
473 \section msg_non_ascii_keys Non US-ASCII Keys
475 Boost.Locale assumes that you use English for original text messages. And the best
476 practice is to use US-ASCII characters for original keys.
478 However in some cases it us useful in insert some Unicode characters in text like
479 for example Copyright "©" character.
481 As long as your narrow character string encoding is UTF-8 nothing further should be done.
483 Boost.Locale assumes that your sources are encoded in UTF-8 and the input narrow
484 string use UTF-8 - which is the default for most compilers around (with notable
485 exception of Microsoft Visual C++).
487 However if your narrow strings encoding in the source file is not UTF-8 but some other
488 encoding like windows-1252, the string would be misinterpreted.
490 You can specify the character set of the original strings when you specify the
491 domain name for the application.
494 #include <boost/locale.hpp>
498 using namespace boost::locale;
504 // Specify location of dictionaries
505 gen.add_messages_path(".");
506 // Specify the encoding of the source string
507 gen.add_messages_domain("copyrighted/windows-1255");
509 // Generate locales and imbue them to iostream
510 locale::global(gen(""));
511 cout.imbue(locale());
513 // In Windows 1255 (C) symbol is encoded as 0xA9
514 cout << translate("© 2001 All Rights Reserved") << endl;
518 Thus if the programs runs in UTF-8 locale the copyright symbol would
519 be automatically converted to an appropriate UTF-8 sequence if the
520 key is missing in the dictionary.
523 \subsection msg_qna Questions and Answers
525 - Do I need GNU Gettext to use Boost.Locale?
527 Boost.Locale provides a run-time environment to load and use GNU Gettext message catalogs, but it does
528 not provide tools for generation, translation, compilation and management of these catalogs.
529 Boost.Locale only reimplements the GNU Gettext libintl.
531 You would probably need:
533 -# Boost.Locale itself -- for runtime.
534 -# A tool for extracting strings from source code, and managing them: GNU Gettext provides good tools, but other
535 implementations are available as well.
536 -# A good translation program like <a href="http://userbase.kde.org/Lokalize">Lokalize</a>, <a href="http://www.poedit.net/">Pedit</a> or <a href="http://projects.gnome.org/gtranslator/">GTranslator</a>.
538 - Why doesn't Boost.Locale provide tools for extracting and management of message catalogs. Why should
539 I use GPL-ed software? Are my programs or message catalogs affected by its license?
541 -# Boost.Locale does not link to or use any of the GNU Gettext code, so you need not worry about your code as
542 the runtime library is fully reimplemented.
543 -# You may freely use GPL-ed software for extracting and managing catalogs, the same way as you are free to use
544 a GPL-ed editor. It does not affect your message catalogs or your code.
545 -# I see no reason to reimplement well debugged, working tools like \c xgettext, \c msgfmt, \c msgmerge that
546 do a very fine job, especially as they are freely available for download and support almost any platform.
547 All Linux distributions, BSD Flavors, Mac OS X and other Unix like operating systems provide GNU Gettext tools
548 as a standard package.\n
549 Windows users can get GNU Gettext utilities via MinGW project. See \ref gettext_for_windows.
552 - Is there any reason to prefer the Boost.Locale implementation to the original GNU Gettext runtime library?
553 In either case I would probably need some of the GNU tools.
555 There are two important differences between the GNU Gettext runtime library and the Boost.Locale implementation:
557 -# The GNU Gettext runtime supports only one locale per process. It is not thread-safe to use multiple locales
558 and encodings in the same process. This is perfectly fine for applications that interact directly with
559 a single user like most GUI applications, but is problematic for services and servers.
560 -# The GNU Gettext API supports only 8-bit encodings, making it irrelevant in environments that natively use
562 -# The GNU Gettext runtime library distributed under LGPL license which may be not convenient for some users.