]> git.proxmox.com Git - ceph.git/blob - ceph/src/boost/libs/locale/doc/std_locales.txt
add subtree-ish sources for 12.0.3
[ceph.git] / ceph / src / boost / libs / locale / doc / std_locales.txt
1 //
2 // Copyright (c) 2009-2011 Artyom Beilis (Tonkikh)
3 //
4 // Distributed under the Boost Software License, Version 1.0. (See
5 // accompanying file LICENSE_1_0.txt or copy at
6 // http://www.boost.org/LICENSE_1_0.txt)
7 //
8
9 // vim: tabstop=4 expandtab shiftwidth=4 softtabstop=4 filetype=cpp.doxygen
10 /*!
11 \page std_locales Introduction to C++ Standard Library localization support
12
13 \section std_locales_basics Getting familiar with standard C++ Locales
14
15 The C++ standard library offers a simple and powerful way to provide locale-specific information. It is done via the \c
16 std::locale class, the container that holds all the required information about a specific culture, such as number formatting
17 patterns, date and time formatting, currency, case conversion etc.
18
19 All this information is provided by facets, special classes derived from the \c std::locale::facet base class. Such facets are
20 packed into the \c std::locale class and allow you to provide arbitrary information about the locale. The \c std::locale class
21 keeps reference counters on installed facets and can be efficiently copied.
22
23 Each facet that was installed into the \c std::locale object can be fetched using the \c std::use_facet function. For example,
24 the \c std::ctype<Char> facet provides rules for case conversion, so you can convert a character to upper-case like this:
25
26 \code
27 std::ctype<char> const &ctype_facet = std::use_facet<std::ctype<char> >(some_locale);
28 char upper_a = ctype_facet.toupper('a');
29 \endcode
30
31 A locale object can be imbued into an \c iostream so it would format information according to the locale:
32
33 \code
34 cout.imbue(std::locale("en_US.UTF-8"));
35 cout << 1345.45 << endl;
36 cout.imbue(std::locale("ru_RU.UTF-8"));
37 cout << 1345.45 << endl;
38 \endcode
39
40 Would display:
41
42 \verbatim
43 1,345.45 1.345,45
44 \endverbatim
45
46 You can also create your own facets and install them into existing locale objects. For example:
47
48 \code
49 class measure : public std::locale::facet {
50 public:
51 typedef enum { inches, ... } measure_type;
52 measure(measure_type m,size_t refs=0)
53 double from_metric(double value) const;
54 std::string name() const;
55 ...
56 };
57 \endcode
58 And now you can simply provide this information to a locale:
59
60 \code
61 std::locale::global(std::locale(std::locale("en_US.UTF-8"),new measure(measure::inches)));
62 /// Create default locale built from en_US locale and add paper size facet.
63 \endcode
64
65
66 Now you can print a distance according to the correct locale:
67
68 \code
69 void print_distance(std::ostream &out,double value)
70 {
71 measure const &m = std::use_facet<measure>(out.getloc());
72 // Fetch locale information from stream
73 out << m.from_metric(value) << " " << m.name();
74 }
75 \endcode
76
77 This technique was adopted by the Boost.Locale library in order to provide powerful and correct localization. Instead of using
78 the very limited C++ standard library facets, it uses ICU under the hood to create its own much more powerful ones.
79
80 \section std_locales_common Common Critical Problems with the Standard Library
81
82 There are numerous issues in the standard library that prevent the use of its full power, and there are several
83 additional issues:
84
85 - Setting the global locale has bad side effects.
86 \n
87 Consider following code:
88 \n
89 \code
90 int main()
91 {
92 std::locale::global(std::locale(""));
93 // Set system's default locale as global
94 std::ofstream csv("test.csv");
95 csv << 1.1 << "," << 1.3 << std::endl;
96 }
97 \endcode
98 \n
99 What would be the content of \c test.csv ? It may be "1.1,1.3" or it may be "1,1,1,3"
100 rather than what you had expected.
101 \n
102 More than that it affects even \c printf and libraries like \c boost::lexical_cast giving
103 incorrect or unexpected formatting. In fact many third-party libraries are broken in such a
104 situation.
105 \n
106 Unlike the standard localization library, Boost.Locale never changes the basic number formatting,
107 even when it uses \c std based localization backends, so by default, numbers are always
108 formatted using C-style locale. Localized number formatting requires specific flags.
109 \n
110 - Number formatting is broken on some locales.
111 \n
112 Some locales use the non-breakable space u00A0 character for thousands separator, thus
113 in \c ru_RU.UTF-8 locale number 1024 should be displayed as "1 024" where the space
114 is a Unicode character with codepoint u00A0. Unfortunately many libraries don't handle
115 this correctly, for example GCC and SunStudio display a "\xC2" character instead of
116 the first character in the UTF-8 sequence "\xC2\xA0" that represents this code point, and
117 actually generate invalid UTF-8.
118 \n
119 - Locale names are not standardized. For example, under MSVC you need to provide the name
120 \c en-US or \c English_USA.1252 , when on POSIX platforms it would be \c en_US.UTF-8
121 or \c en_US.ISO-8859-1
122 \n
123 More than that, MSVC does not support UTF-8 locales at all.
124 \n
125 - Many standard libraries provide only the C and POSIX locales, thus GCC supports localization
126 only under Linux. On all other platforms, attempting to create locales other than "C" or
127 "POSIX" would fail.
128
129 */
130