]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | .. BSD LICENSE |
2 | Copyright(c) 2010-2014 Intel Corporation. All rights reserved. | |
3 | All rights reserved. | |
4 | ||
5 | Redistribution and use in source and binary forms, with or without | |
6 | modification, are permitted provided that the following conditions | |
7 | are met: | |
8 | ||
9 | * Redistributions of source code must retain the above copyright | |
10 | notice, this list of conditions and the following disclaimer. | |
11 | * Redistributions in binary form must reproduce the above copyright | |
12 | notice, this list of conditions and the following disclaimer in | |
13 | the documentation and/or other materials provided with the | |
14 | distribution. | |
15 | * Neither the name of Intel Corporation nor the names of its | |
16 | contributors may be used to endorse or promote products derived | |
17 | from this software without specific prior written permission. | |
18 | ||
19 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS | |
20 | "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT | |
21 | LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR | |
22 | A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT | |
23 | OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, | |
24 | SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT | |
25 | LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, | |
26 | DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY | |
27 | THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | |
28 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE | |
29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | |
30 | ||
31 | .. _Mempool_Library: | |
32 | ||
33 | Mempool Library | |
34 | =============== | |
35 | ||
36 | A memory pool is an allocator of a fixed-sized object. | |
37 | In the DPDK, it is identified by name and uses a mempool handler to store free objects. | |
38 | The default mempool handler is ring based. | |
39 | It provides some other optional services such as a per-core object cache and | |
40 | an alignment helper to ensure that objects are padded to spread them equally on all DRAM or DDR3 channels. | |
41 | ||
42 | This library is used by the :ref:`Mbuf Library <Mbuf_Library>`. | |
43 | ||
44 | Cookies | |
45 | ------- | |
46 | ||
47 | In debug mode (CONFIG_RTE_LIBRTE_MEMPOOL_DEBUG is enabled), cookies are added at the beginning and end of allocated blocks. | |
48 | The allocated objects then contain overwrite protection fields to help debugging buffer overflows. | |
49 | ||
50 | Stats | |
51 | ----- | |
52 | ||
53 | In debug mode (CONFIG_RTE_LIBRTE_MEMPOOL_DEBUG is enabled), | |
54 | statistics about get from/put in the pool are stored in the mempool structure. | |
55 | Statistics are per-lcore to avoid concurrent access to statistics counters. | |
56 | ||
57 | Memory Alignment Constraints | |
58 | ---------------------------- | |
59 | ||
60 | Depending on hardware memory configuration, performance can be greatly improved by adding a specific padding between objects. | |
61 | The objective is to ensure that the beginning of each object starts on a different channel and rank in memory so that all channels are equally loaded. | |
62 | ||
63 | This is particularly true for packet buffers when doing L3 forwarding or flow classification. | |
64 | Only the first 64 bytes are accessed, so performance can be increased by spreading the start addresses of objects among the different channels. | |
65 | ||
66 | The number of ranks on any DIMM is the number of independent sets of DRAMs that can be accessed for the full data bit-width of the DIMM. | |
67 | The ranks cannot be accessed simultaneously since they share the same data path. | |
68 | The physical layout of the DRAM chips on the DIMM itself does not necessarily relate to the number of ranks. | |
69 | ||
70 | When running an application, the EAL command line options provide the ability to add the number of memory channels and ranks. | |
71 | ||
72 | .. note:: | |
73 | ||
74 | The command line must always have the number of memory channels specified for the processor. | |
75 | ||
76 | Examples of alignment for different DIMM architectures are shown in | |
77 | :numref:`figure_memory-management` and :numref:`figure_memory-management2`. | |
78 | ||
79 | .. _figure_memory-management: | |
80 | ||
81 | .. figure:: img/memory-management.* | |
82 | ||
83 | Two Channels and Quad-ranked DIMM Example | |
84 | ||
85 | ||
86 | In this case, the assumption is that a packet is 16 blocks of 64 bytes, which is not true. | |
87 | ||
88 | The Intel® 5520 chipset has three channels, so in most cases, | |
89 | no padding is required between objects (except for objects whose size are n x 3 x 64 bytes blocks). | |
90 | ||
91 | .. _figure_memory-management2: | |
92 | ||
93 | .. figure:: img/memory-management2.* | |
94 | ||
95 | Three Channels and Two Dual-ranked DIMM Example | |
96 | ||
97 | ||
98 | When creating a new pool, the user can specify to use this feature or not. | |
99 | ||
100 | .. _mempool_local_cache: | |
101 | ||
102 | Local Cache | |
103 | ----------- | |
104 | ||
105 | In terms of CPU usage, the cost of multiple cores accessing a memory pool's ring of free buffers may be high | |
106 | since each access requires a compare-and-set (CAS) operation. | |
107 | To avoid having too many access requests to the memory pool's ring, | |
108 | the memory pool allocator can maintain a per-core cache and do bulk requests to the memory pool's ring, | |
109 | via the cache with many fewer locks on the actual memory pool structure. | |
110 | In this way, each core has full access to its own cache (with locks) of free objects and | |
111 | only when the cache fills does the core need to shuffle some of the free objects back to the pools ring or | |
112 | obtain more objects when the cache is empty. | |
113 | ||
114 | While this may mean a number of buffers may sit idle on some core's cache, | |
115 | the speed at which a core can access its own cache for a specific memory pool without locks provides performance gains. | |
116 | ||
117 | The cache is composed of a small, per-core table of pointers and its length (used as a stack). | |
118 | This internal cache can be enabled or disabled at creation of the pool. | |
119 | ||
120 | The maximum size of the cache is static and is defined at compilation time (CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE). | |
121 | ||
122 | :numref:`figure_mempool` shows a cache in operation. | |
123 | ||
124 | .. _figure_mempool: | |
125 | ||
126 | .. figure:: img/mempool.* | |
127 | ||
128 | A mempool in Memory with its Associated Ring | |
129 | ||
130 | Alternatively to the internal default per-lcore local cache, an application can create and manage external caches through the ``rte_mempool_cache_create()``, ``rte_mempool_cache_free()`` and ``rte_mempool_cache_flush()`` calls. | |
131 | These user-owned caches can be explicitly passed to ``rte_mempool_generic_put()`` and ``rte_mempool_generic_get()``. | |
132 | The ``rte_mempool_default_cache()`` call returns the default internal cache if any. | |
133 | In contrast to the default caches, user-owned caches can be used by non-EAL threads too. | |
134 | ||
135 | Mempool Handlers | |
136 | ------------------------ | |
137 | ||
138 | This allows external memory subsystems, such as external hardware memory | |
139 | management systems and software based memory allocators, to be used with DPDK. | |
140 | ||
141 | There are two aspects to a mempool handler. | |
142 | ||
143 | * Adding the code for your new mempool operations (ops). This is achieved by | |
144 | adding a new mempool ops code, and using the ``MEMPOOL_REGISTER_OPS`` macro. | |
145 | ||
146 | * Using the new API to call ``rte_mempool_create_empty()`` and | |
147 | ``rte_mempool_set_ops_byname()`` to create a new mempool and specifying which | |
148 | ops to use. | |
149 | ||
150 | Several different mempool handlers may be used in the same application. A new | |
151 | mempool can be created by using the ``rte_mempool_create_empty()`` function, | |
152 | then using ``rte_mempool_set_ops_byname()`` to point the mempool to the | |
153 | relevant mempool handler callback (ops) structure. | |
154 | ||
155 | Legacy applications may continue to use the old ``rte_mempool_create()`` API | |
156 | call, which uses a ring based mempool handler by default. These applications | |
157 | will need to be modified to use a new mempool handler. | |
158 | ||
159 | For applications that use ``rte_pktmbuf_create()``, there is a config setting | |
160 | (``RTE_MBUF_DEFAULT_MEMPOOL_OPS``) that allows the application to make use of | |
161 | an alternative mempool handler. | |
162 | ||
163 | ||
164 | Use Cases | |
165 | --------- | |
166 | ||
167 | All allocations that require a high level of performance should use a pool-based memory allocator. | |
168 | Below are some examples: | |
169 | ||
170 | * :ref:`Mbuf Library <Mbuf_Library>` | |
171 | ||
172 | * :ref:`Environment Abstraction Layer <Environment_Abstraction_Layer>` , for logging service | |
173 | ||
174 | * Any application that needs to allocate fixed-sized objects in the data plane and that will be continuously utilized by the system. |