X-Git-Url: https://git.proxmox.com/?p=mirror_edk2.git;a=blobdiff_plain;f=AppPkg%2FApplications%2FPython%2FPython-2.7.10%2FObjects%2Fdictnotes.txt;fp=AppPkg%2FApplications%2FPython%2FPython-2.7.10%2FObjects%2Fdictnotes.txt;h=0000000000000000000000000000000000000000;hp=60a04c85f7294f9ebc48caee89f815637f59a371;hb=964f432b9b0afe103c41c7613fade3e699118afe;hpb=e2d3a25f1a3135221a9c8061e1b8f90245d727eb

diff --git a/AppPkg/Applications/Python/Python-2.7.10/Objects/dictnotes.txt b/AppPkg/Applications/Python/Python-2.7.10/Objects/dictnotes.txt
deleted file mode 100644
index 60a04c85f7..0000000000
--- a/AppPkg/Applications/Python/Python-2.7.10/Objects/dictnotes.txt
+++ /dev/null
@@ -1,270 +0,0 @@
-NOTES ON OPTIMIZING DICTIONARIES
-================================
-
-
-Principal Use Cases for Dictionaries
-------------------------------------
-
-Passing keyword arguments
-    Typically, one read and one write for 1 to 3 elements.
-    Occurs frequently in normal python code.
-
-Class method lookup
-    Dictionaries vary in size with 8 to 16 elements being common.
-    Usually written once with many lookups.
-    When base classes are used, there are many failed lookups
-        followed by a lookup in a base class.
-
-Instance attribute lookup and Global variables
-    Dictionaries vary in size.  4 to 10 elements are common.
-    Both reads and writes are common.
-
-Builtins
-    Frequent reads.  Almost never written.
-    Size 126 interned strings (as of Py2.3b1).
-    A few keys are accessed much more frequently than others.
-
-Uniquification
-    Dictionaries of any size.  Bulk of work is in creation.
-    Repeated writes to a smaller set of keys.
-    Single read of each key.
-    Some use cases have two consecutive accesses to the same key.
-
-    * Removing duplicates from a sequence.
-        dict.fromkeys(seqn).keys()
-
-    * Counting elements in a sequence.
-        for e in seqn:
-          d[e] = d.get(e,0) + 1
-
-    * Accumulating references in a dictionary of lists:
-
-        for pagenumber, page in enumerate(pages):
-          for word in page:
-            d.setdefault(word, []).append(pagenumber)
-
-    Note, the second example is a use case characterized by a get and set
-    to the same key.  There are similar use cases with a __contains__
-    followed by a get, set, or del to the same key.  Part of the
-    justification for d.setdefault is combining the two lookups into one.
-
-Membership Testing
-    Dictionaries of any size.  Created once and then rarely changes.
-    Single write to each key.
-    Many calls to __contains__() or has_key().
-    Similar access patterns occur with replacement dictionaries
-        such as with the % formatting operator.
-
-Dynamic Mappings
-    Characterized by deletions interspersed with adds and replacements.
-    Performance benefits greatly from the re-use of dummy entries.
-
-
-Data Layout (assuming a 32-bit box with 64 bytes per cache line)
-----------------------------------------------------------------
-
-Smalldicts (8 entries) are attached to the dictobject structure
-and the whole group nearly fills two consecutive cache lines.
-
-Larger dicts use the first half of the dictobject structure (one cache
-line) and a separate, continuous block of entries (at 12 bytes each
-for a total of 5.333 entries per cache line).
-
-
-Tunable Dictionary Parameters
------------------------------
-
-* PyDict_MINSIZE.  Currently set to 8.
-    Must be a power of two.  New dicts have to zero-out every cell.
-    Each additional 8 consumes 1.5 cache lines.  Increasing improves
-    the sparseness of small dictionaries but costs time to read in
-    the additional cache lines if they are not already in cache.
-    That case is common when keyword arguments are passed.
-
-* Maximum dictionary load in PyDict_SetItem.  Currently set to 2/3.
-    Increasing this ratio makes dictionaries more dense resulting
-    in more collisions.  Decreasing it improves sparseness at the
-    expense of spreading entries over more cache lines and at the
-    cost of total memory consumed.
-
-    The load test occurs in highly time sensitive code.  Efforts
-    to make the test more complex (for example, varying the load
-    for different sizes) have degraded performance.
-
-* Growth rate upon hitting maximum load.  Currently set to *2.
-    Raising this to *4 results in half the number of resizes,
-    less effort to resize, better sparseness for some (but not
-    all dict sizes), and potentially doubles memory consumption
-    depending on the size of the dictionary.  Setting to *4
-    eliminates every other resize step.
-
-* Maximum sparseness (minimum dictionary load).  What percentage
-    of entries can be unused before the dictionary shrinks to
-    free up memory and speed up iteration?  (The current CPython
-    code does not represent this parameter directly.)
-
-* Shrinkage rate upon exceeding maximum sparseness.  The current
-    CPython code never even checks sparseness when deleting a
-    key.  When a new key is added, it resizes based on the number
-    of active keys, so that the addition may trigger shrinkage
-    rather than growth.
-
-Tune-ups should be measured across a broad range of applications and
-use cases.  A change to any parameter will help in some situations and
-hurt in others.  The key is to find settings that help the most common
-cases and do the least damage to the less common cases.  Results will
-vary dramatically depending on the exact number of keys, whether the
-keys are all strings, whether reads or writes dominate, the exact
-hash values of the keys (some sets of values have fewer collisions than
-others).  Any one test or benchmark is likely to prove misleading.
-
-While making a dictionary more sparse reduces collisions, it impairs
-iteration and key listing.  Those methods loop over every potential
-entry.  Doubling the size of dictionary results in twice as many
-non-overlapping memory accesses for keys(), items(), values(),
-__iter__(), iterkeys(), iteritems(), itervalues(), and update().
-Also, every dictionary iterates at least twice, once for the memset()
-when it is created and once by dealloc().
-
-Dictionary operations involving only a single key can be O(1) unless 
-resizing is possible.  By checking for a resize only when the 
-dictionary can grow (and may *require* resizing), other operations
-remain O(1), and the odds of resize thrashing or memory fragmentation
-are reduced. In particular, an algorithm that empties a dictionary
-by repeatedly invoking .pop will see no resizing, which might
-not be necessary at all because the dictionary is eventually
-discarded entirely.
-
-
-Results of Cache Locality Experiments
--------------------------------------
-
-When an entry is retrieved from memory, 4.333 adjacent entries are also
-retrieved into a cache line.  Since accessing items in cache is *much*
-cheaper than a cache miss, an enticing idea is to probe the adjacent
-entries as a first step in collision resolution.  Unfortunately, the
-introduction of any regularity into collision searches results in more
-collisions than the current random chaining approach.
-
-Exploiting cache locality at the expense of additional collisions fails
-to payoff when the entries are already loaded in cache (the expense
-is paid with no compensating benefit).  This occurs in small dictionaries
-where the whole dictionary fits into a pair of cache lines.  It also
-occurs frequently in large dictionaries which have a common access pattern
-where some keys are accessed much more frequently than others.  The
-more popular entries *and* their collision chains tend to remain in cache.
-
-To exploit cache locality, change the collision resolution section
-in lookdict() and lookdict_string().  Set i^=1 at the top of the
-loop and move the  i = (i << 2) + i + perturb + 1 to an unrolled
-version of the loop.
-
-This optimization strategy can be leveraged in several ways:
-
-* If the dictionary is kept sparse (through the tunable parameters),
-then the occurrence of additional collisions is lessened.
-
-* If lookdict() and lookdict_string() are specialized for small dicts
-and for largedicts, then the versions for large_dicts can be given
-an alternate search strategy without increasing collisions in small dicts
-which already have the maximum benefit of cache locality.
-
-* If the use case for a dictionary is known to have a random key
-access pattern (as opposed to a more common pattern with a Zipf's law
-distribution), then there will be more benefit for large dictionaries
-because any given key is no more likely than another to already be
-in cache.
-
-* In use cases with paired accesses to the same key, the second access
-is always in cache and gets no benefit from efforts to further improve
-cache locality.
-
-Optimizing the Search of Small Dictionaries
--------------------------------------------
-
-If lookdict() and lookdict_string() are specialized for smaller dictionaries,
-then a custom search approach can be implemented that exploits the small
-search space and cache locality.
-
-* The simplest example is a linear search of contiguous entries.  This is
-  simple to implement, guaranteed to terminate rapidly, never searches
-  the same entry twice, and precludes the need to check for dummy entries.
-
-* A more advanced example is a self-organizing search so that the most
-  frequently accessed entries get probed first.  The organization
-  adapts if the access pattern changes over time.  Treaps are ideally
-  suited for self-organization with the most common entries at the
-  top of the heap and a rapid binary search pattern.  Most probes and
-  results are all located at the top of the tree allowing them all to
-  be located in one or two cache lines.
-
-* Also, small dictionaries may be made more dense, perhaps filling all
-  eight cells to take the maximum advantage of two cache lines.
-
-
-Strategy Pattern
-----------------
-
-Consider allowing the user to set the tunable parameters or to select a
-particular search method.  Since some dictionary use cases have known
-sizes and access patterns, the user may be able to provide useful hints.
-
-1) For example, if membership testing or lookups dominate runtime and memory
-   is not at a premium, the user may benefit from setting the maximum load
-   ratio at 5% or 10% instead of the usual 66.7%.  This will sharply
-   curtail the number of collisions but will increase iteration time.
-   The builtin namespace is a prime example of a dictionary that can
-   benefit from being highly sparse.
-
-2) Dictionary creation time can be shortened in cases where the ultimate
-   size of the dictionary is known in advance.  The dictionary can be
-   pre-sized so that no resize operations are required during creation.
-   Not only does this save resizes, but the key insertion will go
-   more quickly because the first half of the keys will be inserted into
-   a more sparse environment than before.  The preconditions for this
-   strategy arise whenever a dictionary is created from a key or item
-   sequence and the number of *unique* keys is known.
-
-3) If the key space is large and the access pattern is known to be random,
-   then search strategies exploiting cache locality can be fruitful.
-   The preconditions for this strategy arise in simulations and
-   numerical analysis.
-
-4) If the keys are fixed and the access pattern strongly favors some of
-   the keys, then the entries can be stored contiguously and accessed
-   with a linear search or treap.  This exploits knowledge of the data,
-   cache locality, and a simplified search routine.  It also eliminates
-   the need to test for dummy entries on each probe.  The preconditions
-   for this strategy arise in symbol tables and in the builtin dictionary.
-
-
-Readonly Dictionaries
----------------------
-Some dictionary use cases pass through a build stage and then move to a
-more heavily exercised lookup stage with no further changes to the
-dictionary.
-
-An idea that emerged on python-dev is to be able to convert a dictionary
-to a read-only state.  This can help prevent programming errors and also
-provide knowledge that can be exploited for lookup optimization.
-
-The dictionary can be immediately rebuilt (eliminating dummy entries),
-resized (to an appropriate level of sparseness), and the keys can be
-jostled (to minimize collisions).  The lookdict() routine can then
-eliminate the test for dummy entries (saving about 1/4 of the time
-spent in the collision resolution loop).
-
-An additional possibility is to insert links into the empty spaces
-so that dictionary iteration can proceed in len(d) steps instead of
-(mp->mask + 1) steps.  Alternatively, a separate tuple of keys can be
-kept just for iteration.
-
-
-Caching Lookups
----------------
-The idea is to exploit key access patterns by anticipating future lookups
-based on previous lookups.
-
-The simplest incarnation is to save the most recently accessed entry.
-This gives optimal performance for use cases where every get is followed
-by a set or del to the same key.