Save new folder

2025-11-09 10:02:18 +01:00
commit 5c733eac6b
21738 changed files with 4477854 additions and 0 deletions
--- a/api.dsi.sophal.dz/hr_tickets/Python-3.9.6/Objects/dictnotes.txt
+++ b/api.dsi.sophal.dz/hr_tickets/Python-3.9.6/Objects/dictnotes.txt
@ -0,0 +1,149 @@
+NOTES ON DICTIONARIES
+================================
+
+Principal Use Cases for Dictionaries
+------------------------------------
+
+Passing keyword arguments
+    Typically, one read and one write for 1 to 3 elements.
+    Occurs frequently in normal python code.
+
+Class method lookup
+    Dictionaries vary in size with 8 to 16 elements being common.
+    Usually written once with many lookups.
+    When base classes are used, there are many failed lookups
+        followed by a lookup in a base class.
+
+Instance attribute lookup and Global variables
+    Dictionaries vary in size.  4 to 10 elements are common.
+    Both reads and writes are common.
+
+Builtins
+    Frequent reads.  Almost never written.
+    About 150 interned strings (as of Py3.3).
+    A few keys are accessed much more frequently than others.
+
+Uniquification
+    Dictionaries of any size.  Bulk of work is in creation.
+    Repeated writes to a smaller set of keys.
+    Single read of each key.
+    Some use cases have two consecutive accesses to the same key.
+
+    * Removing duplicates from a sequence.
+        dict.fromkeys(seqn).keys()
+
+    * Counting elements in a sequence.
+        for e in seqn:
+          d[e] = d.get(e,0) + 1
+
+    * Accumulating references in a dictionary of lists:
+
+        for pagenumber, page in enumerate(pages):
+          for word in page:
+            d.setdefault(word, []).append(pagenumber)
+
+    Note, the second example is a use case characterized by a get and set
+    to the same key.  There are similar use cases with a __contains__
+    followed by a get, set, or del to the same key.  Part of the
+    justification for d.setdefault is combining the two lookups into one.
+
+Membership Testing
+    Dictionaries of any size.  Created once and then rarely changes.
+    Single write to each key.
+    Many calls to __contains__() or has_key().
+    Similar access patterns occur with replacement dictionaries
+        such as with the % formatting operator.
+
+Dynamic Mappings
+    Characterized by deletions interspersed with adds and replacements.
+    Performance benefits greatly from the re-use of dummy entries.
+
+Data Layout
+-----------
+
+Dictionaries are composed of 3 components:
+The dictobject struct itself
+A dict-keys object (keys & hashes)
+A values array
+
+
+Tunable Dictionary Parameters
+-----------------------------
+
+See comments for PyDict_MINSIZE_SPLIT, PyDict_MINSIZE_COMBINED,
+USABLE_FRACTION and GROWTH_RATE in dictobject.c
+
+Tune-ups should be measured across a broad range of applications and
+use cases.  A change to any parameter will help in some situations and
+hurt in others.  The key is to find settings that help the most common
+cases and do the least damage to the less common cases.  Results will
+vary dramatically depending on the exact number of keys, whether the
+keys are all strings, whether reads or writes dominate, the exact
+hash values of the keys (some sets of values have fewer collisions than
+others).  Any one test or benchmark is likely to prove misleading.
+
+While making a dictionary more sparse reduces collisions, it impairs
+iteration and key listing.  Those methods loop over every potential
+entry.  Doubling the size of dictionary results in twice as many
+non-overlapping memory accesses for keys(), items(), values(),
+__iter__(), iterkeys(), iteritems(), itervalues(), and update().
+Also, every dictionary iterates at least twice, once for the memset()
+when it is created and once by dealloc().
+
+Dictionary operations involving only a single key can be O(1) unless
+resizing is possible.  By checking for a resize only when the
+dictionary can grow (and may *require* resizing), other operations
+remain O(1), and the odds of resize thrashing or memory fragmentation
+are reduced. In particular, an algorithm that empties a dictionary
+by repeatedly invoking .pop will see no resizing, which might
+not be necessary at all because the dictionary is eventually
+discarded entirely.
+
+The key differences between this implementation and earlier versions are:
+    1. The table can be split into two parts, the keys and the values.
+
+    2. There is an additional key-value combination: (key, NULL).
+       Unlike (<dummy>, NULL) which represents a deleted value, (key, NULL)
+       represented a yet to be inserted value. This combination can only occur
+       when the table is split.
+
+    3. No small table embedded in the dict,
+       as this would make sharing of key-tables impossible.
+
+
+These changes have the following consequences.
+   1. General dictionaries are slightly larger.
+
+   2. All object dictionaries of a single class can share a single key-table,
+      saving about 60% memory for such cases.
+
+Results of Cache Locality Experiments
+--------------------------------------
+
+Experiments on an earlier design of dictionary, in which all tables were
+combined, showed the following:
+
+  When an entry is retrieved from memory, several adjacent entries are also
+  retrieved into a cache line.  Since accessing items in cache is *much*
+  cheaper than a cache miss, an enticing idea is to probe the adjacent
+  entries as a first step in collision resolution.  Unfortunately, the
+  introduction of any regularity into collision searches results in more
+  collisions than the current random chaining approach.
+
+  Exploiting cache locality at the expense of additional collisions fails
+  to payoff when the entries are already loaded in cache (the expense
+  is paid with no compensating benefit).  This occurs in small dictionaries
+  where the whole dictionary fits into a pair of cache lines.  It also
+  occurs frequently in large dictionaries which have a common access pattern
+  where some keys are accessed much more frequently than others.  The
+  more popular entries *and* their collision chains tend to remain in cache.
+
+  To exploit cache locality, change the collision resolution section
+  in lookdict() and lookdict_string().  Set i^=1 at the top of the
+  loop and move the  i = (i << 2) + i + perturb + 1 to an unrolled
+  version of the loop.
+
+For split tables, the above will apply to the keys, but the value will
+always be in a different cache line from the key.
+
+