ceph/src/isa-l/Release_notes.txt

   1 v2.30 Intel Intelligent Storage Acceleration Library Release Notes
   2 ==================================================================
   3
   4 RELEASE NOTE CONTENTS
   5 1. KNOWN ISSUES
   6 2. FIXED ISSUES
   7 3. CHANGE LOG & FEATURES ADDED
   8
   9 1. KNOWN ISSUES
  10 ----------------
  11
  12 * Perf tests do not run in Windows environment.
  13
  14 * 32-bit lib is not supported in Windows.
  15
  16 2. FIXED ISSUES
  17 ---------------
  18 v2.30
  19
  20 * Intel CET support.
  21 * Windows nasm support fix.
  22
  23 v2.28
  24
  25 * Fix documentation on gf_vect_mad(). Min length listed as 32 instead of
  26   required min 64 bytes.
  27
  28 v2.27
  29
  30 * Fix lack of install for pkg-config files
  31
  32 v2.26
  33
  34 * Fixes for sanitizer warnings.
  35
  36 v2.25
  37
  38 * Fix for nasm on Mac OS X/darwin.
  39
  40 v2.24
  41
  42 * Fix for crc32_iscsi().  Potential read-over for small buffer.  For an input
  43   buffer length of less than 8 bytes and aligned to an 8 byte boundary, function
  44   could read past length.  Previously had the possibility to cause a seg fault
  45   only for length 0 and invalid buffer passed.  Calculated CRC is unchanged.
  46
  47 * Fix for compression/decompression of > 4GB files.  For streaming compression
  48   of extremely large files, the total_out parameter would wrap and could
  49   potentially flag an otherwise valid lookback distance as being invalid.
  50   Total_out is still 32bit for zlib compatibility.  No inconsistent compressed
  51   buffers were generated by the issue.
  52
  53 v2.23
  54
  55 * Fix for histogram generation base function.
  56 * Fix library build warnings on macOS.
  57 * Fix igzip to use bsf instruction when tzcnt is not available.
  58
  59 v2.22
  60
  61 * Fix ISA-L builds for other architectures.  Base function and examples
  62   sanitized for non-IA builds.
  63
  64 * Fix fuzz test script to work with llvm 6.0 builtin libFuzz.
  65
  66 v2.20
  67
  68 * Inflate total_out behavior corrected for in-progress decompression.
  69   Previously total_out represented the total bytes decompressed into the output
  70   buffer or temp internal buffer.  This is changed to be only the bytes put into
  71   the output buffer.
  72
  73 * Fixed issue with isal_create_hufftables_subset.  Affects semi-dynamic
  74   compression use case when explicitly creating hufftables from histogram.  The
  75   _hufftables_subset function could fail to generate length symbols for any
  76   length that were never seen.
  77
  78 v2.19
  79
  80 * Fix erasure code test that violates rs matrix bounds.
  81
  82 * Fix 0 length file and looping errors in igzip_inflate_test.
  83
  84 v2.18
  85
  86 * Mac OS X/darwin systems no longer require the --target=darwin config option.
  87   The autoconf canonical build should detect.
  88
  89 v2.17
  90
  91 * Fix igzip using 32K window and a shared object
  92
  93 * Fix igzip undefined instruction error on Nehalem.
  94
  95 * Fixed issue in crc performance tests where OS optimizations turned cold cache
  96   tests into warm tests.
  97
  98 v2.15
  99
 100 * Fix for windows register save in gf_6vect_mad_avx2.asm.  Only affects windows
 101   versions of ec_encode_data_update() running with AVX2.  A GP register was not
 102   properly restored resulting in corruption on return.
 103
 104 v2.14
 105
 106 * Building in unit directories is no longer supported removing the issue of
 107   leftover object files causing the top-level make build to fail.
 108
 109 v2.10
 110
 111 * Fix for windows register save overlap in gf_{3-6}vect_dot_prod_sse.asm. Only
 112   affects windows versions of erasure code.  GP register saves/restore were
 113   pushed to same stack area as XMM.
 114
 115 3. CHANGE LOG & FEATURES ADDED
 116 ------------------------------
 117 v2.30
 118
 119 * Igzip compression enhancements.
 120   - New functions for dictionary acceleration. Split dictionary processing and
 121     resetting can greatly accelerate the performance of compressing many small
 122     files with a dictionary.
 123   - New static level 0 header decode tables. Accelerates decompressing small
 124     files that are level 0 compressed by skipping the known header parsing.
 125   - New feature for igzip cli tool: support for concatenated .gz files. On
 126     decompression, igzip will process a series of independent, concatenated .gz
 127     files into one output stream.
 128
 129 * CRC Improvements
 130   - New vclmul version of crc32_iscsi().
 131   - Updates for aarch64.
 132
 133 v2.29
 134
 135 * CRC Improvements
 136   - New AVX512 vclmul versions of crc16_t10dif(), crc32_ieee(), crc32_gzip_refl.
 137
 138 * Erasure code improvements
 139   - Added AVX512 ec functions with 5 and 6 outputs. Can improve performance for
 140     codes with 5 or more parity by running in batches of up to 6 at a time.
 141
 142 v2.28
 143
 144 * New next-arch versions of 64-bit CRC. All norm and reflected 64-bit
 145   polynomials are expanded to utilize vpclmulqdq.
 146
 147 v2.27
 148
 149 * New multi-threaded compression option for igzip cli tool
 150
 151 v2.26
 152
 153 * Adler32 added to external API.
 154 * Multi-arch improvements.
 155 * Performance test improvements.
 156
 157 v2.25
 158
 159 * Igzip performance improvements and features.
 160   - Performance improvements for uncompressable files. Random or uncompressable
 161     files can be up to 3x faster in level 1 or 2 compression.
 162   - Additional small file performance improvments.
 163   - New options in igzip cli: use name from header or not, test compressed file.
 164
 165 * Multi-arch autoconf script.
 166   - Autoconf should detect architecture and run base functions at minimum.
 167
 168 v2.24
 169
 170 * Igzip small file performance improvements and new features.
 171   - Better performance on small files.
 172   - New gzip/zlib header and trailer handling.
 173   - New gzip/zlib header parsing helper functions.
 174   - New user-space compression/decompression tool igzip.
 175
 176 * New mem unit added with first function isal_zero_detect().
 177
 178 v2.23
 179
 180 * Igzip inflate (decompression) performance improvements.
 181   - Implemented multi-byte decode for inflate.  Decode can pack up to three
 182     symbols into the decode table making some compressed streams decompress much
 183     faster depending on the prevalence of short codes.
 184
 185 v2.22
 186
 187 * Igzip: AVX2 version of level 3 compression added.
 188
 189 * Erasure code examples
 190   - New examples for standard EC encode and decode.
 191   - Example of piggyback EC encode and decode.
 192
 193 v2.21
 194
 195 * Igzip improvements
 196   - New compression levels added.  ISA-L fast deflate now has more levels to
 197     balance speed vs. target compression level.  Level 0, 1 are as in previous
 198     generations.  New levels 2 & 3 target higher compression roughly comparable
 199     to zlib levels 2-3.  Level 3 is currently only optimized for processors with
 200     AVX512 instructions.
 201
 202 * New T10dif & copy function - crc16_t10dif_copy()
 203   - CRC and copy was added to emulate T10dif operations such as DIF insert and
 204     strip.  This function stitches together CRC and memcpy operations
 205     eliminating an extra data read.
 206
 207 * CRC32 iscsi performance improvements
 208   - Fixes issue under some distributions where warm cache performance was
 209     reduced.
 210
 211 v2.20
 212
 213 * Igzip improvements
 214   - Optimized deflate_hash in compression functions.
 215     Improves performance of using preset dictionary.
 216   - Removed alignment restrictions on input structure.
 217
 218 v2.19
 219
 220 * Igzip improvements
 221
 222   - Add optimized Adler-32 checksum.
 223
 224   - Implement zlib compression format.
 225
 226   - Add stateful dictionary support.
 227
 228   - Add struct reset functions for both deflate and inflate.
 229
 230 * Reflected IEEE format CRC32 is released out. Function interface is named
 231   crc32_gzip_refl.
 232
 233 * Exact work condition of Erasure Code Reed-Solomon Matrix is determined by new
 234   added program gen_rs_matrix_limits.
 235
 236 v2.18
 237
 238 * New 2-pass fully-dynamic deflate compression (level -1).  ISA-L fast deflate
 239   now has two levels.  Level 0 (default) is the same as previous generations.
 240   Setting to level 1 will switch to the fully-dynamic compression that will
 241   typically reach higher compression ratios.
 242
 243 * RAID AVX512 functions.
 244
 245 v2.17
 246
 247 * New fast decompression (inflate)
 248
 249 * Compression improvements (deflate)
 250   - Speed and compression ratio improvements.
 251   - Fast custom Huffman code generation.
 252   - New features:
 253     * Run-time option of gzip crc calculation and headers/trailer.
 254     * Choice of static header (BTYPE 01) blocks.
 255     * LARGE_WINDOW, 32K history, now default.
 256     * Stateless full flush mode.
 257
 258 * CRC64
 259   - Six new 64-bit polynomials supported. Normal and reflected versions of ECMA,
 260     ISO and Jones polynomials.
 261
 262 v2.16
 263
 264 * Units added: crc, raid, igzip (deflate compression).
 265
 266 v2.15
 267
 268 * Erasure code updates. New AVX512 versions.
 269
 270 * Nasm support.  ISA-L ported to build with nasm or yasm assembler.
 271
 272 * Windows DLL support.  Windows builds DLL by default.
 273
 274 v2.14
 275
 276 * Autoconf and autotools build allows easier porting to additional systems.
 277   Previous make system still available to embedded users with Makefile.unx.
 278
 279 * Includes update for building on Mac OS X/darwin systems. Add --target=darwin
 280   to ./configure step.
 281
 282 v2.13
 283
 284 * Erasure code improvments
 285   - 32-bit port of optimized gf_vect_dot_prod() functions.  This makes
 286     ec_encode_data() functions much faster on 32-bit processors.
 287   - Avoton performance improvements.  Performance on Avoton for
 288     gf_vect_dot_prod() and ec_encode_data() can improve by as much as 20%.
 289
 290 v2.11
 291
 292 * Incremental erasure code.  New functions added to erasure code to handle
 293   single source update of code blocks.  The function ec_encode_data_update()
 294   works with parameters similar to ec_encode_data() but are called incrementally
 295   with each source block.  These versions are useful when source blocks are not
 296   all available at once.
 297
 298 v2.10
 299
 300 * Erasure code updates
 301   - New AVX and AVX2 support functions.
 302   - Changes min len requirement on gf_vect_dot_prod() to 32 from 16.
 303   - Tests include both source and parity recovery with ec_encode_data().
 304   - New encoding examples with Vandermonde or Cauchy matrix.
 305
 306 v2.8
 307
 308 * First open release of erasure code unit that is part of ISA-L.