ceph/src/zstd/tests/README.md

   1 Programs and scripts for automated testing of Zstandard
   2 =======================================================
   3
   4 This directory contains the following programs and scripts:
   5 - `datagen` : Synthetic and parametrable data generator, for tests
   6 - `fullbench`  : Precisely measure speed for each zstd inner functions
   7 - `fuzzer`  : Test tool, to check zstd integrity on target platform
   8 - `paramgrill` : parameter tester for zstd
   9 - `test-zstd-speed.py` : script for testing zstd speed difference between commits
  10 - `test-zstd-versions.py` : compatibility test between zstd versions stored on Github (v0.1+)
  11 - `zbufftest`  : Test tool to check ZBUFF (a buffered streaming API) integrity
  12 - `zstreamtest` : Fuzzer test tool for zstd streaming API
  13 - `legacy` : Test tool to test decoding of legacy zstd frames
  14 - `decodecorpus` : Tool to generate valid Zstandard frames, for verifying decoder implementations
  15
  16
  17 #### `test-zstd-versions.py` - script for testing zstd interoperability between versions
  18
  19 This script creates `versionsTest` directory to which zstd repository is cloned.
  20 Then all tagged (released) versions of zstd are compiled.
  21 In the following step interoperability between zstd versions is checked.
  22
  23
  24 #### `test-zstd-speed.py` - script for testing zstd speed difference between commits
  25
  26 This script creates `speedTest` directory to which zstd repository is cloned.
  27 Then it compiles all branches of zstd and performs a speed benchmark for a given list of files (the `testFileNames` parameter).
  28 After `sleepTime` (an optional parameter, default 300 seconds) seconds the script checks repository for new commits.
  29 If a new commit is found it is compiled and a speed benchmark for this commit is performed.
  30 The results of the speed benchmark are compared to the previous results.
  31 If compression or decompression speed for one of zstd levels is lower than `lowerLimit` (an optional parameter, default 0.98) the speed benchmark is restarted.
  32 If second results are also lower than `lowerLimit` the warning e-mail is send to recipients from the list (the `emails` parameter).
  33
  34 Additional remarks:
  35 - To be sure that speed results are accurate the script should be run on a "stable" target system with no other jobs running in parallel
  36 - Using the script with virtual machines can lead to large variations of speed results
  37 - The speed benchmark is not performed until computers' load average is lower than `maxLoadAvg` (an optional parameter, default 0.75)
  38 - The script sends e-mails using `mutt`; if `mutt` is not available it sends e-mails without attachments using `mail`; if both are not available it only prints a warning
  39
  40
  41 The example usage with two test files, one e-mail address, and with an additional message:
  42 ```
  43 ./test-zstd-speed.py "silesia.tar calgary.tar" "email@gmail.com" --message "tested on my laptop" --sleepTime 60
  44 ```
  45
  46 To run the script in background please use:
  47 ```
  48 nohup ./test-zstd-speed.py testFileNames emails &
  49 ```
  50
  51 The full list of parameters:
  52 ```
  53 positional arguments:
  54   testFileNames         file names list for speed benchmark
  55   emails                list of e-mail addresses to send warnings
  56
  57 optional arguments:
  58   -h, --help            show this help message and exit
  59   --message MESSAGE     attach an additional message to e-mail
  60   --lowerLimit LOWERLIMIT
  61                         send email if speed is lower than given limit
  62   --maxLoadAvg MAXLOADAVG
  63                         maximum load average to start testing
  64   --lastCLevel LASTCLEVEL
  65                         last compression level for testing
  66   --sleepTime SLEEPTIME
  67                         frequency of repository checking in seconds
  68 ```
  69
  70 #### `decodecorpus` - tool to generate Zstandard frames for decoder testing
  71 Command line tool to generate test .zst files.
  72
  73 This tool will generate .zst files with checksums,
  74 as well as optionally output the corresponding correct uncompressed data for
  75 extra verification.
  76
  77 Example:
  78 ```
  79 ./decodecorpus -ptestfiles -otestfiles -n10000 -s5
  80 ```
  81 will generate 10,000 sample .zst files using a seed of 5 in the `testfiles` directory,
  82 with the zstd checksum field set,
  83 as well as the 10,000 original files for more detailed comparison of decompression results.
  84
  85 ```
  86 ./decodecorpus -t -T1mn
  87 ```
  88 will choose a random seed, and for 1 minute,
  89 generate random test frames and ensure that the
  90 zstd library correctly decompresses them in both simple and streaming modes.
  91
  92 #### `paramgrill` - tool for generating compression table parameters and optimizing parameters on file given constraints
  93
  94 Full list of arguments
  95 ```
  96  -T#          : set level 1 speed objective
  97  -B#          : cut input into blocks of size # (default : single block)
  98  -S           : benchmarks a single run (example command: -Sl3w10h12)
  99     w# - windowLog
 100     h# - hashLog
 101     c# - chainLog
 102     s# - searchLog
 103     l# - minMatch
 104     t# - targetLength
 105     S# - strategy
 106     L# - level
 107  --zstd=      : Single run, parameter selection syntax same as zstdcli with more parameters
 108                     (Added forceAttachDictionary / fadt)
 109                     When invoked with --optimize, this represents the sample to exceed.
 110  --optimize=  : find parameters to maximize compression ratio given parameters
 111                     Can use all --zstd= commands to constrain the type of solution found in addition to the following constraints
 112     cSpeed=   : Minimum compression speed
 113     dSpeed=   : Minimum decompression speed
 114     cMem=     : Maximum compression memory
 115     lvl=      : Searches for solutions which are strictly better than that compression lvl in ratio and cSpeed,
 116     stc=      : When invoked with lvl=, represents percentage slack in ratio/cSpeed allowed for a solution to be considered (Default 100%)
 117               : In normal operation, represents percentage slack in choosing viable starting strategy selection in choosing the default parameters
 118                     (Lower value will begin with stronger strategies) (Default 90%)
 119     speedRatio=   (accepts decimals)
 120               : determines value of gains in speed vs gains in ratio
 121                     when determining overall winner (default 5 (1% ratio = 5% speed)).
 122     tries=    : Maximum number of random restarts on a single strategy before switching (Default 5)
 123                     Higher values will make optimizer run longer, more chances to find better solution.
 124     memLog    : Limits the log of the size of each memotable (1 per strategy). Will use hash tables when state space is larger than max size.
 125                     Setting memLog = 0 turns off memoization
 126  --display=   : specify which parameters are included in the output
 127                     can use all --zstd parameter names and 'cParams' as a shorthand for all parameters used in ZSTD_compressionParameters
 128                     (Default: display all params available)
 129  -P#          : generated sample compressibility (when no file is provided)
 130  -t#          : Caps runtime of operation in seconds (default : 99999 seconds (about 27 hours ))
 131  -v           : Prints Benchmarking output
 132  -D           : Next argument dictionary file
 133  -s           : Benchmark all files separately
 134  -q           : Quiet, repeat for more quiet
 135                   -q Prints parameters + results whenever a new best is found
 136                   -qq Only prints parameters whenever a new best is found, prints final parameters + results
 137                   -qqq Only print final parameters + results
 138                   -qqqq Only prints final parameter set in the form --zstd=
 139  -v           : Verbose, cancels quiet, repeat for more volume
 140                   -v Prints all candidate parameters and results
 141
 142 ```
 143  Any inputs afterwards are treated as files to benchmark.