zstd 速度快、性能好的压缩工具

发布时间: 更新时间: 总字数:1114 阅读时间:3m 作者: IP上海 分享 网址

zstd(Zstandard) 是一种快速的无损压缩算法,目标是在zlib级别的实时压缩场景和更好的压缩率。它由Huff0和FSE库提供的一个非常快速的熵阶段支持。

安装

  • Ubuntu
$ apt install zstd
  • CentOS
$ yum install zstd
  • 源码编译
$ git clone https://github.com/facebook/zstd.git
$ cd zstd; make; sudo make install

zstd

help

zstd--help ...
$ zstd --help
*** zstd command line interface 64-bits v1.4.4, by Yann Collet ***
Usage :
      zstd [args] [FILE(s)] [-o file]

FILE    : a filename
          with no FILE, or when FILE is - , read standard input
Arguments :
 -#     : # compression level (1-19, default: 3)
 -d     : decompression
 -D file: use `file` as Dictionary
 -o file: result stored into `file` (only if 1 input file)
 -f     : overwrite output without prompting and (de)compress links
--rm    : remove source file(s) after successful de/compression
 -k     : preserve source file(s) (default)
 -h/-H  : display help/long help and exit

Advanced arguments :
 -V     : display Version number and exit
 -v     : verbose mode; specify multiple times to increase verbosity
 -q     : suppress warnings; specify twice to suppress errors too
 -c     : force write to standard output, even if it is the console
 -l     : print information about zstd compressed files
--exclude-compressed:  only compress files that are not previously compressed
--ultra : enable levels beyond 19, up to 22 (requires more memory)
--long[=#]: enable long distance matching with given window log (default: 27)
--fast[=#]: switch to very fast compression levels (default: 1)
--adapt : dynamically adapt compression level to I/O conditions
--stream-size=# : optimize compression parameters for streaming input of given number of bytes
--size-hint=# optimize compression parameters for streaming input of approximately this size
--target-compressed-block-size=# : make compressed block near targeted size
 -T#    : spawns # compression threads (default: 1, 0==# cores)
 -B#    : select size of each job (default: 0==automatic)
--rsyncable : compress using a rsync-friendly method (-B sets block size)
--no-dictID : don't write dictID into header (dictionary compression)
--[no-]check : integrity check (default: enabled)
--[no-]compress-literals : force (un)compressed literals
 -r     : operate recursively on directories
--output-dir-flat[=directory]: all resulting files stored into `directory`.
--format=zstd : compress files to the .zst format (default)
--format=gzip : compress files to the .gz format
--format=xz : compress files to the .xz format
--format=lzma : compress files to the .lzma format
--format=lz4 : compress files to the .lz4 format
--test  : test compressed file integrity
--[no-]sparse : sparse mode (default: enabled on file, disabled on stdout)
 -M#    : Set a memory usage limit for decompression
--no-progress : do not display the progress bar
--      : All arguments after "--" are treated as files

Dictionary builder :
--train ## : create a dictionary from a training set of files
--train-cover[=k=#,d=#,steps=#,split=#,shrink[=#]] : use the cover algorithm with optional args
--train-fastcover[=k=#,d=#,f=#,steps=#,split=#,accel=#,shrink[=#]] : use the fast cover algorithm with optional args
--train-legacy[=s=#] : use the legacy algorithm with selectivity (default: 9)
 -o file : `file` is dictionary name (default: dictionary)
--maxdict=# : limit dictionary to specified size (default: 112640)
--dictID=# : force dictionary ID to specified value (default: random)

Benchmark arguments :
 -b#    : benchmark file(s), using # compression level (default: 3)
 -e#    : test all compression levels from -bX to # (default: 1)
 -i#    : minimum evaluation time in seconds (default: 3s)
 -B#    : cut file into independent blocks of size # (default: no block)
--priority=rt : set process priority to real-time

简单实用

# 压缩
$ zstd README.md
README.md            : 40.85%   ( 14163 =>   5785 bytes, README.md.zst)
$ ls
README.md  README.md.zst

# 压缩后删除源文件
$ zstd --rm README.md
README.md            : 40.85%   ( 14163 =>   5785 bytes, README.md.zst)
$ ls
README.md.zst

# 解压
$ zstd -d README.md.zst
README.md.zst       : 14163 bytes
$ ls
README.md  README.md.zst

# 解压
$ unzstd README.md.zst

# 解压到标准输出
$ zstd -dc README.md.zst

# 查看
$ zstd -l README.md
$ zstdcat README.md.zst  # 同解压到标准输出效果

# 查看详细信息
$ zstd -v README.md.zst
$ zstd -v -d README.md.zst  # -d 解压

# 压缩文件,同时指定压缩级别(最低0,最高19,默认为3)
$ zstd -level README.md.zst
$ zstd -6 README.md.zst

# 使用更多的内存(压缩和解压时)以达到更高的压缩比
$ zstd --ultra -level file

# 多进程并发压缩(0表示使用所有CPU)
$ zstd -T0 file
$ zstd -T4 file

# 解压缩为单进程
$ zstd -T4 -d README.md.zst

# 指定压缩模式,默认为 1
zstd --fast=3 xxx

tar

# 使用 tar 创建压缩包
tar -I zstd -cvf test.tar.zst test

# 忽略 test 目录
tar -I zstd -cvf test.tar.zst -C test/ .

# 解压
tar -I zstd -xvf test.tar.zst

训练自定义字典

# Create the dictionary
zstd --train FullPathToTrainingSet/* -o dictionaryName

# Compress with dictionary
zstd -D dictionaryName FILE

# Decompress with dictionary
zstd -D dictionaryName --decompress FILE.zst

python 实现

pzstd

  • pzstd 并行 zstd(Zstandard) 压缩

help

pzstd --help ...
$ pzstd --help
Usage:
  pzstd [args] [FILE(s)]
Parallel ZSTD options:
  -p, --processes   #    : number of threads to use for (de)compression (default:<numcpus>)
ZSTD options:
  -#                     : # compression level (1-19, default:<numcpus>)
  -d, --decompress       : decompression
  -o                file : result stored into `file` (only if 1 input file)
  -f, --force            : overwrite output without prompting, (de)compress links
      --rm               : remove source file(s) after successful (de)compression
  -k, --keep             : preserve source file(s) (default)
  -h, --help             : display help and exit
  -V, --version          : display version number and exit
  -v, --verbose          : verbose mode; specify multiple times to increase log level (default:2)
  -q, --quiet            : suppress warnings; specify twice to suppress errors too
  -c, --stdout           : force write to standard output, even if it is the console
  -r                     : operate recursively on directories
      --ultra            : enable levels beyond 19, up to 22 (requires more memory)
  -C, --check            : integrity check (default)
      --no-check         : no integrity check
  -t, --test             : test compressed file integrity
  --                     : all arguments after "--" are treated as files

使用

pzstd -11 -p 1 X -o xxx

参考

  1. https://github.com/facebook/zstd
  2. http://www.zstd.net/
  3. https://manpages.org/zstd
Home Archives Categories Tags Statistics