zstd(Zstandard)
是一种快速的无损压缩算法,目标是在zlib级别的实时压缩场景和更好的压缩率。它由Huff0和FSE库提供的一个非常快速的熵阶段支持。
安装
$ apt install zstd
$ yum install zstd
$ git clone https://github.com/facebook/zstd.git
$ cd zstd; make; sudo make install
zstd
help
$ zstd --help
*** zstd command line interface 64-bits v1.4.4, by Yann Collet ***
Usage :
zstd [args] [FILE(s)] [-o file]
FILE : a filename
with no FILE, or when FILE is - , read standard input
Arguments :
-# : # compression level (1-19, default: 3)
-d : decompression
-D file: use `file` as Dictionary
-o file: result stored into `file` (only if 1 input file)
-f : overwrite output without prompting and (de)compress links
--rm : remove source file(s) after successful de/compression
-k : preserve source file(s) (default)
-h/-H : display help/long help and exit
Advanced arguments :
-V : display Version number and exit
-v : verbose mode; specify multiple times to increase verbosity
-q : suppress warnings; specify twice to suppress errors too
-c : force write to standard output, even if it is the console
-l : print information about zstd compressed files
--exclude-compressed: only compress files that are not previously compressed
--ultra : enable levels beyond 19, up to 22 (requires more memory)
--long[=#]: enable long distance matching with given window log (default: 27)
--fast[=#]: switch to very fast compression levels (default: 1)
--adapt : dynamically adapt compression level to I/O conditions
--stream-size=# : optimize compression parameters for streaming input of given number of bytes
--size-hint=# optimize compression parameters for streaming input of approximately this size
--target-compressed-block-size=# : make compressed block near targeted size
-T# : spawns # compression threads (default: 1, 0==# cores)
-B# : select size of each job (default: 0==automatic)
--rsyncable : compress using a rsync-friendly method (-B sets block size)
--no-dictID : don't write dictID into header (dictionary compression)
--[no-]check : integrity check (default: enabled)
--[no-]compress-literals : force (un)compressed literals
-r : operate recursively on directories
--output-dir-flat[=directory]: all resulting files stored into `directory`.
--format=zstd : compress files to the .zst format (default)
--format=gzip : compress files to the .gz format
--format=xz : compress files to the .xz format
--format=lzma : compress files to the .lzma format
--format=lz4 : compress files to the .lz4 format
--test : test compressed file integrity
--[no-]sparse : sparse mode (default: enabled on file, disabled on stdout)
-M# : Set a memory usage limit for decompression
--no-progress : do not display the progress bar
-- : All arguments after "--" are treated as files
Dictionary builder :
--train ## : create a dictionary from a training set of files
--train-cover[=k=#,d=#,steps=#,split=#,shrink[=#]] : use the cover algorithm with optional args
--train-fastcover[=k=#,d=#,f=#,steps=#,split=#,accel=#,shrink[=#]] : use the fast cover algorithm with optional args
--train-legacy[=s=#] : use the legacy algorithm with selectivity (default: 9)
-o file : `file` is dictionary name (default: dictionary)
--maxdict=# : limit dictionary to specified size (default: 112640)
--dictID=# : force dictionary ID to specified value (default: random)
Benchmark arguments :
-b# : benchmark file(s), using # compression level (default: 3)
-e# : test all compression levels from -bX to # (default: 1)
-i# : minimum evaluation time in seconds (default: 3s)
-B# : cut file into independent blocks of size # (default: no block)
--priority=rt : set process priority to real-time
简单实用
# 压缩
$ zstd README.md
README.md : 40.85% ( 14163 => 5785 bytes, README.md.zst)
$ ls
README.md README.md.zst
# 压缩后删除源文件
$ zstd --rm README.md
README.md : 40.85% ( 14163 => 5785 bytes, README.md.zst)
$ ls
README.md.zst
# 解压
$ zstd -d README.md.zst
README.md.zst : 14163 bytes
$ ls
README.md README.md.zst
# 解压
$ unzstd README.md.zst
# 解压到标准输出
$ zstd -dc README.md.zst
# 查看
$ zstd -l README.md
$ zstdcat README.md.zst # 同解压到标准输出效果
# 查看详细信息
$ zstd -v README.md.zst
$ zstd -v -d README.md.zst # -d 解压
# 压缩文件,同时指定压缩级别(最低0,最高19,默认为3)
$ zstd -level README.md.zst
$ zstd -6 README.md.zst
# 使用更多的内存(压缩和解压时)以达到更高的压缩比
$ zstd --ultra -level file
# 多进程并发压缩(0表示使用所有CPU)
$ zstd -T0 file
$ zstd -T4 file
# 解压缩为单进程
$ zstd -T4 -d README.md.zst
# 指定压缩模式,默认为 1
zstd --fast=3 xxx
tar
# 使用 tar 创建压缩包
tar -I zstd -cvf test.tar.zst test
# 忽略 test 目录
tar -I zstd -cvf test.tar.zst -C test/ .
# 解压
tar -I zstd -xvf test.tar.zst
训练自定义字典
# Create the dictionary
zstd --train FullPathToTrainingSet/* -o dictionaryName
# Compress with dictionary
zstd -D dictionaryName FILE
# Decompress with dictionary
zstd -D dictionaryName --decompress FILE.zst
python 实现
pzstd
- pzstd 并行
zstd(Zstandard)
压缩
help
$ pzstd --help
Usage:
pzstd [args] [FILE(s)]
Parallel ZSTD options:
-p, --processes # : number of threads to use for (de)compression (default:<numcpus>)
ZSTD options:
-# : # compression level (1-19, default:<numcpus>)
-d, --decompress : decompression
-o file : result stored into `file` (only if 1 input file)
-f, --force : overwrite output without prompting, (de)compress links
--rm : remove source file(s) after successful (de)compression
-k, --keep : preserve source file(s) (default)
-h, --help : display help and exit
-V, --version : display version number and exit
-v, --verbose : verbose mode; specify multiple times to increase log level (default:2)
-q, --quiet : suppress warnings; specify twice to suppress errors too
-c, --stdout : force write to standard output, even if it is the console
-r : operate recursively on directories
--ultra : enable levels beyond 19, up to 22 (requires more memory)
-C, --check : integrity check (default)
--no-check : no integrity check
-t, --test : test compressed file integrity
-- : all arguments after "--" are treated as files
使用
pzstd -11 -p 1 X -o xxx