Dpdata Toolkit#
ai2-kit tool dpdata
This toolkit is a command line wrapper of dpdata to allow user to process DeepMD dataset via command line.
Usage#
ai2-kit tool dpdata # show all commands
ai2-kit tool dpdata to_ase -h # show doc of specific command
This toolkit include the following commands:
Command |
Description |
Example |
Reference |
---|---|---|---|
read |
Read dataset into memory. This command by itself is useless, you should chain other command after reading data into memory. |
|
Support wildcard, can be call multiple time |
write |
Use MultiSystems to merge dataset and write to directory |
|
|
filter |
Use lambda expression to filter dataset by system data. |
See in |
|
set_fparam |
add |
See in |
|
slice |
use slice expression to process systems |
see in |
|
sample |
sample data by different methods, current supported method are |
see in |
|
eval |
use |
see in |
|
to_ase |
convert dpdata format to ase format and use ase tool to process |
see in |
Those commands are chainable and can be used to process trajectory in a pipeline fashion (separated by -
). For more information, please refer to the following examples.
Example#
# read multiple dataset generated by training workflow by wildcard and merge them into a single dataset
# you can also call `read` multiple times to read multiple dataset from different directory
ai2-kit tool dpdata read ./workdir/iters-*/train-deepmd/new_dataset/* --fmt deepmd/npy - write ./merged_dataset --fmt deepmd/npy
# You can also save data with hdf5 format
ai2-kit tool dpdata read ./workdir/iters-*/train-deepmd/new_dataset/* --fmt deepmd/npy - write ./merged.hdf5 --fmt deepmd/hdf5
# Use lambda expression to filter outlier data
ai2-kit tool dpdata read ./path/to/dataset --fmt deepmd/npy - filter "lambda x: x['forces'].max() < 10" - write ./path/to/filtered_dataset
# Set fparam when reading data
ai2-kit tool dpdata read ./path/to/dataset --fmt deepmd/npy --fparam [0,1] - write ./path/to/new_dataset
# (re)label data
ai2-kit tool dpdata read dp-h2o --nolabel - eval dp-frozen.pb - write new-dp-hwo
# Drop the first 10 frames and then sample 10 frames use random method, and save it as xyz format
ai2-kit tool dpdata read dp-h2o - slice 10: - sample 10 --method random - to_ase - write h2o.xyz