ai2_kit.tool.frame module#
- class ai2_kit.tool.frame.FrameTool[source]#
Bases:
object
This tool is design to sampling frames from large trajectory file without parsing them. You can use this tool to merge, sample frames from multiple files, and write them to a new file.
A frame file is a file that contains multiple frames, each frame is separated by a fixed number of lines. For example, jsonl data file, or trajectory files in LAMMPS, xyz, etc format.
- read(*path_or_glob: str, frame_size: int = 0, rp=None, header_size: int = 0)[source]#
Load trajectory files from multiple paths, support glob pattern
- Parameters:
path_or_glob – path or glob pattern to locate data path
frame_size – line number of each frame
rp – repeated pattern, can be string or regex, e.g. ‘ITEM: TIMESTEP’, ‘Lattice.+’
- sample(size: int, method: Literal['even', 'random', 'truncate'] = 'even', **kwargs)[source]#
sample frame by different method
- Parameters:
size – size of sample, if size is larger than data size, return all data
method – method to sample, can be ‘even’, ‘random’, ‘truncate’, default is ‘even’
seed – seed for random sample, only used when method is ‘random’
Note that by default the seed is length of input list, if you want to generate different sample each time, you should set random seed manually
- ai2_kit.tool.frame.detect_frame_size(l: list, rp: str)[source]#
detect frame size of a file by repeating pattern :param rp: repeated pattern, can be string or regex, e.g. ‘TIMESTEP’ (for lammpstrj), ‘Lattice’ (for xyz)
- ai2_kit.tool.frame.load_frames(*path_or_glob: str, frame_size: int = 0, rp=None, header_size: int = 0)[source]#
Load frames from multiple files
- Parameters:
path_or_glob – path or glob pattern to locate data path
frame_size – line number of each frame
rp – repeated pattern, can be string or regex, e.g. ‘ITEM: TIMESTEP’, ‘Lattice.+’
- ai2_kit.tool.frame.parse_frames(lines: List[str], frame_size: int = 0, rp=None, header_size: int = 0)[source]#
parse frames from lines
- Parameters:
lines – lines of data
frame_size – line number of each frame
rp – repeated pattern, can be string or regex, e.g. ‘ITEM: TIMESTEP’, ‘Lattice.+’
header_size – size of header lines