ai2_kit.tool.frame module#

class ai2_kit.tool.frame.FrameTool[source]#

Bases: object

This tool is design to sampling frames from large trajectory file without parsing them. You can use this tool to merge, sample frames from multiple files, and write them to a new file.

A frame file is a file that contains multiple frames, each frame is separated by a fixed number of lines. For example, jsonl data file, or trajectory files in LAMMPS, xyz, etc format.

read(*path_or_glob: str, frame_size: int = 0, rp=None, header_size: int = 0)[source]#

Load trajectory files from multiple paths, support glob pattern

Parameters:
  • path_or_glob – path or glob pattern to locate data path

  • frame_size – line number of each frame

  • rp – repeated pattern, can be string or regex, e.g. ‘ITEM: TIMESTEP’, ‘Lattice.+’

sample(size: int, method: Literal['even', 'random', 'truncate'] = 'even', **kwargs)[source]#

sample frame by different method

Parameters:
  • size – size of sample, if size is larger than data size, return all data

  • method – method to sample, can be ‘even’, ‘random’, ‘truncate’, default is ‘even’

  • seed – seed for random sample, only used when method is ‘random’

Note that by default the seed is length of input list, if you want to generate different sample each time, you should set random seed manually

size()[source]#

size of loaded frames

slice(expr: str)[source]#

slice frame by python slice expression, for example 10:, :10, ::2, etc

Parameters:
  • start – start index

  • stop – stop index

  • step – step

write(out_file: str, keep_header=False, **kwargs)[source]#
ai2_kit.tool.frame.detect_frame_size(l: list, rp: str)[source]#

detect frame size of a file by repeating pattern :param rp: repeated pattern, can be string or regex, e.g. ‘TIMESTEP’ (for lammpstrj), ‘Lattice’ (for xyz)

ai2_kit.tool.frame.load_frames(*path_or_glob: str, frame_size: int = 0, rp=None, header_size: int = 0)[source]#

Load frames from multiple files

Parameters:
  • path_or_glob – path or glob pattern to locate data path

  • frame_size – line number of each frame

  • rp – repeated pattern, can be string or regex, e.g. ‘ITEM: TIMESTEP’, ‘Lattice.+’

ai2_kit.tool.frame.parse_frames(lines: List[str], frame_size: int = 0, rp=None, header_size: int = 0)[source]#

parse frames from lines

Parameters:
  • lines – lines of data

  • frame_size – line number of each frame

  • rp – repeated pattern, can be string or regex, e.g. ‘ITEM: TIMESTEP’, ‘Lattice.+’

  • header_size – size of header lines