mdsuite.database.simulation_database module¶
MDSuite: A Zincwarecode package.
License¶
This program and the accompanying materials are made available under the terms of the Eclipse Public License v2.0 which accompanies this distribution, and is available at https://www.eclipse.org/legal/epl-v20.html
SPDX-License-Identifier: EPL-2.0
Copyright Contributors to the Zincwarecode Project.
Contact Information¶
email: zincwarecode@gmail.com github: https://github.com/zincware web: https://zincwarecode.com/
Citation¶
If you use this module please cite us with:
Summary¶
- class mdsuite.database.simulation_database.Database(path: Union[str, Path] = 'database')[source]¶
Bases:
object
Database class.
Databases make up a large part of the functionality of MDSuite and are kept fairly consistent in structure. Therefore, the database_path structure we are using has a separate class with commonly used methods which act as wrappers for the hdf5 database_path.
- add_data(chunk: TrajectoryChunkData, start_idx: int)[source]¶
Add new data to the dataset.
- Parameters:
chunk – a data chunk
start_idx – Configuration at which to start writing.
- add_dataset(structure: dict)[source]¶
Add a dataset of the necessary size to the database_path.
Just as a separate method exists for building the group structure of the hdf5 database_path, so too do we include a separate method for adding a dataset. This is so datasets can be added not just upon the initial construction of the database_path, but also if tensor_values is added in the future that should also be stored. This method will assume that a group has already been built, although this is not necessary for HDF5, the separation of the actions is good practice.
- Parameters:
structure (dict) – Structure of a single property to be added to the database_path. e.g. {‘Na’: {‘Forces’: (200, 5000, 3)}}
- Return type:
Updates the database_path directly.
- change_key_names(mapping: dict)[source]¶
Change the name of database_path keys.
- Parameters:
mapping (dict) – Mapping for the change of names
- Return type:
Updates the database_path
- get_data_size(data_path: str) tuple [source]¶
Return the size of a dataset as a tuple (n_rows, n_columns, n_bytes).
- get_database_summary()[source]¶
Get a summary of the database properties.
- Returns:
summary – A list of properties that are in the database.
- Return type:
- get_load_time(database_path: Optional[str] = None)[source]¶
Calculate the open/close time of the database_path.
- get_memory_information() dict [source]¶
Get memory information from the database_path.
- Returns:
memory_database – A dictionary of the memory information of the groups in the database_path
- Return type:
- initialize_database(structure: dict)[source]¶
Build a database_path with a general structure.
Note, this method WILL overwrite a pre-existing database_path. This is because it is only to be called on the initial construction of an experiment class and the first addition of tensor_values to it.
- Parameters:
structure (dict) – General structure of the dictionary with relevant dataset sizes. e.g. {‘Na’: {‘Forces’: (200, 5000, 3)}, ‘Pressure’: (5000, 6), ‘Temperature’: (5000, 1)} In this case, the last value in the tuple corresponds to the number of components that wil be parsed to the database_path.
- load_data(path_list: ~typing.Optional[list] = None, select_slice: <numpy.lib.index_tricks.IndexExpression object at 0x7f811e5ba760> = slice(None, None, None), dictionary: bool = False, scaling: ~typing.Optional[list] = None, d_size: ~typing.Optional[int] = None)[source]¶
Load tensor_values from the database_path for some operation.
Should be called by the tensor_values fetch class as this will ensure correct loading and pre-loading.
- class mdsuite.database.simulation_database.MoleculeInfo(name: str, n_particles: int, properties: List[PropertyInfo], mass: Optional[float] = None, charge: float = 0, groups: Optional[dict] = None)[source]¶
Bases:
SpeciesInfo
Information about a Molecule.
All the information of a species + groups
- groups¶
A molecule specific dictionary for mapping the molecule to the particles. The keys of this dict are index references to a specific molecule, i.e. molecule 1 and the values are a dict of atom species and their indices belonging to that specific molecule. e.g
water = {“groups”: {“0”: {“H”: [0, 1], “O”: [0]}}
This tells us that the 0th water molecule consists of the 0th and 1st hydrogen atoms in the database as well as the 0th oxygen atom.
- Type:
- class mdsuite.database.simulation_database.PropertyInfo(name: str, n_dims: int)[source]¶
Bases:
object
Information of a trajectory property. example: pos_info = PropertyInfo(‘Positions’, 3) vel_info = PropertyInfo(‘Velocities’, 3).
- class mdsuite.database.simulation_database.SpeciesInfo(name: str, n_particles: int, properties: List[PropertyInfo], mass: Optional[float] = None, charge: float = 0)[source]¶
Bases:
object
Information of a species.
- properties¶
List of the properties that were recorded for the species mass and charge are optional
- Type:
list of PropertyInfo
- properties: List[PropertyInfo]¶
- class mdsuite.database.simulation_database.TrajectoryChunkData(species_list: List[SpeciesInfo], chunk_size: int)[source]¶
Bases:
object
Class to specify the data format for transfer from the file to the database.
- add_data(data: ndarray, config_idx, species_name, property_name)[source]¶
Add configuration data to the chunk :param data: The data to be added, with shape (n_configs, n_particles, n_dims).
n_particles and n_dims relates to the species and the property that is being added
- Parameters:
config_idx – Start index of the configs that are being added.
species_name – Name of the species to which the data belongs
property_name – Name of the property being added.
Example –
------- –
loop (that reads 5 configs per) –
loop –
add_data(vel_array –
16*5 –
'Na' –
'Velocities') –
(5 (where vel.data.shape ==) –
42 –
3) –
- class mdsuite.database.simulation_database.TrajectoryMetadata(n_configurations: int, species_list: ~typing.List[~mdsuite.database.simulation_database.SpeciesInfo], box_l: ~typing.Optional[list] = None, sample_rate: int = 1, sample_step: ~typing.Optional[float] = None, temperature: ~typing.Optional[float] = None, simulation_data: dict = <factory>)[source]¶
Bases:
object
Trajectory Metadata container.
This metadata must be extracted from trajectory files to build the database into which the trajectory will be stored.
- species_list¶
The information about all species in the system.
- Type:
list of SpeciesInfo
- sample_rate¶
The number of timesteps between consecutive samples # todo remove in favour of sample_step
- Type:
int optional
- sample_step¶
The time between consecutive configurations. E.g. for a simulation with time step 0.1 where the trajectory is written every 5 steps: sample_step = 0.5. Does not have to be specified (e.g. configurations from Monte Carlo scheme), but is needed for all dynamic observables.
- Type:
int optional
- temperature¶
The set temperature of the system. Optional because only applicable for MD simulations with thermostat. Needed for certain observables.
- Type:
float optional
- simulation_data¶
All other simulation data that can be extracted from the trajectory metadata. E.g. software version, pressure in NPT simulations, time step, …
- Type:
str|Path, optional
- species_list: List[SpeciesInfo]¶