Meta Functions Module Documentation

MDSuite: A Zincwarecode package.

License

This program and the accompanying materials are made available under the terms of the Eclipse Public License v2.0 which accompanies this distribution, and is available at https://www.eclipse.org/legal/epl-v20.html

SPDX-License-Identifier: EPL-2.0

Copyright Contributors to the Zincwarecode Project.

Contact Information

email: zincwarecode@gmail.com github: https://github.com/zincware web: https://zincwarecode.com/

Citation

If you use this module please cite us with:

Summary

mdsuite.utils.meta_functions.apply_savgol_filter(data: ndarray, order: int = 2, window_length: int = 17) ndarray[source]

Apply a savgol filter for function smoothing.

This function will simply call the scipy SavGol implementation with preset parameters for the polynomial number and window size.

Parameters:
  • window_length (int) – Window length to use in the filtering.

  • data (list) – Array of tensor_values to be analysed.

  • order (int) – Order of polynomial to use in the smoothing.

Returns:

filtered tensor_values – Returns the filtered tensor_values directly from the scipy SavGol filter.

Return type:

np.ndarray

Notes

There are no tests for this method as a test would simply be testing the scipy implementation which they have done.

mdsuite.utils.meta_functions.check_a_in_b(a, b)[source]

Check if any value of a is in b.

Parameters:
  • a (tf.Tensor) –

  • b (tf.Tensor) –

Return type:

bool

mdsuite.utils.meta_functions.closest_point(data: ndarray, value: float)[source]

Find the value in the array closes to the value provided.

Parameters:
  • data (float) – Array to search.

  • value (np.ndarray) – Value to look for.

mdsuite.utils.meta_functions.find_item(obj, key)[source]

Function to recursively retrieve values given a key for nested dictionaries.

Parameters:
  • obj (dict) – nested dictionary with results

  • key (str, float or other) – to find in the dictionary

Returns:

item – returns the value for the given key. Return type may change depending on the requested key

Return type:

dict value.

mdsuite.utils.meta_functions.get_dimensionality(box: list) int[source]

Calculate the dimensionality of the experiment box.

Parameters:

box (list) – box array of the experiment of the form [x, y, z]

Returns:

dimensions – dimension of the box i.e, 1 or 2 or 3 (Higher dimensions probably don’t make sense just yet)

Return type:

int

mdsuite.utils.meta_functions.get_machine_properties() dict[source]

Get the properties of the machine being used.

Returns:

machine_properties – A dictionary containing information about the hardware being used.

Return type:

dict

mdsuite.utils.meta_functions.get_nearest_divisor(a: int, b: int) int[source]

Function to get the nearest lower divisor.

If b%a is not 0, this method may be called to get the nearest number to a that makes b%a zero.

Parameters:
  • a (int) – divisor

  • b (int) – target number

Returns:

divisor – nearest number to a that divides into b evenly.

Return type:

int

Perform a golden-section search for function minimums.

The Golden-section search algorithm is one of the best min-finding algorithms available and is here used to the minimums of functions during analysis. For example, in the evaluation of coordination numbers the minimum values of the radial distribution functions must be calculated in order to define the coordination. This implementation will return an interval in which the minimum should exists, and does so for all of the minimums on the function.

Parameters:
  • data (np.array) – Data on which to find minimums.

  • a (float) – upper bound on the min finding range.

  • b (float) – lower bound on the min finding range.

Returns:

minimum range – Returns two radii values within which the minimum can be found.

Return type:

tuple

mdsuite.utils.meta_functions.gpu_available() bool[source]

Check if TensorFlow has access to any GPU device.

mdsuite.utils.meta_functions.is_jsonable(x: dict) bool[source]
Parameters:

x (dict) – Dictionary to check, if it is json serializable.

Returns:

bool

Return type:

Whether the dict was serializable or not.

mdsuite.utils.meta_functions.join_path(a, b)[source]

Join a and b and make sure to use forward slashes.

Parameters:
Returns:

str

Return type:

joined path with forced forward slashes

Notes

h5py 3.1.0 on windows relies on forward slashes but os.path.join returns backward slashes. Here we replace them to enable MDSuite for Windows users. To be used ONLY for navigation within a database_path. For navigation through the file experiment in general one should use os.path.join.

mdsuite.utils.meta_functions.line_counter(filename: str) int[source]

Count the number of lines in a file.

This function used a memory safe method to count the number of lines in the file. Using the other tensor_values collected during the trajectory analysis, this is enough information to completely characterize the experiment.

Parameters:

filename (str) – Name of the file to be read in.

Returns:

lines – Number of lines in the file

Return type:

int

mdsuite.utils.meta_functions.linear_fitting_function(x: array, a: float, b: float) array[source]

Linear function for line fitting.

In many cases, namely those involving an Einstein relation, a linear curve must be fit to some tensor_values. This function is called by the scipy curve_fit module as the model to fit to.

Parameters:
  • x (np.array) – x tensor_values for fitting

  • a (float) – Fitting parameter of the gradient

  • b (float) – Fitting parameter for the y intercept

Returns:

a*x + b – Returns the evaluation of a linear function.

Return type:

float

mdsuite.utils.meta_functions.optimize_batch_size(filepath: Union[str, Path], number_of_configurations: int, _file_size: Optional[int] = None, _memory: Optional[int] = None, test: bool = False) int[source]

Optimize the size of batches during initial processing.

During the database_path construction a batch size must be chosen in order to process the trajectories with the least RAM but reasonable performance.

Parameters:
  • filepath (str) – Path to the file be read in. This is not opened during the process, it is simply needed to read the file size.

  • number_of_configurations (int) – Number of configurations in the trajectory.

  • _file_size (int) – Mock file size to use during tests.

  • _memory (int) – Mock memory to use during tests.

  • test (bool) – If true, mock variables are used.

Returns:

batch size – Number of configurations to load in each batch

Return type:

int

mdsuite.utils.meta_functions.simple_file_read(filename: str) list[source]

Trivially read a file and load it into an array.

There are many occasions when a file simply must be read and dumped into a file. In these cases, we call this method and dump tensor_values into an array. This is NOT memory safe, and should not be used for processing large trajectory files.

Parameters:

filename (str) – Name of the file to be read in.

Returns:

data_array – Data read in by the function.

Return type:

list

mdsuite.utils.meta_functions.split_array(data: array, condition: array) list[source]

split an array by a condition :param data: tensor_values to split :type data: np.array :param condition: condition on which to split by. :type condition: np.array

Returns:

split_array – A list of split up arrays.

Return type:

list

mdsuite.utils.meta_functions.timeit(f: Callable) Callable[source]

Decorator to time the execution of a method.

Parameters:

f (Callable) – Function to be wrapped.

Returns:

wrap – Method wrapper for timing the method.

Return type:

Callable

Notes

There is currently no test for this wrapper as there is no simple way of checking timing on a remote server.