dl_data_pipeline.process_functions package

Submodules

dl_data_pipeline.process_functions.any_process module

dl_data_pipeline.process_functions.any_process.rescale(data: ndarray, min_value: float = 0.0, max_value: float = 1.0) ndarray[source]

Rescales the input data array to a specified range [min_value, max_value].

This function takes a NumPy array and rescales its values to fit within the specified minimum and maximum values. The rescaling is done by first shifting the data so that the minimum value becomes zero, then scaling it to the target range, and finally shifting it to start from the specified minimum value.

Parameters:
  • data (np.ndarray) – The input data array to be rescaled.

  • min_value (float, optional) – The minimum value of the target range. Defaults to 0.0.

  • max_value (float, optional) – The maximum value of the target range. Defaults to 1.0.

Returns:

The rescaled data array, with values in the range [min_value, max_value].

Return type:

np.ndarray

Raises:

ValueError – If data contains constant values (e.g., all elements are the same), which would lead to division by zero during rescaling.

Examples

>>> import numpy as np
>>> data = np.array([10, 20, 30, 40, 50])
>>> rescaled_data = rescale(data, min_value=-1, max_value=1)
>>> print(rescaled_data)
[-1.  -0.5  0.   0.5  1. ]

dl_data_pipeline.process_functions.process_1d module

dl_data_pipeline.process_functions.process_1d.center_pad_rcut(data: ndarray, desired_audio_length: int) ndarray[source]

Pad or cut the audio array so that output has a length equal to desired_audio_length :param data: the input audio array :type data: np.ndarray :param desired_audio_length: the target length for the audio :type desired_audio_length: int

Return

(np.ndarray): correctly shaped audio array

Examples

>>> data = np.array([[1, 2, 3], [4, 5, 6]])
>>> center_pad_rcut(data, 5)
array([[0., 1., 2., 3., 0.],
       [0., 4., 5., 6., 0.]])
>>> center_pad_rcut(data, 2)
array([[2, 3],
       [5, 6]])
dl_data_pipeline.process_functions.process_1d.lpad_lcut(data: ndarray, desired_audio_length: int) ndarray[source]

Pad or cut the audio array so that output has a length equal to desired_audio_length :param data: the input audio array :type data: np.ndarray :param desired_audio_length: the target length for the audio :type desired_audio_length: int

Return

(np.ndarray): correctly shaped audio array

Examples

>>> data = np.array([[1, 2, 3], [4, 5, 6]])
>>> lpad_lcut(data, 5)
array([[0., 0., 1., 2., 3.],
       [0., 0., 4., 5., 6.]])
>>> lpad_lcut(data, 2)
array([[2, 3],
       [5, 6]])
dl_data_pipeline.process_functions.process_1d.rpad_rcut(data: ndarray, desired_audio_length: int) ndarray[source]

Pad or cut the audio array so that output has a length equal to desired_audio_length :param data: the input audio array :type data: np.ndarray :param desired_audio_length: the target length for the audio :type desired_audio_length: int

Return

(np.ndarray): correctly shaped audio array

Examples

>>> data = np.array([[1, 2, 3], [4, 5, 6]])
>>> rpad_rcut(data, 5)
array([[1., 2., 3., 0., 0.],
       [4., 5., 6., 0., 0.]])
>>> rpad_rcut(data, 2)
array([[1, 2],
       [4, 5]])

dl_data_pipeline.process_functions.process_2d module

dl_data_pipeline.process_functions.process_2d._reshape_array_for_pooling(data: ndarray, strides: int) ndarray[source]

Reshape the input data for pooling.

This function prepares a 2D array for pooling operations by reshaping the data into smaller blocks based on the given stride.

Parameters:
  • data (np.ndarray) – The input array representing the image or data to be pooled.

  • strides (int) – The stride size that determines the size of the blocks used for pooling.

Returns:

A reshaped array where the input data has been divided into blocks

of shape (mh, strides, mw, strides, -1), where mh and mw are the dimensions after pooling.

Return type:

np.ndarray

dl_data_pipeline.process_functions.process_2d.any_pooling_2d(data: ndarray, strides: int = 2, *, pooling_function: Callable, axis_kw: str = 'axis') ndarray[source]

Apply a custom pooling operation to the input data.

This function allows for flexible pooling operations by accepting a custom pooling function. It reshapes the input data into blocks based on the specified stride, and then applies the given pooling function to the blocks.

Parameters:
  • data (np.ndarray) – The input array representing the image or data to be pooled.

  • strides (int, optional) – The stride size that determines the size of the blocks used for pooling. Defaults to 2.

  • pooling_function (Callable) – A function that takes the reshaped blocks of data and applies the desired pooling operation (e.g., max pooling, average pooling).

  • axis_kw (str, optional) – The keyword name for specifying the axis along which pooling should be applied in the custom pooling function. Defaults to “axis”.

Returns:

A 2D array where the custom pooling function has been applied,

reducing the size of the input array based on the stride.

Return type:

np.ndarray

dl_data_pipeline.process_functions.process_2d.avg_pooling_2d(data: ndarray, strides: int = 2) ndarray[source]

Apply 2D average pooling to the input data.

This function applies average pooling to the input 2D array, reducing its size by calculating the mean value from each block of data, based on the specified stride.

Parameters:
  • data (np.ndarray) – The input array representing the image or data to be pooled.

  • strides (int, optional) – The stride size that determines the size of the blocks used for pooling. Defaults to 2.

Returns:

A 2D array where average pooling has been applied, reducing the size

of the input array based on the stride.

Return type:

np.ndarray

dl_data_pipeline.process_functions.process_2d.image_chw_to_hwc(data: ndarray) ndarray[source]

Converts an image from CHW (Channel-Height-Width) format to HWC (Height-Width-Channel) format.

Parameters:

data (np.ndarray) – The input image array in CHW format. The shape should be (channels, height, width).

Returns:

The image array in HWC format.

The shape will be (height, width, channels).

Return type:

np.ndarray

dl_data_pipeline.process_functions.process_2d.image_hwc_to_chw(data: ndarray) ndarray[source]

Converts an image from HWC (Height-Width-Channel) format to CHW (Channel-Height-Width) format.

Parameters:

data (np.ndarray) – The input image array in HWC format. The shape should be (height, width, channels).

Returns:

The image array in CHW format.

The shape will be (channels, height, width).

Return type:

np.ndarray

dl_data_pipeline.process_functions.process_2d.image_to_channel_num(image: ndarray, channel_number_target: int = 3, fill_value: float | int = 0.0) ndarray[source]

Convert an image to the specified number of channels.

Parameters:
  • image (np.ndarray) – Input image array, which can be grayscale (2D), single-channel (3D), or multi-channel (3D).

  • channel_number_target (int, optional) – Target number of channels. Defaults to 3.

  • fill_value (float | int, optional) – Value used to fill new channels if needed. Defaults to 1.0.

Returns:

Image array with the specified number of channels.

Return type:

np.ndarray

Examples

>>> image = np.array([[1, 2], [3, 4]])
>>> image_to_channel_num(image, channel_number_target=3)
array([[[1, 1, 1],
        [2, 2, 2]],
       [[3, 3, 3],
        [4, 4, 4]]])
dl_data_pipeline.process_functions.process_2d.max_pooling_2d(data: ndarray, strides: int = 2) ndarray[source]

Apply 2D max pooling to the input data.

This function applies max pooling to the input 2D array, reducing its size by selecting the maximum value from each block of data, based on the specified stride.

Parameters:
  • data (np.ndarray) – The input array representing the image or data to be pooled.

  • strides (int, optional) – The stride size that determines the size of the blocks used for pooling. Defaults to 2.

Returns:

A 2D array where max pooling has been applied, reducing the size

of the input array based on the stride.

Return type:

np.ndarray

dl_data_pipeline.process_functions.process_2d.open_rgb_image(path: str) ndarray[source]

Open an image using cv2 and convert back to RGB.

Parameters:

path (str) – path of the image

Returns:

array representing the image

Return type:

np.ndarray

Examples

>>> img = open_rgb_image('path/to/image.jpg')
>>> img.shape
(height, width, 3)
dl_data_pipeline.process_functions.process_2d.padding_2d(data: ndarray, target_shape: Tuple[int, int], fill_value: float = 1.0) ndarray[source]

Pads a 2D (or 3D) array to the target shape with the specified fill value.

Parameters:
  • data (np.ndarray) – The input 2D (or 3D) array representing an image.

  • target_shape (tuple) – The desired shape of the output array (height, width).

  • fill_value (float, optional) – The value used for padding. Defaults to 1.0.

Returns:

The padded array with the target shape.

Return type:

np.ndarray

Raises:

ValueError – If the input data shape is larger than the target shape. If the input data is not a 2D or 3D array.

Examples

>>> data = np.array([[1, 2], [3, 4]])
>>> padding_2d(data, (4, 4), fill_value=0)
array([[0., 0., 0., 0.],
       [0., 1., 2., 0.],
       [0., 3., 4., 0.],
       [0., 0., 0., 0.]])
dl_data_pipeline.process_functions.process_2d.resize_with_max_distortion(data: ndarray, target_shape: Tuple[int, int], max_ratio_distortion: float) ndarray[source]

Resizes the input 2D or 3D array (image) to the target shape with a constraint on maximum allowable distortion.

This function resizes an image (or any 2D/3D array) to a specified target shape while controlling the amount of distortion (change in aspect ratio) allowed during the resizing process. If the distortion exceeds the specified max_ratio_distortion, the function adjusts the stretch ratios accordingly to minimize distortion.

Parameters:
  • data (np.ndarray) – The input 2D or 3D array to be resized. Typically, this represents an image.

  • target_shape (Tuple[int, int]) – The desired target shape (height, width) for the output array.

  • max_ratio_distortion (float) – The maximum allowable difference between the horizontal and vertical stretch ratios. This controls how much the aspect ratio can change during resizing. 0 as max distortion ensures aspect ratio is kept.

Returns:

The resized array that fits within the specified target shape.

Return type:

np.ndarray

Raises:

ValueError – If the input data is not a 2D or 3D array.

Module contents

This module provides a collection of basic preprocessing functions, organized into three submodules.

The functions in this module are commonly used in data processing workflows, particularly in the contexts of machine learning, data analysis, and image processing. These preprocessing steps are crucial for preparing data before it is fed into models or further analyzed.

Submodules:
any_process:

Contains general-purpose preprocessing functions that can be applied to a variety of data types.

process_1d:

Focuses on preprocessing functions specifically designed for one-dimensional data, such as time series or signal data. The excepted shape for those data is (input_dim, channel_number).

process_2d:

Specializes in preprocessing functions for two-dimensional data, primarily images, including resizing, padding, and cropping operations. The excepted shape for those data is (input_dim1, input_dim2, channel_number).

These submodules provide a comprehensive set of tools for handling different types of data, ensuring that they are in the optimal format and condition for downstream tasks.