Estimate of the Neural Network Dimension Using Algebraic Topology and Lie Theory

In this paper we present an approach to determine the smallest possible number of perceptrons in a neural net in such a way that the topology of the input space can be learned sufficiently well. We introduce a general procedure based on persistent homology to investigate topological invariants of the manifold on which we suspect the data set. We specify the required dimensions precisely, assuming that there is a smooth manifold on or near which the data are located. Furthermore, we require that this space is connected and has a commutative group structure in the mathematical sense. These assumptions allow us to derive a decomposition of the underlying space whose topology is well known. We use the representatives of the k-dimensional homology groups from the persistence landscape to determine an integer dimension for this decomposition. This number is the dimension of the embedding that is capable of capturing the topology of the data manifold. We derive the theory and validate it experimentally on toy data sets.

Keywords: Embedding Dimension, Parameterization, Persistent Homology, Neural Networks and Manifold Learning.

Citation

@inproceedings{imta7/MelodiaL21,
  author    = {Luciano Melodia and
               Richard Lenz},
  editor    = {Del Bimbo, A.,
               Cucchiara, R.,
               Sclaroff, S.,
               Farinella, G.M.,
               Mei, T.,
               Bertini, M.,
               Escalante, H.J.,
               Vezzani, R.},
  title     = {Estimate of the Neural Network Dimension using Algebraic Topology and Lie Theory},
  booktitle = {Pattern Recognition. ICPR International Workshops and Challenges, {IMTA VII}
               2021, Milano, Italy, January 11, 2021, Proceedings},
  series    = {Lecture Notes in Computer Science},
  volume    = {12665},
  pages     = {15--29},
  publisher = {Springer},
  year      = {2021},
  url       = {https://doi.org/10.1007/978-3-030-68821-9_2},
  doi       = {10.1007/978-3-030-68821-9_2},
}

Content

Invertible autoencoders autoencoderInvertible.py
Count representatives from homology groups countHomgroups.py
Persistence landscapes persistenceLandscapes.py
Persistence statistics persistenceStatistics.py
- Hausdorff intervall
- Truncated simplex trees

imageAutoencode

take_out_element

take_out_element(k: tuple, r) -> tuple

A function taking out specific values.

param k: tuple object to be processed, type tuple.
param r: value to be removed, type int, float, string, None.
return k2: cropped tuple object, type tuple.

primeFactors

primeFactors(n)

A function that returns the prime factors of an integer.

param n: an integer, type int.
return factors: a list of prime factors, type list.

load_data_keras

load_data_keras(dimensions: tuple, factor: float = 255.0, dataset: str = 'mnist') -> tuple

A utility function to load datasets.

This functions helps to load particular datasets ready for a processing with convolutional
or dense autoencoders. It depends on the specified shape (the input dimensions). This functions
is for validation purpose and works for keras datasets only.
Supported datasets are mnist (default), cifar10, cifar100 and boston_housing.
The shapes: mnist (28,28,1), cifar10 (32,32,3), cifar100 (32,32,3)

param dimensions: dimension of the data, type tuple.
param factor: division factor, default is 255, type float.
param dataset: keras dataset, default is mnist,type str.
return X_train, X_test, input_image: , type tuple.

add_gaussian_noise

add_gaussian_noise(data: numpy.ndarray, noise_factor: float = 0.5, mean: float = 0.0, std: float = 1.0) -> numpy.ndarray

A utility function to add gaussian noise to data.

The purpose of this functions is validating certain models under gaussian noise.
The noise can be added changing the mean, standard deviation and the amount of
noisy points added.

param noise_factor: amount of noise in percent, type float.
param data: dataset, type np.ndarray.
param mean: mean, type float.
param std: standard deviation, type float.
return x_train_noisy: noisy data, type np.ndarray.

crop_tensor

crop_tensor(dimension: int, start: int, end: int) -> Callable

A utility function cropping a tensor along a given dimension.

The purpose of this function is to be used for multivariate cropping and to serve
as a procedure for the invertible autoencoders, which need a cropping to make the
matrices trivially invertible, as can be seen in the Real NVP architecture.
This procedure works up to dimension 4.

param dimension: the dimension of cropping, type int.
param start: starting index for cropping, type int.
param end: ending index for cropping, type int.
return Lambda(func): Lambda function on the tensor, type Callable.

convolutional_group

convolutional_group(_input: numpy.ndarray, filterNumber: int, alpha: float = 5.5, kernelSize: tuple = (2, 2), kernelInitializer: str = 'uniform', padding: str = 'same', useBias: bool = True, biasInitializer: str = 'zeros')

This group can be extended for deep learning models and is a sequence of convolutional layers.

The convolutions is a 2D-convolution and uses a LeakyRelu activation function. After the activation
function batch-normalization is performed on default, to take care of the covariate shift. As default
the padding is set to same, to avoid difficulties with convolution.

param _input: data from previous convolutional layer, type np.ndarray.
param filterNumber: multiple of the filters per layer, type int.
param alpha: parameter for LeakyRelu activation function, default 5.5, type float.
param kernelSize: size of the 2D kernel, default (2,2), type tuple.
param kernelInitializer: keras kernel initializer, default uniform, type str.
param padding: padding for convolution, default same, type str.
param useBias: whether or not to use the bias term throughout the network, type bool.
param biasInitializer: initializing distribution of the bias values, type str.
return data: processed data by neural layers, type np.ndarray.

loop_group

loop_group(group: Callable, groupLayers: int, element: numpy.ndarray, filterNumber: int, kernelSize: tuple, useBias: bool = True, kernelInitializer: str = 'uniform', biasInitializer: str = 'zeros') -> numpy.ndarray

This callable is a loop over a group specification.

The neural embeddings ends always with dimension 1 in the color channel. For other
specifications use the parameter colorChannel. The function operates on every keras
group of layers using the same parameter set as 2D convolution.

param group: a callable that sets up the neural architecture, type Callable.
param groupLayers: depth of the neural network, type int.
param element: data, type np.ndarray.
param filterNumber: number of filters as exponential of 2, type int.
param kernelSize: size of the kernels, type tuple.
return data: processed data by neural network, type np.ndarray.
param useBias: whether or not to use the bias term throughout the network, type bool.
param biasInitializer: initializing distribution of the bias values, type str.

invertible_layer

invertible_layer(data: numpy.ndarray, alpha: float = 5.5, kernelSize: tuple = (2, 2), kernelInitializer: str = 'uniform', groupLayers: int = 6, filterNumber: int = 2, croppingFactor: int = 4, useBias: bool = True, biasInitializer: str = 'zeros') -> numpy.ndarray

Returns an invertible neural network layer.

This neural network layer learns invertible subspaces, parameterized by higher dimensional
functions with a trivial invertibility. The higher dimensional functions are also neural
subnetworks, trained during learn...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Estimate of the Neural Network Dimension Using Algebraic Topology and Lie Theory

Citation

Content

imageAutoencode

take_out_element

primeFactors

load_data_keras

add_gaussian_noise

crop_tensor

convolutional_group

loop_group

invertible_layer

Releases: karhunenloeve/NTOPL

NTOPLv.1.0

Estimate of the Neural Network Dimension Using Algebraic Topology and Lie Theory

Citation

Content

imageAutoencode

take_out_element

primeFactors

load_data_keras

add_gaussian_noise

crop_tensor

convolutional_group

loop_group

invertible_layer