Releases: karhunenloeve/NTOPL
NTOPLv.1.0
Estimate of the Neural Network Dimension Using Algebraic Topology and Lie Theory
- This is the link to the arxiv article.
- This is the link to slides for the talk given at the IMTA-7.
- This is the link to the transcript of the talk given at the IMTA-7.
In this paper we present an approach to determine the smallest possible number of perceptrons in a neural net in such a way that the topology of the input space can be learned sufficiently well. We introduce a general procedure based on persistent homology to investigate topological invariants of the manifold on which we suspect the data set. We specify the required dimensions precisely, assuming that there is a smooth manifold on or near which the data are located. Furthermore, we require that this space is connected and has a commutative group structure in the mathematical sense. These assumptions allow us to derive a decomposition of the underlying space whose topology is well known. We use the representatives of the k-dimensional homology groups from the persistence landscape to determine an integer dimension for this decomposition. This number is the dimension of the embedding that is capable of capturing the topology of the data manifold. We derive the theory and validate it experimentally on toy data sets.
Keywords: Embedding Dimension, Parameterization, Persistent Homology, Neural Networks and Manifold Learning.
Citation
@inproceedings{imta7/MelodiaL21,
author = {Luciano Melodia and
Richard Lenz},
editor = {Del Bimbo, A.,
Cucchiara, R.,
Sclaroff, S.,
Farinella, G.M.,
Mei, T.,
Bertini, M.,
Escalante, H.J.,
Vezzani, R.},
title = {Estimate of the Neural Network Dimension using Algebraic Topology and Lie Theory},
booktitle = {Pattern Recognition. ICPR International Workshops and Challenges, {IMTA VII}
2021, Milano, Italy, January 11, 2021, Proceedings},
series = {Lecture Notes in Computer Science},
volume = {12665},
pages = {15--29},
publisher = {Springer},
year = {2021},
url = {https://doi.org/10.1007/978-3-030-68821-9_2},
doi = {10.1007/978-3-030-68821-9_2},
}
Content
- Invertible autoencoders
autoencoderInvertible.py
- Remove tensor elements
- Get prime factors
- Load example Keras datasets
- Add gaussian noise to data
- Crop tensor elements
- Greate a group of convolutional layers
- Loop over a group of convolutional layers
- Invertible Keras neural network layer
- Convert dimensions into 2D-convolution
- Embedded invertible autoencoder model
- Count representatives from homology groups
countHomgroups.py
- Persistence landscapes
persistenceLandscapes.py
- Persistence statistics
persistenceStatistics.py
imageAutoencode
take_out_element
take_out_element(k: tuple, r) -> tuple
A function taking out specific values.
- param k: tuple object to be processed, type
tuple
. - param r: value to be removed, type
int, float, string, None
. - return k2: cropped tuple object, type
tuple
.
primeFactors
primeFactors(n)
A function that returns the prime factors of an integer.
- param n: an integer, type
int
. - return factors: a list of prime factors, type
list
.
load_data_keras
load_data_keras(dimensions: tuple, factor: float = 255.0, dataset: str = 'mnist') -> tuple
A utility function to load datasets.
This functions helps to load particular datasets ready for a processing with convolutional
or dense autoencoders. It depends on the specified shape (the input dimensions). This functions
is for validation purpose and works for keras datasets only.
Supported datasets are mnist
(default), cifar10
, cifar100
and boston_housing
.
The shapes: mnist (28,28,1)
, cifar10 (32,32,3)
, cifar100 (32,32,3)
- param dimensions: dimension of the data, type
tuple
. - param factor: division factor, default is
255
, typefloat
. - param dataset: keras dataset, default is
mnist
,typestr
. - return X_train, X_test, input_image: , type
tuple
.
add_gaussian_noise
add_gaussian_noise(data: numpy.ndarray, noise_factor: float = 0.5, mean: float = 0.0, std: float = 1.0) -> numpy.ndarray
A utility function to add gaussian noise to data.
The purpose of this functions is validating certain models under gaussian noise.
The noise can be added changing the mean, standard deviation and the amount of
noisy points added.
- param noise_factor: amount of noise in percent, type
float
. - param data: dataset, type
np.ndarray
. - param mean: mean, type
float
. - param std: standard deviation, type
float
. - return x_train_noisy: noisy data, type
np.ndarray
.
crop_tensor
crop_tensor(dimension: int, start: int, end: int) -> Callable
A utility function cropping a tensor along a given dimension.
The purpose of this function is to be used for multivariate cropping and to serve
as a procedure for the invertible autoencoders, which need a cropping to make the
matrices trivially invertible, as can be seen in the Real NVP
architecture.
This procedure works up to dimension 4
.
- param dimension: the dimension of cropping, type
int
. - param start: starting index for cropping, type
int
. - param end: ending index for cropping, type
int
. - return Lambda(func): Lambda function on the tensor, type
Callable
.
convolutional_group
convolutional_group(_input: numpy.ndarray, filterNumber: int, alpha: float = 5.5, kernelSize: tuple = (2, 2), kernelInitializer: str = 'uniform', padding: str = 'same', useBias: bool = True, biasInitializer: str = 'zeros')
This group can be extended for deep learning models and is a sequence of convolutional layers.
The convolutions is a 2D
-convolution and uses a LeakyRelu
activation function. After the activation
function batch-normalization is performed on default, to take care of the covariate shift. As default
the padding is set to same, to avoid difficulties with convolution.
- param _input: data from previous convolutional layer, type
np.ndarray
. - param filterNumber: multiple of the filters per layer, type
int
. - param alpha: parameter for
LeakyRelu
activation function, default5.5
, typefloat
. - param kernelSize: size of the
2D
kernel, default(2,2)
, typetuple
. - param kernelInitializer: keras kernel initializer, default
uniform
, typestr
. - param padding: padding for convolution, default
same
, typestr
. - param useBias: whether or not to use the bias term throughout the network, type
bool
. - param biasInitializer: initializing distribution of the bias values, type
str
. - return data: processed data by neural layers, type
np.ndarray
.
loop_group
loop_group(group: Callable, groupLayers: int, element: numpy.ndarray, filterNumber: int, kernelSize: tuple, useBias: bool = True, kernelInitializer: str = 'uniform', biasInitializer: str = 'zeros') -> numpy.ndarray
This callable is a loop over a group specification.
The neural embeddings ends always with dimension 1
in the color channel. For other
specifications use the parameter colorChannel
. The function operates on every keras
group of layers using the same parameter set as 2D
convolution.
- param group: a callable that sets up the neural architecture, type
Callable
. - param groupLayers: depth of the neural network, type
int
. - param element: data, type
np.ndarray
. - param filterNumber: number of filters as exponential of
2
, typeint
. - param kernelSize: size of the kernels, type
tuple
. - return data: processed data by neural network, type
np.ndarray
. - param useBias: whether or not to use the bias term throughout the network, type
bool
. - param biasInitializer: initializing distribution of the bias values, type
str
.
invertible_layer
invertible_layer(data: numpy.ndarray, alpha: float = 5.5, kernelSize: tuple = (2, 2), kernelInitializer: str = 'uniform', groupLayers: int = 6, filterNumber: int = 2, croppingFactor: int = 4, useBias: bool = True, biasInitializer: str = 'zeros') -> numpy.ndarray
Returns an invertible neural network layer.
This neural network layer learns invertible subspaces, parameterized by higher dimensional
functions with a trivial invertibility. The higher dimensional functions are also neural
subnetworks, trained during learn...