Colloquium: Tuesday January 31, 2023. Speaker: Ido Nachum (EPFL). Title: “A Johnson-Lindenstrauss Framework for Randomly Initialized CNNs”.

Author: raphy January 27, 2023

The link for the meetings is:

https://us02web.zoom.us/j/83337601824

Speaker : Ido Nachum (EPFL)

Date : Tuesday, 31st of January, 2023.

Time : 14:00

Title: A Johnson–Lindenstrauss Framework for Randomly Initialized CNNs.

Abstract: Fix a dataset $\{ x_i \}_{i=1}^n \subset \mathbb{R}^d$. The celebrated Johnson–Lindenstrauss (JL) lemma shows that a random projection preserves geometry: with high probability, $\langle x_i,x_j \rangle \approx \langle W \cdot x_i , W \cdot x_j \rangle$.

How does the JL lemma relate to neural networks?

A neural network is a sequential application of a projection that is followed by a non-linearity $\sigma: \mathbb{R} \rightarrow \mathbb{R}$ (applied coordinate-wise). For example, a fully connected network (FNN) with two hidden layers has the form: $N(x)=W_3 \cdot \sigma( W_2 \cdot \sigma( W_1 \cdot x ) )$. By the JL lemma, any layer of a random linear FNN(

$\sigma(x)=x$) essentially preserves the original geometry of the dataset. How does the quantity $

\langle\sigma( W \cdot x_i ) , \sigma( W \cdot x_j ) \rangle$ change with other non-linearities or a convolution ($*$) instead of matrix multiplication ($\cdot$)?

For FNNs with the prevalent ReLU activation ($Re LU(x):=\max\{ x , 0 \}$), the angle between two inputs contracts according to a known mapping. The question for non-linear convolutional neural networks (CNNs) becomes much more intricate. To answer this question, we introduce a geometric framework. For linear CNNs, we show that the Johnson–Lindenstrauss lemma continues to hold, namely, that the angle between two inputs is preserved. For CNNs with ReLU activation, on the other hand, the behavior is richer: The angle between the outputs contracts, where the level of contraction depends on the nature of the inputs. In particular, after one layer, the geometry of natural images is essentially preserved, whereas for Gaussian correlated inputs, CNNs exhibit the same contracting behavior as FNNs with ReLU activation.