Abstract

Group convolutional neural networks (G-CNNs) have been shown to increase parameter efficiency and model accuracy by incorporating geometric inductive biases. In this work, we investigate the properties of representations learned by regular G-CNNs, and show considerable parameter redundancy in group convolution kernels. This finding motivates further weight-tying by sharing convolution kernels over subgroups. To this end, we introduce convolution kernels that are separable over the subgroup and channel dimensions. In order to obtain equivariance to arbitrary affine Lie groups we provide a continuous parameterisation of separable convolution kernels. We evaluate our approach across several vision datasets, and show that our weight sharing leads to improved performance and computational efficiency. In many settings, separable G-CNNs outperform their non-separable counterpart, while only using a fraction of their training time. In addition, thanks to the increase in computational efficiency, we are able to implement G-CNNs equivariant to the group; the group of dilations, rotations and translations. -equivariance further improves performance on all tasks considered.

Notes

Zotero Link

To this end, we introduce convolution kernels that are separable over the subgroup and channel dimensions.

continuous parameterisation of separable convolution kernels

In many settings, separable G-CNNs outperform their nonseparable counterpart, while only using a fraction of their training time.

However, a practical challenge impeding application to larger groups

is the computational complexity of regular group convolutions, which scales exponentially with the dimensionality of the group.

Furthermore, Lengyel & van Gemert (2021) show that group convolution filters in the original formulation of the G-CNN by Cohen & Welling (2016a) exhibit considerable redundancies along the group axis for the p4m and Z2 groups

We propose the use of a SIREN

To achieve equivariance to continuous affine Lie groups, we propose a random sampling method over subgroups H for approximating the group convolution operation.

Extending this investigation of learned convolution filters to the original G-CNN (Cohen & Welling, 2016a), Lengyel & van Gemert (2021) remark on the high degree of correlation found among filters along the rotation axis, and propose to share the same spatial kernel for every rotation feature map.

This may imply that the reduction in kernel expressivity also has a regularising effect that benefits generalisation. In this experiment, separable group convolutions decisively outperform the non-separable variant.

The reduction in expressivity may serve as regularization, preventing overfitting on the training set

We showed that separable group convolutions not only drastically increase computational efficiency, but in many settings also outperform their non-separable counterpart.

Random sampling over the dilation group results in representations containing information at different spatial resolutions at every sampling step. In this setting, we conjecture that due to the changing spatial resolution when traversing the group, random sampling may have too strong of a regularising effect on the learned kernels, to the point that the network is unable to build sufficiently expressive representations.