a

Lorem ipsum dolor sit amet, consectetur adicing elit ut ullamcorper. leo, eget euismod orci. Cum sociis natoque penati bus et magnis dis.Proin gravida nibh vel velit auctor aliquet. Leo, eget euismod orci. Cum sociis natoque penati bus et magnis dis.Proin gravida nibh vel velit auctor aliquet.

  /  Project   /  Blog: Review: IGCNet / IGCV1 — Interleaved Group Convolutions (Image Classification)

Blog: Review: IGCNet / IGCV1 — Interleaved Group Convolutions (Image Classification)


Outperforms NIN, Highway, FractalNet, ResNet, Pre-Activation ResNet, Stochastic Depth, WRN, RiR, Xception, DenseNet, ResNeXt

IGCNet / IGCV1

In this story, Interleaved Group Convolutional Neural Networks (IGCNet / IGCV1), by Microsoft Research and University of Central Florida, is reviewed. With the novel Interleaved Group Convolutions, IGCV1 outperforms state-of-the-art approaches such as ResNet with fewer number of parameters and fewer number of FLOPs. This is a paper in 2017 ICCV with more than 50 citations. And later on, authors also published IGCV2 and IGCV3. (Sik-Ho Tsang @ Medium)


Outline

  1. Interleaved Group Convolution (IGC) Block
  2. Connections to Other Convolutions
  3. Evaluations

1. Interleaved Group Convolution (IGC) Block

Interleaved Group Convolutions
  • As shown above, it is split into primary group convolutions and secondary group convolutions.
  • And there are permutation before and after secondary group convolutions.

1.1. Primary Group Convolutions

  • Let L be the number of partitions. The input feature maps are divided into L groups as shown in the figure above.
  • Standard spatial convolutions, such as 3×3, are applied for each group independently.
  • Therefore, a group convolution can be viewed as a regular convolution with a sparse block-diagonal convolution kernel, where each block corresponds to a partition of channels and there are no connections across the partitions.

1.2. Secondary Group Convolutions

  • Then, permutation is performed on the output of primary group convolutions, as the equation above where P is the permutation matrix, in the way that the mth secondary partition is composed of the mth output channel from each primary partition.
  • Next, the secondary group convolution is performed over the M secondary partitions. Inhere, point-wise 1×1 convolution is applied on the mth secondary partition.
  • After convolution, it is permuted back to x.
  • In summary, an interleaved group convolution block is formulated as:
  • It can be treated as:
  • i.e. an IGC block is actually equivalent to a regular convolution with the convolution kernel being the product of two sparse kernels.

2. Connections to Other Convolutions

2.1. Connection to Regular Convolution

(a) Regular Convolution, (b) Four-branch Representation of the Regular Convolution
  • Authors suggested, the above right 4-branch IGC can be equivalent to regular convolution with:
(More Details in the paper)

2.2. Connection to Summation Fusion like ResNeXt

  • For ResNeXt-like network, in the form of IGC blocks, actually it is:

2.3. Connection to Xception-like Network

  • It is actually L=1 and M=1.

3. Evaluations

3.1. Comparison with SumFusion and Regular Convolution

Different Architectures
  • SumFusion: ResNeXt-like network.
  • RegConv: Network using standard convolution.
  • IGC-L?M?: Proposed IGCNet / IGCV1 with different L and M.
Number of Parameters (Left) and Number of FLOPs (Right)
  • It is found that IGCNet / IGCV1 has smaller number of parameters as well as smaller of FLOPs.
Classification Accuracy on CIFAR-10 and CIFAR-100
  • IGC-L24M2 containing much fewer parameters, performs better than both RegConv-W16 and RegConv-W18.
  • The IGC blocks increase the width and the parameters are exploited more efficiently.

3.2. Effect of Partition Number

Accuracy Using Different Sets of (L, M) on CIFAR-100
  • The performance with M = 2 secondary partitions is better than Xception-like network (M = 1).
  • IGC with L = 40 and M = 2 gets 63.89% accuracy, about 0.8% better than IGC with L = 64 and M = 1, which gets 63.07% accuracy.

3.3. Combination with Identity Mapping Structure

Classification Accuracy on CIFAR-10 and CIFAR-100
  • Identity Mapping Structure is used in Pre-Activation ResNet.
  • IGC-L24M2 can be cooperated with Identity Mapping Structure, and obtain the highest accuracy.

3.4. ImageNet Classification Compared with ResNet

  • IGC-L4M32+Ident., performs better than ResNet (C = 64) that contains slightly fewer parameters.
  • IGC-L16M16+Ident. performs better than ResNet (C = 69) that has approximately the same number of parameters and computation complexity.
  • The gains are not from regularization but from richer representation.

3.5. Comparison with State-of-the-art Approaches

Classification Error on ImageNet

Hope I can cover IGCV2 and IGCV3 in the future.


Reference

[2107 ICCV] [IGCNet / IGCV1]
Interleaved Group Convolutions for Deep Neural Networks

My Previous Reviews

Image Classification
[LeNet] [AlexNet] [Maxout] [NIN] [ZFNet] [VGGNet] [Highway] [SPPNet] [PReLU-Net] [STN] [DeepImage] [SqueezeNet] [GoogLeNet / Inception-v1] [BN-Inception / Inception-v2] [Inception-v3] [Inception-v4] [Xception] [MobileNetV1] [ResNet] [Pre-Activation ResNet] [RiR] [RoR] [Stochastic Depth] [WRN] [Shake-Shake] [FractalNet] [Trimps-Soushen] [PolyNet] [ResNeXt] [DenseNet] [PyramidNet] [DRN] [DPN] [Residual Attention Network] [MSDNet] [ShuffleNet V1] [SENet]

Object Detection
[OverFeat] [R-CNN] [Fast R-CNN] [Faster R-CNN] [MR-CNN & S-CNN] [DeepID-Net] [CRAFT] [R-FCN] [ION] [MultiPathNet] [NoC] [Hikvision] [GBD-Net / GBD-v1 & GBD-v2] [G-RMI] [TDM] [SSD] [DSSD] [YOLOv1] [YOLOv2 / YOLO9000] [YOLOv3] [FPN] [RetinaNet] [DCN]

Semantic Segmentation
[FCN] [DeconvNet] [DeepLabv1 & DeepLabv2] [CRF-RNN] [SegNet] [ParseNet] [DilatedNet] [DRN] [RefineNet] [GCN] [PSPNet] [DeepLabv3]

Biomedical Image Segmentation
[CUMedVision1] [CUMedVision2 / DCAN] [U-Net] [CFS-FCN] [U-Net+ResNet] [MultiChannel] [V-Net] [3D U-Net] [M²FCN] [SA] [3D U-Net+ResNet]

Instance Segmentation
[SDS] [Hypercolumn] [DeepMask] [SharpMask] [MultiPathNet] [MNC] [InstanceFCN] [FCIS]

Super Resolution
[SRCNN] [FSRCNN] [VDSR] [ESPCN] [RED-Net] [DRCN] [DRRN] [LapSRN & MS-LapSRN] [SRDenseNet]

Human Pose Estimation
[DeepPose] [Tompson NIPS’14] [Tompson CVPR’15] [CPM]

Source: Artificial Intelligence on Medium

(Visited 5 times, 1 visits today)
Post a Comment

Newsletter