ISSN 0253-2778

CN 34-1054/N

Open AccessOpen Access JUSTC Original Paper

Ensemble max-pooling: Is only the maximum activation useful when pooling

Cite this:
https://doi.org/10.3969/j.issn.0253-2778.2017.10.001
  • Received Date: 22 May 2017
  • Rev Recd Date: 24 June 2017
  • Publish Date: 31 October 2017
  • The pooling layer in convolutional neural networks performs subsampling on the basis of the local correlation principle, reducing the data size while keeping useful information in order to improve generalization, and effectively increase receptive fields simultaneously. The winner-take-all strategy is used in classical max-pooling, which will affect the generalization of the network sometimes. A simple and effective pooling method named ensemble max-pooling was introduced, which can replace the pooling layer in conventional convolutional neural networks. In each pooling region, ensemble max-pooling drops the neuron with maximum activation with probability p, and outputs the neuron with second largest activation. Ensemble max-pooling can be viewed as an ensemble of many basic underlying networks, and it can also be viewed as the classical max-pooling with some local distortion of the input. The results achieved are better than classical pooling methods and other related pooling approaches. DFN-MR is derived from ResNet, compared with which it has more basic underlying networks and avoids very deep networks. By keeping other hyperparameters unchanged, and replacing each convolutional layer in DFN-MR with a tandem form, i.e., a combination of an ensemble max-pooling layer and a convolutional layer with stride 1, it is shown to deliver significant gains in performance.
    The pooling layer in convolutional neural networks performs subsampling on the basis of the local correlation principle, reducing the data size while keeping useful information in order to improve generalization, and effectively increase receptive fields simultaneously. The winner-take-all strategy is used in classical max-pooling, which will affect the generalization of the network sometimes. A simple and effective pooling method named ensemble max-pooling was introduced, which can replace the pooling layer in conventional convolutional neural networks. In each pooling region, ensemble max-pooling drops the neuron with maximum activation with probability p, and outputs the neuron with second largest activation. Ensemble max-pooling can be viewed as an ensemble of many basic underlying networks, and it can also be viewed as the classical max-pooling with some local distortion of the input. The results achieved are better than classical pooling methods and other related pooling approaches. DFN-MR is derived from ResNet, compared with which it has more basic underlying networks and avoids very deep networks. By keeping other hyperparameters unchanged, and replacing each convolutional layer in DFN-MR with a tandem form, i.e., a combination of an ensemble max-pooling layer and a convolutional layer with stride 1, it is shown to deliver significant gains in performance.
  • loading
  • [1]
    ZEILER M D, FERGUS R. Stochastic pooling for regularization of deep convolutional neural networks [J]. Eprint, 2013: arXiv:1301.3557.
    [2]
    HUANG Yuchi, SUN Xiuyu, LU Ming, et al. Channel-max, channel-drop and stochastic max-pooling [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Boston, USA: IEEE, 2015: 9-17.
    [3]
    CAI Meng, SHI Yongzhe, LIU Jia. Stochastic pooling maxout networks for low-resource speech recognition [C]// Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing. Florence, Italy: IEEE, 2014: 3266-3270.
    [4]
    HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 770-778.
    [5]
    VEIT A, WILBER M, BELONGIE S. Residual networks are exponential ensembles of relatively shallow networks [EB/OL]. [2017-02-14] https://arxiv.org/abs/1605.06431v1.
    [6]
    ZHAO Liming, WANG Jingdong, LI Xi, et al. On the connection of deep fusion to ensembling[EB/OL]. [2017-02-14] https://arxiv.org/abs/1611.07718.
    [7]
    WU Haibing, GU Xiaodong. Max-pooling dropout for regularization of convolutional neural networks [C]// Proceedings of the International Conference on Neural Information Processing. Berlin: Springer, 2015: 46-54.
    [8]
    HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Identity mappings in deep residual networks [C]// Proceedings of the 14th European Conference on Computer Vision. Berlin: Springer, 2016: 630-645.
    [9]
    SIMARD P Y, STEINKRAUS D, PLATT J C, et al. Best practices for convolutional neural networks applied to visual document analysis [C]// Proceedings of the International Conference on Document Analysis and Recognition. Washington: IEEE Computer Society, 2003: 958-962.
    [10]
    LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition [J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
    [11]
    JIA Y Q, SHELHAMER E, DONAHUE J, et al. Caffe: Convolutional architecture for fast feature embedding [C]// Proceedings of the 22nd ACM International Conference on Multimedia. Orlando, USA: ACM, 2014: 675-678.
    [12]
    KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks [J]. International Conference on Neural Information Processing Systems, 2012, 25(2): 1097-1105.
    [13]
    SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [J]. Eprint, 2015: arXiv:1409.1556.
    [14]
    DENG J, DONG W, SOCHER R, et al. ImageNet: A large-scale hierarchical image database [C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE, 2009: 248-255.
    [15]
    KRIZHEVSKY A, HINTON G. Learning multiple layers of features from tiny images[J]. Eprint, 2009: arXiv:1011.1669v3.
    [16]
    GEMAN S, BIENENSTOCK E, DOURSAT R. Neural networks and the bias/variance dilemma [J]. Neural computation, 1992, 4(1): 1-58.
    [17]
    HUANG Gao, SUN Yu, LIU Zhuang, et al. Deep networks with stochastic depth [C]// Proceedings of the 14th European Conference on Computer Vision. Berlin: Springer, 2016: 646-661.
    [18]
    CHEN Tianqi, LI Mu, LI Yutian, et al. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems [J]. Eprint, 2015: arXiv:1512.01274.
    [19]
    DANIELS H, KAMP B, VERKOOIJEN W. Application of Neural Networks to House Pricing and Bond Rating [M]. Tilburg University, 1997.
    [20]
    COBHAM A. The intrinsic computational difficulty of functions [J]. International Congress for Logic, 1969, 31(1): 43-52.
    [21]
    EDMONDS J. Paths, trees, and flowers [J]. Canadian Journal of Mathematics, 2009, 17(3):361-379.
    [22]
    IOFFE S, SZEGEDY C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift [C]// Proceedings of the 32nd International Conference on Machine Learning. Lille, France: ACM, 2015: 448-456.
    [23]
    HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification [C]// Proceedings of the 27th International Conference on Computer Vision. Santiago, USA: ACM, 2015: 1026-1034.
  • 加载中

Catalog

    [1]
    ZEILER M D, FERGUS R. Stochastic pooling for regularization of deep convolutional neural networks [J]. Eprint, 2013: arXiv:1301.3557.
    [2]
    HUANG Yuchi, SUN Xiuyu, LU Ming, et al. Channel-max, channel-drop and stochastic max-pooling [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Boston, USA: IEEE, 2015: 9-17.
    [3]
    CAI Meng, SHI Yongzhe, LIU Jia. Stochastic pooling maxout networks for low-resource speech recognition [C]// Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing. Florence, Italy: IEEE, 2014: 3266-3270.
    [4]
    HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 770-778.
    [5]
    VEIT A, WILBER M, BELONGIE S. Residual networks are exponential ensembles of relatively shallow networks [EB/OL]. [2017-02-14] https://arxiv.org/abs/1605.06431v1.
    [6]
    ZHAO Liming, WANG Jingdong, LI Xi, et al. On the connection of deep fusion to ensembling[EB/OL]. [2017-02-14] https://arxiv.org/abs/1611.07718.
    [7]
    WU Haibing, GU Xiaodong. Max-pooling dropout for regularization of convolutional neural networks [C]// Proceedings of the International Conference on Neural Information Processing. Berlin: Springer, 2015: 46-54.
    [8]
    HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Identity mappings in deep residual networks [C]// Proceedings of the 14th European Conference on Computer Vision. Berlin: Springer, 2016: 630-645.
    [9]
    SIMARD P Y, STEINKRAUS D, PLATT J C, et al. Best practices for convolutional neural networks applied to visual document analysis [C]// Proceedings of the International Conference on Document Analysis and Recognition. Washington: IEEE Computer Society, 2003: 958-962.
    [10]
    LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition [J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
    [11]
    JIA Y Q, SHELHAMER E, DONAHUE J, et al. Caffe: Convolutional architecture for fast feature embedding [C]// Proceedings of the 22nd ACM International Conference on Multimedia. Orlando, USA: ACM, 2014: 675-678.
    [12]
    KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks [J]. International Conference on Neural Information Processing Systems, 2012, 25(2): 1097-1105.
    [13]
    SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [J]. Eprint, 2015: arXiv:1409.1556.
    [14]
    DENG J, DONG W, SOCHER R, et al. ImageNet: A large-scale hierarchical image database [C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE, 2009: 248-255.
    [15]
    KRIZHEVSKY A, HINTON G. Learning multiple layers of features from tiny images[J]. Eprint, 2009: arXiv:1011.1669v3.
    [16]
    GEMAN S, BIENENSTOCK E, DOURSAT R. Neural networks and the bias/variance dilemma [J]. Neural computation, 1992, 4(1): 1-58.
    [17]
    HUANG Gao, SUN Yu, LIU Zhuang, et al. Deep networks with stochastic depth [C]// Proceedings of the 14th European Conference on Computer Vision. Berlin: Springer, 2016: 646-661.
    [18]
    CHEN Tianqi, LI Mu, LI Yutian, et al. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems [J]. Eprint, 2015: arXiv:1512.01274.
    [19]
    DANIELS H, KAMP B, VERKOOIJEN W. Application of Neural Networks to House Pricing and Bond Rating [M]. Tilburg University, 1997.
    [20]
    COBHAM A. The intrinsic computational difficulty of functions [J]. International Congress for Logic, 1969, 31(1): 43-52.
    [21]
    EDMONDS J. Paths, trees, and flowers [J]. Canadian Journal of Mathematics, 2009, 17(3):361-379.
    [22]
    IOFFE S, SZEGEDY C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift [C]// Proceedings of the 32nd International Conference on Machine Learning. Lille, France: ACM, 2015: 448-456.
    [23]
    HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification [C]// Proceedings of the 27th International Conference on Computer Vision. Santiago, USA: ACM, 2015: 1026-1034.

    Article Metrics

    Article views (449) PDF downloads(183)
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return