ISSN 0253-2778

CN 34-1054/N

Open AccessOpen Access JUSTC Information Science and Technology 08 June 2023

Robustness benchmark for unsupervised anomaly detection models

Cite this:
https://doi.org/10.52396/JUSTC-2022-0165
More Information
  • Author Bio:

    Pei Wang is currently working at Alibaba Cloud in Hangzhou. He received his master’s degree in Control Science and Engineering from the University of Science and Technology of China in 2023. His research interests focus on computer vision

    Yang Cao is currently an Associate Professor in the Automation Department at the University of Science and Technology of China. He received his Ph.D. degree in Pattern Recognition and Intelligent System from the Northeastern University in 2004. His research interests include computer vision and multimedia processing

  • Corresponding author: E-mail: forrest@ustc.edu.cn
  • Received Date: 01 January 2023
  • Accepted Date: 22 March 2023
  • Available Online: 08 June 2023
  • Due to the complexity and diversity of production environments, it is essential to understand the robustness of unsupervised anomaly detection models to common corruptions. To explore this issue systematically, we propose a dataset named MVTec-C to evaluate the robustness of unsupervised anomaly detection models. Based on this dataset, we explore the robustness of approaches in five paradigms, namely, reconstruction-based, representation similarity-based, normalizing flow-based, self-supervised representation learning-based, and knowledge distillation-based paradigms. Furthermore, we explore the impact of different modules within two optimal methods on robustness and accuracy. This includes the multi-scale features, the neighborhood size, and the sampling ratio in the PatchCore method, as well as the multi-scale features, the MMF module, the OCE module, and the multi-scale distillation in the Reverse Distillation method. Finally, we propose a feature alignment module (FAM) to reduce the feature drift caused by corruptions and combine PatchCore and the FAM to obtain a model with both high performance and high accuracy. We hope this work will serve as an evaluation method and provide experience in building robust anomaly detection models in the future.
    Benchmarking robustness in unsupervised anomaly detection: dataset, metrics, comparative analysis of existing methods, and enhancing with feature aligning.
    Due to the complexity and diversity of production environments, it is essential to understand the robustness of unsupervised anomaly detection models to common corruptions. To explore this issue systematically, we propose a dataset named MVTec-C to evaluate the robustness of unsupervised anomaly detection models. Based on this dataset, we explore the robustness of approaches in five paradigms, namely, reconstruction-based, representation similarity-based, normalizing flow-based, self-supervised representation learning-based, and knowledge distillation-based paradigms. Furthermore, we explore the impact of different modules within two optimal methods on robustness and accuracy. This includes the multi-scale features, the neighborhood size, and the sampling ratio in the PatchCore method, as well as the multi-scale features, the MMF module, the OCE module, and the multi-scale distillation in the Reverse Distillation method. Finally, we propose a feature alignment module (FAM) to reduce the feature drift caused by corruptions and combine PatchCore and the FAM to obtain a model with both high performance and high accuracy. We hope this work will serve as an evaluation method and provide experience in building robust anomaly detection models in the future.
    • We construct a robustness benchmark for unsupervised anomaly detection methods, including a dataset with eight corruption types and five severity levels and metrics to assess robustness.
    • We evaluate the accuracy and robustness of mainstream unsupervised anomaly detection methods and find that representation similarity-based and knowledge distillation-based approaches are the best paradigms in terms of performance and robustness.
    • The different components of the two best-performed methods are studied for ablation, thus helping to understand the impact of different factors on robustness.
    • We propose a feature alignment module to rectify the corrupted features. Combining the proposed module with PatchCore yields a model with both robustness while maintaining high performance.

  • loading
  • [1]
    Schlegl T, Seeböck P, Waldstein S M, et al. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In: Information Processing in Medical Imaging. Cham: Springer, 2017: 146–157.
    [2]
    Akcay S, Atapour-Abarghouei A, Breckon T P. GANomaly: Semi-supervised anomaly detection via adversarial training. In: Jawahar C, Li H, Mori G, et al. editors. Computer Vision–ACCV 2018. Cham: Springer, 2018: 622–637.
    [3]
    Bergmann P, Löwe S, Fauser M, et al. Improving unsupervised defect segmentation by applying structural similarity to autoencoders. arXiv: 1807.02011, 2018.
    [4]
    Cohen N, Hoshen Y. Sub-image anomaly detection with deep pyramid correspondences. arXiv: 2005.02357, 2020.
    [5]
    Defard T, Setkov A, Loesch A, et al. PaDiM: A patch distribution modeling framework for anomaly detection and localization. In: Pattern recognition. ICPR international workshops and challenges. Cham: Springer, 2021: 475–489.
    [6]
    Roth K, Pemula L, Zepeda J, et al. Towards total recall in industrial anomaly detection. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022: 14298–14308.
    [7]
    Rudolph M, Wandt B, Rosenhahn B. Same same but DifferNet: Semi-supervised defect detection with normalizing flows. In: 2021 IEEE Winter Conference on Applications of Computer Vision (WACV). Waikoloa, USA: IEEE, 2021: 1906–1915.
    [8]
    Gudovskiy D, Ishizaka S, Kozuka K. CFLOW-AD: Real-time unsupervised anomaly detection with localization via conditional normalizing flows. In: 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Waikoloa, USA: IEEE, 2022: 1819–1828.
    [9]
    Li C L, Sohn K, Yoon J, et al. CutPaste: self-supervised learning for anomaly detection and localization. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE, 2021: 9659–9669.
    [10]
    Bergmann P, Fauser M, Sattlegger D, et al. Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2020: 4182–4191.
    [11]
    Salehi M, Sadjadi N, Baselizadeh S, et al. Multiresolution knowledge distillation for anomaly detection. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE, 2021: 14897–14907.
    [12]
    Deng H, Li X. Anomaly detection via reverse distillation from one-class embedding. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022: 9727–9736.
    [13]
    Bergmann P, Fauser M, Sattlegger D, et al. MVTec AD—a comprehensive real-world dataset for unsupervised anomaly detection. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2019: 9584–9592.
    [14]
    Hendrycks D, Dietterich T. Benchmarking neural network robustness to common corruptions and perturbations. arXiv: 1903.12261, 2019.
    [15]
    Michaelis C, Mitzkus B, Geirhos R, et al. Benchmarking robustness in object detection: Autonomous driving when winter is coming. arXiv: 1907.07484, 2019.
    [16]
    Kamann C, Rother C. Benchmarking the robustness of semantic segmentation models with respect to common corruptions. International Journal of Computer Vision, 2021, 129: 462–483. doi: 10.1007/s11263-020-01383-2
    [17]
    Wang J, Jin S, Liu W, et al. When human pose estimation meets robustness: Adversarial algorithms and benchmarks. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE, 2021: 11850–11859.
    [18]
    Schlegl T, Seeböck P, Waldstein S M, et al. f-AnoGAN: Fast unsupervised anomaly detection with generative adversarial networks. Medical Image Analysis, 2019, 54: 30–44. doi: 10.1016/j.media.2019.01.010
    [19]
    Zavrtanik V, Kristan M, Skočaj D. Reconstruction by inpainting for visual anomaly detection. Pattern Recognition, 2021, 112: 107706. doi: 10.1016/j.patcog.2020.107706
    [20]
    Rippel O, Mertens P, Merhof D. Modeling the distribution of normal data in pre-trained deep features for anomaly detection. In: 2020 25th International Conference on Pattern Recognition (ICPR). Milan, Italy: IEEE, 2021: 6726–6733.
    [21]
    Golan I, El-Yaniv R. Deep anomaly detection using geometric transformations. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. New York: ACM, 2018: 9781–9791.
    [22]
    Hendrycks D, Mazeika M, Kadavath S, et al. Using self-supervised learning can improve model robustness and uncertainty. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. New York: ACM, 2019: 15663–15674.
    [23]
    Bergman L, Hoshen Y. Classification-based anomaly detection for general data. arXiv: 2005.02359, 2020.
    [24]
    Sohn K, Li C L, Yoon J, et al. Learning and evaluating representations for deep one-class classification. arXiv: 2011.02578, 2011.
    [25]
    Wang G, Han S, Ding E, et al. Student-teacher feature pyramid matching for anomaly detection. arXiv: 2103.04257, 2021.
    [26]
    Goodge A, Hooi B, Ng S K, et al. Robustness of autoencoders for anomaly detection under adversarial impact. In: IJCAI'20: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. New York: ACM, 2021: 1244–1250
    [27]
    Schneider M, Aspinall D, Bastian N D. Evaluating model robustness to adversarial samples in network intrusion detection. In: 2021 IEEE International Conference on Big Data (Big Data). Orlando, USA : IEEE, 2021: 3343–3352.
    [28]
    Han D, Wang Z, Zhong Y, et al. Evaluating and improving adversarial robustness of machine learning-based network intrusion detectors. IEEE Journal on Selected Areas in Communications, 2021, 39 (8): 2632–2647. doi: 10.1109/JSAC.2021.3087242
    [29]
    Perales Gómez Á L, Maimó L F, Clemente F J G, et al. A methodology for evaluating the robustness of anomaly detectors to adversarial attacks in industrial scenarios. IEEE Access, 2022, 10: 124582–124594. doi: 10.1109/ACCESS.2022.3224930
    [30]
    Kloft M, Laskov P. Online anomaly detection under adversarial impact. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS). Sardinia, Italy: PMLR, 2010: 405–412.
    [31]
    Madani P, Vlajic N. Robustness of deep autoencoder in intrusion detection under adversarial contamination. In: Proceedings of the 5th Annual Symposium and Bootcamp on Hot Topics in the Science of Security. New York: ACM, 2018: 1–8.
    [32]
    Bovenzi G, Foggia A, Santella S, et al. Data poisoning attacks against autoencoder-based anomaly detection models: A robustness analysis. In: ICC 2022—IEEE International Conference on Communications. Seoul, Korea: IEEE, 2022: 5427–5432.
    [33]
    Altindis S F, Dalva Y, Pehlivan H, et al. Benchmarking the robustness of instance segmentation models. arXiv: 2109.01123, 2021.
    [34]
    Dooley S, Goldstein T, Dickerson J P. Robustness disparities in commercial face detection. arXiv: 2108.12508, 2021.
  • 加载中

Catalog

    Figure  1.  Samples with different corruption types. The first image is the original image in MVTec.

    Figure  2.  The pipeline of PatchCore[6]. Neighborhood aggregation enhances the patch representations extracted from a pre-trained model. Then the greedy sampling constructs a compact yet maximal representative memory bank. Finally, each patch is scored by calculating the distance between the patch representation and its nearest neighbor in the memory bank.

    Figure  3.  The pipeline of Reverse Distillation[12]. First, the teacher model acts as a feature extractor and outputs multi-scale features. Secondly, multi-scale features are merged and compressed into low-dimensional embeddings by the multi-scale feature fusion (MFF) and the one-class embedding (OCE) module. Third, the student model tries to reconstruct the multi-scale features output by the teacher model. Finally, the cosine distance is used to compare the outputs of the student and teacher models, resulting in an anomaly map.

    Figure  4.  Corruptions cause a globally consistent drift, while defects cause a local drift from the normal distribution. The average shift of all image patch features tends to align more closely with the drift caused by corruption than that caused by defects. Consequently, minimizing the average shift can effectively reduce the drift induced by corruption by aligning the corrupted features with the normal ones. Although this approach can alter the defect features, it remains insufficient to compensate for the drift caused by defects.

    Figure  5.  Analysis experiments. (a) The effect of multi-scale features in PatchCore on robustness. “1”, “2”, and “3” represent the features from layer1, layer2, and layer3 in ResNet, respectively. (b) The effect of neighborhood size on the robustness of the domain aggregation module in PatchCore. (c) The effect of multi-scale features on robustness in Reverse Distillation. (d) Effect of multi-scale distillation in Reverse Distillation on robustness. (e) Effect of the number of FAM iterations. (f) Effect of the number of FAM reference features.

    Figure  6.  Impact of Reverse Distillation OCE module and MFF module on robustness. “Plain” represents the base model.

    Figure  7.  Qualitative comparisons between PatchCore and our method.

    [1]
    Schlegl T, Seeböck P, Waldstein S M, et al. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In: Information Processing in Medical Imaging. Cham: Springer, 2017: 146–157.
    [2]
    Akcay S, Atapour-Abarghouei A, Breckon T P. GANomaly: Semi-supervised anomaly detection via adversarial training. In: Jawahar C, Li H, Mori G, et al. editors. Computer Vision–ACCV 2018. Cham: Springer, 2018: 622–637.
    [3]
    Bergmann P, Löwe S, Fauser M, et al. Improving unsupervised defect segmentation by applying structural similarity to autoencoders. arXiv: 1807.02011, 2018.
    [4]
    Cohen N, Hoshen Y. Sub-image anomaly detection with deep pyramid correspondences. arXiv: 2005.02357, 2020.
    [5]
    Defard T, Setkov A, Loesch A, et al. PaDiM: A patch distribution modeling framework for anomaly detection and localization. In: Pattern recognition. ICPR international workshops and challenges. Cham: Springer, 2021: 475–489.
    [6]
    Roth K, Pemula L, Zepeda J, et al. Towards total recall in industrial anomaly detection. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022: 14298–14308.
    [7]
    Rudolph M, Wandt B, Rosenhahn B. Same same but DifferNet: Semi-supervised defect detection with normalizing flows. In: 2021 IEEE Winter Conference on Applications of Computer Vision (WACV). Waikoloa, USA: IEEE, 2021: 1906–1915.
    [8]
    Gudovskiy D, Ishizaka S, Kozuka K. CFLOW-AD: Real-time unsupervised anomaly detection with localization via conditional normalizing flows. In: 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Waikoloa, USA: IEEE, 2022: 1819–1828.
    [9]
    Li C L, Sohn K, Yoon J, et al. CutPaste: self-supervised learning for anomaly detection and localization. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE, 2021: 9659–9669.
    [10]
    Bergmann P, Fauser M, Sattlegger D, et al. Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2020: 4182–4191.
    [11]
    Salehi M, Sadjadi N, Baselizadeh S, et al. Multiresolution knowledge distillation for anomaly detection. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE, 2021: 14897–14907.
    [12]
    Deng H, Li X. Anomaly detection via reverse distillation from one-class embedding. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022: 9727–9736.
    [13]
    Bergmann P, Fauser M, Sattlegger D, et al. MVTec AD—a comprehensive real-world dataset for unsupervised anomaly detection. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2019: 9584–9592.
    [14]
    Hendrycks D, Dietterich T. Benchmarking neural network robustness to common corruptions and perturbations. arXiv: 1903.12261, 2019.
    [15]
    Michaelis C, Mitzkus B, Geirhos R, et al. Benchmarking robustness in object detection: Autonomous driving when winter is coming. arXiv: 1907.07484, 2019.
    [16]
    Kamann C, Rother C. Benchmarking the robustness of semantic segmentation models with respect to common corruptions. International Journal of Computer Vision, 2021, 129: 462–483. doi: 10.1007/s11263-020-01383-2
    [17]
    Wang J, Jin S, Liu W, et al. When human pose estimation meets robustness: Adversarial algorithms and benchmarks. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE, 2021: 11850–11859.
    [18]
    Schlegl T, Seeböck P, Waldstein S M, et al. f-AnoGAN: Fast unsupervised anomaly detection with generative adversarial networks. Medical Image Analysis, 2019, 54: 30–44. doi: 10.1016/j.media.2019.01.010
    [19]
    Zavrtanik V, Kristan M, Skočaj D. Reconstruction by inpainting for visual anomaly detection. Pattern Recognition, 2021, 112: 107706. doi: 10.1016/j.patcog.2020.107706
    [20]
    Rippel O, Mertens P, Merhof D. Modeling the distribution of normal data in pre-trained deep features for anomaly detection. In: 2020 25th International Conference on Pattern Recognition (ICPR). Milan, Italy: IEEE, 2021: 6726–6733.
    [21]
    Golan I, El-Yaniv R. Deep anomaly detection using geometric transformations. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. New York: ACM, 2018: 9781–9791.
    [22]
    Hendrycks D, Mazeika M, Kadavath S, et al. Using self-supervised learning can improve model robustness and uncertainty. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. New York: ACM, 2019: 15663–15674.
    [23]
    Bergman L, Hoshen Y. Classification-based anomaly detection for general data. arXiv: 2005.02359, 2020.
    [24]
    Sohn K, Li C L, Yoon J, et al. Learning and evaluating representations for deep one-class classification. arXiv: 2011.02578, 2011.
    [25]
    Wang G, Han S, Ding E, et al. Student-teacher feature pyramid matching for anomaly detection. arXiv: 2103.04257, 2021.
    [26]
    Goodge A, Hooi B, Ng S K, et al. Robustness of autoencoders for anomaly detection under adversarial impact. In: IJCAI'20: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. New York: ACM, 2021: 1244–1250
    [27]
    Schneider M, Aspinall D, Bastian N D. Evaluating model robustness to adversarial samples in network intrusion detection. In: 2021 IEEE International Conference on Big Data (Big Data). Orlando, USA : IEEE, 2021: 3343–3352.
    [28]
    Han D, Wang Z, Zhong Y, et al. Evaluating and improving adversarial robustness of machine learning-based network intrusion detectors. IEEE Journal on Selected Areas in Communications, 2021, 39 (8): 2632–2647. doi: 10.1109/JSAC.2021.3087242
    [29]
    Perales Gómez Á L, Maimó L F, Clemente F J G, et al. A methodology for evaluating the robustness of anomaly detectors to adversarial attacks in industrial scenarios. IEEE Access, 2022, 10: 124582–124594. doi: 10.1109/ACCESS.2022.3224930
    [30]
    Kloft M, Laskov P. Online anomaly detection under adversarial impact. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS). Sardinia, Italy: PMLR, 2010: 405–412.
    [31]
    Madani P, Vlajic N. Robustness of deep autoencoder in intrusion detection under adversarial contamination. In: Proceedings of the 5th Annual Symposium and Bootcamp on Hot Topics in the Science of Security. New York: ACM, 2018: 1–8.
    [32]
    Bovenzi G, Foggia A, Santella S, et al. Data poisoning attacks against autoencoder-based anomaly detection models: A robustness analysis. In: ICC 2022—IEEE International Conference on Communications. Seoul, Korea: IEEE, 2022: 5427–5432.
    [33]
    Altindis S F, Dalva Y, Pehlivan H, et al. Benchmarking the robustness of instance segmentation models. arXiv: 2109.01123, 2021.
    [34]
    Dooley S, Goldstein T, Dickerson J P. Robustness disparities in commercial face detection. arXiv: 2108.12508, 2021.

    Article Metrics

    Article views (545) PDF downloads(2193)
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return