ISSN 0253-2778

CN 34-1054/N

Open AccessOpen Access JUSTC Engineering & Materials /Info. & Intelligence 20 April 2022

The control of moldy risk during rice storage based on multivariate linear regression analysis and random forest algorithm

Cite this:
https://doi.org/10.52396/JUSTC-2021-0118
More Information
  • Author Bio:

    Yurui Deng PhD, research interest: Process safety

  • Corresponding author: E-mail: yongz@ustc.edu.cn
  • Received Date: 23 April 2021
  • Accepted Date: 05 September 2021
  • Available Online: 20 April 2022
  • Clarifying the mechanism of fungi growth is of great significance for maintaining the quality during grain storage. Among the factors that affect the growth of fungi spores, the most important factors are temperature, moisture content and storage time. Therefore, through this study, a multivariate linear regression model among several important factors, such as the spore number and ambient temperature, rice moisture content and storage days, were developed based on the experimental data. In order to build a more accurate model, we introduce a random forest algorithm into the fungal spore prediction during grain storage. The established regression models can be used to predict the spore number under different ambient temperature, rice moisture content and storage days during the storage process. For the random forest model, it could control the predicted value to be of the same order of magnitude as the actual value for 99% of the original data, which have a high accuracy to predict the spore number during the storage process. Furthermore, we plot the prediction surface graph to help practitioners to control the storage environment within the conditions in the low risk region.

    Clarifying the mechanism of fungi growth is of great significance for maintaining the quality during grain storage. Among the factors that affect the growth of fungi spores, the most important factors are temperature, moisture content and storage time. Therefore, through this study, a multivariate linear regression model among several important factors, such as the spore number and ambient temperature, rice moisture content and storage days, were developed based on the experimental data. In order to build a more accurate model, we introduce a random forest algorithm into the fungal spore prediction during grain storage. The established regression models can be used to predict the spore number under different ambient temperature, rice moisture content and storage days during the storage process. For the random forest model, it could control the predicted value to be of the same order of magnitude as the actual value for 99% of the original data, which have a high accuracy to predict the spore number during the storage process. Furthermore, we plot the prediction surface graph to help practitioners to control the storage environment within the conditions in the low risk region.

    • A multivariate linear regression model among several important factors, such as spore number and ambient temperature, rice moisture content and storage days, were developed based on the experimental data.
    • Random forest algorithm was introduced into fungal spore prediction during grain storage.
    • The established regression models can be used to predict the spore number under different ambient temperature, rice moisture content and storage days during the storage process.

  • loading
  • [1]
    Cheng S F, Tang F, Wu S L. Study on the early detection method of stored grain fungus damage. Journal of the Chinese Cereals and Oils Association, 2011, 26 (4): 85–88. doi: 10.4028/www.scientific.net/AMR.881-883.378
    [2]
    Yin W S, Zhang Y D. A survey of paddy fungus flora in China and some researches in it’s evolutional laws. Journal of Zhengzhou Grain College, 1986 (3): 3−17. https://en.cnki.com.cn/Article_en/CJFDTotal-ZZLS198603002.htm
    [3]
    Purushtham S P, Shetty H S. Storage fungal invasion and deterioration of nutritional quality of rice. Mycol Pl Pathol, 2010, 40 (4): 581–585.
    [4]
    Adriana L, Zoe M. Distribution of microbial contamination within cereal grains. Journal of Food Engineering, 2006, 72 (4): 332–338. doi: 10.1016/j.jfoodeng.2004.12.012
    [5]
    Genkawa T, Uchino T. Development of a low-moisture-content storage system for brown rice: Storability at decreased moisture contents. Biosystems Engineering, 2008, 99 (4): 515–522. doi: 10.1016/j.biosystemseng.2007.12.011
    [6]
    Soponronnarit S, Chiawwet M. Comparative study of physicochemical properties of accelerated and naturally aged rice. Journal of Food Engineering, 2008, 85 (2): 268–276. doi: 10.1016/j.jfoodeng.2007.07.023
    [7]
    Zhou J X, Ju X R. Succession of mould flora for paddy in different storage conditions. Journal of the Chinese Cereals and Oils Association, 2008, 23 (5): 133−136(Chinese). http://cqvip.53yu.com/qk/96663x/200805/28325464.html
    [8]
    Zhou J X, Zhang R. Temperature influence on microorganism flora and fatty acid value of stored paddy under high humidity. Journal of the Chinese Cereals and Oils Association, 2011, 26(1): 92−95(Chinese).https://en.cnki.com.cn/Article_en/CJFDTotal-ZLYX201101022.htm
    [9]
    Zhou J, Shi X Z, Du K, et al. Feasibility of random-forest approach for prediction of ground settlements induced by the construction of a shield-driven tunnel. International Journal of Geomechanics, 2017, 17 (6): 04016129. doi: 10.1061/(ASCE)GM.1943-5622.0000817
    [10]
    Zhou J, Asteris P G, Armaghani D J, et al. Prediction of ground vibration induced by blasting operations through the use of the Bayesian network and random forest models. Soil Dynamics and Earthquake Engineering, 2020, 139: 106390. doi: 10.1016/j.soildyn.2020.106390
    [11]
    Qiu Y, Zhou J, Khandelwal M, et al. Performance evaluation of hybrid WOA-XGBoost, GWO-XGBoost and BO-XGBoost models to predict blast-induced ground vibration. Engineering with Computers, 2021. https://doi.org/10.1007/s00366-021-01393-9.
    [12]
    Breiman L. Random forests. Machine Learning, 2001, 45: 5–32. doi: 10.1023/a:1010933404324
    [13]
    Semenick Doug C S. Tests and measurements. National Strength and Conditioning Association Journal, 1990, 12 (1): 36–37. doi: 10.1519/0744-0049(1990)012<0036:TTT>2.3.CO;2
    [14]
    Svetnik V, Liaw A, Tong C, et al. Random forest: A classification and regression tool for compound classification and QSAR modeling. Journal of Chemical Information and Computer Sciences, 2013, 43 (6): 1947–1958. doi: 10.1021/ci034160g
    [15]
    Oliveira S, Oehler F, San-Miguel-Ayanz J, et al. Modeling spatial patterns of fire occurrence in Mediterranean Europe using multiple regression and random forest. Forest Ecology and Management, 2012, 275: 117–129. doi: 10.1016/j.foreco.2012.03.003
    [16]
    Chen X R. Probability Theory and Mathematical Statistics. Hefei: University of Science and Technology of China Press, 2009: 281-325. https://www.taylorfrancis.com/books/mono/10.1201/9781482267761/probability-theory-mathematical-statistics-engineers-paolo-gatti
    [17]
    Breiman L, Friedman J, Stone C J, et al. Classification and Regression Trees. Belmont, USA: Wadsworth, 1984. https://www.taylorfrancis.com/books/mono/10.1201/9781315139470/classification-regression-trees-leo-breiman-jerome-friedman-richard-olshen-charles-stone
    [18]
    Esmeir S, Markovitch S. Anytime learning of decision trees. Journal of Machine Learning Research, 2007, 8: 891-933. https://www.jmlr.org/papers/volume8/esmeir07a/esmeir07a.pdf
    [19]
    Yu Z, Shi X, Zhou J, et al. Effective assessment of blast-induced ground vibration using an optimized random forest model based on a Harris Hawks optimization algorithm. Applied Sciences, 2020, 10 (4): 1403. doi: 10.3390/app10041403
    [20]
    Chandra B, Kuppili V B. Heterogeneous node split measure for decision tree construction. International Conference on Systems, Man, and Cybernetics. Anchorage, AK: IEEE, 2011: 872-877. https://ieeexploreieee.53yu.com/abstract/document/6083761.
    [21]
    Zhou J, Qiu Y, Armaghani D J, et al. Predicting TBM penetration rate in hard rock condition: A comparative study among six XGB-based metaheuristic techniques. Geoscience Frontiers, 2021, 12 (3): 101091.
    [22]
    Yu Z, Shi X, Qiu X, et al. Effective assessment of blast-induced ground vibration using an optimized random forest model based on a Harris Hawks optimization algorithm. Engineering Optimization, 2021, 53: 1467–1482. doi: 10.1080/0305215X.2020.1801668
    [23]
    Mitchell M W. Bias of the random forest out-of-bag (OOB) error for certain input parameters. Open Journal of Statistics, 2011, 1 (3): 205–211. doi: 10.4236/ojs.2011.13024
  • 加载中

Catalog

    Figure  1.  Spore number along with ambient temperature, rice moisture content and storage days.

    Figure  2.  Scatter plot after taking a natural logarithm of the spore number.

    Figure  3.  Residual case order plot.

    Figure  4.  The distribution of spore number.

    Figure  5.  Comparison of training, out-of-bag, and independent test set error rates for random forest as the number of trees increases.

    Figure  6.  Prediction surface graph in different temperature of 10 ℃ (a), 15 ℃ (b), 20 ℃ (c), 25 ℃ (d), 30 ℃ (e), 35 ℃(f).

    Figure  7.  The Scatter plot of prediction value and true value.

    [1]
    Cheng S F, Tang F, Wu S L. Study on the early detection method of stored grain fungus damage. Journal of the Chinese Cereals and Oils Association, 2011, 26 (4): 85–88. doi: 10.4028/www.scientific.net/AMR.881-883.378
    [2]
    Yin W S, Zhang Y D. A survey of paddy fungus flora in China and some researches in it’s evolutional laws. Journal of Zhengzhou Grain College, 1986 (3): 3−17. https://en.cnki.com.cn/Article_en/CJFDTotal-ZZLS198603002.htm
    [3]
    Purushtham S P, Shetty H S. Storage fungal invasion and deterioration of nutritional quality of rice. Mycol Pl Pathol, 2010, 40 (4): 581–585.
    [4]
    Adriana L, Zoe M. Distribution of microbial contamination within cereal grains. Journal of Food Engineering, 2006, 72 (4): 332–338. doi: 10.1016/j.jfoodeng.2004.12.012
    [5]
    Genkawa T, Uchino T. Development of a low-moisture-content storage system for brown rice: Storability at decreased moisture contents. Biosystems Engineering, 2008, 99 (4): 515–522. doi: 10.1016/j.biosystemseng.2007.12.011
    [6]
    Soponronnarit S, Chiawwet M. Comparative study of physicochemical properties of accelerated and naturally aged rice. Journal of Food Engineering, 2008, 85 (2): 268–276. doi: 10.1016/j.jfoodeng.2007.07.023
    [7]
    Zhou J X, Ju X R. Succession of mould flora for paddy in different storage conditions. Journal of the Chinese Cereals and Oils Association, 2008, 23 (5): 133−136(Chinese). http://cqvip.53yu.com/qk/96663x/200805/28325464.html
    [8]
    Zhou J X, Zhang R. Temperature influence on microorganism flora and fatty acid value of stored paddy under high humidity. Journal of the Chinese Cereals and Oils Association, 2011, 26(1): 92−95(Chinese).https://en.cnki.com.cn/Article_en/CJFDTotal-ZLYX201101022.htm
    [9]
    Zhou J, Shi X Z, Du K, et al. Feasibility of random-forest approach for prediction of ground settlements induced by the construction of a shield-driven tunnel. International Journal of Geomechanics, 2017, 17 (6): 04016129. doi: 10.1061/(ASCE)GM.1943-5622.0000817
    [10]
    Zhou J, Asteris P G, Armaghani D J, et al. Prediction of ground vibration induced by blasting operations through the use of the Bayesian network and random forest models. Soil Dynamics and Earthquake Engineering, 2020, 139: 106390. doi: 10.1016/j.soildyn.2020.106390
    [11]
    Qiu Y, Zhou J, Khandelwal M, et al. Performance evaluation of hybrid WOA-XGBoost, GWO-XGBoost and BO-XGBoost models to predict blast-induced ground vibration. Engineering with Computers, 2021. https://doi.org/10.1007/s00366-021-01393-9.
    [12]
    Breiman L. Random forests. Machine Learning, 2001, 45: 5–32. doi: 10.1023/a:1010933404324
    [13]
    Semenick Doug C S. Tests and measurements. National Strength and Conditioning Association Journal, 1990, 12 (1): 36–37. doi: 10.1519/0744-0049(1990)012<0036:TTT>2.3.CO;2
    [14]
    Svetnik V, Liaw A, Tong C, et al. Random forest: A classification and regression tool for compound classification and QSAR modeling. Journal of Chemical Information and Computer Sciences, 2013, 43 (6): 1947–1958. doi: 10.1021/ci034160g
    [15]
    Oliveira S, Oehler F, San-Miguel-Ayanz J, et al. Modeling spatial patterns of fire occurrence in Mediterranean Europe using multiple regression and random forest. Forest Ecology and Management, 2012, 275: 117–129. doi: 10.1016/j.foreco.2012.03.003
    [16]
    Chen X R. Probability Theory and Mathematical Statistics. Hefei: University of Science and Technology of China Press, 2009: 281-325. https://www.taylorfrancis.com/books/mono/10.1201/9781482267761/probability-theory-mathematical-statistics-engineers-paolo-gatti
    [17]
    Breiman L, Friedman J, Stone C J, et al. Classification and Regression Trees. Belmont, USA: Wadsworth, 1984. https://www.taylorfrancis.com/books/mono/10.1201/9781315139470/classification-regression-trees-leo-breiman-jerome-friedman-richard-olshen-charles-stone
    [18]
    Esmeir S, Markovitch S. Anytime learning of decision trees. Journal of Machine Learning Research, 2007, 8: 891-933. https://www.jmlr.org/papers/volume8/esmeir07a/esmeir07a.pdf
    [19]
    Yu Z, Shi X, Zhou J, et al. Effective assessment of blast-induced ground vibration using an optimized random forest model based on a Harris Hawks optimization algorithm. Applied Sciences, 2020, 10 (4): 1403. doi: 10.3390/app10041403
    [20]
    Chandra B, Kuppili V B. Heterogeneous node split measure for decision tree construction. International Conference on Systems, Man, and Cybernetics. Anchorage, AK: IEEE, 2011: 872-877. https://ieeexploreieee.53yu.com/abstract/document/6083761.
    [21]
    Zhou J, Qiu Y, Armaghani D J, et al. Predicting TBM penetration rate in hard rock condition: A comparative study among six XGB-based metaheuristic techniques. Geoscience Frontiers, 2021, 12 (3): 101091.
    [22]
    Yu Z, Shi X, Qiu X, et al. Effective assessment of blast-induced ground vibration using an optimized random forest model based on a Harris Hawks optimization algorithm. Engineering Optimization, 2021, 53: 1467–1482. doi: 10.1080/0305215X.2020.1801668
    [23]
    Mitchell M W. Bias of the random forest out-of-bag (OOB) error for certain input parameters. Open Journal of Statistics, 2011, 1 (3): 205–211. doi: 10.4236/ojs.2011.13024

    Article Metrics

    Article views (860) PDF downloads(5302)
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return