ISSN 0253-2778

CN 34-1054/N

Open AccessOpen Access JUSTC Original Paper

A composite index strategy for big marine data based on adaptive method of data merging strategy

Cite this:
https://doi.org/10.3969/j.issn.0253-2778.2015.10.003
  • Received Date: 27 August 2015
  • Accepted Date: 29 September 2015
  • Rev Recd Date: 29 September 2015
  • Publish Date: 30 October 2015
  • Marine data fall easily into category of Big Data. A basic requirement for various marine monitoring applications is quick retrieval and the establishment of a sound index structure is of great importance. A multi-layer index (ML-index, for short) with regard to time interval B+-tree and hybrid space partition tree (HSP-tree, for short) was proposed. It employs the adaptive method of data merging strategy to optimize the primary key index (i.e. B+-tree). An adaptive space partition method was also proposed on the basis of data characteristics, and data unit capacity particular, for building secondary index, namely, HSP-tree. The experiment result shows that ML-index saves about 2/3 of the time in comparison with two state-of-the-art index methods.
    Marine data fall easily into category of Big Data. A basic requirement for various marine monitoring applications is quick retrieval and the establishment of a sound index structure is of great importance. A multi-layer index (ML-index, for short) with regard to time interval B+-tree and hybrid space partition tree (HSP-tree, for short) was proposed. It employs the adaptive method of data merging strategy to optimize the primary key index (i.e. B+-tree). An adaptive space partition method was also proposed on the basis of data characteristics, and data unit capacity particular, for building secondary index, namely, HSP-tree. The experiment result shows that ML-index saves about 2/3 of the time in comparison with two state-of-the-art index methods.
  • loading
  • [1]
    Mayer-Schnberger V, Cukier K. Big data: A Revolution That Will Transform How We Live, Work, and Think[M]. Houghton Mifflin Harcourt, 2013.
    [2]
    孟小峰, 慈祥. 大数据管理: 概念、技术与挑战[J]. 计算机研究已发展, 2013, 50(1): 146-169.
    Meng X F, Ci X. Big data management: Concepts, techniques and challenges[J]. Journal of Computer Research and Development, 2013, 50(1): 146-169.
    [3]
    石绥祥, 雷波. 中国数字海洋: 理论与实践[M]. 北京: 海洋出版社, 2011.
    [4]
    Liu Xiansan, Zhang Xin, Chi Tianhe, et al. Study on China digital ocean prototype system[C]// Proceedings of the 2009 WRI World Congress. Piscataway, USA: IEEE Press, 2009: 466-469.
    [5]
    周项敏, 王国仁. 基于关键位的高维空间划分策略[J]. 软件学报, 2004, 15(9): 1361-1374.
    Zhou Xi M, Wang G R. Key dimension based high-dimensional data partition strategy[J]. Journal of Software, 2004, 15(9): 1361-1374.
    [6]
    黄冬梅, 杜艳玲, 贺琪. 混合云存储中海洋大数据迁移算法的研究[J]. 计算机研究与发展, 2014, 51(1): 199-205.
    Huang D M, Du Y L, He Q. Migration algorithm for big marine data in hybrid cloud storage[J]. Journal of Integrative Plant Biology, 2014, 50(1): 199-205.
    [7]
    Ren P, Liu W, Sun D, et al. Partition-based data cube storage and parallel queries for cloud computing[C]// Proceedings of the 9th International Conference on Natural Computation. Shengyang, China: IEEE Press, 2013: 1183-1187.
    [8]
    Selvakumar C, Rathanam G J, Sumalatha M R. PDDS-Improving cloud data storage security using data partitioning technique[C]// Proceedings of the 3rd IEEE International Advance Computing Conference. Ghaziabad, India: IEEE Press, 2013: 7-11.
    [9]
    赵丹枫, 金顺福, 刘国华, 等. DAS模型下基于查询概率的密文索引技术[J]. 燕山大学学报, 2008, 32(6): 477-482.
    Zhao D F, Jin S F, Liu G H, et al. A cryptograph index technology based on query probability in DAS model[J]. Journal of Yanshan University, 2008, 32(6): 477-482.
    [10]
    韩蕾, 孙徐湛, 吴志川, 等. MapReduce上基于抽样的数据划分最优化研究[J]. 计算机研究与发展, 2013, 50(S): 77-84.
    Han L, Sun X Z, Wu Z C, et al. Optimization Study on sample based partition on MapReduce[J]. Journal of Computer Research and Development, 2013, 50(S): 77-84.
    [11]
    Fox A, Eichelberger C, Hughes J, et al. Spatio-temporal indexing in non-relational distributed databases[C]// Proceeding of IEEE Conference on Big Data. Silicon Valley, IEEE Press, 2013: 291-299.
    [12]
    Stantic B, Terry J, Topor R, et al. Indexing temporal data with virtual structure[C]// Proceedings of the 14th east European Conference on Advances in Databases and Information Systems. Springer, 2010: 591-594.
    [13]
    Zhong Y Q, Fang J Y, Zhao X F. VegaIndexer: A Distributed composite index scheme for big spatio-temporal sensor data on cloud[C]// Proceedings of the IEEE Conference on Geoscience and Remote Sensing Symposium. Melbourne, Australia: IEEE Press, 2013: 1713-1716.
    [14]
    Chen S, Ooi B C, Tan K L, et al. ST2B-tree: A self-tunable spatio-temporal b+-tree index for moving objects[C]// Proceeding of Conference on ACM Special Interest Group Conference on Management Of Data. Vancouver, Canada: IEEE Press, 2008: 29-42.
    [15]
    Goil S, Nagesh H, Choudhary A. MAFIA: Efficient and scalable subspace clustering for very large data sets[C]// Proceedings of SIGKDD on Data Mining. San Diego: IEEE Press, 1999: 443-452.
    [16]
    Kaufmann M, Amiri A, Vagenas P, et al., Timeline index: A unified data structure for processing queries on temporal data[C]// Proceedings of the ACM SIGMODInternational Conference on Management of Data. New York, USA: ACM Press, 2013:1173-1184.
    [17]
    Chen S, Mario A, et al: ST2B-tree: a self-tunable spatio-temporal b+-tree index for moving objects. SIGMOD 2008:29-42.)
  • 加载中

Catalog

    [1]
    Mayer-Schnberger V, Cukier K. Big data: A Revolution That Will Transform How We Live, Work, and Think[M]. Houghton Mifflin Harcourt, 2013.
    [2]
    孟小峰, 慈祥. 大数据管理: 概念、技术与挑战[J]. 计算机研究已发展, 2013, 50(1): 146-169.
    Meng X F, Ci X. Big data management: Concepts, techniques and challenges[J]. Journal of Computer Research and Development, 2013, 50(1): 146-169.
    [3]
    石绥祥, 雷波. 中国数字海洋: 理论与实践[M]. 北京: 海洋出版社, 2011.
    [4]
    Liu Xiansan, Zhang Xin, Chi Tianhe, et al. Study on China digital ocean prototype system[C]// Proceedings of the 2009 WRI World Congress. Piscataway, USA: IEEE Press, 2009: 466-469.
    [5]
    周项敏, 王国仁. 基于关键位的高维空间划分策略[J]. 软件学报, 2004, 15(9): 1361-1374.
    Zhou Xi M, Wang G R. Key dimension based high-dimensional data partition strategy[J]. Journal of Software, 2004, 15(9): 1361-1374.
    [6]
    黄冬梅, 杜艳玲, 贺琪. 混合云存储中海洋大数据迁移算法的研究[J]. 计算机研究与发展, 2014, 51(1): 199-205.
    Huang D M, Du Y L, He Q. Migration algorithm for big marine data in hybrid cloud storage[J]. Journal of Integrative Plant Biology, 2014, 50(1): 199-205.
    [7]
    Ren P, Liu W, Sun D, et al. Partition-based data cube storage and parallel queries for cloud computing[C]// Proceedings of the 9th International Conference on Natural Computation. Shengyang, China: IEEE Press, 2013: 1183-1187.
    [8]
    Selvakumar C, Rathanam G J, Sumalatha M R. PDDS-Improving cloud data storage security using data partitioning technique[C]// Proceedings of the 3rd IEEE International Advance Computing Conference. Ghaziabad, India: IEEE Press, 2013: 7-11.
    [9]
    赵丹枫, 金顺福, 刘国华, 等. DAS模型下基于查询概率的密文索引技术[J]. 燕山大学学报, 2008, 32(6): 477-482.
    Zhao D F, Jin S F, Liu G H, et al. A cryptograph index technology based on query probability in DAS model[J]. Journal of Yanshan University, 2008, 32(6): 477-482.
    [10]
    韩蕾, 孙徐湛, 吴志川, 等. MapReduce上基于抽样的数据划分最优化研究[J]. 计算机研究与发展, 2013, 50(S): 77-84.
    Han L, Sun X Z, Wu Z C, et al. Optimization Study on sample based partition on MapReduce[J]. Journal of Computer Research and Development, 2013, 50(S): 77-84.
    [11]
    Fox A, Eichelberger C, Hughes J, et al. Spatio-temporal indexing in non-relational distributed databases[C]// Proceeding of IEEE Conference on Big Data. Silicon Valley, IEEE Press, 2013: 291-299.
    [12]
    Stantic B, Terry J, Topor R, et al. Indexing temporal data with virtual structure[C]// Proceedings of the 14th east European Conference on Advances in Databases and Information Systems. Springer, 2010: 591-594.
    [13]
    Zhong Y Q, Fang J Y, Zhao X F. VegaIndexer: A Distributed composite index scheme for big spatio-temporal sensor data on cloud[C]// Proceedings of the IEEE Conference on Geoscience and Remote Sensing Symposium. Melbourne, Australia: IEEE Press, 2013: 1713-1716.
    [14]
    Chen S, Ooi B C, Tan K L, et al. ST2B-tree: A self-tunable spatio-temporal b+-tree index for moving objects[C]// Proceeding of Conference on ACM Special Interest Group Conference on Management Of Data. Vancouver, Canada: IEEE Press, 2008: 29-42.
    [15]
    Goil S, Nagesh H, Choudhary A. MAFIA: Efficient and scalable subspace clustering for very large data sets[C]// Proceedings of SIGKDD on Data Mining. San Diego: IEEE Press, 1999: 443-452.
    [16]
    Kaufmann M, Amiri A, Vagenas P, et al., Timeline index: A unified data structure for processing queries on temporal data[C]// Proceedings of the ACM SIGMODInternational Conference on Management of Data. New York, USA: ACM Press, 2013:1173-1184.
    [17]
    Chen S, Mario A, et al: ST2B-tree: a self-tunable spatio-temporal b+-tree index for moving objects. SIGMOD 2008:29-42.)

    Article Metrics

    Article views (19) PDF downloads(70)
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return