Cluster Mapping Compression Storage of Monitoring Big Data in Distribution Network Based on Hive

Zhi-jian Qu, Ding-long Chen, Xiang Peng, Qun-feng Wang, Liang Zhao

Abstract


Aiming at the storage problems of power monitoring data in distribution automation system, a cluster mapping compression storage method is proposed. By avoiding the Reduce task, monitoring big data can be read in parallel via multiple Map tasks and then be stored in HDFS after compression. At the end of this paper, a cluster mapping compression test is conducted. The experimental results show that the data importing time which adopts cluster mapping compression in the format of Deflate, Gzip, Bzip2, Lzo and Snappy except for Bzip2 is much shorter than those which has not been compressed. The time decrement increases along with the growth of data volume and this trend slows down when the records exceed 20 million. Particularly, data compressed in the format of Deflate obtains the best effect. Therefore, the cluster mapping compression method can effectively solve the storage problem of monitoring big data.


DOI
10.12783/dtetr/mimece2016/10029

Refbacks

  • There are currently no refbacks.