Cluster Mapping Compression Storage of Monitoring Big Data in Distribution Network Based on Hive
Abstract
Aiming at the storage problems of power monitoring data in distribution automation system, a cluster mapping compression storage method is proposed. By avoiding the Reduce task, monitoring big data can be read in parallel via multiple Map tasks and then be stored in HDFS after compression. At the end of this paper, a cluster mapping compression test is conducted. The experimental results show that the data importing time which adopts cluster mapping compression in the format of Deflate, Gzip, Bzip2, Lzo and Snappy except for Bzip2 is much shorter than those which has not been compressed. The time decrement increases along with the growth of data volume and this trend slows down when the records exceed 20 million. Particularly, data compressed in the format of Deflate obtains the best effect. Therefore, the cluster mapping compression method can effectively solve the storage problem of monitoring big data.
DOI
10.12783/dtetr/mimece2016/10029
10.12783/dtetr/mimece2016/10029
Refbacks
- There are currently no refbacks.