Data Finding, Sharing and Duplication Removal in the Cloud Using File Checksum Algorithm
Osuolale A. Festus
Citation : Osuolale A. Festus, Data Finding, Sharing and Duplication Removal in the Cloud Using File Checksum Algorithm International Journal of Research Studies in Computer Science and Engineering 2019, 6(1) : 23-44.
Cloud computing is a powerful technology that provides a way of storing voluminous data that can easily be accessed anywhere and at any time by eliminating the need to maintain expensive computing hardware, dedicated space, and software. Addressing increasing storage needs is challenging and a time demanding task that requires large computational infrastructure to ensure successful data processing and analysis. With the continuous and exponential increase of the number of users and the size of their data, data deduplication becomes more and more a necessity for cloud storage providers. By storing a unique copy of duplicate data, cloud providers greatly reduce their storage and data transfer costs. This project provides an overview on cloud computing, cloud file services, its accessibility, and storage. It also considers storage optimization by de-duplication studying the existing data de-duplication strategies, processes, and implementations for the benefit of the cloud service providers and the cloud users. The project also proposes an efficient method for detecting and removing duplicates using file checksum algorithms by calculating the digest of files which takes less time than other pre-implemented methods.