Record:   Prev Next
Author Ren, Jin
Title Architecture and performance evaluation of data storage systems
book jacket
Descript 186 p
Note Source: Dissertation Abstracts International, Volume: 72-06, Section: B, page: 3604
Adviser: Qing Yang
Thesis (Ph.D.)--University of Rhode Island, 2011
With the advancement of cloud computing technologies, both personal and business users tend to store more and more data on consolidated data centers which can be accessed from anywhere using computers or smart devices. Multiple users may upload identical or similar contents which results in a large amount of duplicated data in the data center. Besides cloud services, the emerging virtualization technologies allow running hundreds of virtual machines on one physical machine which needs to store many copies of similar operating systems and applications. Traditional data storage systems are not able to fully exploit such data redundancy. This dissertation presents a new approach to identify and store similar data blocks in compact formats to improve the performance of the storage system
A histogram-based signature is proposed to capture the similarities between data blocks if their contents are similar or shifted. Similar data blocks are clustered into the same group based on their signatures. Furthermore, a heatmap algorithm is designed to find the most popular block among similar blocks considering both temporal and content localities of data blocks. Finally, a high-speed delta coding algorithm is developed to compress similar blocks into small deltas
The proposed approach leverages flash memory based Solid-State Disk (SSD) to store a single copy, the reference, for many redundant data blocks. Other similar blocks are stored as small deltas referring to the reference block in SSD. Compared to conventional magnetic hard disks, the flash based SSD is orders of magnitude faster in terms of latency. Thus the reference block stored on SSD can be retrieved quickly and I/O requests to other similar blocks can be served by combining the corresponding deltas with the reference block to avoid slow hard disk accesses
Two prototypes of the proposed data storage system have been implemented, one as part of the Linux kernel virtual machine monitor and the other as a Linux device driver. Numerical results on standard benchmarks show an order of magnitude improvement of the new storage system compared to existing disk I/O architectures such as RAID and SSD/HDD storage hierarchy
The last part of this dissertation presents a block level versioning system that is able to recover to any point in time to the past. The versioning system is independent of operating systems by using network storage protocol. The version creation, log maintenance and version recovery are done at storage target to offload the versioning overhead from application servers. Experiments on Linux, Windows, and Solaris have demonstrated that the new versioning system allows user to recover selected files with much smaller metadata cost compared to existing file system versioning systems
School code: 0186
Host Item Dissertation Abstracts International 72-06B
Subject Engineering, Computer
Physics, Solid State
Computer Science
0464
0600
0984
Alt Author University of Rhode Island. Electrical Engineering
Record:   Prev Next