Record:   Prev Next
作者 Eng, Kenjone
書名 Schema-based compression of XML streams and data
國際標準書號 9780549277835
book jacket
說明 98 p
附註 Source: Masters Abstracts International, Volume: 46-02, page: 1010
Adviser: Christopher League
Thesis (M.S.)--Long Island University, The Brooklyn Center, 2007
XML was created with one purpose in mind: to allow the exchange of structured data between applications of differing native platforms over the web. The strength and popularity of XML lies in its extensibility and self-describing nature. Tags and attributes are defined by the user, which do not restrict how the data is defined. XML focuses on what the data is rather than how the data should look
However, this extensibility comes at a cost. The verbosity of XML compromises efficiency in terms of storage and transportation. XML grammar is extremely taggy. In some cases, the XML document sizes are unacceptable when compared to its original untagged format. This raises serious bandwidth concerns when it comes to data exchange over the web. Our approach to this problem is compression
The purpose of compression is ultimately one of three goals: achieving maximum compression ratio, achieving maximum compression efficiency, or a balance of both. While the idea of XML-specific compression is not entirely new, we will look closely at a new schema-based compression technique (also referred to as type-based) in a tool called rngzip. This method is the first in the context of Relax NG
We will show that by using the design knowledge in the form of an automaton, the transitions at each choice point can be labeled strategically, This allows us to reconstruct the original document by only recording the transitions and the text. By separating the document structure from the data, then compressing them separately, better compression ratios can be achieved. In turn, we can transmit XML documents more compactly
School code: 0198
Host Item Masters Abstracts International 46-02
主題 Computer Science
Alt Author Long Island University, The Brooklyn Center
Record:   Prev Next