Record:   Prev Next
作者 Lu, Caimei
書名 Exploiting Social Tagging Network for Web Mining and Search
國際標準書號 9781267173775
book jacket
說明 154 p
附註 Source: Dissertation Abstracts International, Volume: 73-05, Section: A, page: 1602
Adviser: Tony Hu
Thesis (Ph.D.)--Drexel University, 2011
The rapidly growing social data created by users through Web 2.0 applications has intrigued active research in data mining and information retrieval (IR) community. The social tags created by users through social tagging system are one type of such social data. This thesis is dedicated to investigating whether and how the social tagging data can be utilized to enhance the performance of web mining and search methods
First, in order to reveal whether social tags are effective document features which can be used to represent and index web documents, the author compares social tags with other type of index terms, including expert-created subject terms, author-provided keywords and description terms, as well as the content words of web documents. The results of the comparison studies show that social tags contain both high-quality index terms and subjective and personal terms. Besides, like author-provided keywords and description terms, social tags provide additional information beyond the content words of tagged web documents, but social tags are more effective than author-provided keywords and description terms as independent document features for web clustering
The author further researches different approaches to improve web clustering performance by leveraging the social tagging data. The author proposes a novel clustering method called Tripartite Clustering which clusters web documents, users and tags simultaneously based on the social tagging network. The author also investigates two other social tagging-based clustering approaches with K-means and Link K-means clustering methods. Experimental results show that all tag-based clustering methods can significantly improve the performance of content-based clustering. Compared to tag-based K-means and Link K-means, Tripartite Clustering achieves equivalent or better performance and produces more useful information
The author also develops a novel personalized search framework based on a hyper-graph model of social tagging. During the search process, the proposed framework combines three types of relations from the social tagging network for query expansion and ranking: the social relation among users, the semantic relation among tags, and the tripartite relation among users, tags and web documents. Experiments demonstrate that the proposed personalized search framework is more efficient and effective than baseline search methods
Finally, the author proposes a novel topic model to simulate the generation of social tags and accordingly discover the topical structures of documents and users' tagging perspectives. Experimental analysis shows that the proposed Topic-Perspective model has better generative ability than topic models proposed in existing literature. Besides, this model also generates more useful information about document topics, user perspectives, and the impact of document topics and user perspectives on tag generation
In sum, this thesis not only reveals the potential value of social tags as document features and index terms of web documents, but also demonstrates how the social tagging network can be effectively utilized for web mining and search
School code: 0065
Host Item Dissertation Abstracts International 73-05A
主題 Web Studies
Information Science
Alt Author Drexel University
Record:   Prev Next