Record:   Prev Next
作者 Losee, Robert M. (Robert MacLean), 1952- author
書名 Predicting information retrieval performance / Robert M. Losee
出版項 [San Rafael, California] : Morgan & Claypool, 2018
國際標準書號 1681734737
9781681734736
國際標準號碼 10.2200/S00887ED1V01Y201811ICR065
book jacket
說明 1 online resource (xix, 59 pages) : illustrations
text txt rdacontent
computer c rdamedia
online resource cr rdacarrier
系列 Synthesis lectures on information concepts, retrieval, and services, 1947-9468 ; # 65
Synthesis lectures on information concepts, retrieval, and services ; # 65. 1947-9468
附註 Title from PDF title page (viewed on January 3, 2019)
Includes bibliographical references (pages 57-58)
1. Information retrieval: a predictive science -- 1.1 Rules, measures, and science -- 1.2 The science of information retrieval --
2. Probabilities and probabilistic information retrieval -- 2.1 Probabilities -- 2.2 Probabilistic retrieval -- 2.2.1 Bayes rule -- 2.3 Describing a probability --
3. Information retrieval performance measures -- 3.1 Precision and recall -- 3.1.1 Precision -- 3.1.2 Recall -- 3.2 Precision-recall graphs -- 3.2.1 High-precision systems and high-recall systems -- 3.2.2 Generality and fallout -- 3.3 High-precision performance measures -- 3.3.1 Mean average precision -- 3.3.2 Precision at k -- 3.3.3 Discounted cumulative gain -- 3.3.4 F measure -- 3.3.5 Mean reciprocal rank -- 3.4 Receiver operating characteristic -- 3.5 Search lengths -- 3.5.1 Expected search length -- 3.5.2 Average search length and the expected position of a relevant document -- 3.6 Summary --
4. Single-term performance -- 4.1 A single binary term -- 4.2 Average search length and expected position of a relevant document -- 4.3 Probability of optimal ranking (Q) -- 4.3.1 Developing Q -- 4.3.2 Best-case ranking -- 4.3.3 Interpreting Pr(p > t) -- 4.3.4 Worst-case ranking -- 4.3.5 Random ranking -- 4.3.6 Q and ranking by inverse document frequency weighting -- 4.3.7 Q and ranking by decision-theoretic weighting -- 4.4 Expected position of a relevant document given Q and A values -- 4.4.1 Best- and worst-case performance -- 4.5 General discrete single feature distribution models -- 4.5.1 Binary feature model -- 4.5.2 A single Poisson feature -- 4.6 General continuous feature distribution models -- 4.6.1 Normal distribution -- 4.7 Advanced models of Q -- 4.7.1 Point probabilities -- 4.7.2 Distribution instead of a point probability -- 4.7.3 Inaccurate knowledge about a distribution -- 4.8 Predicting performance with high precision retrieval -- 4.8.1 A changing for different segments of an ordering -- 4.9 Summary --
5. Performance with multiple binary features -- 5.1 Partial dependence -- 5.1.1 Maximum spanning tree -- 5.1.2 Simple maximum spanning trees -- 5.1.3 Partial dependencies with Bahadur Lazarsfeld and generalized dependence models -- 5.2 Singular value decomposition -- 5.3 Teugels' models for full dependence -- 5.3.1 Covariances to probabilities -- 5.3.2 Averages to probabilities -- 5.3.3 Probabilities to covariances -- 5.3.4 Probabilities to averages -- 5.4 Example with documents -- 5.5 Predicting performance with feature dependence -- 5.6 Validation -- 5.7 Feature independence -- 5.8 Summary --
6. Applications: metadata and linguistic labels -- 6.1 Metadata and indexing -- 6.1.1 Assigning terms to relevant documents -- 6.1.2 Not assigning terms to relevant documents -- 6.1.3 Not assigning terms to non-relevant documents -- 6.1.4 Incorrectly assigning terms to non-relevant document -- 6.2 Validating metadata rules -- 6.3 Natural language part-of-speech tags -- 6.3.1 Best- and worst-case performance with tags -- 6.4 Summary --
7. Conclusion -- A. Variables -- Bibliography -- Author's biography
Compendex
INSPEC
Google scholar
Google book search
Information Retrieval performance measures are usually retrospective in nature, representing the effectiveness of an experimental process. However, in the sciences, phenomena may be predicted, given parameter values of the system. After developing a measure that can be applied retrospectively or can be predicted, performance of a system using a single term can be predicted given several different types of probabilistic distributions. Information Retrieval performance can be predicted with multiple terms, where statistical dependence between terms exists and is understood. These predictive models may be applied to realistic problems, and then the results may be used to validate the accuracy of the methods used. The application of metadata or index labels can be used to determine whether or not these features should be used in particular cases. Linguistic information, such as part-of-speech tag information, can increase the discrimination value of existing terminology and can be studied predictively. This work provides methods for measuring performance that may be used predictively. Means of predicting these performance measures are provided, both for the simple case of a single term in the query and for multiple terms. Methods of applying these formulae are also suggested
鏈接 Print version: 9781681734729 9781681734743
主題 Information retrieval -- Evaluation
LITERARY CRITICISM / Books & Reading bisacsh
Information retrieval -- Evaluation. fast (OCoLC)fst00972624
Information Retrieval
performance measures
science
predicting performance
single term models
statistical feature dependence
multiple features
metadata performance
natural language performance measures
Electronic books
Record:   Prev Next