-Learning to Rank for Information Retrieval.pdf

发布时间：2022-06-10 发布人：admin 分类：说明书资料大小：1.99M 资料格式：pdf 举报版权申诉

lintong1234-5099064-4744300845369507362.pdf-第1页.png

第1页 / 共304页

lintong1234-5099064-4744300845369507362.pdf-第2页.png

第2页 / 共304页

lintong1234-5099064-4744300845369507362.pdf-第3页.png

第3页 / 共304页

lintong1234-5099064-4744300845369507362.pdf-第4页.png

第4页 / 共304页

lintong1234-5099064-4744300845369507362.pdf-第5页.png

第5页 / 共304页

lintong1234-5099064-4744300845369507362.pdf-第6页.png

第6页 / 共304页

lintong1234-5099064-4744300845369507362.pdf-第7页.png

第7页 / 共304页

lintong1234-5099064-4744300845369507362.pdf-第8页.png

第8页 / 共304页

Cover

Learning to Rank for Information Retrieval

ISBN 9783642142666

Preface

Acknowledgements

Contents

Part I: Overview of Learning to Rank

Chapter 1: Introduction

1.1 Overview

1.2 Ranking in Information Retrieval

1.2.1 Conventional Ranking Models

1.2.1.1 Relevance Ranking Models

1.2.1.2 Importance Ranking Models

1.2.2 Query-Level Position-Based Evaluations

Mean Reciprocal Rank (MRR)

Mean Average Precision (MAP)

Discounted Cumulative Gain (DCG)

Rank Correlation (RC)

1.3 Learning to Rank

1.3.1 Machine Learning Framework

1.3.2 Definition of Learning to Rank

Feature Based

Discriminative Training

1.3.3 Learning-to-Rank Framework

1.3.3.1 The Pointwise Approach

1.3.3.2 The Pairwise Approach

1.3.3.3 The Listwise Approach

1.4 Book Overview

1.5 Exercises

References

Part II: Major Approaches to Learning to Rank

Chapter 2: The Pointwise Approach

2.1 Overview

2.2 Regression-Based Algorithms

2.2.1 Subset Ranking with Regression

2.3 Classification-Based Algorithms

2.3.1 Binary Classification for Ranking

SVM-Based Method

Logistic Regression-Based Method

2.3.2 Multi-class Classification for Ranking

Boosting Tree-Based Method

Association Rule Mining-Based Method

2.4 Ordinal Regression-Based Algorithms

2.4.1 Perceptron-Based Ranking (PRanking)

2.4.2 Ranking with Large Margin Principles

2.4.3 Ordinal Regression with Threshold-Based Loss Functions

2.5 Discussions

2.5.1 Relationship with Relevance Feedback

2.5.2 Problems with the Pointwise Approach

2.5.3 Improved Algorithms

2.6 Summary

2.7 Exercises

References

Chapter 3: The Pairwise Approach

3.1 Overview

3.2 Example Algorithms

3.2.1 Ordering with Preference Function

3.2.2 SortNet: Neural Network-Based Sorting Algorithm

3.2.3 RankNet: Learning to Rank with Gradient Descent

3.2.4 FRank: Ranking with a Fidelity Loss

3.2.5 RankBoost

3.2.6 Ranking SVM

3.2.7 GBRank

3.3 Improved Algorithms

3.3.1 Multiple Hyperplane Ranker

3.3.2 Magnitude-Preserving Ranking

3.3.3 IR-SVM

3.3.4 Robust Pairwise Ranking with Sigmoid Functions

3.3.5 P-norm Push

3.3.6 Ordered Weighted Average for Ranking

3.3.7 LambdaRank

3.3.8 Robust Sparse Ranker

3.4 Summary

3.5 Exercises

References

Chapter 4: The Listwise Approach

4.1 Overview

4.2 Minimization of Measure-Specific Loss

4.2.1 Measure Approximation

4.2.1.1 SoftRank

4.2.1.2 Decision Theoretic Framework for Ranking

4.2.1.3 Approximate Rank

4.2.1.4 SmoothRank

4.2.2 Bound Optimization

4.2.2.1 SVMmap

4.2.3 Non-smooth Optimization

4.2.3.1 AdaRank

4.2.3.2 Genetic Programming Based Algorithms

4.2.4 Discussions

4.3 Minimization of Non-measure-Specific Loss

4.3.1 ListNet

4.3.2 ListMLE

4.3.3 Ranking Using Cumulative Distribution Networks

4.3.4 BoltzRank

4.4 Summary

4.5 Exercises

References

Chapter 5: Analysis of the Approaches

5.1 Overview

5.2 The Pointwise Approach

5.3 The Pairwise Approach

5.4 The Listwise Approach

5.4.1 Non-measure-Specific Loss

5.4.2 Measure-Specific Loss

5.5 Summary

5.6 Exercises

References

Part III: Advanced Topics in Learning to Rank

Chapter 6: Relational Ranking

6.1 General Relational Ranking Framework

6.1.1 Relational Ranking SVM

6.1.2 Continuous Conditional Random Fields

6.2 Learning Diverse Ranking

6.3 Discussions

References

Chapter 7: Query-Dependent Ranking

7.1 Query-Dependent Loss Function

7.2 Query-Dependent Ranking Function

7.2.1 Query Classification-Based Approach

7.2.2 K Nearest Neighbor-Based Approach

7.2.3 Query Clustering-Based Approach

7.2.4 Two-Layer Learning Approach

7.3 Discussions

References

Chapter 8: Semi-supervised Ranking

8.1 Inductive Approach

8.2 Transductive Approach

8.3 Discussions

References

Chapter 9: Transfer Ranking

9.1 Feature-Level Transfer Ranking

9.2 Instance-Level Transfer Ranking

9.3 Discussions

References

Part IV: Benchmark Datasets for Learning to Rank

Chapter 10: The LETOR Datasets

10.1 Overview

10.2 Document Corpora

10.2.1 The "Gov" Corpus and Six Query Sets

10.2.2 The OHSUMED Corpus

10.2.3 The "Gov2" Corpus and Two Query Sets

10.3 Document Sampling

10.4 Feature Extraction

10.5 Meta Information

10.6 Learning Tasks

10.7 Discussions

References

Chapter 11: Experimental Results on LETOR

11.1 Experimental Settings

11.2 Experimental Results on LETOR 3.0

11.3 Experimental Results on LETOR 4.0

11.4 Discussions

11.5 Exercises

References

Chapter 12: Other Datasets

12.1 Yahoo! Learning-to-Rank Challenge Datasets

12.2 Microsoft Learning-to-Rank Datasets

12.3 Discussions

References

Part V: Practical Issues in Learning to Rank

Chapter 13: Data Preprocessing for Learning to Rank

13.1 Overview

13.2 Ground Truth Mining from Logs

13.2.1 User Click Models

13.2.1.1 Dependent Click Model

13.2.1.2 Bayesian Browsing Model

13.2.1.3 Dynamic Bayesian Network Click Model

13.2.2 Click Data Enhancement

13.2.2.1 Learning a User Interaction Model

13.2.2.2 Smoothing Click-Through Data

13.3 Training Data Selection

13.3.1 Document and Query Selection for Labeling

13.3.1.1 Deep Versus Shallow Judgments

13.3.1.2 Actively Learning for Labeling

13.3.2 Document and Query Selection for Training

13.3.2.1 Document Selection Strategies

13.3.2.2 Data Selection by Optimizing PPC

13.3.3 Feature Selection for Training

13.4 Summary

13.5 Exercises

References

Chapter 14: Applications of Learning to Rank

14.1 Overview

14.2 Question Answering

14.2.1 Definitional QA

14.2.2 Quantity Consensus QA

14.2.3 Non-factoid QA

14.2.4 Why QA

14.3 Multimedia Retrieval

14.4 Text Summarization

14.5 Online Advertising

14.6 Summary

14.7 Exercises

References

Part VI: Theories in Learning to Rank

Chapter 15: Statistical Learning Theory for Ranking

15.1 Overview

15.2 Statistical Learning Theory

15.3 Learning Theory for Ranking

15.3.1 Statistical Ranking Framework

15.3.2 Generalization Analysis for Ranking

15.3.3 Statistical Consistency for Ranking

15.4 Exercises

References

Chapter 16: Statistical Ranking Framework

16.1 Document Ranking Framework

16.1.1 The Pointwise Approach

16.1.2 The Pairwise Approach

(1) The U-statistics View

(2) The Average View

16.1.3 The Listwise Approach

16.2 Subset Ranking Framework

16.2.1 The Pointwise Approach

16.2.2 The Pairwise Approach

16.2.3 The Listwise Approach

16.3 Two-Layer Ranking Framework

16.3.1 The Pointwise Approach

16.3.2 The Pairwise Approach

(1) The U-statistics View

(2) The Average View

16.3.3 The Listwise Approach

16.4 Summary

16.5 Exercises

References

Chapter 17: Generalization Analysis for Ranking

17.1 Overview

17.2 Uniform Generalization Bounds for Ranking

17.2.1 For Document Ranking

17.2.2 For Subset Ranking

17.2.3 For Two-Layer Ranking

17.3 Algorithm-Dependent Generalization Bound

17.3.1 For Document Ranking

17.3.2 For Subset Ranking

17.3.3 For Two-Layer Ranking

17.4 Summary

17.5 Exercises

References

Chapter 18: Statistical Consistency for Ranking

18.1 Overview

18.2 Consistency Analysis for Document Ranking

18.2.1 Regarding Pairwise 0-1 Loss

18.3 Consistency Analysis for Subset Ranking

18.3.1 Regarding DCG-Based Ranking Error

18.3.2 Regarding Permutation-Level 0-1 Loss

18.3.3 Regarding Top-k True Loss

18.3.4 Regarding Weighted Kendall's tau

18.4 Consistency Analysis for Two-Layer Ranking

18.5 Summary

18.6 Exercises

References

Part VII: Summary and Outlook

Chapter 19: Summary

References

Chapter 20: Future Work

20.1 Sample Selection Bias

20.2 Direct Learning from Logs

20.3 Feature Engineering

20.4 Advanced Ranking Models

20.5 Large-Scale Learning to Rank

20.6 Online Complexity Versus Accuracy

20.7 Robust Learning to Rank

20.8 Online Learning to Rank

20.9 Beyond Ranking

References

Part VIII: Appendix

Chapter 21: Mathematical Background

21.1 Probability Theory

21.1.1 Probability Space and Random Variables

21.1.2 Probability Distributions

21.1.2.1 Discrete Distribution

21.1.2.2 Continuous Distribution

21.1.2.3 Popular Distributions

21.1.3 Expectations and Variances

21.2 Linear Algebra and Matrix Computation

21.2.1 Notations

21.2.2 Basic Matrix Operations and Properties

21.2.2.1 Matrix Multiplication

21.2.2.2 Identity Matrix

21.2.2.3 Diagonal Matrix

21.2.2.4 Transpose

21.2.2.5 Symmetric Matrix

21.2.2.6 Trace

21.2.2.7 Norm

21.2.2.8 Inverse

21.2.2.9 Orthogonal Matrix

21.2.2.10 Determinant

21.2.2.11 Quadratic Form

21.2.3 Eigenvalues and Eigenvectors

21.3 Convex Optimization

21.3.1 Convex Set and Convex Function

21.3.2 Conditions for Convexity

21.3.2.1 First-Order Condition

21.3.2.2 Second-Order Condition

21.3.3 Convex Optimization Problem

21.3.4 Lagrangian Duality

21.3.5 KKT Conditions

References

Chapter 22: Machine Learning

22.1 Regression

22.1.1 Linear Regression

22.1.2 Probabilistic Explanation

22.2 Classification

22.2.1 Neural Networks

22.2.2 Support Vector Machines

22.2.3 Boosting

22.2.4 K Nearest Neighbor (KNN)

22.3 Statistical Learning Theory

22.3.1 Formalization

22.3.2 Bounds for |R(g) - R(g)|

22.3.2.1 Hoeffding's Inequality

22.3.2.2 Uniform Bounds in Finite Case

22.3.2.3 Uniform Bounds in Infinite Case

22.3.2.4 Uniform Bounds Based on Covering Number

22.3.2.5 Uniform Bounds Based on Rademacher Average

References

Index

Learning to Rank for Information Retrieval

Tie-Yan Liu Learning to Rank for Information Retrieval

Tie-Yan Liu Microsoft Research Asia Bldg #2, No. 5, Dan Ling Street Haidian District Beijing 100080 People’s Republic of China Tie-Yan.Liu@microsoft.com ISBN 978-3-642-14266-6 DOI 10.1007/978-3-642-14267-3 Springer Heidelberg Dordrecht London New York e-ISBN 978-3-642-14267-3 Library of Congress Control Number: 2011927168 © Springer-Verlag Berlin Heidelberg 2011 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, speciﬁcally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microﬁlm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Cover design: KünkelLopka GmbH Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)

Preface In recent years, with the fast growth of the World Wide Web and the difﬁculties in ﬁnding desired information, efﬁcient and effective information retrieval systems have become more important than ever, and the search engine has become an essen- tial tool for many people. The ranker, a central component in every search engine, is responsible for the matching between processed queries and indexed documents. Because of its central role, great attention has been paid to the research and devel- opment of ranking technologies. In addition, ranking is also pivotal for many other information retrieval applications, such as collaborative ﬁltering, question answer- ing, multimedia retrieval, text summarization, and online advertising. Leveraging machine learning technologies in the ranking process has led to innovative and more effective ranking models, and has also led to the emerging of a new research area named learning to rank. This new book gives a comprehensive review of the major approaches to learning to rank, i.e., the pointwise, pairwise, and listwise approaches. For each approach, the basic framework, example algorithms, and their theoretical properties are discussed. Then some recent advances in learning to rank that are orthogonal to the three ma- jor approaches are introduced, including relational ranking, query-dependent rank- ing, semi-supervised ranking, and transfer ranking. Next, we introduce the bench- mark datasets for the research on learning to rank and discuss some practical is- sues regarding the application of learning to rank, such as click-through log mining and training data selection/preprocessing. After that several examples that apply learning-to-rank technologies to solve real information retrieval problems are pre- sented. The book is completed by theoretical discussions on guarantees for ranking performance, and the outlook of future research on learning to rank. This book is written for researchers and graduate students in information retrieval and machine learning. Familiarity of machine learning, probability theory, linear al- gebra, and optimization would be helpful though not essential as the book includes a self-contained brief introduction to the related knowledge in Chaps. 21 and 22. Because learning to rank is still a fast growing research area, it is impossible to pro- vide a complete list of references. Instead, the aim has been to give references that are representative and hopefully provide entry points into the short but rich litera- ture of learning to rank. This book also provides several promising future research v

vi Preface directions on learning to rank, hoping that the readers can be inspired to work on these new topics and contribute to this emerging research area in person. Beijing People’s Republic of China February 14, 2011 Tie-Yan Liu

I would like to dedicate this book to my wife and my lovely baby son!

分享到：

赞收藏

资料库

-Learning to Rank for Information Retrieval.pdf

相关推荐

网络技术

热门标签

最新资料