logo资料库

introduction to search with sphinx.pdf

第1页 / 共146页
第2页 / 共146页
第3页 / 共146页
第4页 / 共146页
第5页 / 共146页
第6页 / 共146页
第7页 / 共146页
第8页 / 共146页
资料共146页,剩余部分请下载后查看
Copyright
Table of Contents
Preface
Audience
Organization of This Book
Conventions Used in This Book
Using Code Examples
We’d Like to Hear from You
Safari® Books Online
Acknowledgments
Chapter 1. The World of Text Search
Terms and Concepts in Search
Thinking in Documents Versus Databases
Why Do We Need Full-Text Indexes?
Query Languages
Logical Versus Full-Text Conditions
Logical conditions
Full-text queries
Differences between logical and full-text searches
Natural Language Processing
From Text to Words
Linguistics Crash Course
Relevance, As Seen from Outer Space
Result Set Postprocessing
Full-Text Indexes
Search Workflows
Kinds of Data
Indexing Approaches
Full-Text Indexes and Attributes
Approaches to Searching
Kinds of Results
Chapter 2. Getting Started with Sphinx
Workflow Overview
Getting Started ... in a Minute
Basic Configuration
Defining Data Sources
Disk-based indexes
RT indexes
Distributed indexes
Declaring Fields and Attributes in SQL Data
Sphinx-Wide Settings
Managing Configurations with Inheritance and Scripting
Accessing searchd
Configuring Interfaces
Using SphinxAPI
Using SphinxQL
Building Sphinx from Source
Quick Build
Source Build Requirements
Configuring Sources and Building Binaries
Chapter 3. Basic Indexing
Indexing SQL Data
Main Fetch Query
Pre-Queries, Post-Queries, and Post-Index Queries
How the Various SQL Queries Work Together
Ranged Queries for Larger Data Sets
Indexing XML Data
Index Schemas for XML Data
XML Encodings
xmlpipe2 Elements Reference
Working with Character Sets
Handling Stop Words and Short Words
Chapter 4. Basic Searching
Matching Modes
Full-Text Query Syntax
Known Operators
Escaping Special Characters
AND and OR Operators and a Notorious Precedence Trap
NOT Operator
Field Limit Operator
Phrase Operator
Keyword Proximity Operator
Quorum Operator
Strict Order (BEFORE) Operator
NEAR Operator
SENTENCE and PARAGRAPH Operators
ZONE Limit Operator
Keyword Modifiers
Result Set Contents and Limits
Searching Multiple Indexes
Result Set Processing
Expressions
Filtering
Sorting
Grouping
Chapter 5. Managing Indexes
The “Divide and Conquer” Concept
Index Rotation
Picking Documents
Handling Updates and Deletions with K-Lists
Scheduling Rebuilds, and Using Multiple Deltas
Merge Versus Rebuild Versus Deltas
Scripting and Reloading Configurations
Chapter 6. Relevance and Ranking
Relevance Assessment: A Black Art
Relevance Ranking Functions
Sphinx Rankers Explained
BM25 Factor
Phrase Proximity Factor
Overview of the Available Rankers
Nitty-gritty Ranker Details
How Do I Draw Those Stars?
How Do I Rank Exact Field Matches Higher?
How Do I Force Document D to Rank First?
How Does Sphinx Ranking Compare to System XYZ?
Where to Go from Here
Learn how to turn data into decisions. From startups to the Fortune 500, smart companies are betting on data-driven insight, seizing the opportunities that are emerging from the convergence of four powerful trends: n New methods of collecting, managing, and analyzing data n Cloud computing that offers inexpensive storage and flexible, on-demand computing power for massive data sets n Visualization techniques that turn complex data into images that tell a compelling story n Tools that make the power of data available to anyone Get control over big data and turn it into insight with O’Reilly’s Strata offerings. Find the inspiration and information to create new products or revive existing ones, understand customer behavior, and get the data edge. Visit oreilly.com/data to learn more. ©2011 O’Reilly Media, Inc. O’Reilly logo is a registered trademark of O’Reilly Media, Inc.
Introduction to Search with Sphinx o D
Introduction to Search with Sphinx Andrew Aksyonoff Beijing•Cambridge•Farnham•Köln•Sebastopol•Tokyo
Introduction to Search with Sphinx by Andrew Aksyonoff Copyright © 2011 Andrew Aksyonoff. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://my.safaribooksonline.com). For more information, contact our corporate/institutional sales department: (800) 998-9938 or corporate@oreilly.com. Editor: Andy Oram Production Editor: Jasmine Perez Copyeditor: Audrey Doyle Proofreader: Jasmine Perez Printing History: April 2011: First Edition. Cover Designer: Karen Montgomery Interior Designer: David Futato Illustrator: Robert Romano Nutshell Handbook, the Nutshell Handbook logo, and the O’Reilly logo are registered trademarks of O’Reilly Media, Inc. Introduction to Search with Sphinx, the image of the lime tree sphinx moth, and related trade dress are trademarks of O’Reilly Media, Inc. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and O’Reilly Media, Inc. was aware of a trademark claim, the designations have been printed in caps or initial caps. While every precaution has been taken in the preparation of this book, the publisher and authors assume no responsibility for errors or omissions, or for damages resulting from the use of the information con- tained herein. ISBN: 978-0-596-80955-3 [LSI] 1302874422
分享到:
收藏