An introduction to Xapian, an open-source full-text search engine and how this tool can be used with PHP to deliver near-instant search results of your data.

Getting Started with Xapian
Keine Kommentare

This article covers:


Indexing concepts and terminology

Search concepts

Knowing your data and users

Installation and set up

Indexing your data

Querying the database

More than the Basics

What you will learn:

You will learn: how to build a sub-second text searching system; how to index and query this data from within PHP; what Xapian is and how it works.

What you should know:

The knowledge requirements for this article are basic PHP and MySQL skills; however, it helps to have a database with a large volume of data to index!


Good search is at the core of every web site these days. Users need to be able to find what they’re looking for quickly, accurately and they need to be able to filter and sort their results in a variety of different ways.

I was introduced to Xapian many years ago as part of a project to improve the search on one of my client’s websites: They sell market research reports and have over 250,000, comprising 1.6GB of textual data (titles, subtitles, summary and table of contents) plus a range of numeric and date information (prices, publication date, etc). We also wanted to be able to provide counts for users based on categories and date ranges, in a similar way to eBay’s search.

We looked at a number of different products at the time and Xapian came out on top, primarily because it’s very lightweight, fast (results in microseconds) and provides range search and faceting capabilities. This means that you can run queries like “Find me all reports in vehicle manufacturing priced between £100-£500” and “Find me all reports with all the words ‚tobacco industry revenue‘ published within 6 months”.

In this article, I’ll be introducing some of the concepts behind Xapian, together with some simple examples of code that’s needed to start using the system, which you can expand upon to create a bespoke search engine for your own data.


Unsere Redaktion empfiehlt:

Relevante Beiträge

Benachrichtige mich bei
Inline Feedbacks
View all comments
- Gib Deinen Standort ein -
- or -