|
year | 2001 | title | Building a Distributed Full-Text Index for the Web | abstract |
We identify crucial design issues in building a distributed inverted index
for a large collection of web pages. We introduce a novel pipelining technique
for structuring the core index-building system that substantially reduces
the index construction time. We also propose a storage scheme for creating
and managing inverted files using an embedded database system. We propose
and compare different strategies for addressing various issues relevant to
distributed index construction. Finally, we present performance results from
experiments on a testbed distributed indexing system that we have implemented.
| keywords | Full-text index, Web, WebBase, Text retrieval | note | Extended version of paper submitted to ICDE 2001 |
|
|
|