BIB-VERSION:: CS-TR-v2.0 ID:: STAN//CS-TN-94-10 ENTRY:: July 12, 1994 ORGANIZATION:: Stanford University, Department of Computer Science TITLE:: Precision and Recall of GlOSS Estimators for Database Discovery TYPE:: Technical Note AUTHOR:: Tomasic, Anthony AUTHOR:: Gravano, Luis AUTHOR:: Garcia-Molina, Hector PAGES:: 23 ABSTRACT:: The availability of large numbers of network information sources has led to a new problem: finding which text databases (out of perhaps thousands of choices) are the most relevant to a query. We call this the text-database discovery problem. Our solution to this problem, GlOSS--Glossary-Of-Servers Server, keeps statistics on the available databases to decide which ones are potentially useful for a given query. In this paper we present different query-result size estimators for GlOSS and we evaluate them with metrics based on the precision and recall concepts of text-document information-retrieval theory. Our generalization of these metrics uses different notions of the set of relevant databases to define different query semantics. NOTES:: [Adminitrivia V1/Prg/19940712] END:: STAN//CS-TN-94-10