| Data Mining References
|
Assigned Readings
Note: some of these links require access to an electronic library,
such as ACM's, and may not be available from non-Stanford machines.
-
Wednesday, 5/17:
H. Mannila, H. Toivonen, and A. I. Verkamo,
``Discovering Frequent Episodes in Sequences.''
First International Conference on Knowledge Discovery and Data Mining,
pp. 210 - 215, AAAI Press, 1995.
Postscript.
-
Monday, 5/15:
Christos Faloutsos, M. Ranganathan and Yannis Manolopoulos,
``Fast subsequence matching in time-series databases,''
SIGMOD, 1994, pp. 419-429.
PDF.
-
Wednesday, 5/10:
S. Guha, R. Rastogi, and K. Shim,
``CURE: An Efficient Clustering Algorithm for Large Databases,''
SIGMOD 1998.
PDF.
Note: this PDF file requires a huge amount of temp space (over 200Mb).
-
Monday, 5/8:
Venkatesh Ganti, Raghu Ramakrishnan, Johannes Gehrke, Allison L. Powell, and
James C. French:,
``Clustering Large Datasets in Arbitrary Metric Spaces,''
ICDE, pp. 502--511, 1999.
PDF.
-
Wednesday, 5/3:
Christos Faloutsos and King-Ip (David) Lin,
``FastMap: A Fast Algorithm for Indexing, Data-Mining and
Visualization of Traditional and Multimedia Datasets,''
ACM SIGMOD, May 1995, San Jose, CA, pp. 163-174.
Gzipped
Postscript.
-
Wednesday, 4/26:
P. Bradley, U. Fayyad, and C. Reina,
``Scaling Clustering Algorithms to Large Databases,''
1998 KDD.
Postscript.
-
Monday, 4/24:
S. Brin, ``Extracting Patterns and Relations from the World-Wide Web.''
Postscript.
-
Wednesday, 4/19:
- a)
-
J. Kleinberg, ``Authoritative sources in a hyperlinked environment,''
J. ACM Sept., 1999, pp. 604-632.
PDF.
- b)
-
S. Brin and L. Page, ``Dynamic Data Mining.''
Postscript.
-
Monday, 4/17:
S. Brin and L. Page, ``The Anatomy of a
Large-Scale
Hypertextual Web
Search Engine,''
WWW7/Computer Networks (1-7), 1998, pp. 107-117.
Postscript.
-
Wednesday, 4/12:
D. Tsur et al., ``Query Flocks: A Generalization of Association-Rule Mining,''
1998 SIGMOD.
Postscript.
-
Monday, 4/10:
E. Cohen et al.,
``Finding Interesting Associations without Support Pruning,''
ICDE 2000.
Postscript.
-
Wednesday, 4/5:
- a)
-
M. Fang, N. Shivakumar, H. Garcia-Molina, R. Motwani, and J. Ullman,
``Computing
Iceberg Queries Efficiently,''
1998 VLDB.
Postscript.
- b)
-
H. Toivonen, ``Sampling Large Databases for Association Rules,''
VLDB 1996, pp. 134-145.
Postscript.
-
Monday, 4/3:
J. S. Park, M.-S. Chen, and P. S. Yu, ``An Effective Hash-Based Algorithm
for Mining Association Rules,''
1995 SIGMOD, pp. 175--186.
PDF
-
Wednesday, 3/29:
- a)
-
R. Agrawal, T. Imielinski, A. Swami: ``Mining Associations between Sets of Items
in Massive Databases'', Proc. of the ACM
SIGMOD Int'l Conference on Management of Data,
Washington D.C., May 1993, 207-216.
Postscript.
PDF.
- b)
-
R. Agrawal, R. Srikant: ``Fast Algorithms for Mining Association Rules'',
Proc. of the 20th Int'l Conference on Very Large
Databases, Santiago, Chile, Sept. 1994.
Postscript.
PDF.
Resources
-
CS145 notes on Datalog.
Postscript;
PDF.
-
ACM SIGKDD (Knowledge Discovery in
Databases) home page.
-
CS349 taught previously
as data mining by Sergey Brin.
-
Heikki
Mannila's Papers at the University of Helsinki.
-
The IBM Quest Project.
-
Shinichi
Morishita's Papers at the University of Tokyo.
Also, his
Recent Papers on genome mining.
-
CACM, Nov., 1996 Special
Issue on Data Mining.
-
Univ. of Washington/Microsoft Summer, 1997 Institute
on data mining.
-
J. Gehrke. W.-Y. Loh, R. Ramamkrishnan, Tutorial on Classification
from the 1999 KDD Conference.
PDF.
Jeffrey D. Ullman
ullman @ cs.stanford.edu
650-494-8016 (home)
650-725-2588 (FAX)