Automatic Extraction of Data from 2-D Plots in Documents
Xiaonan Lu, James Z. Wang, Prasenjit Mitra and C. Lee Giles
The Pennsylvania State University
Two-dimensional (2-D) plots in digital documents contain important
information. Often, the results of scientific experiments and
performance of businesses are summarized using plots. Although 2-D
plots are easily understood by human users, current search engines
rarely utilize the information contained in the plots to enhance the
results returned in response to queries posed by endusers. We propose
an automated algorithm for extracting information from line curves in
2-D plots. The extracted information can be stored in a database and
indexed to answer end-user queries and enhance search results. We have
collected 2-D plot images from a variety of resources and tested our
extraction algorithms. Experimental evaluation has demonstrated that
our method can produce results suitable for real world use.
PDF file (388KB)
Xiaonan Lu, James Z. Wang, Prasenjit Mitra, and C. Lee Giles,
``Automatic Extraction of Data from 2-D Plots in Documents,''
Proceedings of the International Conference on Document Analysis and
Recognition, pp. 188-192, Parana, Brazil, September 2007.
Copyright 2007 ICDAR. Permission to make digital or hard copies of all or
part of this work for personal or classroom use is granted without fee
provided that copies are not made or distributed for profit or
commercial advantage and that copies bear this notice and the full
citation on the first page. To copy otherwise, to republish, to post
on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
August 15, 2007.
© 2007, James Z. Wang