Tracing the Provenance and Flow of Data In this talk, I shall describe an annotation management system that can be used to "eagerly" trace the provenance (i.e. origins) or flow of a piece of data. In this system, every piece of data is assumed to have one or more annotations attached to it. As data is being transformed, e.g., through a query, the relevant annotations are automatically propagated along. This system also has potential applications in other areas such as markup of data and quality control. We show that optimizing a query in such an annotation management system can be rather different from traditional query optimizations: Two queries that are considered to be equivalent by a traditional query optimizer may not be annotation-equivalent (i.e. generate the same annotated outcome) in general. Despite this, we show that the same annotated result is obtained whether intermediate constructs of a query are evaluated with set or bag semantics. We also give a necessary and sufficient condition, via homomorphisms, that checks whether a query is annotation-contained in another. Even though our characterization suggests that annotation-containment is more complex than query containment, we show that the annotation-containment problem is NP-complete, thus putting it in the same complexity class as query containment. In addition, we show that the annotation placement problem, which was first shown to be NP-hard, is in fact DP-hard and the exact complexity of this problem still remains open.