Data-Driven Processing in Sensor Networks Jun Yang Duke University Wireless sensor networks enable data collection from the physical environment on unprecedented scales. In this talk, I will describe some data processing problems that arise in building an environmental sensing network in Duke Forest, in collaboration with ecologists and statisticians. Because of severe resource constraints on battery-powered sensor nodes, it is infeasible to collect and report all raw readings for centralized processing. An effective approach is model-driven data acquisition, which avoids acquiring readings that can be accurately predicted from known spatio-temporal models of data. We argue for an alternative, data-driven approach, which exploits models in optimizing push-based reporting, but does not depend on the quality of models for correctness. A particularly thorny issue with push-based reporting is transmission failures, which are common in sensor networks, and make failed reports indistinguishable from intentionally suppressed ones. The cost of implementing reliable transmissions is prohibitively high. We show how to inject application-level redundancy in data reporting to enable efficient, effective, and principled resolution of uncertainty in the missing data.