Skip to main content

Highlight

Discovering Flow Anomalies: A SWEET Approach

Achievement/Results

Given a percentage-threshold and readings from a pair of consecutive upstream and downstream sensors, flow anomaly discovery identifies dominant time intervals where the fraction of time instants of significantly mis-matched sensor readings exceed the given percentage-threshold. Discovering flow anomalies (FA) is an important problem in environmental flow monitoring networks and early warning detection systems for water quality problems. However, mining FAs is computationally expensive because of the large (potentially infinite) number of time instants of measurement and potentially long delays due to stagnant (e.g. lakes) or slow moving (e.g. wetland) water bodies between consecutive sensors.

Traditional outlier detection methods (e.g. t-test) are suited for detecting transient FAs (i.e., time instants of significant mis-matches across consecutive sensors) and cannot detect persistent FAs (i.e., long variable time-windows with a high fraction of time instant transient FAs) due to a lack of a pre-defined window size. In contrast, we propose a Smart Window Enumeration and Evaluation of persistence-Thresholds (SWEET) method to efficiently explore the search space of all possible window lengths. Computation overhead is brought down significantly by restricting the start and end points of a window to coincide with transient FAs, using a smart counter and efficient pruning techniques. Experimental evaluation using a real dataset shows our proposed approach outperforms Na?ve alternatives.

Address Goals

Discovery: This interdisciplinary work has resulted in discovery of novel computational methods, e.g. SWEET algorithm. The results were accepted for publications in a highly regarded peer-reviewed forum, i.e. IEEE Conference on Data Mining (2008), with extreme selectivity (1 out of 7).

Research Infrastructure: This work also resulted in a software tool to help Environmental Scientists analyze water quality measurements to quickly identify flow anomalies to locate pollution sources.