编辑: ACcyL | 2019-08-29 |
Engineering Riverside, CA 92521, USA {wli, nkumar, vlolla, eamonn, stelo, ratana}@cs.
ucr.edu Abstract Recent advancements in sensor technology have made it possible to collect enormous amounts of data in real time. However, because of the sheer volume of data most of it will never be inspected by an algorithm, much less a human being. One way to mitigate this problem is to perform some type of anomaly (novelty / interestingness/ surprisingness) detection and flag unusual patterns for further inspection by humans or more CPU intensive algorithms. Most current solutions are custom made for particular domains, such as ECG monitoring, valve pressure monitoring, etc. This customization requires extensive effort by domain expert. Furthermore, hand- crafted systems tend to be very brittle to concept drift. In this demonstration, we will show an online anomaly detection system that does not need to be customized for individual domains, yet performs with exceptionally high precision/recall. The system is based on the recently introduced idea of time series bitmaps. To demonstrate the universality of our system, we will allow testing on independently annotated datasets from domains as diverse as ECGs, Space Shuttle telemetry monitoring, video surveillance, and respiratory data. In addition, we invite attendees to test our system with any dataset available on the web. 1. Introduction Recent advancements in sensor technology have made it possible to collect enormous amounts of data in real time. However, because of the sheer volume of data most of it is never inspected by an algorithm, much less a human being. One way to mitigate this problem is to perform some type of anomaly (novelty / interestingness/ surprisingness) detection and to flag unusual patterns for future inspection by humans or more CPU intensive algorithms. Most current solutions are custom made for particular domains, such as ECG monitoring, valve pressure monitoring, etc. This customization requires extensive effort by domain experts. Furthermore hand- crafted systems tend to be very brittle to concept drift. In this demonstration, we will show an online anomaly detection system that does not need to be customized for individual domains, yet performs with exceptionally high precision/recall. The system is based on the recently introduced idea of time series bitmaps [11]. It allows users to efficiently navigate through a time series of arbitrary length and identify portions that require further investigation. Figure
1 illustrates the graphical interface of our system1 . Figure 1. A snapshot of the anomaly detection tool. To demonstrate the universality of our system, we will allow testing on independently annotated datasets from domains as diverse as ECGs, Space Shuttle telemetry monitoring, video surveillance, and respiratory data. In addition, we invite attendees to test our system with any dataset available on the web. 2. Background and Related Work In this section, we give brief reviews of chaos games and symbolic representations of time series, which are at the heart of our anomaly detection technique. 2.1 Chaos Game Representations Our visualization technique is partly inspired by an algorithm to draw fractals called the Chaos game [1]. The method can produce a representation of DNA sequences, in which both local and global patterns are displayed. The basic idea is to map frequency counts of DNA substrings of length L into a 2L by 2L matrix as shown in Figure 2, then color-code these frequency counts. From our point of view, the crucial observation is that the CGR