编辑: 丑伊 | 2019-07-16 |
200 pounds. The query is written as if the x location were precise. Q1: Select group id, sum(S.weight) From Locations S [Partition By tag id Rows 1] Group By AreaId(S.x, AreaLength) as group id Having sum(S.weight) >
200 Computational Astrophysics. There have been several recent ini- tiatives to apply relational techniques to computational astrophysics. As detailed in a recent workshop paper [17], massive astrophysical surveys will soon generate observations of
108 stars and galaxies at nightly data rates of 0.5TB to 20TB. The observations are in- herently noisy as the objects can be too dim to be recognized in a single image. However, repeated observations (up to a thousand times) allow scientists to model the location, brightness, and color of objects using appropriate distributions, represented as (id, time, (x, y)p, luminosityp, colorp). Then queries can be issued to detect dynamic features, transient events, and anomalous behaviors. Query Q2 below detects regions of the sky of high luminosity from the observations in the past hour. Similar to Q1, it groups the objects into the prede?ned regions and for the regions with the maximum luminosity above a threshold it reports the maximum luminosity. Q2: Select group id, max(S.luminosity) From Observations S [Range
1 hour] Group By AreaId(S.(x,y), AreaDef) as group id Having max(S.luminosity) >
20 There are several commonalities between the above two exam- ples. First, the uncertain attributes are continuous-valued and usually modeled by a probability density functions (pdf). Unfortunately, as noted in recent workshop papers [1, 17], such attributes have been under-addressed in the probabilistic databases and data streams lit- erature. Second, both queries involve complex relational operations on continuous-valued uncertain attributes. In particular, group by'
s are a form of conditioning operations that restrict the pdf of an un- certain attribute to a region speci?ed in the group condition. Then an aggregate is applied to the tuples in each group with conditioned dis- tributions. The aggregate result of each group can be further ?ltered using the Having clause (another form of conditioning operation). Third, such complex operations are performed in real-time as tuples arrive. These commonalities characterize the problem we address in this paper: to support conditioning and aggregation operations on data streams involving continuous-valued uncertain attributes. Challenges. The most salient challenge arises from the fact that to characterize the uncertainty of query results, one generally has to compute the probability distributions of uncertain att........