Ecoinformatics site parent site of Partnership for Biodiversity Informatics site parent site of REAP - Home


 

 

 



Ocean_SST_Conceptual

Difference between version 21 and version 5:

Line 1 was replaced by line 1
- !! Conceptual Description of REAP SST Usecase
+ !! Conceptual Description of REAP SST Usecase
Line 3 was replaced by line 3
- The goal of the REAP SST usecase workflow (REAP-SST-UCW) is to compare and match-up existing sensor datasets in OpenDAP archives.
+ The goal of the REAP SST usecase workflow (REAP-SST-UCW) is to compare and match-up existing remote-sensed (satellite) images of sea surface temperature found in OPeNDAP archives.
Line 8 was replaced by line 8
- ## __The datasets to use:__ For now, we assume only satellite (MODUS, HYCON, etc.) and Level 3 mapped data. "The Level 3 mapped products are global gridded data sets with all points filled even over land." [Ref: ftp://podaac.jpl.nasa.gov/pub/documents/dataset_docs/modis_sst.html]
+ ## __The datasets to use:__ For now, we assume only 2D fields, either satellite or model output (MODIS, HYCOM, etc.). Furthermore, all 2D fields in a given data set will be in the same geographic project; such data sets are referred to as Level 3 data sets$$. "The Level 3 mapped products are global gridded data sets with all points filled, even over land." [Ref | http://ilrs.gsfc.nasa.gov/reports/ilrs_reports/9809_attach7a.html]
Lines 10-11 were replaced by lines 10-11
- ## __Time sampling:__ Percentage of the dataset times within the timespan.
- ## __Time span delta:__ Maximum amount of time between a reference dataset time sample and the corresponding sample of another dataset (see Step 2.4) %%(color: red) We need the ppt figure from Peter C.%%
+ ## __Time sampling:__ Percentage of the dataset times within the timespan. This parameter defines the fraction of temporal fields in the selected subset of the dataset to be sampled. Sampling will be random based on this fraction. Figure 1 is a schematic of a data set, with each gray field corresponding to a latitude, longitude array at one instant in time. The blue slices correspond to fields randomly sampled from the time series. In this case the temporal sampling parameter was approximately 20%. [{Image src='ImageStackRandomlySelected.png' caption='Figure 1. Temporal samples from a time series of SST fields.' height=400 width=400}]
+ ## __Time span delta:__ Maximum amount of time between a reference dataset time sample and the corresponding sample of another dataset (see Step 2.4) %%(color:red)Not sure what you are referring to here?%%
Lines 14-46 were replaced by lines 14-20
- ## __Spatial sampling:__ Percentage of the area defined by the min/max latitudes and longitudes to sample. %%(color: red) We need the ppt figure from Peter C.%%
- ## __Spatial window delta:__ Maximum amount of overlap between a reference dataset and the corresponding region of another dataset %%(color: red) We need the ppt figure from Peter C.%%
- ## __MinNumberOfPixels:__
- ## __Sampling ?__
- ## __TileAveraging:__ A ''tile'' is a rectangular region that includes a set of pixels. A ''pixel'' is atomic an area and characterized by an SST value at a lat,long that represents the center of a pixel.
- # __Build Match-Up Datasets:__\\
- # __Analyze Match-Up Datasets:__\\
-
-
-
-
-
- 1.
-
- 2. Once the user has input these parameters, the workflow builds a set of “tiles” or “match-ups”.
- 2.1. The metadata of the datasets is retrieved. The metadata describes both when and where the datasets occur.
- 2.2. In the time span specified from Step 1.2, the workflow determines the dataset with the coarsest granularity of timestamps.
- 2.3. The workflow randomly chooses time samples of the reference dataset selected in the previous step. The samples are bounded by the time span from Step 1.2 and the number chosen is the percentage in Step 1.3.
- 2.4. For each time sample selected in the previous step, the workflow finds the closest time sample for each of the other datasets. The maximum allowable difference in time between a time sample from the coarsest dataset and any other dataset is the time span delta specified in Step 1.4.
- 2.5. In the spatial area determined from Step 1.5 and 1.6, the workflow determines the dataset with the coarsest spatial granularity.
- 2.6. The workflow randomly chooses spatial samples or “tiles” of the reference dataset of selected in the previous step, using the time samples determined in Step 2.4: these are bounded by the min. and max. latitudes and longitudes from Step 1.5 and 1.6. The spatial samples are randomly selected such that they cover the spatial percentage (specified in Step 1.7) of the reference dataset.
- 2.7. For each spatial sample selected in the previous step, the workflow determines the corresponding sample area in each of the other datasets.
- 2.8. The SST values for the spatial samples are retrieved for each dataset.
- 2.9. A description of the samples retrieved in the previous step is written to a database. For each sample, the description includes:
- 2.9.1. Latitude center
- 2.9.2. Longitude center
- 2.9.3. Descriptions of the sample for each dataset:
- 2.9.3.1. Time sample
- 2.9.3.2. Array of latitudes
- 2.9.3.3. Array of longitudes
- 2.9.3.4. SST values
- 2.9.3.5. Number of good SST values
- 2.9.3.6. Sum of SST values
+ ## __Percentage Spatial sampling:__ Percentage of the area defined by the min/max latitudes and longitudes to sample. To better understand this, we define ''tile''. A tile is a rectangular region that includes a set of pixels. A ''pixel'' is an atomic area and is characterized by an SST value at a lat,long that represents the center of a pixel. The percentage of the area sampled is determined by the number of tiles per field times the number of pixels per tile divided by the total number of pixels in the defined subregion. Spatial sampling is performed on individual 2-d fields. Figure 2 shows the selection of a temporal sample that is to be sampled spatially. Figure 3 shows the randomly selected spatial samples, tiles, with the area covered by each. [{Image src='ImageStackRandomlySelectedOne.png' caption='Figure 2. Selection of a temporal sample.' height=400 width=400}] [{Image src='RandomSpatialSamples.png' caption='Figure 3. Spatial samples in this temporal sample.' height=400 width=400}]
+ ## __MinNumberOfPixels:__ This parameter is the minimum number of pixels in each spatial sample. For example if the minimum number of pixels is 9, the sampled area will consist of 3x3 pixels centered on the randomly selected spatial location. In the comparison of several data sets, the number of pixels per sample varies from dataset to dataset. The minimum number in the coarsest dataset defines the spatial region of each sample. This region is then used in the other data sets to define the number of pixels to sample sample from them. The percentage of area sampled in each data set will be approximately the same, although the number of pixels will differ.
+ ## __Spatial window delta:__ Maximum amount of overlap between a reference dataset and the corresponding region of another dataset %%(color: red) We need the ppt figure from Peter C.<<I don't think that I have a ppt for this one. As I recall, this came up in our discussion of the search procedure. We want to find all datasets that overlap a reference dataset by some amount. If the data sets just touch, then they the new one is not of a lot of interest. This is the stuff that was on the whiteboard right?%%
+ ## __Sampling __ This defines the method to use when matching sample locations from one data set to another. We should start with 'Nearest Neighbor' and 'Bi-linear Interpolation'. %%(color:red)We might just start with Nearest Neighbor'. Can add more later.%%
+ ## __TileAveraging:__ The tiles from different datasets will consist of different numbers of pixels determined by the relative resolution of each data set. Comparing the tile values requires some form of averaging. This parameter defines how the averaging is to be done. For now let's fix this at a straight linear average.
+ # __Build Match-Up Datasets:__\\Once the user has input parameters in step 1, the workflow builds a set of “tiles” or “match-ups”. The match-up building starts with finding the coarsest granularity of timesteps depending on the metadata for the datasets. The dataset is sampled and closest timeframe of the other datasets is determined based on this coarsest dataset.\\For each timestep, the workflow finds the dataset with coarsest spatial granularity. Then the workflow randomly chooses spatial samples or “tiles” of this reference dataset. These are bounded by the min. and max. latitudes and longitudes. The spatial samples are randomly selected such that they cover the spatial percentage of the reference dataset.\\For each spatial sample selected in the previous step, the workflow determines the corresponding sample area in each of the other datasets. The SST values for the spatial samples are retrieved for each dataset. A description of the samples retrieved is written to a database. For each sample, the description including latitude,longitude center and descriptions of the sample for each dataset (Timeframe, Array of latitudes, Array of longitudes, SST values, Number of good SST values, Sum of SST values).
+ # __Analyze Match-Up Datasets:__\\ The selected and saved (in an RDBMS) SST match-up dataset values can then be used in statistical comparisons. %%(color: red) To be defined in detail after the implementation of the first two steps. %%

Back to Ocean_SST_Conceptual, or to the Page History.