Ecoinformatics site parent site of Partnership for Biodiversity Informatics site parent site of REAP - Home


 

 

 



O Pe NDAP Kepler Data Model Resolution

Difference between version 2 and version 1:

At line 8 added 6 lines.
+ Comments from Dan Higgins - 9/28/2007
+ # The Ptolemy array of tokens is inefficient while the Ptolemy Matrix is designed for 2-D info (images). I suggest that we add a new multidimensional array type to Kepler. Data would be stored linearly in memory (1-D Java array) with a second array indicating dimensionality (approach used in R). Thus a 1-D array with 12 elements could be dimensioned as a 1x12, a 2x6, 3x4, 2x3x2 etc arrays and the dimensionality can changed without moving any data.
+ # Although a new multidimensional array type would be more space efficient than arrays of array tokens, any purely RAM based implementation will rapidly run into memory limitations as we try to handle bigger data sets (e.g. multiple datasets over time with time the 3rd dimension). So why not consider a disk-based option now?
+ # Why not consider a new Kepler datatype that is file-based? i.e. store the opendap data in local files (perhaps CDF or HDF files? I think there are Java tools for reading such file(?)) and use 'file reference tokens' (these don't currently exist). (Currently we do use simple strings as file references in Kepler. We should add a 'ReferenceToken' that is immutable - i.e. for files this would involve file locking, etc.)
+ # Java NIO routines offer some methods for optimizing speed of random access of large disk-based files using OS disk caches and other methods. We might want to investigate these to optimize performance of disk-based data storage.
+

Back to O Pe NDAP Kepler Data Model Resolution, or to the Page History.