Loading…
2016 ESIP Winter Meeting has ended
Friday, January 8 • 9:00am - 10:30am
A Framework for Comparing Data Containers

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Data containers are infrastructures that facilitate storage, retrieval, and analysis of data sets. Big data applications in Earth Science require a mix of processing techniques, data sources and storage formats that are supported by different data containers. Some of the most popular data containers used in Earth Science studies are Hadoop, Spark, SciDB, AsterixDB, RasDaMan, and HDF. The goal is to develop an evaluation plan for these infrastructures to assess their suitability for Earth Science data processing needs. We have identified a selection of test cases that are relevant to most data processing exercises in Earth Science applications and we aim to evaluate these systems for optimal performance against each of these test cases. The use cases identified as part of this study are (i) data fetching, (ii) data preparation for multivariate analysis, (iii) data normalization, (iv) distance (kernel) computation, and (v) optimization. Technologies to be discussed: Rasdaman - P. Yang and Q. Huang SciSpark and AstrixDB- C. Mattmann HDF - A. Jelenak and T. Haberman

Moderators
Volunteers

Friday January 8, 2016 9:00am - 10:30am PST
Thurgood Southwest