The amount of potentially valuable information buried in Big Data is of interest to many data science applications ranging from natural sciences to marketing research. In order to analyze and digest such heterogeneous data, many challenges for integration and distributed analysis should be resolved using scalable data preparation and analysis techniques, new and distributed programming paradigms and innovative hardware and software systems that can serve a variety of applications based on their needs. These applications typically involving data ingestion, preparation, integration, analysis, visualization and dissemination are referred to as Data Science Workflows.
WorDS Center of Excellence builds upon more than a decade of experience on building workflows for computational science, data science and engineering at the intersection of distributed computing, big data analysis, reproducible science, while fostering a collaborative working culture.
WorDS offers expertise and services to support data-driven applications, data analysis projects, data scientists and software engineers in their computational practices involving process management. We provide our academic and indstrial partners with expertise-oriented services spanning:
- Consulting with world-class researchers and an A-Team of developers well-versed in data science and scientific computing technologies
- Workflow management technologies that resulted in the collaborative development of the popular Kepler Scientific Workflow System
- Development of data science workflow applications through combination of tools, technologies and best practices
- Hands on consulting on workflow technologies for big data and cloud systems, e.g., MapReduce, Hadoop, Spark, Yarn, Cascading
- Technology briefings and applied classes on end-to-end support for data science