"Our WoRDs center partners, who helped develop our applications using the free and open source Kepler scientific workflow, are of tremendous help in quickly transforming our Gateway Social Science XSEDE application into a Kepler workflow system that is much easier to assemble and modify from our existing R code for rapid discoveries and teaching applications. We're looking to partner to include phylogenetic modeling in our workflows."
--- Douglas White, Ph.D.
Housed in the San Diego Supercomputer Center, at UC San Diego, the Workflows for Data Science (WorDS) Center of Excellence is a hub for the development, promotion, and delivery of workflow services for a wide range of applications. Our mission is to support data analysis projects, data scientists and software engineers in their computational practices involving process management.
Expertise and services:
- Consulting with world-class researchers and an A-Team of developers well-versed in data science and scientific computing technologies
- Workflow management technologies that resulted in the collaborative development of the popular Kepler Scientific Workflow System
- Development of data science workflow applications through combination of tools, technologies and best practices
- Hands on consulting on workflow technologies for big data and cloud systems, i.e., MapReduce, Hadoop, Yarn, Cascading
- Technology briefings and classes on end-to-end support for data science
Areas in which WorDS researchers conduct scientific collaborations include:
- Environmental Observatories
- Computational Chemistry
WorDS Center of Excellence builds upon more than a decade of experience on building workflows for computational science, data science and engineering at the intersection of distributed computing, big data analysis, reproducible science, while fostering a collaborative working culture.
The research and development efforts the WorDS team conducted under SDSC's Scientific Workflow Automation Technologies Laboratory since 2004 include:
- Scientific workflow management, including
- Data and process provenance
- Distributed execution using scientific workflows
- Engineering and streaming workflows for environmental observatories
- Fault tolerance in scientific workflows
- Sensor network management and monitoring
- Role of scientific workflows in eScience infrastructures
- Understanding collaborative work in workflow-driven eScience