WorDS of Data Science beginning with R
In computer science, reusability refers to the idea of using the same code segment in different applications or scenarios. The advantages of this are many: saves time and resources, reduces error, provides consistency, and simplifies application development. Reusability is also inherent in a workflow approach like Kepler, offering similar benefits to the overall scientific process.
In a scientific workflow, specific tasks are represented as atomic components. Each component can be added to any workflow requiring that particular functionality. When a component is added to a workflow, the underlying processing is effectively reused. The ability to add a Database Connection component to a workflow, for example, without having to write any code whenever you need to create a connection to a database system, saves time and greatly simplifies the workflow building process. Functionality via workflow components also provides a consistent way to access diverse resources such as code scripts, entire programs, and data stores that are often involved in scientific processes.
Reusability is offered at many levels in a workflow approach. Components implementing specific tasks can be reused. Sub-workflows consisting of several components, and even entire workflows can also be reused. For example, tasks such as establishing a database connection, retrieving data from the database, computing summary statistics, and scaling of the data can be steps in a sub-workflow to prepare data for further analysis. This sub-workflow can be re-used across multiple applications in which data retrieval and preparation are necessary pre-processing steps. Components and sub-workflows can also be shared and reused across scientific disciplines.
Reusability in workflows can greatly aid productivity and efficiency of the scientific process. Through reusability, workflows help to alleviate the need to worry about implementation details, simplify the workflow building process, and thus, put the focus on the scientific work.
In both workflows above, code to read data (provided by ReadTable) is reused.