Frequently Asked Questions

This page contains a list of frequently asked questions and answers about bioKepler.

General

What is bioKepler?

How is bioKepler funded?

What license does bioKepler use?

What's the relationship between bioKepler and Kepler?

What is a bioActor?

What are the bioKepler demo workflows?

What is DDP?

Who can I contact about bioActor, demo, etc., problems?

How can I contribute to bioKepler (actors, workflows, etc.)?

Downloads and Releases

How can I get bioKepler?

How can I run bioKepler on Amazon EC2?

How can I run bioKepler on OpenStack?

How can I run bioKepler on a scalable distributed computing platform, e.g., a cluster?

How can I use bioKepler with related bioinformatics tools?

Is there documentation for bioKepler?

VNC

How can I use VNC to connect to the bioKepler cloud image?

How to avoid "d" keyboard shortcut to minimize VNC window?

 

General

What is bioKepler?

bioKepler is a Kepler suite facilitating rapid development and scalable distributed execution of bioinformatics workflows in Kepler while simplifying access to a wide range of bioinformatics tools executed locally or distributedly. bioKepler contains a set of Kepler actors, called “bioActors”, that are specialized for running bioinformatics tools at scale along with Kepler directors for distributed data-parallel (DDP) execution on Hadoop and Stratosphere engines. For more information, see the About page.

How is bioKepler funded?

bioKepler is funded by NSF Award DBI-1062565 under CI Reuse and Advances in Bioinformatics programs.

What license does bioKepler use?

bioKepler is released under the BSD license.

What's the relationship between bioKepler and Kepler?

bioKepler is a suite of add-on modules that includes Kepler. The bioKepler suite adds bioActors, demo bioinformatics workflows, and the DDP framework to the Kepler suite. 

What is a bioActor?

A bioActor is a Kepler actor that executes a single application or tool, e.g., the blastall bioActor runs blastall, the bowtie bioActor runs bowtie, etc. A bioActor can execute the tool on the local machine; some bioActors additionally can execute the tool on distributed computational resources. For more information, see the bioKepler 1.0 User Guide.

What are the bioKepler demo workflows?

The bioKepler suite contains over 40 example workflows that demonstrate the bioKepler actors and directors. They can be viewed here.

What is DDP?

The bioKepler suite includes the Kepler Distributed Data-Parallel (DDP) framework for data-parallel execution on distributed computational resources. Many DDP patterns facilitate data-intensive applications/workflows, which can execute in parallel with partitioned data on distributed computing nodes. For more information, see the bioKepler 1.0 User Guide.

Who can I contact about bioActor, demo, etc., problems?

You can send email to one of the Kepler mailing lists, submit a bug report, or chat on the Kepler IRC channel. See the Kepler Contact Us page for more information.

How can I contribute to bioKepler (actors, workflows, etc.)?

We welcome contributions, e.g., bioActors, demo workflows, etc., to bioKepler. Please contact us by emailing one of the Kepler mailing lists. See the Kepler Contact Us page for more information.

Downloads and Releases

How can I get bioKepler?

See the bioKepler 1.0 Release page.

How can I run bioKepler on Amazon EC2?

See the bioKepler Amazon EC2 page. 

How can I run bioKepler on OpenStack?

We are currently building an OpenStack image for bioKepler. We will announce its completion on the Kepler mailiing lists.

How can I run bioKepler on a scalable distributed computing platform, e.g., a cluster?

If a bioActor has a DDP sub-workflow, such as the 'HadoopMapOnly' in blast bioActor, it can run in parallel if you have a scalable distributed computing platform, such as a cluster. First check whether a Hadoop/Stratosphere software is already running on the platform. If such a software is already configured and running, just set the config directory path to the DDP engine config parameter for the sub-workflow in the bioActor, e.g., 'HadoopConf' in 'HadoopMapOnly' and select this sub-workflow for the 'Choice' parameter. If you have nodes available but Hadoop/Stratosphere is not running on them, you can use and configure the Hadoop/Stratosphere in bioKepler. For instance, you can change the configuration files in '$HOME/KeplerData/workflows/module/hadoop-1.0/tools/conf/'.   

How can I use bioKepler with related bioinformatics tools?

bioKepler includes many bioActors that execute bioinformatics tools. Before you can use a bioActor in a workflow, you must first download and install the bioinformatics tool, and put the binary in your $PATH. For example, you must install blastall before you can use the blastall bioActor. There are several bioActors that use applications and databases from Weizhong Li's group at UCSD; they can be downloaded here.

Is there documentation for bioKepler?

Yes, the bioKepler 1.0 User Guide describes bioActors, and DDP. For more general information about Kepler, see the Kepler Documentation page for the Getting Started Guide and User Manual.

VNC

How can I use VNC to connect to the bioKepler cloud image?   

See the Configuring VNC page.

How to avoid "d" keyboard shortcut to minimize VNC window?

Open Applications\System Tools\dconf Editor navigate to: org\gnome\desktop\wm\keybindings then change the "show-desktop" keybinding to "[]", without the quotations, and then restart the service.