Instructions for Finding and Starting the Amazon EC2 Image

This page contains the instructions for using the bioKepler AMI on Amazon EC2. You can also watch the video for these instructions.

  1. Sign in to your AWS account or create a new one if you don’t have one.

       To sign up, go to the following link and follow the instructions.


  1. If you do not already have a key pair, you'll need to create one and download the private key. This key pair is used make secure connections to your instances. The following link describes how to create a key pair.


  1. Create a security group to act as a firewall controlling inbound and outbound traffic. The following link describes how to create a security group.


  1. Open EC2 Management Console. Select US East region and select AMIs under the IMAGES section on left side.



  1. Search AMIs: Select Public Images and search by AMI name/ID bioKepler-1.0 20130926(ami-590d5930).



  1. Instantiate and Launch Instance: Select the image “bioKepler-1.0 20130926(ami-590d5930)” and click Launch. This will open Request Instances Wizard. Provide details for your instance(s) according to your specification. For demo purposes, set the minimal instance type as M1 medium and default options for all others. The wizard also navigates you to configure the instance with a key pair and a security group.



  1. Connect to an Instance: Go to Instances, you will see the recently launched instance in running state. Right click the instance and select Connect. The window will give you the IP address of the instance and a command line to connect to your instance. Open an SSH client and connect using the command. To connect to the instance, you need to use ubuntu user in place of root user.



  ssh -i /Users/bioKepler/biokepler-keypair.pem


  1. Hadoop requires password-less SSH access to manage its nodes. Set up authorization keys to be used by hadoop when ssh’ing to localhost.

  ssh-keygen -t rsa -P ""
  cat ~/.ssh/ >> ~/.ssh/authorized_keys
  ssh localhost

First time, you will warned that authenticity of host can't be established and you will be asked whether you really want to connect. Answer yes and press enter. This step was to confirm ssh to localhost is passwordless. Exit and proceed to next step.


  1. Before starting Hadoop for the first time, go to /opt/kepler/hadoop-1.0/workflows/tools and format the namenode by running:



  1. Run bioKepler Workflows from


    a) Commandline:

    Go to /opt/kepler and run bioKepler workflows. Example: below are the commands to run the blast workflow for different execution choices.

          Local Execution

      ./ -runwf -nogui /opt/kepler/biokepler-1.0/workflows/demos/Alignment/blast.xml


      ./ -runwf -nogui -blastall.control HadoopMapOnly /opt/kepler/biokepler-1.0/workflows/demos/Alignment/blast.xml


      ./ -runwf -nogui -blastall.control StratosphereMapOnly /opt/kepler/biokepler-1.0/workflows/demos/Alignment/blast.xml


    b) GUI:

    Read how to configure VNC on the EC2 instance

    After you connect GUI via VNC, you should see at the desktop with Kepler icon, double click to start Kepler (see below for screenshot).


  1. Stop/ Terminate Instance: Go to Instances page and locate the instance you want to stop/ terminate. Right click the instance, and then click Stop or Terminate. Terminating an instance will delete any data stored on it, and you cannot reconnect to an instance after termination. Stopping an instance means shutting it down and you can restart it later.



  1. Create an AMI from an Instance: You can customize an instance by installing software and applications, copying data or adding additional EBS volumes as per your requirement. You can create an image from this instance. Stop the instance before creating an image to guarantee file system integrity. Right click the instance, and select Create Image (EBS AMI).



       Fill in the requested information and select create image. For more details about creating images, see:


Problems? Let us know.