Distributed Data Parallel PFAM Annotation Workflow
This workflow performs function annotation by using HMMER 3.0 program on PFAM database.
Protein sequences in FASTA format
-E e-value cutoff for prediction (default value=0.001)
PFAM database 24.0
output.1: Table of hmmer hits
output.2: Table of GO mapping
output.3: Table of EC mapping
1). select input file (protein sequences in FASTA format).
2). select appropriate E-value for HMMER.
3). check results if the output format is ok
Yes, it needs parallelization to speed up the run time. By splitting input sequences into smaller pieces, it can speed up by running HMMER run multiple nodes.