Using CAP3 on Big Red at IU
CAP3 is a sequence assembly program. For a brief introduction, see the CAP3 home page. Sample input and output files are also available. At Indiana University, CAP3 is available on Big Red.
Note: The related program PCAP is for large-scale assembly of genomic sequences with quality values and with or without forward-reverse read pairs. For more, see Using PCAP on Big Red at IU.
On this page:
Request a Big Red account
Access to Big Red is provided to all IU faculty and graduate students, and faculty-sponsored undergraduates and staff. Instructional use is limited to courses that have been approved by the Director for Research Technologies. If you use Big Red, you need to know the Big Red usage policies.
To request a Big Red account, use the Account Management System (AMS); see At IU, if I already have some computing accounts, how do I get others? For more, see Getting started on Big Red.
Send in a license agreement
CAP3 is restricted-use software. If you are at IU and want to use this package, read the license agreement and email Bioinformatics Support agreeing to the license terms and requesting the use of the CAP3 package on Big Red.
Set up SoftEnv and submit jobs
- To set up SoftEnv, edit your
.softfile to add the line+cap3and then execute theresoftcommand.
- Read the instruction file
/N/soft/linux-sles9-ppc64/CAP3-64/bin/doc.
- To see the options for CAP3, enter
cap3. You will see something like the following: VersionDate: 04/15/05 SizeOfLong: 4 Usage: cap3 File_of_reads [options] File_of_reads is a file of DNA reads in FASTA format If the file of reads is named 'xyz', then the file of quality values must be named 'xyz.qual', and the file of constraints named 'xyz.con'. Options (default values): -a N specify band expansion size N > 10 (20) -b N specify base quality cutoff for differences N > 15 (20) -c N specify base quality cutoff for clipping N > 5 (12) -d N specify max qscore sum at differences N > 20 (200) -e N specify clearance between no. of diff N > 10 (30) -f N specify max gap length in any overlap N > 1 (20) -g N specify gap penalty factor N > 0 (6) -h N specify max overhang percent length N > 2 (20) -m N specify match score factor N > 0 (2) -n N specify mismatch score factor N < 0 (-5) -o N specify overlap length cutoff > 20 (40) -p N specify overlap percent identity cutoff N > 65 (80) -r N specify reverse orientation value N >= 0 (1) -s N specify overlap similarity score cutoff N > 400 (900) -t N specify max number of word matches N > 30 (300) -u N specify min number of constraints for correction N > 0 (3) -v N specify min number of constraints for linking N > 0 (2) -w N specify file name for clipping information (none) -x N specify prefix string for output file names (cap) -y N specify clipping range N > 5 (250) -z N specify min no. of good reads at clip pos N > 0 (3) - If your job will run for fewer than 20 minutes, call CAP3 in the
directory where your input files are.
If your program will take more than 20 minutes to run, use the
serialjobscript to submit a batch job. To learn how to use theserialjobcommand, enterman serialjobor see On Big Red at IU, how do I use the serialjob script to submit jobs? If the program will use large amounts of memory while it runs, use the-memoryoption for theserialjobscript.
Output that CAP3 generates will be in the same directory as your input files.
Also see:
Last modified on December 18, 2008.






