SRA to fastq


How to extract Illumina format files from the SRA archives:

 

1.  Download the SRA toolkit from  http://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?       Choose the MacOS 64 bit architecture version.  This will create a folder called sratoolkit.2.0.0rc1-mac64, which you can put anywhere on your file system you like (e.g., the desktop).

2.  Open a terminal window and cd to the SRA toolkit folder.

3.  Copy the SRA file(s) you want to convert to the SRA toolkit folder (not exactly necessary, but simpler).

4.  type:
     ./illumina-dump --table-path ./mySRAfile --outdir yourOutputDirectory -qseq 1

mySRAfile is whatever file you want to convert.
The output directory does not need to be created before issuing this command.

This will generate multiple .qseq files.

 

To convert these files into fastq you need to:

 

Copy this perl script to the directory where the qseq.txt files produced by the sratoolkit are located.

To convert all the files and concatenate the output into a single file:

cat SRR065822_7_0???_qseq.txt | perl qseq2fastq.pl > s_7.fastq

Depending on which data files you start with, you will alter SRR065822_7_0???_qseq.txt appropriately (the ? is a wild card character).  
The s_7.fastq name can be whatever you want to be.

At this point, you should have a fastq file that can be used by bowtie.