The program "get_repeats.py" helps to identify repeats from the results of reads mapping to the genome aseembly, it considers regions with coverage four times of the average to be repeats in the genome.

Prepare the input:
The input file should be the number of reads mapped to evey non-overlaping 100bp windows in the genome. 
Each line in the input file shows the length of and number of reads mapped to that window
These regions should be listed in the order as they show up in the genome.
Example is "reads_in_window"

One line in the input file is like:
scaffold1_cov26:1-101   100     30.8
the first field is the name of the scaffold and the range of the window, the second field is the length of this window and the third one shows the number of reads mapped to this window.

The number of reads mapped to one window can be obtained with command like:
samtools view ../[sorted_alignments].bam scaffold1_cov26:1-101 | wc -l
Where the file [sorted_alignments].bam is the alignment between reads and the reference genome made by BWA or Bowtie and processed by samtools.

Run the program:
python merge_scaf.py reads_in_window

Output:
A fasta format file with the sequences of repeats identified in the genome.
