Iterative Learning for Reference-Guided DNA Sequence Assembly from Short Reads: Algorithms and Limits of Performance

Recent emergence of next-generation DNA sequencing technology has enabled acquisition of genetic information at unprecedented scales. In order to determine the genetic blueprint of an organism, sequencing platforms typically employ the shotgun sequency strategy to oversample the target genome with a library of relatively short overlapping reads. The order of nucleotides in the reads is determined by processing the acquired noisy signals generated by the sequencing instrument.

