Frameshift repair & Shiftability

Additional Supplementary Materials for the preprint/Manuscript:

Why frameshift homologs widespread within and across species?

Xiaolong Wang, Quanjiang Dong, Gang Chen, Jianye Zhang,Yongqiang Liu, Jinqiao Zhao, Haibo Peng, Yalei Wang, Yujia Cai,Xuxiang Wang, Chao Yang,  Why are frameshift homologs widespread within and across species?, BioRxiv, August 25, 2016, doi:

Dear Editors/Reviewers,

Thank you for reviewing our manuscipt!

The following Additional Supplementary Materials were used in this study, we list them here in this page, as they cannot be submitted to the online submission system.

1.  Java/Perl programs for frameshift analysis:

2. UCSC custom tracks in BED format

Human frameshift

3. Supplementary dataset in .xlsx format (File size is smaller, and better than the .xls files, which is required in the online submission system)

Supplementary Dataset 1-Frameshift homologs

Supplementary Dataset 2-FrameshiftSimilarity

Supplementary Dataset 3-Frameshift Substitutions Scores

Supplementary Dataset 4-Codon usage and their FSSs

Supplementary Dataset 5-Codon Pair usage and their FSSs

4. Relevant Preprint:

An earlier preprint of this manuscript was posted in the PeerJ PrePrints. We received many valuable comments and suggestions through email correspondences, conferences and discussions, and improved this research work and the manuscript greatly.

Wang Xiaolong, Wang Xuxiang, Chen Gang, Zhang Jianye, Liu Yongqiang, Yang Chao. (2015The shiftability of protein coding genes: the genetic code was optimized for frameshift tolerating. PeerJ PrePrints 3e806v1

The main points of the preprint:

(1)  A frameshifted protein sequence is always highly similar to the wild-type.

(2)  The similarity between a wild-type and a frameshifted protein sequence is predefined mainly by the genetic code.

(3) A model of reading frame restoration for the repairing of frameshifted genes.

The difference between the preprint and the submitted manuscript:

The present submission is an update of the preprint. We were advised that the reading frame restoration model is independent on the theory of shiftability, and the evidences are not strong enough to support it, so it was removed in this version.

In the submitted manuscript, the following points were added:

(1)    Frameshift homologs, including frameshift orthologs and frameshift paralogs, widespread within a genome and across species.

(2)    The frameshift tolerating ability of the natural genetic code ranks in the best ~6% in all compatible genetic codes.

(3)    The genetic code is symmetric in frameshift tolerating ability.

(4)    Frameshift-tolerable codon pairs are used more frequently in genomes, and sequence-level shiftability is achieved by a biased usage of codon pairs.