Anjusha Mune etal.
684 GENOME-WIDE COMPARATIVE ANALYSIS OF THE CODON USAGE PATTERN IN
FLAVIVIRIDAE
FAMILY BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS
et al., 2005; Gu et al., 2003; Wang et al., 2011). The
possible explanation of weak codon bias in RNA virus
is that a weak bias is helpful for ef cient replication of
virus in host cells. (Zhong etal., 2007)
MUTATION PRESSURE AFFECTS THE CODON
USAGE PATTERN
Mutational pressure and natural selection are considered
the two major factors that shape codon usage patterns
(Jenkins and Holmes, 2003). A general mutational pres-
sure, which affects the whole genome, would certainly
account for the majority of the codon usage among cer-
tain RNA viruses (Tatarinova et al., 2010). To identify
whether the evolution and variation pattern of codon
usage had been driven alone by mutation pressure or
also contributed by natural selection, we compared the
correlation between overall nucleotide composition (A,
U, C, G) and nucleotide composition at the third position
of codon (A
3
, U
3
, C
3
, G
3
) and correlation between overall
nucleotide composition (A, U, C, G, A
3
, U
3
, C
3
, G
3
) and
GC, GC
3
and ENC for individual genus using Pearson s
correlation [supplementary material ( Tables 2-3).
In genus Flavivirus GC and GC
3
show signi cant pos-
itive correlation with G(r=0.87, P<0.01) (r=0.80, P<0.01),
C (r=0.75, P<0.01) (r=0.80, P<0.01) and G
3
(r=0.86,
P<0.01) (r=0.85, P<0.01), C
3
(r=0.75, P<0.01) (r=0.82,
P<0.01).and negative correlation with A (r=-0.84,
P<0.01) (r= -0.85, P<0.01), U (r= -0.70, P<0.01) (r= -0.65,
P<0.01), and A
3
(r= -0.76, P<0.01), (r= -0.81, P<0.01),
U
3
(r= -0.69, P<0.01) (r= -0.68, P<0.01). ENC shows posi-
tive signi cant correlation with C (r= 0.73, P<0.01) and
C
3
(r= 0.70, P<0.01), and negative correlation with A (r=
-0.69, P<0.01) and A
3
(r= -0.65, P<0.01)
,
and non-sig-
ni cant correlation with U, G, and U
3.
A shows positive
correlation with A
3
, negative correlation with C
3
and G
3,
and non-signi cant correlation with U
3.
U shows signi -
cantly negative correlation with C
3
and G
3 ,
positive cor-
relation with U
3
, and non-signi cant correlation with
A
3.
G and C show signi cantly negative correlation with
A
3
and U
3
and signi cantly positive correlation with C
3
and G
3
. When we study correlation vector wise, tick born
and NKV viruses show signi cant correlation in com-
parison with mosquito borne viruses of the genus. In
genus Pestivirus an interesting and complex correlation
was observed.
To sum up, the GC, GC
3
and ENC have highly posi-
tive signi cant correlation with C (r=0.84, P<0.01)
(r=0.86, P<0.01) (ENC=0.87), C
3
(r=0.90, P<0.01) (r=0.94,
P<0.01) (ENC=0.89) and G (r=0.94, P<0.01) (r=0.91,
P<0.01) (ENC=0.74),G
3
(r=0.98, P<0.01) (r=0.97, P<0.01)
(ENC=0.82)
.
And signi cantly negative correlation with
A (r= -0.99, P<0.01) (r= -0.99, P<0.01) (ENC= -0.92),
A
3
(r= -0.97, P<0.01) (r= -0.99, P<0.01) (ENC= -0.90)
and U (r= -0.89, P<0.01) (r= -0.84, P<0.01), U
3
(r= -0.82,
P<0.01) (r= -0.80, P<0.01) (ENC= -0.66, P<0.05). A
3
and
U
3
show signi cantly positive correlation with A and
U, and signi cantly negative correlation with C and
G, whereas C
3
and G
3
show signi cantly negative cor-
relation with A and U and signi cantly positive cor-
relation with C and G. In genus Hepacivirus A
3
shows
positive correlation with A and has non-correlation with
U, C, G and GC and GC
3
. Similarly, A shows non-cor-
relation with U
3
, C
3
, G
3
and GC, GC
3
and ENC. GC and
GC
3
show signi cantly negative correlation with U (r=
-0.94, P<0.01) (r= -0.97, P<0.01) U
3
(r= -0.92, P<0.01)
(r= -0.96, P<0.01) and highly positive correlation with
G (r=0.95, P<0.01) (r=0.94, P<0.01), G
3
(r=0.96, P<0.01)
(r=0.97, P<0.01) and C (r=0.98, P<0.01) (r=0.95, P<0.01),
C
3
(r=0.96, P<0.01) (r=0.99, P<0.01). ENC of Hepacivirus
show non-correlation with A, U, C, G, U
3,
G
3.
In genus Pegivirus the GC and GC
3
show signi cantly
positive correlation with G (r=0.83, P<0.01) (r=0.83,
P<0.01), C (r=0.84, P<0.01) (r=0.71, P<0.05), C
3
(r=0.82,
P<0.01) (r=0.84, P<0.01) and signi cantly negative cor-
relation with U (r= -0.77, P<0.05) (r= -0.91, P<0.01)
and A
3
(r= -0.89, P<0.01) (r= -0.73, P<0.05). ENC have
highly signi cant correlation with A, U, C, G, A
3
, C
3
and
non-correlation with U
3
and G
3.
A shows signi cant cor-
relation with A
3
but does not show signi cant correla-
tion with U
3
, C
3 ,
G
3
. U shows signi cant correlation with
U
3
, C
3
and G
3
, and non-signi cant correlation with A
3.
C
show signi cant correlation with A
3
and C
3
and non-
signi cant correlation with U
3
and G
3
. G shows signi -
cant correlation with G
3
and non-signi cant correlation
with A
3,
U
3,
and C
3.
The members of unclassi ed group
do not show signi cant correlation with other nucleo-
tides, they show signi cant positive correlation with the
same type of nucleotide like A show positive correla-
tion with A
3
. The GC and GC
3
show positive correlation
with C (r=0.73, P<0.01) (r=0.62, P<0.05) and G (r=0.81,
P<0.01) (r=0.67, P<0.01) and negative correlation with
A (r=-0.75, P<0.01) (r=-0.69, P<0.01) and U (r=-0.78,
P<0.01) (r=-0.59, P<0.05). ENC does not show signi -
cant correlation with any nucleotide. This analysis col-
lectively indicates that mutational pressure is most likely
responsible for the patterns of nucleotide composition
and, therefore, codon usage patterns in all four genus of
Flaviviridae family.
VARIATION OF RELATIVE SYNONYMOUS
CODON USAGES IN FLAVIVIRIDAE FAMILY
In order to investigate the extent of codon usage bias in
aviviridae family, all RSCU values of different codons
in genus Flavivirus (69), Hepacivirus (14), Pegivirus (8),
Pestivirus (11) and unclassi ed members (12) were cal-
culated. The heat map [supplementary material Fig. 1]