PhD student Turmagambetova A., prof. Ivashchenko A.

al-Farabi Kazakh National University, Almaty, Kazakhstan

HYDROPATHY OF EXONS, INTRONS AND SPLICING SITES

IN HUMAN ONCOGENES

 

Due to computer methods there is an opportunity to find out and to predict many biochemical and molecular-genetic properties of nucleotide DNA sequences. The interest to study of genes exon-intron organization has increased in connection with data accession about new properties and functions of introns /1/. The presence of regulating sites of genes expression in introns was shown. The introns role in increase of a variety of products of genes expression by alternative splicing was demonstrated. Fairly often wrong splicing results in pathologies, which represents the large interest from the practical point of view. For example, establishment hydropathy of splice sites and prediction of alternative splicing for genes responsible for gastrointestinal tract cancer development is important.

Methods

Nucleotide sequence of genes was extracted from GenBank (http://www.ncbi.nlm.nih.gov/). Prediction of alternative splicing is based on exon-intron hydropathy determination. In order to distinguish biochemical properties of the pre-mRNA of genes responsible for gastrointestinal tract cancer we define a general hydropathy profile as described below. A hydropathy coefficient HC(N) may be associated with each base N. Coefficients provided in /2/, are HC(A) = -1.07, HC(C) = -0.76, HC(G) = -1.36 and HC(U) = -0.76. We compute an average hydropathy value for each exon and intron as follows. For each base, its number of occurrences in the exon/intron is multiplied by its hydropathy coefficients. Summing over all the bases yields the average hydropathy value.

We used the nucleotides in positions -8 to +30 at the 5’ splice sites and nucleotides in positions -35 to +8 at the 3’ splice sites to build hydropathy profile of the splice sites of different intron types. For each intron set and each position an average hydropathy value is computed. Hydropathy coefficients are associated to each base. Average hydropathy value has counted by multiplying each coefficient by the occurrence at the sets at the given position.

Results

We have determined that hydropathy of exon and intron depends on their nucleotide composition. Usually exons are more hydrophilic, than introns. Thus, introns mostly form a pre-mRNA core while exons are mainly located on surface of molecule. Example of several genes responsible for gastrointestinal tract cancer development shows the difference of exons and introns hydropathy, for genes without alternative splicing and having some alternative products. It was established that in genes without alternative splicing hydropathy of exons lower, than that of introns. It is depicted in Figure 1 that exons have negative hydropathy and introns have positive one. The genes with alternative splicing have sites with different hydropathy of exons and introns. For these genes there are sections where hydropathy of exons and introns differs insignificantly (Figure 1). On the basis of these data it is possible to assume, that places with an insignificantly differ hydropathy of exons and introns are alternatively spliced.

The protein encoded by CD44 gene is a cell-surface glycoprotein involved in cell-cell interactions, cell adhesion and migration. It is a receptor for hyaluronic acid and can also interact with other ligands, such as osteopontin, collagens, and matrix metalloproteinases. This protein participates in a wide variety of cellular functions including lymphocyte activation, recirculation and homing, hematopoiesis, and tumor metastasis. 17 named isoforms of this alternative splicing of CD44 gene and the majority of isoforms may be related to tumor metastasis. In some cases alternative splicing results in an alternative translation frame or premature termination of translation /3-5/. For pre-mRNA of CD44 gene exons 3, 4, 5 and 17 are alternatively spliced (Figure 1). Hydropathy of 6, 7, 8, 9, 10, 11, 13, 14 exons are precisely differs from hydropathy of adjacent introns. These exons consist of strongly hydrophilic extracellular domain. The choice of these splice sites is carried out through special positive or negative trans-acting factors. 

 

Figure 1. Hydropathy profiles of exons and introns of genes without splicing (MCC, IL1B) and alternatively splicing genes (CD44, WWOX); õ– axis – exons (–♦– numbered) and introns (–▲–), ó-axis – base hydropathy coefficients.

Negative regulators would prevent the inclusion of variant exon sequences and thus give rise to the expression of the standard form of CD44 (CD44s) in most tissues. Alternatively, choice of a specific splice isoforms could be achieved by induced positive regulators that select the corresponding variant exon combination. In tumor cells, altered splicing patterns would then by caused by the up-regulation of such positive factors. 

Gene WWOX encodes a protein which contains 2 WW domains and a short dehydrogenase/reductase domain (SRD). WW domain-containing proteins are found in all eukaryotes and play an important role in the regulation of a wide variety of cellular functions such as protein degradation, transcription, and RNA splicing. The highest normal expression of this gene is detected in hormonally regulated tissues such as testis, ovary, and prostate. This expression pattern and the presence of an SRD domain suggest a role for this gene in steroid metabolism. The encoded protein is more than 90% identical to the mouse protein, which is an essential mediator of tumor necrosis factor-alpha-induced apoptosis, suggesting a similar, important role in apoptosis for the human protein. In addition, there is evidence that this gene behaves as a suppressor of tumor growth /6, 7/. It is depicted in Figure 1 that alternatively spliced only 3’ end of pre-mRNA (exons 6 and 7). According to previous works by others all alternative variants of proteins differ only on 3' end. This study suggests that it is possible to predict places of alternative splicing by given method. This method is insufficiently if gene has special elements that regulate alternative splicing.

Introns are more hydrophobic, contact to each other and form core of pre-mRNA molecule. Mostly hydrophilic exons located predominantly on surface of molecule as well as the 5' splice sites and the 3' splice sites that facilitate interaction and ligation of exons. Hence, it is possible to assume, that the hydropathy profiles of the 5' splice sites and the 3' splice sites should be similar. We define hydropathy profiles of all splice sites of gene. We used the nucleotides in position -8 to +8 at the 5’ splice sites and nucleotides in positions +8 to -8 at the 3' splice sites. The average hydropathy value of all splice cites is computed to build hydropathy profile of the splice sites of gene. Further the difference of hydropathy profiles of 5' splice sites and 3' splice sites is shown by the example of several genes responsible for gastrointestinal tract cancer development. We point out that the 3' end of intron always more hydrophobic, than other parts of the 5' splice sites and 3' splice sites.

Except for introns containing a conservative sequence on 5' splice sites and 3' splice sites, there is a small group of introns with other terminal dinucleotides: AT-AC introns. There are U12 snRNP - dependent introns which are excised by spliceosomes containing U4atac, U6atac, U5, U11 and U12 snRNPs. It is the minor class of introns the quantity of the AT-AC introns on two orders is less, than usual introns. The molecular mechanism of splicing is based on complementary binds and hydropathy interactions of pre-mRNA nucleotides and snRNA. These interactions play a significant role in splicing catalysis. In order to distinguish biochemical properties of the splice sites of genes responsible for gastrointestinal tract cancer we have defined a general hydropathy profile as described in methods. There are at least two classes of splice sites on human genome which differ in nucleotide consensus. There are two types of splice sites for U2 (GT-AG and GC-AG) and U12 (GT-AG and AT-AC) snRNA. 

Spliceosoma apparatus is highly conservative that is why conservative regions must exist near splice sites to provide signals for splice site recognition. Due to the degeneracy of motives for U2 (U12)-type introns it is clearly that not always nature of spliceosoma – pre-mRNA interactions is simple based on mechanism of complementary binds. Participation of more than 200 proteins in splicing also confirms this idea. We suggest that recognition RNA by spliceosoma is based on hydropathy interactions, which provide interactions between pre-mRNA and snRNA as well as between pre-mRNA and snRNP.

Figure 2 depicts hydropathy profiles of splice sites of three genes responsible for gastrointestinal tract cancer development (CD44, KRAS and DLEC1). We determine that splice sites can essentially differ within one gene and even concern to different types of introns.

           

Figure 2. Hydropathy profile of the splice sites of U2 and U12- dependent introns for CD44, DLEC1 and KRAS genes; x-axis – the numbers of nucleotides, y-axis – hydropathy values, à.ñ. – alternative variant.

 

As it is seen from the Figure 2 advantage of a choice of 3' splice site remains at isoforms, that code normally functioning protein, for gene CD44 as well as for gene KRAS.

The character of splicing is control by proteins which interact with pre-mRNA in ends of introns or exon-intron boundary sites. Such proteins can block excision of some introns at the same time activating an excision others one. Pre-mRNA forms tertiary structure to cooperate with other components of spliceosoma. The splicing regulating proteins could exclude catalysis of ligation of one pair of exons and will promote effectively ligation of other one. Directional regulation of splicing ways could be used in molecular medicine.

References:

1. Nott, A., Meislin S.H., Moore M.J. (2003) A quantitative analysis of intron effect on mammalian gene expression. RNA. 9, 607-617.

2. Guckian K.M., Schweitzer B.A., Ren R., X.-F., Sheils C.J., Tahmassebi D.C., Kool E.T. (2000) Factors conributing to aromatic stacking in water: evaluation in context of DNA. J. Am. Chem. Soc., 122, 2213-2222.

3. Chen J., Zhan W., He Y., Peng J., Wang J., Cai Sh., Ma J. (2004) Expression of heparanase gene, CD44v6, MMP-7 and nm23 protein and their relationship with the invasion and metastasis of gastric carcinomas. World J Gastroenterol, 15, 776-782.

4. Liu Y., Yan P., Li J., Jia J. (2005) Expression and significance of CD44s, CD44v6, and nm23 mRNA in human cancer. World J Gastroenterol, 14, 6601-6606.

5. Cheng Ch., Sharp Ph. (2006) Regulation of CD44 alternative splicing by SRm160 and its potential role in tumor cell invasion. Molecular and cellular biology, 26, 362-370.

6. Aqeilan R., Kuroki T., Pekarsky Y., Albagha O., Trapasso F., Baffa R., Huebner K., Edmonds P., Croce C. (2004) Loss of WWOX Expression in Gastric Carcinoma. Clinical Cancer Research, 10, 3053-3058.

7. Ishii H., Mimori K., Inageta T., Murakumo Y., Vecchione A., Mori M., Furukawa Y. (2005) Components of DNA Damage Checkpoint Pathway Regulate UV Exposure–Dependent Alterations of Gene Expression of FHIT and WWOX at Chromosome Fragile Sites. Molecular Cancer Research, 3, 130-138.