GAG-POL Polyprotein

HIV POL encodes the viral enzymes protease, reverse transcriptase, and integrase. The enzymes are produced as a GAG-POL precursor polyprotein, which is processed by viral protease.






Protease - p15
Reverse Transcriptase - p51
Reverse Transcriptase and RNase H - p66
RNase H - p15
Integrase - p31



ViralZone:HIV-1
PDB: none available
SwissProt: P04585 (HIV-1 HXB2 Pol)
Chime Tutorial: not available
Los Alamos HIV structure DB: not available
EMBL: K03455 [EMBL/GenBank/DDBJ]
BioAfrica: Pol Protein Data Mining Tool

Isoforms:

  • GAG-POL Polyprotein (1003 amino acids)
  • p160 - GAG-POL Polyprotein

Cleavage site:

Localization:

  • Virion
  • Cell cytoplasm

Function:

  • precursor for viral enzymes
  • during viral maturation, viral protease cleaves the Pol polyprotein away from Gag, and further digests it to separate it into 4 proteins: protease, reverse transcriptase, RNase H, and Integrase

Additional Information:

  • All of the pol gene products can be found in the capsid of the virion
  • the Gag-Pol precursor is generated by a ribosome frameshift at the C-terminus of GAG (Ref. #4 & #5)
  • the ribosome frameshift is triggered by a specific cis-acting RNA motif (Ref. #4 & #5)
  • the cis-acting RNA motif consists of a heptanucleotide sequence followed by a short stem-loop in the distal region of the GAG RNA
  • without interrupting translation, the ribosome shifts to the pol reading frame ~5% of the time that the cis-acting RNA motif is encountered
  • the frequency of ribosomal frameshifting coincides with the 20:1 ratio of Gag to Gag-Pol precursors
  • Protease cleavage does not occur efficiently, and 50% of the Reverse Transcriptase protein remain covalently associated to RNase H

Genomic Location: [TOP]

Reference Sequences:

HIV-1 (HXB2):

          10         20         30         40         50         60         70
| | | | | | |
FFREDLAFLQ GKAREFSSEQ TRANSPTRRE LQVWGRDNNS PSEAGADRQG TVSFNFPQVT LWQRPLVTIK
80 90 100 110 120 130 140
| | | | | | |
IGGQLKEALL DTGADDTVLE EMSLPGRWKP KMIGGIGGFI KVRQYDQILI EICGHKAIGT VLVGPTPVNI
150 160 170 180 190 200 210
| | | | | | |
IGRNLLTQIG CTLNFPISPI ETVPVKLKPG MDGPKVKQWP LTEEKIKALV EICTEMEKEG KISKIGPENP
220 230 240 250 260 270 280
| | | | | | |
YNTPVFAIKK KDSTKWRKLV DFRELNKRTQ DFWEVQLGIP HPAGLKKKKS VTVLDVGDAY FSVPLDEDFR
290 300 310 320 330 340 350
| | | | | | |
KYTAFTIPSI NNETPGIRYQ YNVLPQGWKG SPAIFQSSMT KILEPFRKQN PDIVIYQYMD DLYVGSDLEI
360 370 380 390 400 410 420
| | | | | | |
GQHRTKIEEL RQHLLRWGLT TPDKKHQKEP PFLWMGYELH PDKWTVQPIV LPEKDSWTVN DIQKLVGKLN
430 440 450 460 470 480 490
| | | | | | |
WASQIYPGIK VRQLCKLLRG TKALTEVIPL TEEAELELAE NREILKEPVH GVYYDPSKDL IAEIQKQGQG
500 510 520 530 540 550 560
| | | | | | |
QWTYQIYQEP FKNLKTGKYA RMRGAHTNDV KQLTEAVQKI TTESIVIWGK TPKFKLPIQK ETWETWWTEY
570 580 590 600 610 620 630
| | | | | | |
WQATWIPEWE FVNTPPLVKL WYQLEKEPIV GAETFYVDGA ANRETKLGKA GYVTNRGRQK VVTLTDTTNQ
640 650 660 670 680 690 700
| | | | | | |
KTELQAIYLA LQDSGLEVNI VTDSQYALGI IQAQPDQSES ELVNQIIEQL IKKEKVYLAW VPAHKGIGGN
710 720 730 740 750 760 770
| | | | | | |
EQVDKLVSAG IRKVLFLDGI DKAQDEHEKY HSNWRAMASD FNLPPVVAKE IVASCDKCQL KGEAMHGQVD
780 790 800 810 820 830 840
| | | | | | |
CSPGIWQLDC THLEGKVILV AVHVASGYIE AEVIPAETGQ ETAYFLLKLA GRWPVKTIHT DNGSNFTGAT
850 860 870 880 890 900 910
| | | | | | |
VRAACWWAGI KQEFGIPYNP QSQGVVESMN KELKKIIGQV RDQAEHLKTA VQMAVFIHNF KRKGGIGGYS
920 930 940 950 960 970 980
| | | | | | |
AGERIVDIIA TDIQTKELQK QITKIQNFRV YYRDSRNPLW KGPAKLLWKG EGAVVIQDNS DIKVVPRRKA
990 1000
| |
KIIRDYGKQM AGDDCVASRQ DED
[download in fasta format]

Length: 1003 amino acids
Molecular Weight: 113779 Da


Protein Domains/Folds/Motifs: [TOP]

p15 - Protease (99 amino acids)
p51 - Reverse Transcriptase (440 amino acids)
p66 - RT and RNase H (560 amino acids)
p15 - RNase H (120 amino acids)
p31 - Integrase (288 amino acids)


Secondary Structure prediction:

Low Complexity Region - seg:


Antigenic Sites - EMBOSS:

Predicted Motifs: Printer-friendly version

N-glycosylation:
N-myristoylation:
Amidation:
Protein kinase C:
Casein kinase II:
Tyrosine kinase:
cAMP / cGMP kinase:
Cell attachment motif:
Asp Protease motif:
Asp Prot Retro motif:
Cysteine-rich Region:
Tryptophan-rich Region:
Zinc-finger CCHC motif:
Leucine Zipper motif:

Protein-Protein Interactions: [TOP]



Primary and Secondary Database Entries: [TOP]

Identifiers:

ViralZone: HIV-1
PDB/MMDB: Search for HIV-1 & POL

SwissProt: P04585 (HIV-1 HXB2 Pol)
EMBL: K03455; AAB50259.1 [EMBL/GenBank/DDBJ]

PIR: UNKNOWN
HIV: K03455; POL$HXB2
MEROPS: A02.001
InterPro: IPR000477 - RNA-directed DNA polymerase (RT) family / IPR001037 - Integrase C-terminal family
IPR001584 - Integrase catalytic domain / IPR001969 - Eukaryotic/viral aspartic protease active site
IPR001995 - Retroviral Aspartic Protease family/ IPR002156 - RNase H domain
IPR003308 - Integrase N-terminal zinc-binding domain / IPR009007 - Acid Protease domain
Pfam: PF00078 - RVT / PF00665 - RVE / PF00077 - RVP / PF00075 - RNase H / PF00552 - Integrase
PF02022 - Integrase Zinc-binding
Prints: none
ProDom: PD186096 (residues 13 - 72) / PD000261 (residues 156 - 217) / PD580497 (residues 184 - 227) /
PD492067 (residues 218 - 260) / PD404869 (residues 218 - 285) / PD000379 (residues 261 - 303) /
PD513590 (residues 276 - 316) / PD474846 (residues 294 - 389) / PD000698 (residues 390 - 451) /
PD495523 (residues 462 - 593) / PD390352 (residues 589 - 705) / PD000727 (residues 594 - 661) /
PD416714 (residues 664 - 704) / PD582846 (residues 675 - 712) / PD685225 (residues 676 - 705) /
PD502558 (residues 699 - 758) / PD000915 (residues 716 - 770) / PD000348 (residues 771 - 926) /
PD000723 (residues 934 - 985) / PD371748 (residues 940 - 981)
SCOP: SSF56672 - DNA/RNA polymerase / SSF50630 - Acid protease / SSF53098 - RNase H-like protein
SSF46919 - Integrase N-terminal Zn-binding domain / SSF50122 - Integrase C-terminal DNA-binding domain
BLOCKS: P04585
Prosite: P04585
ProtoNet: P04585
ProtoMap: P04585
PRESAGE: P04585
Database of Interacting Proteins: P04585
ModBase: P04585
Swiss-2DPAGE: 2D gel

BioAfrica Tools:
- Pol Protein Data Mining Tool provides real-time analysis of HIV-1 Pol isolates
- HIV Structure BLAST searches for similar HIV sequences that have known structures
- HIV Proteomics Resource contains protein sequence and structure analysis tools
Reviews and References: [TOP]

Cite the resource by citing the following paper:
Doherty R et al. BioAfrica's HIV-1 Proteomics Resource: Combining protein data with bioinformatics tools. Retrovirology (2005), 9;2(1):18.

1 - HIV Sequence Compendium 2000
Kuiken CL, Foley B, Hahn B, Korber B, Marx PA, McCutchan F, Mellors JW, Mullins JI, Sodroski J, Wolinksy S.
Theoretical Biol. & Biophys. Group, Los Alamos Nat Lab, LA-UR 01-3860 [Read it online: Compendium]
2 - Retroviruses
Coffin JM, Hughes SH, Varmus HE.
CD-ROM ed. (2002) Cold Spring Harbor Laboratory Press [Read it online: NCBI Bookshelf]
3 - Molecular Characteristics of HIV-1 Subtype C Viruses from KwaZulu-Natal, South Africa:
Implications for Vaccine and Antiretroviral Control Strategies.
Gordon M, De Oliveira T, Bishop K, Coovadia HM, Madurai L, Engelbrecht S, Janse van Rensburg E, Mosam A, Smith A, Cassol S.
Journal of Virology 77(4): 2587-2599 (2003) [pubmed: 12551997]
4 - Characterization of ribosomal frameshifting in HIV-1 Gag-Pol expression.
Jacks T, Power MD, Masiarz FR.
Nature 331: 280-283 (1988) [pubmed: 2447506]
5 - Human immunodeficiency virus type 1 gag-pol frameshifting is dependent on
mRNA secondary structure: Demonstration by expression in vivo.
Parkin NT, Chamorro M, Varmus HE.
J Virol 66: 5147-5151 (1992) [pubmed: 1321294]



Page last updated by Tulio de Oliveira.