PSI Structural Biology Knowledgebase

PSI | Structural Biology Knowledgebase
Header Icons

Related Articles
Design and Evolution: Molecular Sleuthing Reveals Drug Selectivity
June 2015
Families in Gene Neighborhoods
June 2015
Ryanodine Receptor
April 2015
CCR5 and HIV Infection
January 2015
Drug Targets: Bile Acids in Motion
September 2014
Drug Targets: S1R's Ligands and Partners
September 2014
P2Y Receptors and Blood Clotting
September 2014
Bacterial CDI Toxins
June 2014
Glucagon Receptor
April 2014
March 2014
Microbial Pathogenesis: Targeting Drug Resistance in Mycobacterium tuberculosis
February 2014
Design and Discovery: Virtual Drug Screening
January 2014
Cancer Networks: IFI16-mediated p53 Activation
November 2013
G Proteins and Cancer
November 2013
Drug Discovery: Antidepressant Potential of 6-NQ SERT Inhibitors
October 2013
Drug Discovery: Finding Druggable Targets
October 2013
Drug Discovery: Identifying Dynamic Networks by CONTACT
October 2013
Drug Discovery: Modeling NET Interactions
October 2013
Membrane Proteome: GPCR Substrate Recognition and Functional Selectivity
August 2013
Infectious Diseases: Determining the Essential Structome
May 2013
NDM-1 and Antibiotics
May 2013
Microbial Pathogenesis: Computational Epitope Prediction
January 2013
Microbial Pathogenesis: Influenza Inhibitor Screen
January 2013
Microbial Pathogenesis: Measles Virus Attachment
January 2013
Cytochrome Oxidase
November 2012
Membrane Proteome: The ABCs of Transport
November 2012
Bacterial Phosphotransferase System
October 2012
Regulatory insights
September 2012
Solute Channels
September 2012
Pocket changes
July 2012
Receptor bias
July 2012
Anthrax Stealth Siderophores
June 2012
G Protein-Coupled Receptors
May 2012
Substrate specificity sleuths
April 2012
Reading out regioselectivity
December 2011
Superbugs and Antibiotic Resistance
December 2011
Terminal activation
December 2011
A change to resistance
November 2011
Docking and rolling
October 2011
Breaking down the defenses
September 2011
A2A Adenosine Receptor
May 2011
Cell wall recycler
May 2011
Subtly different
March 2011
January 2011
Subtle shifts
January 2011
ABA receptor diversity
November 2010
COX inhibition: Naproxen by proxy
November 2010
Zinc Transporter ZntB
July 2010
Peptidoglycan binding: Calcium-free killing
June 2010
Treating sleeping sickness
May 2010
Bacterial spore kinase
April 2010
Antibiotics and Ribosome Function
March 2010
Safer Alzheimer's drugs?
March 2010
Anthrax evasion tactics
September 2009
GPCR subunits: Separate but not equal
September 2009
Antibiotic target
August 2009
Salicylic Acid Binding Protein 2
August 2009
July 2009
Tackling influenza
June 2009
Bacterial Leucine Transporter, LeuT
May 2009
Anthrax stealth molecule
March 2009
Drug targets to aim for
February 2009
High-energy storage system
February 2009
Transporter mechanism in sight
February 2009
Scavenger Decapping Enzyme DcpS
November 2008
Blocking AmtB
September 2008

Research Themes Drug discovery

Families in Gene Neighborhoods

SBKB [doi:10.1038/sbkb.2015.20]
Technical Highlight - June 2015
Short description: A bioinformatics strategy takes advantage of the proximal organization of genes encoding proteins involved in metabolic pathways to predict protein function.

Sequence similarity networks for the proline racemase superfamily, displayed for genes with 35% sequence identity. The identified clusters are color coded. Figure from reference 1 .

As sequencing data accumulate, effective approaches are needed to decipher functions of the enzymes encoded within those genomes. For organisms such as eubacteria and archaea, genes encoding enzymes and other proteins involved in the same metabolic pathway often cluster together in operons. Taking advantage of the localization in such clusters or gene neighborhoods, the groups of Jacobson, Gerlt and Almo (PSI NYSGRC) developed a new bioinformatics approach to predict in vitro activities of the encoded proteins as well as their metabolic functions in cells.

Using this strategy genome neighborhood networks (GNNs) they analyzed 2,333 unique sequences encoding proteins in the proline racemerase superfamily. The authors constructed a sequence similarity network in which varying thresholds can be set that correlate to distinct sequence identity levels; in this study, 35% and 60% cutoffs were used. The simultaneous query of all sequences results in amplification of genes for functionally related proteins; importantly, if genes for unrelated proteins occur within these neighborhoods in some species, those signals will be eliminated as noise using such analysis. For this reason, the authors suggest that this large-scale, aggregate approach is more efficient for the identification of proteins involved in metabolic pathways compared to single-genome analyses. The GNN approach predicted function for >85% of the proteins, which the authors verified by measuring in vitro enzyme activity, by assaying phenotypes and using transcriptomics as well as X-ray crystallography.

For more complex superfamilies, information from multiple sources will need to be integrated. For example, when bacterial genes are located in polycistronic transcriptional units, that information can be combined to identify pathways and predict enzyme function.

Irene Kaganman


  1. S. Zhao et al. Prediction and characterization of enzymatic activities guided by sequence similarity and genome neighborhood networks.
    eLlife. 3 (2014). doi:10.7554/eLife.03275

Structural Biology Knowledgebase ISSN: 1758-1338
Funded by a grant from the National Institute of General Medical Sciences of the National Institutes of Health