NCGR :: Nat'l Center for Genome Resources
About NCGR Our Work Jobs at NCGR Contact Us Search NCGR Support
 


  Link to our




 
Schizophrenia

GEYSIR

CAPSOD

NM-INBRE

TB Archive


 
 LIS

 VPIN

Phytophthora
Studies



 
VPIN

Sequence
Analysis


Comparative Map
and Trait Viewer


Genomic Explorer

Integrated
Software Systems



 
NCGR
Outreach Program


Core Infrastructure
for New Mexico


Summer Internships

Educational Outreach





Our Work - Collaborative Projects and Scientific Software

The success of NCGR depends on collaborative research at the intersection of bioscience, computing and mathematics. Today at NCGR, our scientists and partners study the influence of genetic variability of both host and pathogen on infectious disease progression.  NCGR's software engineers develop scientific software solutions to support and enable those studies. A range of federal and state funded programs support our programs and projects in Human Health, Infectious Disease, Legume Crop Improvement, and Food Security.

Alpheus Software System
 
Alpheus Software System ALPHEUS

In 2006, NCGR commenced development of the Alpheus™ (N. Miller, Project Lead) web–based software system for analysis of data in massively parallel resequencing projects. Specifically Alpheus™ was designed for resequencing–based case–control association studies to identify the genetic basis of complex diseases and traits. Alpheus™ provides massively parallel sequence pipelining, visualization, analysis, and project management capabilities.

Alpheus™ provides dynamic queries and visualization of read data, variant data and results via an intuitive user interface. Alpheus™ reports sSNPs, nsSNPs, indels, premature stop codons, and splice isoforms. Read coverage statistics are reported by gene or transcript together with a visualization module based upon an individual transcript or genomic segment.

Alpheus™ is ideal for all current DNA sequence formats including:

  • Sanger
  • Roche-454
  • Illumina-Solexa
  • ABI SOLiD
  • 100s of GigaBases
  • Nucleotide variant, splice isoform identification

Alpheus™ provides data management services, an analysis pipeline, and internet-accessible software for variant discovery and analysis for ultra-high throughput next-generation sequence data with minimal human manipulation. Alpheus™ is available on a software–as–a–service basis to academic and industry clients. Upon provision of sequence data and reference database coordinates, NCGR provides clients with a secure, custom web–interface in which to analyze aligned reads and discover differences between samples. Alpheus™ is also available for local installation.
Contact Faye Shilkey (fds@ncgr.org) for details or pricing information.
 
Human Health/Infectious Disease
 
Schizophrenia Genome Project schizophrenia

The Schizophrenia Genome Project (S. Kingsmore, PI) was established in 2007 to identify the genetic basis of Schizophrenia. The SGP is a collaboration with Dr. Nora Bizzozero at the University of New Mexico (UNM) and Dr. Gary Schroth at Illumina Inc.

To date, the SGP has sequenced the transcriptome of 20 case and control samples, generating more than 15 billion nucleotides of sequence. Most of the samples analyzed to date have been from cerebellar cortex, an affected tissue in schizophrenia. Investigators are performing case–control comparisons to identify non–synonymous nucleotide variants that are associated with schizophrenia. The National Institutes of Mental Health has generously provided thousands of archived samples for validation studies.

GEYSIR/deCODE Genetics GEYSIR

This NIAID-funded Population Genetics project (J. Gulcher, PI, deCODE Genetics) is aimed at discovering host genes involved in immune response and adverse effects to vaccination. Specifically, the teams at deCODE, NCGR and the University of New Mexico Health Sciences Center (UNM-HSC) will collaborate to study four different populations having

  1. adverse effect to smallpox vaccination,
  2. clinical tuberculosis infection versus seroconversion,
  3. serious influenza infections, and
  4. one or more severe infections associated with encapsulated bacteria such as S. pneumoniae, H. influenzae, and N. meningitidis.

Using the Icelandic genealogy database, deCODE is identifying extended families affected in each category and carrying out genome-wide linkage and case-control association studies to map host genes. Following identification of host genes that confer substantial risk for infection or vaccine response, the UNM-HSC team will functionally validate them by testing protein and mRNA expression differences in monocyte and dendritic cells of patients with infection susceptibility versus controls, with or without in vitro pathogen exposure. NCGR (B. Beavis, PI; S. Baxter, PM) is using its expertise in creating informatics systems and analyzing large datasets to create and update a discovery platform of linkage analysis and validation results called GEYSIR.

Diagnostics for severe sepsis and community acquired pneumonia (CAPSOD) capsod

This NIAID-funded program, titled "CAPSOD", is a public-private, multidisciplinary collaboration involving investigators at ten organizations: NCGR; Duke University Medical Center, Durham, N.C.; Henry Ford Hospital, Detroit, MI; Durham Veterans Administration Medical Center, Durham, NC; Eli Lilly and Co., Indianapolis, IN; Monarch Life Sciences, Indianapolis, IN; Pfizer, Inc., Groton, CT; Metabolon, Inc., Durham, NC; Roche Diagnostics Corp., Indianapolis, IN; and ProSanos Corp., La Jolla, Calif. CAPSOD is a five-year program that will prospectively enroll patients with sepsis and CAP at Duke University Medical Center and Henry Ford Hospital. The study will use advanced bioinformatic and proteomic technologies to identify specific protein changes, or biomarkers, in patient blood samples that predict outcome in sepsis and CAP. Development of biomarker-based tests will permit patient selection for appropriate disposition, such as the intensive care unit, and use of intensive medical therapies, thereby reducing mortality and increasing effectiveness of resource allocation. See the full CAPSOD description at ClinicialTrials.gov.

New Mexico Idea Network of Biomedical Research Excellence (NM-INBRE) INBRE

The NIH/NCRR-funded NM-INBRE program is a collaboration among a number of New Mexican institutions including: New Mexico State University (NMSU), the University of New Mexico (UNM), Eastern New Mexico University (ENMU), New Mexico Institute of Mining and Technology (NMT), and New Mexico Highlands University (NMHU) and NCGR. INBRE aims to strengthen biomedical research in New Mexico's institutions of higher education and to prepare faculty and students for participation in the research programs of the National Institutes of Health. NCGR provides bioinformatics training and research, develops customized bioinformatic tools, hosts and maintains the NM-INBRE website to support collaboration, and hosts an annual Bioinformatics Symposium. NCGR's work also includes an outreach program for students at other 4-year undergraduate institutions, tribal and community colleges in the state to increase matriculation in graduate biomedical research programs.

Tuberculosis Archive Project
National Tuberculosis Archive

Dr. Damian Gessler at NCGR was the lead author on a proposal in the journal Science [Gessler, Dye, Farmer, Murray, Navin, Reves, Shinnick, Small, Yates & Simpson (2006) Science. 311: 1245 - 1246.] proposing the establishment of the nation's first integrated clinical and biological information resource for Tuberculosis (TB). In coordination with the universal genotyping program of Centers of Disease Control, the Tuberculosis Archive would contain a sample from every verified case of TB in the United States, together with comprehensive clinical, epidemiological, and genomic information.  At present, these samples and information are dispersed among independent organization, geographic locations, and databases.  This lack of integration greatly hampers rapid response to outbreaks and analysis of the spread of drug resistant TB.  Currently the Science article authors - representing an international team of nine leading health and research organizations - are working to make the Tuberculosis Archive a reality.

Plant Biology/Nutrition
 
Legume Information System (LIS) LIS

The Legume Information System (LIS) is the result of a cooperative research agreement between NCGR (G. May, PI) and the USDA Agricultural Research Service (ARS) as part of the Model Plant Initiative (MPI). The LIS project provides a publicly accessible legume resource that integrates genetic and molecular data from multiple legume species and enables genomic, transcript and map cross-species comparisons.

Virtual Plant Information Network (VPIN) VPIN VPIN

The VPIN is a collaborative technical project funded by NSF (D. Gessler, PI).   Together with technical teams at TIGR and Cold Spring Harbor Laboratories, NCGR is building a technology framework for a virtual plant network aimed at integrating data and services provided by independently evolving information resources.  Participating plant information resources include DragonDB, Gramene, IRIS, IWIS, LIS, TAIR, and TIGR. The technical foundation of the VPIN is based on evolving semantic web and web services technologies, first developed as part of the previous NSF-funded MOBY projects. The on-going development of the VPIN platform is run as an open source project. The code is currently hosted at Open Bioinformatics Foundation, where you can view the cvs code repository for the Semantic MOBY/VPIN project.

Phytophthora Studies

Oomycetes, or water molds, are among the most important eukaryotic plant pathogens. Annually, they cause $100s of billions of damage to agricultural and ecological systems worldwide, impacting the productivity and sustainability of food crops, ornamentals, forest products, and seafood. Several oomycete species represent sufficient threats to the safety of the nation’s food supply to merit inclusion on the Animal and Plant Health Inspection Service agricultural bioterrorism list or the USDA regulated plant pest list. There exists an urgent need to identify the genetic determinants of virulence and host range in order to develop improved control methods.

Phytophthora capsici Genome Project capsici capsici

P. capsici is a non-indigenous US pathogen. It was first reported in the US in 1922 on chili peppers in New Mexico and spread to vegetable production areas in Colorado and Florida in the 1930's and 1940's, affecting tomatoes, eggplants, squash, and melons.

NCGR (S. Kingsmore, PI), along with biologists at University of Tennessee and Ohio State University, is funded by USDA/NSF to sequence the P. capsici genome.  The rationale for these studies is:

  1. P. capsici is a devastating pathogen of vegetable crops of national economic importance;
  2. P. capsici is an excellent genetic model. This project will create broadly applicable resources for gene models and population genetic studies of oomycete biology and hemibiotroph-induced disease;
  3. 454 sequencing technology will be evaluated and benchmarked for de novo and re-sequencing in the largest genome studied to date (65MB).

In collaboration and support from DOE's JGI sequencing group, the aims are to use novel 454 Life Sciences sequencing technology to generate:

  1. 20X draft genome sequence of the vegetable pathogen Phytophthora capsici,
  2. 2X coverage resequencing in 4 outbred isolates, and
  3. a catalog of single nucleotide variation.

These resources will be disseminated at the Phytophthora Functional Genomic Database (PFGD).

Phytophthora Functional Genomic Database (PFGD) PFGD

PFGD is a web based clade-oriented information resource that builds upon data formerly available from the Phytophthora Genome Consortium (PGC) and at the Oomycete Genomics Database, as well as all publicly available P. infestans transcript data. PFGD is funded by NSF (S. Kamoun, Ohio State University, PI). Oomycete sequence data is analyzed and automatically annotated using NCGR's XGI system. PFGD includes functional assays and gene expression data, combined with transcript and genomic analysis and annotation. PFGD integrates the P. sojae and P. ramorum genomes and their annotations as well for comparative analysis. In addition, host species data — available at solgd.org - is integrated at PFGD. Going forward, P. capsici sequence data and variant analysis will also be available at PFGD.

Software Tools and Active Software Development Projects

NCGR's programs have produced a number of software tools that are freely available to the scientific community using the internet or for software download.  Some software packages are released under open source licenses; others are freely available to non-profit organizations under a licensing agreement. For additional information on NCGR's software tools contact us at info@ncgr.org.

Virtual Plant Information Network (VPIN)

The technical foundation of the VPIN (M. Montoya, PM) is based on evolving semantic web and web services technologies, first developed as part of the previous NSF-funded MOBY projects. The on-going development of the VPIN platform is run as an open source project. The code is currently hosted at Open Bioinformatics Foundation.

The X Genome Initiative (XGI)

The X Genome Initiative (K. Gajendran, PM) is NCGR's high-throughput computational, species-independent sequence analysis pipeline and database software system. XGI uses a variety of algorithms for sequence pattern recognition, comparison and annotation of genomic, EST or ORF sequence data types. XGI is the annotation and database engine behind both PFGD and the Legume Information System. The system is being modified to incorporate assembly algorithms for 454 Life Sciences and Sanger reads, in addition to handling sequence variant detection. XGI jobs are queued across NCGR's Linux cluster and results are stored in a relational database. XGI operates on batch files and can be configured to perform any series of sequence similarity or motif searching operations based on user preference or sequence type. Pipeline analyses can include BLAST analyses, InterProScan algorithms and a variety of tools for gene prediction and assembly.  Automated post-analysis annotation links best match annotation to Gene Ontology entries and annotations.

Comparative Map and Trait Viewer (CMTV) CMTV

The Comparative Map and Trait Viewer (A. Farmer, PM) is a graphical client for integrating various types of genomic data from different sources, including annotated sequences, genetic maps and QTL data. The tool allows comparison of maps using a variety of different algorithms. The results of these comparisons are used  to integrate data from multiple maps into a common framework. As a component of the ISYS integration platform, it provides a structural/comparative perspective on data that may be simultaneously viewed in relation to functional classification systems such as GO or biochemical interaction networks. Our collaborators at CIMMYT have used the tool to construct drought tolerance consensus maps for Zea mays based on the results of multiple trait mapping experiments under different genetic backgrounds and environmental conditions. The Legume Information Network has used the tool via Java WebStart to provide a client to aggregate genomic data provided via semantic web services and explore synteny relationship among legume species.

The earliest versions of CMTV were developed in collaboration with four CGIAR centers (CIAT, CIMMYT, CIP and IRRI). The current funding for the project comes from a USAID Linkage Grant with CIMMYT. CMTV source code is available from SourceForge.

Genomic Explorer y Survey of Immune Response (GEYSIR) GEYSIR

GEYSIR (Faye Schilkey, PM) is an interactive, web-based genomic visualization tool developed as part of the NIAID/deCODE population genetics project. Web-based tools for exploring genomic data typically are statically rendered HTML pages, which lack live interactivity.  With the exceptional amount of data and scales of size involved in working with genomic data, this lack of dynamic interaction usually becomes not only cumbersome for the user but inadequate for scientific discovery.  To address this issue in the context of population genetics studies, GEYSIR was developed to enable exploration of a wide scale of genomic data, from single nucleotide polymorphisms to gene neighborhoods to marker sets and association data spanning all chromosomes.  GEYSIR is designed to be a highly interactive, dynamic, and responsive web application.  Additionally, it was designed up front for extensibility and reusability so that the code base and architecture can be reused for a wide variety of genomic data, organisms, research, and data models.

Integrated Software Systems (ISYS) ISYS

ISYS (A. Farmer, PM) is a dynamic, flexible platform for the integration of bioinformatics software tools and databases. ISYS offers a component-based architecture that enables scientists to "plug and play" among tools of interest. These tools may be separately developed and independently evolving. In addition, ISYS allows web-based resources to be integrated with programs running on the scientist's desktop.

ISYS's DynamicDiscovery™ technology creates an exploratory environment in which scientists can navigate freely among registered components. DynamicDiscovery helps to guide the user by suggesting appropriate registered components to process selected data objects. In addition, ISYS supports visual synchronization among components, which helps each one to complement the others. ISYS is written in Java for platform independence and is supported on Windows and Solaris. It is also available without a Java Virtual Machine for Linux and other types of UNIX. The ISYS Platform code has been released under an Open Source license and is available for download from SourceForge.

Outreach
NCGR Outreach Program

Part of NCGR's founding mission was to enrich New Mexico by providing educational science and research opportunities. NCGR has established a multi-faceted outreach strategy that is focused on encouraging students and faculty in New Mexico to study science and math. We seek to establish relationships with the New Mexico science community through working partnerships. Together with regional programs and universities, we work to directly involve students in the research process through mentored training opportunities.

Core Infrastructure for New Mexico:

An example of our ongoing outreach efforts to regional students and faculty is the NIH-funded New Mexico IDeA Networks of Biomedical Research Excellence (NM-INBRE) program, of which NCGR is a participating institution. As part of this program, NCGR hosts an annual New Mexico Bioinformatics Symposium (NMBIS). Now in its third year, NMBIS brings together about 120 students and faculty from New Mexico, eastern Arizona and west Texas for research presentations, student poster presentations, and hands-on bioinformatic workshops covering such topics as Microarray Experiment Design and Microarray Data Handling, Bioinformatic Tools for Gene Discovery and Comparative Genomics. For many students, this represents their only opportunity to meet and listen to nationally-recognized scientists. One highlight of the meeting includes the catered student and faculty poster sessions, at which students and faculty have an opportunity to chat informally with one another and with nationally-recognized scientists.

Summer Internships:

NCGR offers summer research internships (eight to ten weeks in duration, funded by NSF and NIH) to give New Mexico's undergraduates and faculty the opportunity to couple their classroom knowledge to a research experience. Our interns work on various research projects (for example: studying the relationship between peanut protein structure and allergenicity, using bioinformatic tools and resources to study the evolution of antibiotic resistance, and performing advanced computer simulations of protein dynamics) Interns are given the opportunity to see 'science at work', see science as a viable career path, and to think about their continued education at the Master's or Doctoral level.

Educational Outreach:

NCGR scientists and staff travel to institutions within New Mexico to give hands-on workshops on topics such as genomics and protein structure visualization and manipulation. Furthermore we participate in group-targeted outreach activities such as the Sandia National Laboratory Dream Catcher Science Program, a program intended for American Indian students interested in science, math and engineering.