 |
|
|

|
Our Work - Collaborative Projects and Scientific Software
|
The success of NCGR depends on collaborative research at the
intersection of bioscience, computing and mathematics. Today at
NCGR, our scientists and partners study the influence of genetic
variability of both host and pathogen on infectious disease
progression. NCGR's software engineers develop scientific
software solutions to support and enable those studies. A range
of federal and state funded programs support our programs and
projects in Human Health, Infectious Disease, Legume Crop
Improvement, and Food Security.
|
| Alpheus Software System |
 |
| |
Alpheus Software System
In 2006, NCGR commenced development of the
Alpheus™
(N. Miller, Project Lead) web–based software system for
analysis of data in massively parallel resequencing projects.
Specifically Alpheus™ was designed for resequencing–based
case–control association studies to identify the genetic basis
of complex diseases and traits. Alpheus™ provides massively
parallel sequence pipelining, visualization, analysis, and project
management capabilities.
Alpheus™ provides dynamic queries and visualization of read data, variant
data and results via an intuitive user interface. Alpheus™ reports sSNPs,
nsSNPs, indels, premature stop codons, and splice isoforms. Read coverage
statistics are reported by gene or transcript together with a visualization
module based upon an individual transcript or genomic segment.
Alpheus™ is ideal for all current DNA sequence formats including:
- Sanger
- Roche-454
- Illumina-Solexa
- ABI SOLiD
- 100s of GigaBases
- Nucleotide variant, splice isoform identification
Alpheus™ provides data management services, an analysis pipeline, and
internet-accessible software for variant discovery and analysis for
ultra-high throughput next-generation sequence data with minimal human
manipulation. Alpheus™ is available on a software–as–a–service
basis to academic and industry clients. Upon provision of sequence data
and reference database coordinates, NCGR provides clients with a secure,
custom web–interface in which to analyze aligned reads and discover
differences between samples. Alpheus™ is also available for local
installation.
Contact Faye Shilkey (fds@ncgr.org) for details or pricing information.
|
|
| |
| Human Health/Infectious Disease |
 |
| |
Schizophrenia Genome Project
The Schizophrenia Genome Project (S. Kingsmore, PI)
was established in 2007 to identify the genetic basis of Schizophrenia.
The SGP is a collaboration with Dr. Nora Bizzozero at the University of
New Mexico (UNM) and Dr. Gary Schroth at Illumina Inc.
To date, the SGP has sequenced the transcriptome of 20 case and control
samples, generating more than 15 billion nucleotides of sequence. Most
of the samples analyzed to date have been from cerebellar cortex, an
affected tissue in schizophrenia. Investigators are performing case–control
comparisons to identify non–synonymous nucleotide variants that are
associated with schizophrenia. The National Institutes of Mental Health
has generously provided thousands of archived samples for validation studies.
|
|
GEYSIR/deCODE Genetics
This NIAID-funded Population Genetics project
(J. Gulcher, PI, deCODE Genetics) is aimed at discovering host genes
involved in immune response and adverse effects to
vaccination. Specifically, the teams at deCODE, NCGR and the
University of New Mexico Health Sciences Center (UNM-HSC) will
collaborate to study four different populations having
- adverse effect to smallpox vaccination,
- clinical tuberculosis infection versus seroconversion,
- serious influenza infections, and
- one or more severe infections associated with encapsulated bacteria such as
S. pneumoniae, H. influenzae, and N. meningitidis.
Using the Icelandic genealogy database, deCODE is
identifying extended families affected in each category and carrying
out genome-wide linkage and case-control association studies to map
host genes. Following identification of host genes that confer
substantial risk for infection or vaccine response, the UNM-HSC team
will functionally validate them by testing protein and mRNA expression
differences in monocyte and dendritic cells of patients with infection
susceptibility versus controls, with or without in vitro
pathogen exposure. NCGR (B. Beavis, PI; S. Baxter, PM) is using its
expertise in creating informatics systems and analyzing large datasets
to create and update a discovery platform of linkage analysis and
validation results called GEYSIR.
|
|
Diagnostics for severe sepsis and community acquired pneumonia (CAPSOD)
This NIAID-funded program, titled "CAPSOD",
is a public-private, multidisciplinary collaboration involving
investigators at ten organizations: NCGR; Duke University Medical
Center, Durham, N.C.; Henry Ford Hospital, Detroit, MI; Durham Veterans
Administration Medical Center, Durham, NC; Eli Lilly and Co.,
Indianapolis, IN; Monarch Life Sciences, Indianapolis, IN; Pfizer, Inc.,
Groton, CT; Metabolon, Inc., Durham, NC; Roche Diagnostics Corp.,
Indianapolis, IN; and ProSanos Corp., La Jolla, Calif.
CAPSOD
is a five-year program that will prospectively enroll patients with
sepsis and CAP at Duke University Medical Center and Henry Ford
Hospital. The study will use advanced bioinformatic and proteomic
technologies to identify specific protein changes, or biomarkers,
in patient blood samples that predict outcome in sepsis and CAP.
Development of biomarker-based tests will permit patient selection
for appropriate disposition, such as the intensive care unit, and
use of intensive medical therapies, thereby reducing mortality and
increasing effectiveness of resource allocation. See the full CAPSOD
description at
ClinicialTrials.gov.
|
|
New Mexico Idea Network of Biomedical Research Excellence (NM-INBRE)
The NIH/NCRR-funded NM-INBRE program is a collaboration among a
number of New Mexican institutions including: New Mexico State
University (NMSU), the University of New Mexico (UNM), Eastern New
Mexico University (ENMU), New Mexico Institute of Mining and
Technology (NMT), and New Mexico Highlands University (NMHU) and
NCGR. INBRE aims to strengthen biomedical research in New Mexico's
institutions of higher education and to prepare faculty and students
for participation in the research programs of the National Institutes
of Health. NCGR provides bioinformatics training and research,
develops customized bioinformatic tools, hosts and maintains the
NM-INBRE website to support collaboration, and hosts an annual
Bioinformatics
Symposium. NCGR's work also includes an outreach program for
students at other 4-year undergraduate institutions, tribal and
community colleges in the state to increase matriculation in graduate
biomedical research programs.
|
|
Tuberculosis Archive Project
National Tuberculosis Archive
Dr. Damian Gessler at NCGR was the lead author on
a proposal in the journal
Science
[Gessler, Dye, Farmer, Murray, Navin, Reves, Shinnick, Small, Yates &
Simpson (2006) Science. 311: 1245 - 1246.] proposing the establishment
of the nation's first integrated clinical and biological information
resource for Tuberculosis (TB). In coordination with the universal
genotyping program of Centers of Disease Control, the Tuberculosis
Archive would contain a sample from every verified case of TB in the
United States, together with comprehensive clinical, epidemiological,
and genomic information. At present, these samples and information
are dispersed among independent organization, geographic locations,
and databases. This lack of integration greatly hampers rapid
response to outbreaks and analysis of the spread of drug resistant
TB. Currently the Science article authors - representing an
international team of nine leading health and research organizations -
are working to make the Tuberculosis Archive a reality.
|
|
Plant Biology/Nutrition |
 |
| |
Legume Information System (LIS)
The Legume Information
System (LIS) is the result of a cooperative research agreement
between NCGR (G. May, PI) and the USDA Agricultural
Research Service (ARS) as part of the Model Plant Initiative
(MPI). The LIS project provides a publicly accessible legume resource
that integrates genetic and molecular data from multiple legume
species and enables genomic, transcript and map cross-species
comparisons.
|
|
Virtual Plant Information Network (VPIN)
VPIN
The VPIN is a
collaborative technical project funded by NSF (D. Gessler, PI).
Together with technical teams at TIGR and Cold
Spring Harbor Laboratories, NCGR is building a technology framework
for a virtual plant network aimed at integrating data and services
provided by independently evolving information resources.
Participating plant information resources include DragonDB,
Gramene,
IRIS,
IWIS,
LIS,
TAIR, and
TIGR. The technical foundation of the
VPIN is based on evolving semantic web and web services technologies,
first developed as part of the previous NSF-funded MOBY projects. The
on-going development of the VPIN platform is run as an open source
project. The code is currently hosted at
Open Bioinformatics
Foundation, where you can view
the cvs code repository for the Semantic MOBY/VPIN
project.
|
|
| Phytophthora Studies
Oomycetes, or water molds,
are among the most important eukaryotic plant pathogens. Annually, they cause
$100s of billions of damage to agricultural and ecological systems worldwide,
impacting the productivity and sustainability of food crops, ornamentals,
forest products, and seafood. Several oomycete species represent sufficient
threats to the safety of the nation’s food supply to merit inclusion on the Animal
and Plant Health Inspection Service agricultural bioterrorism list or the USDA
regulated plant pest list. There exists an urgent need to identify the genetic
determinants of virulence and host range in order to develop improved control
methods.
Phytophthora capsici Genome Project
P. capsici is a non-indigenous US
pathogen. It was first reported in the US in 1922 on chili peppers in
New Mexico and spread to vegetable production areas in Colorado and
Florida in the 1930's and 1940's, affecting tomatoes, eggplants,
squash, and melons.
NCGR (S. Kingsmore, PI), along
with biologists at University of Tennessee and Ohio State University, is funded by
USDA/NSF to sequence the P. capsici genome. The rationale for
these studies is:
- P. capsici is a devastating pathogen of vegetable crops of
national economic importance;
- P. capsici is an excellent genetic model. This project
will create broadly applicable resources for gene models and
population genetic studies of oomycete biology and
hemibiotroph-induced disease;
- 454 sequencing technology will be evaluated and benchmarked for
de novo and re-sequencing in the largest genome studied to date
(65MB).
In collaboration and support from DOE's JGI sequencing group,
the aims are to use novel 454 Life Sciences sequencing technology to generate:
- 20X draft genome sequence of the vegetable pathogen
Phytophthora capsici,
- 2X coverage resequencing in 4 outbred isolates, and
- a catalog of single nucleotide variation.
These resources will be disseminated at the Phytophthora Functional Genomic Database (PFGD).
Phytophthora Functional Genomic Database (PFGD)
PFGD is a web
based clade-oriented information resource that builds upon data
formerly available from the Phytophthora Genome Consortium (PGC) and
at the Oomycete Genomics Database, as
well as all publicly available P. infestans transcript
data. PFGD is funded by NSF (S. Kamoun, Ohio State University, PI).
Oomycete sequence data is analyzed and
automatically annotated using NCGR's XGI system. PFGD includes
functional assays and gene expression data, combined with transcript
and genomic analysis and annotation. PFGD integrates the
P. sojae and P. ramorum genomes and their annotations as
well for comparative analysis. In addition, host species data —
available at solgd.org - is integrated at PFGD. Going forward,
P. capsici sequence data and variant analysis will also be
available at PFGD.
|
|
| Software
Tools and Active Software Development Projects |
 |
NCGR's programs have
produced a number of software tools that are freely available to the scientific
community using the internet or for software download. Some software packages
are released under open source licenses; others are freely available to
non-profit organizations under a licensing agreement. For additional
information on NCGR's software tools contact us at info@ncgr.org.
Virtual Plant Information Network (VPIN)
The technical foundation of the VPIN (M. Montoya, PM) is based on
evolving semantic web and web services technologies, first developed
as part of the previous NSF-funded MOBY projects. The on-going
development of the VPIN platform is run as an open source project. The
code is currently hosted
at Open Bioinformatics Foundation.
The X Genome Initiative (XGI)
The X Genome
Initiative (K. Gajendran, PM) is NCGR's high-throughput
computational, species-independent sequence analysis pipeline and
database software system. XGI uses a variety of algorithms for
sequence pattern recognition, comparison and annotation of genomic,
EST or ORF sequence data types. XGI is the annotation and database
engine behind both PFGD and the
Legume Information
System. The system is being modified to incorporate assembly
algorithms for 454 Life Sciences and Sanger reads, in addition to
handling sequence variant detection. XGI jobs are queued across
NCGR's Linux cluster and results are stored in a relational
database. XGI operates on batch files and can be configured to perform
any series of sequence similarity or motif searching operations based
on user preference or sequence type. Pipeline analyses can include
BLAST analyses, InterProScan algorithms and a variety of tools for
gene prediction and assembly. Automated post-analysis annotation
links best match annotation to Gene Ontology entries and annotations.
Comparative Map and Trait Viewer (CMTV)
The Comparative Map and Trait
Viewer (A. Farmer, PM) is a graphical client for integrating
various types of genomic data from different sources, including
annotated sequences, genetic maps and QTL data. The tool allows
comparison of maps using a variety of different algorithms. The
results of these comparisons are used to integrate
data from multiple maps into a common framework. As a component of
the ISYS integration platform,
it provides a structural/comparative perspective on data that may
be simultaneously viewed in relation to functional classification
systems such as GO or biochemical interaction networks. Our
collaborators at CIMMYT
have used the tool to construct drought tolerance consensus maps
for Zea mays based on the results of multiple trait mapping
experiments under different genetic backgrounds and environmental
conditions. The Legume Information
Network has used the tool via Java WebStart to provide a client
to aggregate genomic data provided via semantic web services and
explore synteny relationship among legume species.
The earliest versions of CMTV were developed in
collaboration with four CGIAR centers (CIAT, CIMMYT, CIP and IRRI). The current funding for the
project comes from a USAID Linkage Grant with CIMMYT. CMTV source code is available
from SourceForge.
Genomic Explorer y Survey of Immune Response (GEYSIR)
GEYSIR (Faye
Schilkey, PM) is an interactive, web-based genomic visualization tool
developed as part of the NIAID/deCODE population genetics
project. Web-based tools for exploring genomic data typically are
statically rendered HTML pages, which lack live interactivity. With
the exceptional amount of data and scales of size involved in working
with genomic data, this lack of dynamic interaction usually becomes
not only cumbersome for the user but inadequate for scientific
discovery. To address this issue in the context of population
genetics studies, GEYSIR was developed to enable exploration of a wide
scale of genomic data, from single nucleotide polymorphisms to gene
neighborhoods to marker sets and association data spanning all
chromosomes. GEYSIR is designed to be a highly interactive, dynamic,
and responsive web application. Additionally, it was designed up
front for extensibility and reusability so that the code base and
architecture can be reused for a wide variety of genomic data,
organisms, research, and data models.
Integrated Software Systems (ISYS)
ISYS (A. Farmer, PM) is a
dynamic, flexible platform for the integration of bioinformatics
software tools and databases. ISYS offers a component-based
architecture that enables scientists to "plug and play"
among tools of interest. These tools may be separately developed and
independently evolving. In addition, ISYS allows web-based resources
to be integrated with programs running on the scientist's desktop.
ISYS's DynamicDiscovery™ technology creates
an exploratory environment in which scientists can navigate freely
among registered components. DynamicDiscovery helps to guide the user
by suggesting appropriate registered components to process selected
data objects. In addition, ISYS supports visual synchronization among
components, which helps each one to complement the others. ISYS is
written in Java for platform independence and is supported on Windows
and Solaris. It is also available without a Java Virtual Machine for
Linux and other types of UNIX. The ISYS Platform code has been
released under an Open Source license and is available for download from
SourceForge.
|
|
| Outreach |
 |
|
NCGR Outreach Program
Part of NCGR's founding mission was to enrich New
Mexico by providing educational science and research opportunities. NCGR
has established a multi-faceted outreach strategy that is focused on
encouraging students and faculty in New Mexico to study science and
math. We seek to establish relationships with the New Mexico science
community through working partnerships. Together with regional programs
and universities, we work to directly involve students in the research
process through mentored training opportunities.
Core Infrastructure for New Mexico:
An example of our ongoing outreach efforts to regional
students and faculty is the NIH-funded New Mexico IDeA Networks of
Biomedical Research Excellence (NM-INBRE) program, of
which NCGR is a participating institution. As part of this program, NCGR
hosts an annual New Mexico Bioinformatics Symposium (NMBIS).
Now in its third year, NMBIS brings together about 120 students and faculty
from New Mexico, eastern Arizona and west Texas for research
presentations, student poster presentations, and hands-on bioinformatic
workshops covering such topics as Microarray Experiment Design and
Microarray Data Handling, Bioinformatic Tools for Gene Discovery and
Comparative Genomics. For many students, this represents their only
opportunity to meet and listen to nationally-recognized scientists. One
highlight of the meeting includes the catered student and faculty poster
sessions, at which students and faculty have an opportunity to chat
informally with one another and with nationally-recognized scientists.
Summer Internships:
NCGR offers summer research internships (eight to
ten weeks in duration, funded by NSF and NIH) to give New Mexico's
undergraduates and faculty the opportunity to couple their classroom
knowledge to a research experience. Our interns work on various research
projects (for example: studying the relationship between peanut protein
structure and allergenicity, using bioinformatic tools and resources to
study the evolution of antibiotic resistance, and performing advanced
computer simulations of protein dynamics) Interns are given the
opportunity to see 'science at work', see science as a viable career
path, and to think about their continued education at the Master's or
Doctoral level.
Educational Outreach:
NCGR scientists and staff travel to institutions
within New Mexico to give hands-on workshops on topics such as genomics
and protein structure visualization and manipulation. Furthermore we
participate in group-targeted outreach activities such as the Sandia
National Laboratory Dream
Catcher Science Program, a program intended for American Indian
students interested in science, math and engineering.
|
|
|