A Computing Facility for Bioinformatics and Life Sciences Research |
| Project Active: Start Date 2004-09-29 |
Broadly, we can view computing associated with life sciences research as the application of advanced information technology to solve biological problems. These technologies are aimed at organizing biological data, analyzing the data, and then facilitating its interpretation. The knowledge gained from this process can then be used to build predictive models of biological systems. Thus, life sciences research needs information technologies, e.g., advanced algorithms, high performance computing, data and information management (including databases and data mining systems), and software to support communications and collaboration. |
|
Faculty Investigator(s): Victor Frost (PI), Xue-wen Chen, Terry Clark Staff Investigator(s): Adam Hock, David Johnson Student Investigator(s): Doug Herbers, Justin Ward |
| Primary Sponsor(s): NIH-Health Resources and Services Administration (HRSA) (of U.S. Dept. of Health and Human Services) |
An Online Clearinghouse for Bioinformatics Software Sharing and Evaluation |
| Project Active: Start Date 2006-09-29, Projected End Date 2007-04-30 |
The registry will help prevent the duplication of comparable tools and publicize technical developments. It will collect user feedback and provide reliable cross-platform documentation. |
| Faculty Investigator(s): Gerald Lushington (PI), Jianwen Fang |
| Primary Sponsor(s): University of Kansas Medical Center Research Institute |
CAREER: Machine Learning Approaches for Genome-wide Biological Network Interference |
| Project Active: Start Date 2007-02-22, Projected End Date 2010-04-30 |
Because of technological limitations, molecular biology research has had to focus on individual genes and gene products. This has led to a wealth of knowledge about individual cellular components and their functions. Isolated cellular components are not sufficient to understand most cellular functions, which are carried out by complex networks. It is therefore imperative to employ network-based approaches to address the complexity of living systems. |
|
Faculty Investigator(s): Xue-wen Chen (PI) Student Investigator(s): Mei Liu, Jong Jeong, Bing Han, Michael Wasikowski, Jae Kim, Alexander Senf, Matthew Mandelbaum, Patrick Dermyer |
| Primary Sponsor(s): NSF |
CAREER: Mining Genome-wide Chemical-Structure Activity Relationships in Emergent Chemical Genomics Databases |
| Project Active: Start Date 2009-07-01, Projected End Date 2014-06-30 |
ITTC will develop an integrated research and education program for advancing the underlying theoretical and computational principles of data mining in the emergent chemical genomics databases. The core technical innovations are advances in (i) developing effective kernel-based representations and structure pattern extraction and selection methods to capture the intrinsic characteristics of irregular and discrete spaces such as the chemical space, (ii) designing methods for adaptive and scalable similarity search in large databases of complex data and methods for accurate classification model construction with imbalanced and out-of-domain data, and (iii) deriving application oriented validation. |
|
Faculty Investigator(s): Jun Huan (PI) Student Investigator(s): Brian Quanz |
| Primary Sponsor(s): National Science Foundation |
Computational Prediction of Beta-Sheet Arrangement (K-INBRE) |
| Project Active: Start Date 2005-07-01, Projected End Date 2006-04-30 |
It is widely believed that protein misfolding into beta aggregates or fibrils is a significant contributor to the onset of Alzheimer’s, Parkinson’s, and other neurodegenerative diseases. Although knowledge of the mechanism for conformational change may be critical to control of these diseases, considerable uncertainty exists about the nature of aggregate formation and the nature of the fibrils. Computational prediction of the transformation may present a plausible approach to resolving some of the uncertainties. Use of long-range interactions in prediction of ?-strand arrangement in the formation of ?-sheets may well be an essential step to forecasting/determining the 3-D structure of proteins from amino acid sequences. Thus, a better understanding of ?-strand arrangement in ?-sheets will not only provide possible solutions to prevent intermolecular ?-sheet formation associated with neurodegenerative diseases but may also contribute to the success of 3-D structure prediction. |
| Faculty Investigator(s): Jianwen Fang (PI) |
| Primary Sponsor(s): University of Kansas Medical Center Research Institute, Inc. (KUMCRI) |
Computational Proteomics: Protein Interaction Prediction |
| Project Active: Start Date 2004-09-01, Projected End Date 2006-06-30 |
Proteins perform biological functions by interacting with other molecules. During the protein-protein interaction, the conserved domains physically interact with each other. Thus, understanding protein interactions at domain level gives detailed functional insights upon proteins that are either characterized or newly discovered. However, unlike protein-protein interactions that can be discovered by some high throughput technologies, domain-domain interactions largely remain unknown. This project addresses this issue by developing computational models to infer domain-domain interactions from protein-protein interactions; the model can then be used to validate and predict unknown protein interactions. |
| Faculty Investigator(s): Xue-wen Chen (PI), Xue-wen Chen |
| Primary Sponsor(s): National Institutes of Health |
Constructing Gene Networks from Microarray Data for Age-Dependent Epiliptogenesis |
| Project Active: Start Date 2004-07-01, Projected End Date 2005-06-30 |
Epilepsy, characterized by the repetitive occurrence of seizures, currently afflicts approximately 4 percent of Americans of all backgrounds and ages. There are no current therapies available which can completely arrest the epileptic process in most individuals. In order to develop effective prevention and therapeutic intervention approaches, the molecular mechanisms of epilepsy must be identified. Bioinformatics approaches will unravel relationships among the specific genes and generate hypotheses on the molecular mechanisms of the epileptogenic process. |
| Faculty Investigator(s): Xue-wen Chen (PI) |
| Primary Sponsor(s): Center of Biomedical Research Excellence (COBRE)-NIH |
Development of an Integrated Bioinformatics Information Infrastructure |
| Project Active: Start Date 2004-10-13, Projected End Date 2006-09-29 |
The Army's chemical and biological defense research and development interests reflect numerous activities that should benefit significantly from the increased facility of data flow and hypothesis testing that arise from an enhanced informatics infrastructure. Chemical and biological defense research is multifaceted, involving issues from the sub-cellular level through ecological and geographic dynamics of a disease. The same is true of current life science research activities at the University of Kansas. Given the related nature of various of the KU efforts and those under way within Edgewood Chemical Biological Center (ECBC), it is logical to expect that bioinformatics infrastructure to be developed under this effort at KU in conjunction with local research activities should be relevant to, and readily extensible to, the information management needs within the ECBC. |
|
Faculty Investigator(s): Victor Frost (PI), Terry Clark, Susan Gauch, Gary Minden Student Investigator(s): Alexander Garrett, Lance Feagan, Justin Rohrer, Jesse Stanley, Keith Preston, Doug Herbers, Andrew Ozor, Justin Ward, Heather Amthauer |
| Primary Sponsor(s): U.S. Army |
First Award: Rapid Integration of Genomic Data from Multiple Sources |
| Project Active: Start Date 2005-03-21, Projected End Date 2006-05-31 |
The research will automate data integration and schema extensions toward intuitive and flexible interfaces to object-oriented databases for biologists, expert and non-expert users, and software systems. The target application scenario involves large collections of primary genomic data stored in an object-oriented genomic database. Here users are interested in integrating data and schemas from external sources with a comprehensive warehouse. In this work, XML is chosen .as the input format for data; the target genomic data warehouse is the public domain Genomic Unified Schema, GUS. A framework is designed and developed to admit new data types (schema) and dynamically incorporate them into through the database object layer using an automatically generated interface. This interface will automatically generate mappings between input data and data warehouse objects from compliant (based on a current prototype) and schema definitions. Input data may conform to the target GUS schema, or to new schemas. The proposed functionality will extend the XMLGUS data loading system developed by the PI. This interface has proven successful in application settings, yet it is tedious to generate manually the interface grammar. Thus, toward addressing and managing the complexity of GUS, a part of the three-year proposed work extends the XMLGUS framework to generate the variable components automatically |
|
Faculty Investigator(s): Terry Clark (PI) Student Investigator(s): Krishna Kotcherlakota, Yi Jia |
| Primary Sponsor(s): NSF & KTEC |
K-INBRE Cellular Pathogen Gene Identification via Graph Data Mining |
| Project Active: Start Date 2007-06-27, Projected End Date 2008-04-30 |
Genomics efforts continue to yield a myriad of new protein sequences. They offer unprecedented opportunities for knowledge-based sequence annotations that aim to automatically transfer experimentally gained biological knowledge from model organisms to newly sequenced genomes to expedite biological discovery. Applying rigorous data mining methods to large, sequentially diverse, and clinically-important protein families, like the immunologic proteins, can yield reliable, intuitively predictive models readily extensible to annotating novel sequences. This would enable rational experimental design that may lead to improved medicine against refractory pathogens. Specifically, for characterizing and annotating immunological proteins, we plan to devise, refine, and disseminate statistical geometric analysis methods. We will include rigorous protein structure representation using geometric graphs, identifying conserved substructure patterns in protein structures based on graph database mining, mapping structure patterns to sequence motifs, and annotating genes using the obtained sequence motif with advanced statistical learning methods such as support vector machine. |
|
Faculty Investigator(s): Jun Huan (PI) Student Investigator(s): Lin Yi, Vincent Buhr, Yi Jia, Jae Kim, Xiaotong (Cindy) Lin |
| Primary Sponsor(s): KUMCRI (flow-through from NIH) |
K-INBRE: Complete, Upgrade and Enhance Data Handling in the Analytical Proteomics Laboratory |
| Project Active: Start Date 2007-06-26, Projected End Date 2008-04-30 |
Researchers will refine preliminary software designed to generate statistically justifiable and robust protein identifications especially for the KU investigators looking at targeted proteomes of 100s to 1000s proteins, e.g. Mitochondrial, Lipid Rafts, Liver Microsomes, Protein-Protein interaction pull downs |
| Faculty Investigator(s): Gerald Lushington (PI), Jianwen Fang |
K-INBRE: Web Server Tracker, an Automated Literature, Protein/DNA Sequence and Domain Tracking System |
| Project Active: Start Date 2007-06-26, Projected End Date 2008-04-30 |
Tracker is a widely used automatic literature and protein/DNA sequence and domain tracking system developed at KU under the K-BRIN program. ITTC investigators are procuring a powerful web application server to replace an old server that currently runs Tracker. Researchers also plan to update the application to meet new needs as specified by users. |
|
Faculty Investigator(s): Jianwen Fang (PI), Gerald Lushington Student Investigator(s): Brian Quanz, Raymond Anderson |
| Primary Sponsor(s): KUMCRI (flow-through from NIH) |
Unified Data Format for Mass Spectrometry Analysis (UDF) |
| Project Active: Start Date 2005-01-13, Projected End Date 2005-06-30 |
Despite the similarity of information content across a wide variety of vendor-specific mass spectrometry formats (i.e., the pervasive mass/charge ratio), tools developed to process the data coming from one instrument are rarely capable of processing data derived from another platform. There is a great desire to be able to do so, since specific analysis options available on one platform are often of value to (and unavailable to) data arising from another. This communications issue can be largely overcome by: a) constructing a set of conversion routines to deposit all data (except that . arising from a small number of vendors that contractually forbid format reverse- . engineering) into a consistent and unified format, and by developing commensurate routines for back-converting from this unified format to vendor specific structures. |
|
Faculty Investigator(s): John Gauch Student Investigator(s): Praveen Lakkaraju |
| Primary Sponsor(s): Kansas Idea Network of Biomedical Research Excellence (KINBRE)-NIH |
First Award: Identify Informative Genes for Cancer Classification |
| Project Expired On: 2005-06-30 |
With the completion of human genome project and the advance of microarray technologies, it is now possible to explore the whole genome both systematically and comprehensively. Microarrays have been extensively used for screening gene expressions and for exploiting important clues to understanding the role of genes and the underlying gene regulatory networks. Use of microarrays is rapidly generating large amounts of data (typically terabytes) that create both opportunities and challenging problems. Conventional methods are increasingly unable to deal with the huge amount of data. For example, when applied to cancer classification, microarray data are overwhelming conventional machine learning algorithms because the number of samples is much less than the number of features (genes). A major challenge is the identification of informative genes for cancer classification from gene expression measurements. In fact, it has been demonstrated that only a small number of genes are relevant to a specific cancer classification problem. Identifying these relevant genes is important in numerous microarray-based applications such as drug discovery, early disease detection, and proper treatment guidance. |
|
Faculty Investigator(s): Xue-wen Chen (PI) Student Investigator(s): Mei Liu, Manjunath Narayana |
| Primary Sponsor(s): NSF and KTEC |
Copyright © 2008 by the University of Kansas
Please send comments and questions to the webmaster.
