Matching Items (82)
130362-Thumbnail Image.png
Description
Background
Multicellular organisms consist of cells of many different types that are established during development. Each type of cell is characterized by the unique combination of expressed gene products as a result of spatiotemporal gene regulation. Currently, a fundamental challenge in regulatory biology is to elucidate the gene expression controls that

Background
Multicellular organisms consist of cells of many different types that are established during development. Each type of cell is characterized by the unique combination of expressed gene products as a result of spatiotemporal gene regulation. Currently, a fundamental challenge in regulatory biology is to elucidate the gene expression controls that generate the complex body plans during development. Recent advances in high-throughput biotechnologies have generated spatiotemporal expression patterns for thousands of genes in the model organism fruit fly Drosophila melanogaster. Existing qualitative methods enhanced by a quantitative analysis based on computational tools we present in this paper would provide promising ways for addressing key scientific questions.
Results
We develop a set of computational methods and open source tools for identifying co-expressed embryonic domains and the associated genes simultaneously. To map the expression patterns of many genes into the same coordinate space and account for the embryonic shape variations, we develop a mesh generation method to deform a meshed generic ellipse to each individual embryo. We then develop a co-clustering formulation to cluster the genes and the mesh elements, thereby identifying co-expressed embryonic domains and the associated genes simultaneously. Experimental results indicate that the gene and mesh co-clusters can be correlated to key developmental events during the stages of embryogenesis we study. The open source software tool has been made available at http://compbio.cs.odu.edu/fly/.
Conclusions
Our mesh generation and machine learning methods and tools improve upon the flexibility, ease-of-use and accuracy of existing methods.
ContributorsZhang, Wenlu (Author) / Feng, Daming (Author) / Li, Rongjian (Author) / Chernikov, Andrey (Author) / Chrisochoides, Nikos (Author) / Osgood, Christopher (Author) / Konikoff, Charlotte (Author) / Newfeld, Stuart (Author) / Kumar, Sudhir (Author) / Ji, Shuiwang (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor)
Created2013-12-28
130363-Thumbnail Image.png
Description
Background
Fruit fly embryogenesis is one of the best understood animal development systems, and the spatiotemporal gene expression dynamics in this process are captured by digital images. Analysis of these high-throughput images will provide novel insights into the functions, interactions, and networks of animal genes governing development. To facilitate comparative analysis,

Background
Fruit fly embryogenesis is one of the best understood animal development systems, and the spatiotemporal gene expression dynamics in this process are captured by digital images. Analysis of these high-throughput images will provide novel insights into the functions, interactions, and networks of animal genes governing development. To facilitate comparative analysis, web-based interfaces have been developed to conduct image retrieval based on body part keywords and images. Currently, the keyword annotation of spatiotemporal gene expression patterns is conducted manually. However, this manual practice does not scale with the continuously expanding collection of images. In addition, existing image retrieval systems based on the expression patterns may be made more accurate using keywords.
Results
In this article, we adapt advanced data mining and computer vision techniques to address the key challenges in annotating and retrieving fruit fly gene expression pattern images. To boost the performance of image annotation and retrieval, we propose representations integrating spatial information and sparse features, overcoming the limitations of prior schemes.
Conclusions
We perform systematic experimental studies to evaluate the proposed schemes in comparison with current methods. Experimental results indicate that the integration of spatial information and sparse features lead to consistent performance improvement in image annotation, while for the task of retrieval, sparse features alone yields better results.
ContributorsYuan, Lei (Author) / Woodard, Alexander (Author) / Ji, Shuiwang (Author) / Jiang, Yuan (Author) / Zhou, Zhi-Hua (Author) / Kumar, Sudhir (Author) / Ye, Jieping (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor) / Ira A. Fulton School of Engineering (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor)
Created2012-05-23
130364-Thumbnail Image.png
Description
Background
Drosophila melanogaster has been established as a model organism for investigating the developmental gene interactions. The spatio-temporal gene expression patterns of Drosophila melanogaster can be visualized by in situ hybridization and documented as digital images. Automated and efficient tools for analyzing these expression images will provide biological insights into the

Background
Drosophila melanogaster has been established as a model organism for investigating the developmental gene interactions. The spatio-temporal gene expression patterns of Drosophila melanogaster can be visualized by in situ hybridization and documented as digital images. Automated and efficient tools for analyzing these expression images will provide biological insights into the gene functions, interactions, and networks. To facilitate pattern recognition and comparison, many web-based resources have been created to conduct comparative analysis based on the body part keywords and the associated images. With the fast accumulation of images from high-throughput techniques, manual inspection of images will impose a serious impediment on the pace of biological discovery. It is thus imperative to design an automated system for efficient image annotation and comparison.
Results
We present a computational framework to perform anatomical keywords annotation for Drosophila gene expression images. The spatial sparse coding approach is used to represent local patches of images in comparison with the well-known bag-of-words (BoW) method. Three pooling functions including max pooling, average pooling and Sqrt (square root of mean squared statistics) pooling are employed to transform the sparse codes to image features. Based on the constructed features, we develop both an image-level scheme and a group-level scheme to tackle the key challenges in annotating Drosophila gene expression pattern images automatically. To deal with the imbalanced data distribution inherent in image annotation tasks, the undersampling method is applied together with majority vote. Results on Drosophila embryonic expression pattern images verify the efficacy of our approach.
Conclusion
In our experiment, the three pooling functions perform comparably well in feature dimension reduction. The undersampling with majority vote is shown to be effective in tackling the problem of imbalanced data. Moreover, combining sparse coding and image-level scheme leads to consistent performance improvement in keywords annotation.
ContributorsSun, Qian (Author) / Muckatira, Sherin (Author) / Yuan, Lei (Author) / Ji, Shuiwang (Author) / Newfeld, Stuart (Author) / Kumar, Sudhir (Author) / Ye, Jieping (Author) / Biodesign Institute (Contributor) / Center for Evolution and Medicine (Contributor) / College of Liberal Arts and Sciences (Contributor) / School of Life Sciences (Contributor) / Ira A. Fulton School of Engineering (Contributor)
Created2013-12-03
Description
Twitter has become a very popular social media site that is used daily by many people and organizations. This paper will focus on the financial aspect of Twitter, as a process will be shown to be able to mine data about specific companies' stock prices. This was done by writing

Twitter has become a very popular social media site that is used daily by many people and organizations. This paper will focus on the financial aspect of Twitter, as a process will be shown to be able to mine data about specific companies' stock prices. This was done by writing a program to grab tweets about the stocks of the thirty companies in the Dow Jones.
ContributorsLarson, Grant Elliott (Author) / Davulcu, Hasan (Thesis director) / Ye, Jieping (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)
Created2014-05
Description
Methyl-CpG binding protein 2 (MECP2) is a widely abundant, multifunctional regulator of gene expression with highest levels of expression in mature neurons. In humans, both loss- and gain-of-function mutations of MECP2 cause mental retardation and motor dysfunction classified as either Rett Syndrome (RTT, loss-of-function) or MECP2 Duplication Syndrome (MDS, gain-of-function).

Methyl-CpG binding protein 2 (MECP2) is a widely abundant, multifunctional regulator of gene expression with highest levels of expression in mature neurons. In humans, both loss- and gain-of-function mutations of MECP2 cause mental retardation and motor dysfunction classified as either Rett Syndrome (RTT, loss-of-function) or MECP2 Duplication Syndrome (MDS, gain-of-function). At the cellular level, MECP2 mutations cause both synaptic and dendritic defects. Despite identification of MECP2 as a cause for RTT nearly 16 years ago, little progress has been made in identifying effective treatments. Investigating major cellular and molecular targets of MECP2 in model systems can help elucidate how mutation of this single gene leads to nervous system and behavioral defects, which can ultimately lead to novel therapeutic strategies for RTT and MDS. In the work presented here, I use the fruit fly, Drosophila melanogaster, as a model system to study specific cellular and molecular functions of MECP2 in neurons. First, I show that targeted expression of human MECP2 in Drosophila flight motoneurons causes impaired dendritic growth and flight behavioral performance. These effects are not caused by a general toxic effect of MECP2 overexpression in Drosophila neurons, but are critically dependent on the methyl-binding domain of MECP2. This study shows for the first time cellular consequences of MECP2 gain-of-function in Drosophila neurons. Second, I use RNA-Seq to identify KIBRA, a gene associated with learning and memory in humans, as a novel target of MECP2 involved in the dendritic growth phenotype. I confirm bidirectional regulation of Kibra by Mecp2 in mouse, highlighting the translational utility of the Drosophila model. Finally, I use this system to identify a novel role for the C-terminus in regulating the function of MECP in apoptosis and verify this finding in mammalian cell culture. In summary, this work has established Drosophila as a translational model to study the cellular effects of MECP2 gain-of-function in neurons, and provides insight into the function of MECP2 in dendritic growth and apoptosis.
ContributorsWilliams, Alison (Author) / Duch, Carsten (Thesis advisor) / Orchinik, Miles (Committee member) / Gallitano, Amelia (Committee member) / Huentelman, Matthew (Committee member) / Narayanan, Vinodh (Committee member) / Newfeld, Stuart (Committee member) / Arizona State University (Publisher)
Created2015
Description
Discriminative learning when training and test data belong to different distributions is a challenging and complex task. Often times we have very few or no labeled data from the test or target distribution, but we may have plenty of labeled data from one or multiple related sources with different distributions.

Discriminative learning when training and test data belong to different distributions is a challenging and complex task. Often times we have very few or no labeled data from the test or target distribution, but we may have plenty of labeled data from one or multiple related sources with different distributions. Due to its capability of migrating knowledge from related domains, transfer learning has shown to be effective for cross-domain learning problems. In this dissertation, I carry out research along this direction with a particular focus on designing efficient and effective algorithms for BioImaging and Bilingual applications. Specifically, I propose deep transfer learning algorithms which combine transfer learning and deep learning to improve image annotation performance. Firstly, I propose to generate the deep features for the Drosophila embryo images via pretrained deep models and build linear classifiers on top of the deep features. Secondly, I propose to fine-tune the pretrained model with a small amount of labeled images. The time complexity and performance of deep transfer learning methodologies are investigated. Promising results have demonstrated the knowledge transfer ability of proposed deep transfer algorithms. Moreover, I propose a novel Robust Principal Component Analysis (RPCA) approach to process the noisy images in advance. In addition, I also present a two-stage re-weighting framework for general domain adaptation problems. The distribution of source domain is mapped towards the target domain in the first stage, and an adaptive learning model is proposed in the second stage to incorporate label information from the target domain if it is available. Then the proposed model is applied to tackle cross lingual spam detection problem at LinkedIn’s website. Our experimental results on real data demonstrate the efficiency and effectiveness of the proposed algorithms.
ContributorsSun, Qian (Author) / Ye, Jieping (Committee member) / Xue, Guoliang (Committee member) / Liu, Huan (Committee member) / Li, Jing (Committee member) / Arizona State University (Publisher)
Created2015
Description
One of the most remarkable outcomes resulting from the evolution of the web into Web 2.0, has been the propelling of blogging into a widely adopted and globally accepted phenomenon. While the unprecedented growth of the Blogosphere has added diversity and enriched the media, it has also added complexity. To

One of the most remarkable outcomes resulting from the evolution of the web into Web 2.0, has been the propelling of blogging into a widely adopted and globally accepted phenomenon. While the unprecedented growth of the Blogosphere has added diversity and enriched the media, it has also added complexity. To cope with the relentless expansion, many enthusiastic bloggers have embarked on voluntarily writing, tagging, labeling, and cataloguing their posts in hopes of reaching the widest possible audience. Unbeknown to them, this reaching-for-others process triggers the generation of a new kind of collective wisdom, a result of shared collaboration, and the exchange of ideas, purpose, and objectives, through the formation of associations, links, and relations. Mastering an understanding of the Blogosphere can greatly help facilitate the needs of the ever growing number of these users, as well as producers, service providers, and advertisers into facilitation of the categorization and navigation of this vast environment. This work explores a novel method to leverage the collective wisdom from the infused label space for blog search and discovery. The work demonstrates that the wisdom space can provide a most unique and desirable framework to which to discover the highly sought after background information that could aid in the building of classifiers. This work incorporates this insight into the construction of a better clustering of blogs which boosts the performance of classifiers for identifying more relevant labels for blogs, and offers a mechanism that can be incorporated into replacing spurious labels and mislabels in a multi-labeled space.
ContributorsGalan, Magdiel F (Author) / Liu, Huan (Thesis advisor) / Davulcu, Hasan (Committee member) / Ye, Jieping (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)
Created2015
Description
Software-as-a-Service (SaaS) has received significant attention in recent years as major computer companies such as Google, Microsoft, Amazon, and Salesforce are adopting this new approach to develop software and systems. Cloud computing is a computing infrastructure to enable rapid delivery of computing resources as a utility in a dynamic, scalable,

Software-as-a-Service (SaaS) has received significant attention in recent years as major computer companies such as Google, Microsoft, Amazon, and Salesforce are adopting this new approach to develop software and systems. Cloud computing is a computing infrastructure to enable rapid delivery of computing resources as a utility in a dynamic, scalable, and virtualized manner. Computer Simulations are widely utilized to analyze the behaviors of software and test them before fully implementations. Simulation can further benefit SaaS application in a cost-effective way taking the advantages of cloud such as customizability, configurability and multi-tendency.

This research introduces Modeling, Simulation and Analysis for Software-as-Service in Cloud. The researches cover the following topics: service modeling, policy specification, code generation, dynamic simulation, timing, event and log analysis. Moreover, the framework integrates current advantages of cloud: configurability, Multi-Tenancy, scalability and recoverability.

The following chapters are provided in the architecture:

Multi-Tenancy Simulation Software-as-a-Service.

Policy Specification for MTA simulation environment.

Model Driven PaaS Based SaaS modeling.

Dynamic analysis and dynamic calibration for timing analysis.

Event-driven Service-Oriented Simulation Framework.

LTBD: A Triage Solution for SaaS.
ContributorsLi, Wu (Author) / Tsai, Wei-Tek (Thesis advisor) / Sarjoughian, Hessam S. (Committee member) / Ye, Jieping (Committee member) / Xue, Guoliang (Committee member) / Arizona State University (Publisher)
Created2015
Description
Understanding the complexity of temporal and spatial characteristics of gene expression over brain development is one of the crucial research topics in neuroscience. An accurate description of the locations and expression status of relative genes requires extensive experiment resources. The Allen Developing Mouse Brain Atlas provides a large number of

Understanding the complexity of temporal and spatial characteristics of gene expression over brain development is one of the crucial research topics in neuroscience. An accurate description of the locations and expression status of relative genes requires extensive experiment resources. The Allen Developing Mouse Brain Atlas provides a large number of in situ hybridization (ISH) images of gene expression over seven different mouse brain developmental stages. Studying mouse brain models helps us understand the gene expressions in human brains. This atlas collects about thousands of genes and now they are manually annotated by biologists. Due to the high labor cost of manual annotation, investigating an efficient approach to perform automated gene expression annotation on mouse brain images becomes necessary. In this thesis, a novel efficient approach based on machine learning framework is proposed. Features are extracted from raw brain images, and both binary classification and multi-class classification models are built with some supervised learning methods. To generate features, one of the most adopted methods in current research effort is to apply the bag-of-words (BoW) algorithm. However, both the efficiency and the accuracy of BoW are not outstanding when dealing with large-scale data. Thus, an augmented sparse coding method, which is called Stochastic Coordinate Coding, is adopted to generate high-level features in this thesis. In addition, a new multi-label classification model is proposed in this thesis. Label hierarchy is built based on the given brain ontology structure. Experiments have been conducted on the atlas and the results show that this approach is efficient and classifies the images with a relatively higher accuracy.
ContributorsZhao, Xinlin (Author) / Ye, Jieping (Thesis advisor) / Wang, Yalin (Thesis advisor) / Li, Baoxin (Committee member) / Arizona State University (Publisher)
Created2016
Description
One of the fundamental questions in molecular biology is how genes and the control of their expression give rise to so many diverse phenotypes in nature. The mRNA molecule plays a key role in this process as it directs the spatial and temporal expression of genetic information contained in the

One of the fundamental questions in molecular biology is how genes and the control of their expression give rise to so many diverse phenotypes in nature. The mRNA molecule plays a key role in this process as it directs the spatial and temporal expression of genetic information contained in the DNA molecule to precisely instruct biological processes in living organisms. The region located between the STOP codon and the poly(A)-tail of the mature mRNA, known as the 3′Untranslated Region (3′UTR), is a key modulator of these activities. It contains numerous sequence elements that are targeted by trans-acting factors that dose gene expression, including the repressive small non-coding RNAs, called microRNAs.

Recent transcriptome data from yeast, worm, plants, and humans has shown that alternative polyadenylation (APA), a mechanism that enables expression of multiple 3′UTR isoforms for the same gene, is widespread in eukaryotic organisms. It is still poorly understood why metazoans require multiple 3′UTRs for the same gene, but accumulating evidence suggests that APA is largely regulated at a tissue-specific level. APA may direct combinatorial variation between cis-elements and microRNAs, perhaps to regulate gene expression in a tissue-specific manner. Apart from a few single gene anecdotes, this idea has not been systematically explored.

This dissertation research employs a systems biology approach to study the somatic tissue dynamics of APA and its impact on microRNA targeting networks in the small nematode C. elegans. In the first aim, tools were developed and applied to isolate and sequence mRNA from worm intestine and muscle tissues, which revealed pervasive tissue-specific APA correlated with microRNA regulation. The second aim provides genetic evidence that two worm genes use APA to escape repression by microRNAs in the body muscle. Finally, in aim three, mRNA from five additional somatic worm tissues was sequenced and their 3′ends mapped, allowing for an integrative study of APA and microRNA targeting dynamics in worms. Together, this work provides evidence that APA is a pervasive mechanism operating in somatic tissues of C. elegans with the potential to significantly rearrange their microRNA regulatory networks and precisely dose their gene expression.
ContributorsBlazie, Stephen M (Author) / Mangone, Marco (Thesis advisor) / LaBaer, Josh (Committee member) / Lake, Doug (Committee member) / Newfeld, Stuart (Committee member) / Arizona State University (Publisher)
Created2016