Filtering by
- Creators: Applied Structural Discovery
- Member of: ASU Regents' Professors Open Access Works

Learning Sparse Representations for Fruit-Fly Gene Expression Pattern Image Annotation and Retrieval
Fruit fly embryogenesis is one of the best understood animal development systems, and the spatiotemporal gene expression dynamics in this process are captured by digital images. Analysis of these high-throughput images will provide novel insights into the functions, interactions, and networks of animal genes governing development. To facilitate comparative analysis, web-based interfaces have been developed to conduct image retrieval based on body part keywords and images. Currently, the keyword annotation of spatiotemporal gene expression patterns is conducted manually. However, this manual practice does not scale with the continuously expanding collection of images. In addition, existing image retrieval systems based on the expression patterns may be made more accurate using keywords.
Results
In this article, we adapt advanced data mining and computer vision techniques to address the key challenges in annotating and retrieving fruit fly gene expression pattern images. To boost the performance of image annotation and retrieval, we propose representations integrating spatial information and sparse features, overcoming the limitations of prior schemes.
Conclusions
We perform systematic experimental studies to evaluate the proposed schemes in comparison with current methods. Experimental results indicate that the integration of spatial information and sparse features lead to consistent performance improvement in image annotation, while for the task of retrieval, sparse features alone yields better results.





X-ray free-electron lasers provide novel opportunities to conduct single particle analysis on nanoscale particles. Coherent diffractive imaging experiments were performed at the Linac Coherent Light Source (LCLS), SLAC National Laboratory, exposing single inorganic core-shell nanoparticles to femtosecond hard-X-ray pulses. Each facetted nanoparticle consisted of a crystalline gold core and a differently shaped palladium shell. Scattered intensities were observed up to about 7 nm resolution. Analysis of the scattering patterns revealed the size distribution of the samples, which is consistent with that obtained from direct real-space imaging by electron microscopy. Scattering patterns resulting from single particles were selected and compiled into a dataset which can be valuable for algorithm developments in single particle scattering research.

Single particle diffractive imaging data from Rice Dwarf Virus (RDV) were recorded using the Coherent X-ray Imaging (CXI) instrument at the Linac Coherent Light Source (LCLS). RDV was chosen as it is a well-characterized model system, useful for proof-of-principle experiments, system optimization and algorithm development. RDV, an icosahedral virus of about 70 nm in diameter, was aerosolized and injected into the approximately 0.1 μm diameter focused hard X-ray beam at the CXI instrument of LCLS. Diffraction patterns from RDV with signal to 5.9 Ångström were recorded. The diffraction data are available through the Coherent X-ray Imaging Data Bank (CXIDB) as a resource for algorithm development, the contents of which are described here.



The membrane proximal region (MPR, residues 649–683) and transmembrane domain (TMD, residues 684–705) of the gp41 subunit of HIV-1’s envelope protein are highly conserved and are important in viral mucosal transmission, virus attachment and membrane fusion with target cells. Several structures of the trimeric membrane proximal external region (residues 662–683) of MPR have been reported at the atomic level; however, the atomic structure of the TMD still remains unknown. To elucidate the structure of both MPR and TMD, we expressed the region spanning both domains, MPR-TM (residues 649–705), in Escherichia coli as a fusion protein with maltose binding protein (MBP). MPR-TM was initially fused to the C-terminus of MBP via a 42 aa-long linker containing a TEV protease recognition site (MBP-linker-MPR-TM).
Biophysical characterization indicated that the purified MBP-linker-MPR-TM protein was a monodisperse and stable candidate for crystallization. However, crystals of the MBP-linker-MPR-TM protein could not be obtained in extensive crystallization screens. It is possible that the 42 residue-long linker between MBP and MPR-TM was interfering with crystal formation. To test this hypothesis, the 42 residue-long linker was replaced with three alanine residues. The fusion protein, MBP-AAA-MPR-TM, was similarly purified and characterized. Significantly, both the MBP-linker-MPR-TM and MBP-AAA-MPR-TM proteins strongly interacted with broadly neutralizing monoclonal antibodies 2F5 and 4E10. With epitopes accessible to the broadly neutralizing antibodies, these MBP/MPR-TM recombinant proteins may be in immunologically relevant conformations that mimic a pre-hairpin intermediate of gp41.