Search Content

Enhancing the perception of speech indexical properties of Cochlear implants through sensory substitution

Description

Through decades of clinical progress, cochlear implants have brought the world of speech and language to thousands of profoundly deaf patients. However, the technology has many possible areas for improvement, including providing information of non-linguistic cues, also called indexical properties of speech. The field of sensory substitution, providing information relating…

Through decades of clinical progress, cochlear implants have brought the world of speech and language to thousands of profoundly deaf patients. However, the technology has many possible areas for improvement, including providing information of non-linguistic cues, also called indexical properties of speech. The field of sensory substitution, providing information relating one sense to another, offers a potential avenue to further assist those with cochlear implants, in addition to the promise they hold for those without existing aids. A user study with a vibrotactile device is evaluated to exhibit the effectiveness of this approach in an auditory gender discrimination task. Additionally, preliminary computational work is included that demonstrates advantages and limitations encountered when expanding the complexity of future implementations.

ContributorsButts, Austin McRae (Author) / Helms Tillery, Stephen (Thesis advisor) / Berisha, Visar (Committee member) / Buneo, Christopher (Committee member) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)

Created2015

Recognizing Compositional Actions in Videos with Temporal Ordering

Description

In some scenarios, true temporal ordering is required to identify the actions occurring in a video. Recently a new synthetic dataset named CATER, was introduced containing 3D objects like sphere, cone, cylinder etc. which undergo simple movements such as slide, pick & place etc. The task defined in the dataset…

In some scenarios, true temporal ordering is required to identify the actions occurring in a video. Recently a new synthetic dataset named CATER, was introduced containing 3D objects like sphere, cone, cylinder etc. which undergo simple movements such as slide, pick & place etc. The task defined in the dataset is to identify compositional actions with temporal ordering. In this thesis, a rule-based system and a window-based technique are proposed to identify individual actions (atomic) and multiple actions with temporal ordering (composite) on the CATER dataset. The rule-based system proposed here is a heuristic algorithm that evaluates the magnitude and direction of object movement across frames to determine the atomic action temporal windows and uses these windows to predict the composite actions in the videos. The performance of the rule-based system is validated using the frame-level object coordinates provided in the dataset and it outperforms the performance of the baseline models on the CATER dataset. A window-based training technique is proposed for identifying composite actions in the videos. A pre-trained deep neural network (I3D model) is used as a base network for action recognition. During inference, non-overlapping windows are passed through the I3D network to obtain the atomic action predictions and the predictions are passed through a rule-based system to determine the composite actions. The approach outperforms the state-of-the-art composite action recognition models by 13.37% (mAP 66.47% vs. mAP 53.1%).

ContributorsMaskara, Vivek Kumar (Author) / Venkateswara, Hemanth (Thesis advisor) / McDaniel, Troy (Thesis advisor) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2022

Effect of Image Captioning with Description on the Working Memory

Description

Working memory plays an important role in human activities across academic,professional, and social settings. Working memory is dened as the memory extensively involved in goal-directed behaviors in which information must be retained and manipulated to ensure successful task execution. The aim of this research is to understand the effect of image captioning with…

Working memory plays an important role in human activities across academic,professional, and social settings. Working memory is dened as the memory extensively involved in goal-directed behaviors in which information must be retained and manipulated to ensure successful task execution. The aim of this research is to understand the effect of image captioning with image description on an individual's working memory. A study was conducted with eight neutral images comprising situations relatable to daily life such that each image could have a positive or negative description associated with the outcome of the situation in the image. The study consisted of three rounds where the first and second round involved two parts and the third round consisted of one part. The image was captioned a total of five times across the entire study. The findings highlighted that only 25% of participants were able to recall the captions which they captioned for an image after a span of 9-15 days; when comparing the recall rate of the captions, 50% of participants were able to recall the image caption from the previous round in the present round; and out of the positive and negative description associated with the image, 65% of participants recalled the former description rather than the latter. The conclusions drawn from the study are participants tend to retain information for longer periods than the expected duration for working memory, which may be because participants were able to relate the images with their everyday life situations and given a situation with positive and negative information, the human brain is aligned towards positive information over negative information.

ContributorsUppara, Nithiya Shree (Author) / McDaniel, Troy (Thesis advisor) / Venkateswara, Hemanth (Thesis advisor) / Bryan, Chris (Committee member) / Arizona State University (Publisher)

Created2021

Deep domain fusion for adaptive image classification

Description

Endowing machines with the ability to understand digital images is a critical task for a host of high-impact applications, including pathology detection in radiographic imaging, autonomous vehicles, and assistive technology for the visually impaired. Computer vision systems rely on large corpora of annotated data in order to train task-specific visual…

Endowing machines with the ability to understand digital images is a critical task for a host of high-impact applications, including pathology detection in radiographic imaging, autonomous vehicles, and assistive technology for the visually impaired. Computer vision systems rely on large corpora of annotated data in order to train task-specific visual recognition models. Despite significant advances made over the past decade, the fact remains collecting and annotating the data needed to successfully train a model is a prohibitively expensive endeavor. Moreover, these models are prone to rapid performance degradation when applied to data sampled from a different domain. Recent works in the development of deep adaptation networks seek to overcome these challenges by facilitating transfer learning between source and target domains. In parallel, the unification of dominant semi-supervised learning techniques has illustrated unprecedented potential for utilizing unlabeled data to train classification models in defiance of discouragingly meager sets of annotated data.

In this thesis, a novel domain adaptation algorithm -- Domain Adaptive Fusion (DAF) -- is proposed, which encourages a domain-invariant linear relationship between the pixel-space of different domains and the prediction-space while being trained under a domain adversarial signal. The thoughtful combination of key components in unsupervised domain adaptation and semi-supervised learning enable DAF to effectively bridge the gap between source and target domains. Experiments performed on computer vision benchmark datasets for domain adaptation endorse the efficacy of this hybrid approach, outperforming all of the baseline architectures on most of the transfer tasks.

ContributorsDudley, Andrew, M.S (Author) / Panchanathan, Sethuraman (Thesis advisor) / Venkateswara, Hemanth (Committee member) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)

Created2019

Development of a Soft Robotic Exosuit for Knee Flexion Assistance

Description

The knee joint has essential functions to support the body weight and maintain normal walking. Neurological diseases like stroke and musculoskeletal disorders like osteoarthritis can affect the function of the knee. Besides physical therapy, robot-assisted therapy using wearable exoskeletons and exosuits has shown the potential as an efficient therapy that…

The knee joint has essential functions to support the body weight and maintain normal walking. Neurological diseases like stroke and musculoskeletal disorders like osteoarthritis can affect the function of the knee. Besides physical therapy, robot-assisted therapy using wearable exoskeletons and exosuits has shown the potential as an efficient therapy that helps patients restore their limbs’ functions. Exoskeletons and exosuits are being developed for either human performance augmentation or medical purposes like rehabilitation. Although, the research on exoskeletons started early before exosuits, the research and development on exosuits have recently grown rapidly as exosuits have advantages that exoskeletons lack. The objective of this research is to develop a soft exosuit for knee flexion assistance and validate its ability to reduce the EMG activity of the knee flexor muscles. The exosuit has been developed with a novel soft fabric actuator and novel 3D printed adjustable braces to attach the actuator aligned with the knee. A torque analytical model has been derived and validate experimentally to characterize and predict the torque output of the actuator. In addition to that, the actuator’s deflation and inflation time has been experimentally characterized and a controller has been implemented and the exosuit has been tested on a healthy human subject. It is found that the analytical torque model succeeded to predict the torque output in flexion angle range from 0° to 60° more precisely than analytical models in the literature. Deviations existed beyond 60° might have happened because some factors like fabric extensibility and actuator’s bending behavior. After human testing, results showed that, for the human subject tested, the exosuit gave the best performance when the controller was tuned to inflate at 31.9 % of the gait cycle. At this inflation timing, the biceps femoris, the semitendinosus and the vastus lateralis muscles showed average electromyography (EMG) reduction of - 32.02 %, - 23.05 % and - 2.85 % respectively. Finally, it is concluded that the developed exosuit may assist the knee flexion of more diverse healthy human subjects and it may potentially be used in the future in human performance augmentation and rehabilitation of people with disabilities.

ContributorsHasan, Ibrahim Mohammed Ibrahim (Author) / Zhang, Wenlong (Thesis advisor) / Aukes, Daniel (Committee member) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)

Created2021

Vibro-Thermal Haptic Display for Socio-Emotional Communication Through Pattern Generations

Description

Touch plays a vital role in maintaining human relationships through social andemotional communications. This research proposes a multi-modal haptic display capable of generating vibrotactile and thermal haptic signals individually and simultaneously. The main objective for creating this device is to explore the importance of touch in social communication, which is absent in traditional…

Touch plays a vital role in maintaining human relationships through social andemotional communications. This research proposes a multi-modal haptic display capable of generating vibrotactile and thermal haptic signals individually and simultaneously. The main objective for creating this device is to explore the importance of touch in social communication, which is absent in traditional communication modes like a phone call or a video call. By studying how humans interpret haptically generated messages, this research aims to create a new communication channel for humans. This novel device will be worn on the user's forearm and has a broad scope of applications such as navigation, social interactions, notifications, health care, and education. The research methods include testing patterns in the vibro-thermal modality while noting its realizability and accuracy. Different patterns can be controlled and generated through an Android application connected to the proposed device via Bluetooth. Experimental results indicate that the patterns SINGLE TAP and HOLD/SQUEEZE were easily identifiable and more relatable to social interactions. In contrast, other patterns like UP-DOWN, DOWN-UP, LEFTRIGHT, LEFT-RIGHT, LEFT-DIAGONAL, and RIGHT-DIAGONAL were less identifiable and less relatable to social interactions. Finally, design modifications are required if complex social patterns are needed to be displayed on the forearm.

ContributorsGharat, Shubham Shriniwas (Author) / McDaniel, Troy (Thesis advisor) / Redkar, Sangram (Thesis advisor) / Zhang, Wenlong (Committee member) / Arizona State University (Publisher)

Created2021

A Novel Battery Management & Charging Solution for Autonomous UAV Systems

Description

Currently, one of the biggest limiting factors for long-term deployment of autonomous systems is the power constraints of a platform. In particular, for aerial robots such as unmanned aerial vehicles (UAVs), the energy resource is the main driver of mission planning and operation definitions, as everything revolved around flight time.…

Currently, one of the biggest limiting factors for long-term deployment of autonomous systems is the power constraints of a platform. In particular, for aerial robots such as unmanned aerial vehicles (UAVs), the energy resource is the main driver of mission planning and operation definitions, as everything revolved around flight time. The focus of this work is to develop a new method of energy storage and charging for autonomous UAV systems, for use during long-term deployments in a constrained environment. We developed a charging solution that allows pre-equipped UAV system to land on top of designated charging pads and rapidly replenish their battery reserves, using a contact charging point. This system is designed to work with all types of rechargeable batteries, focusing on Lithium Polymer (LiPo) packs, that incorporate a battery management system for increased reliability. The project also explores optimization methods for fleets of UAV systems, to increase charging efficiency and extend battery lifespans. Each component of this project was first designed and tested in computer simulation. Following positive feedback and results, prototypes for each part of this system were developed and rigorously tested. Results show that the contact charging method is able to charge LiPo batteries at a 1-C rate, which is the industry standard rate, maintaining the same safety and efficiency standards as modern day direct connection chargers. Control software for these base stations was also created, to be integrated with a fleet management system, and optimizes UAV charge levels and distribution to extend LiPo battery lifetimes while still meeting expected mission demand. Each component of this project (hardware/software) was designed for manufacturing and implementation using industry standard tools, making it ideal for large-scale implementations. This system has been successfully tested with a fleet of UAV systems at Arizona State University, and is currently being integrated into an Arizona smart city environment for deployment.

ContributorsMian, Sami (Author) / Panchanathan, Sethuraman (Thesis advisor) / Berman, Spring (Committee member) / Yang, Yezhou (Committee member) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)

Created2018

Monocular Visual Odometry: Deep Learning vs Classical Approaches

Description

Visual Odometry is one of the key aspects of robotic localization and mapping. Visual Odometry consists of many geometric-based approaches that convert visual data (images) into pose estimates of where the robot is in space. The classical geometric methods have shown promising results; they are carefully crafted and built explicitly…

Visual Odometry is one of the key aspects of robotic localization and mapping. Visual Odometry consists of many geometric-based approaches that convert visual data (images) into pose estimates of where the robot is in space. The classical geometric methods have shown promising results; they are carefully crafted and built explicitly for these tasks. However, such geometric methods require extreme fine-tuning and extensive prior knowledge to set up these systems for different scenarios. Classical Geometric approaches also require significant post-processing and optimization to minimize the error between the estimated pose and the global truth. In this body of work, the deep learning model was formed by combining SuperPoint and SuperGlue. The resulting model does not require any prior fine-tuning. It has been trained to enable both outdoor and indoor settings. The proposed deep learning model is applied to the Karlsruhe Institute of Technology and Toyota Technological Institute dataset along with other classical geometric visual odometry models. The proposed deep learning model has not been trained on the Karlsruhe Institute of Technology and Toyota Technological Institute dataset. It is only during experimentation that the deep learning model is first introduced to the Karlsruhe Institute of Technology and Toyota Technological Institute dataset. Using the monocular grayscale images from the visual odometer files of the Karlsruhe Institute of Technology and Toyota Technological Institute dataset, through the experiment to test the viability of the models for different sequences. The experiment has been performed on eight different sequences and has obtained the Absolute Trajectory Error and the time taken for each sequence to finish the computation. From the obtained results, there are inferences drawn from the classical and deep learning approaches.

ContributorsVaidyanathan, Venkatesh (Author) / Venkateswara, Hemanth (Thesis advisor) / McDaniel, Troy (Thesis advisor) / Michael, Katina (Committee member) / Arizona State University (Publisher)

Created2022

Innovative Strategies: Power of Props

Description

Recruitment of students in engineering programs is a critical endeavor for universities striving to thrive in an increasingly competitive landscape. This master’s thesis investigates the effectiveness of utilizing an ASU-inspired Mandalorian armor set as a recruitment prop at engineering recruitment events. The research questions posed in this study delve into…

Recruitment of students in engineering programs is a critical endeavor for universities striving to thrive in an increasingly competitive landscape. This master’s thesis investigates the effectiveness of utilizing an ASU-inspired Mandalorian armor set as a recruitment prop at engineering recruitment events. The research questions posed in this study delve into the behavioral response of event attendees and evaluate the prop's effectiveness in generating interest and initiating interactions with ASU recruiting staff. Drawing on a combination of observational data, thematic analysis, and insights from the literature review, this study evaluates the prop's impact on booth traffic, attendee engagement, and overall recruitment efforts. The observational data collected from two recruitment events on March 22, 2024, and March 24, 2024, revealed fluctuations in attendee engagement with the prop, with substantial visitor traffic observed on March 22, 2024, compared to March 24, 2024. The thematic analysis provided deeper insights into the prop's role as a conversation starter and attraction for both adults and children, highlighting its ability to spark curiosity and inquiries about its significance and association with ASU and engineering programs. The literature review supported these findings, offering insights into the dynamics of college choice processes and the strategic deployment of marketing practices in higher education recruitment. Concepts from studies by Han (2014), Rogers et al. (2010), Muller et al. (2010), and Brignull & Rogers (2003) informed the design attribution, booth interaction strategies, and logistical considerations associated with the prop's deployment.

ContributorsReynolds, Zane (Author) / Jordan, Shawn (Thesis advisor) / McDaniel, Troy (Committee member) / Nichols, Kevin (Committee member) / Arizona State University (Publisher)

Created2024

Why Pop? A System to Explain How Deep Learning Models Classify Music

Description

The impact of Artificial Intelligence (AI) has increased significantly in daily life. AI is taking big strides towards moving into areas of life that are critical such as healthcare but, also into areas such as entertainment and leisure. Deep neural networks have been pivotal in making all these advancements possible.…

The impact of Artificial Intelligence (AI) has increased significantly in daily life. AI is taking big strides towards moving into areas of life that are critical such as healthcare but, also into areas such as entertainment and leisure. Deep neural networks have been pivotal in making all these advancements possible. But, a well-known problem with deep neural networks is the lack of explanations for the choices it makes. To combat this, several methods have been tried in the field of research. One example of this is assigning rankings to the individual features and how influential they are in the decision-making process. In contrast a newer class of methods focuses on Concept Activation Vectors (CAV) which focus on extracting higher-level concepts from the trained model to capture more information as a mixture of several features and not just one. The goal of this thesis is to employ concepts in a novel domain: to explain how a deep learning model uses computer vision to classify music into different genres. Due to the advances in the field of computer vision with deep learning for classification tasks, it is rather a standard practice now to convert an audio clip into corresponding spectrograms and use those spectrograms as image inputs to the deep learning model. Thus, a pre-trained model can classify the spectrogram images (representing songs) into musical genres. The proposed explanation system called “Why Pop?” tries to answer certain questions about the classification process such as what parts of the spectrogram influence the model the most, what concepts were extracted and how are they different for different classes. These explanations aid the user gain insights into the model’s learnings, biases, and the decision-making process.

ContributorsSharma, Shubham (Author) / Bryan, Chris (Thesis advisor) / McDaniel, Troy (Committee member) / Sarwat, Mohamed (Committee member) / Arizona State University (Publisher)

Created2022

Filtering by