Matching Items (65)
Filtering by
- Genre: Doctoral Dissertation
- Creators: Yang, Yezhou

Description
Reasoning with commonsense knowledge is an integral component of human behavior. It is due to this capability that people know that a weak person may not be able to lift someone. It has been a long standing goal of the Artificial Intelligence community to simulate such commonsense reasoning abilities in machines. Over the years, many advances have been made and various challenges have been proposed to test their abilities. The Winograd Schema Challenge (WSC) is one such Natural Language Understanding (NLU) task which was also proposed as an alternative to the Turing Test. It is made up of textual question answering problems which require resolution of a pronoun to its correct antecedent.
In this thesis, two approaches of developing NLU systems to solve the Winograd Schema Challenge are demonstrated. To this end, a semantic parser is presented, various kinds of commonsense knowledge are identified, techniques to extract commonsense knowledge are developed and two commonsense reasoning algorithms are presented. The usefulness of the developed tools and techniques is shown by applying them to solve the challenge.
In this thesis, two approaches of developing NLU systems to solve the Winograd Schema Challenge are demonstrated. To this end, a semantic parser is presented, various kinds of commonsense knowledge are identified, techniques to extract commonsense knowledge are developed and two commonsense reasoning algorithms are presented. The usefulness of the developed tools and techniques is shown by applying them to solve the challenge.
ContributorsSharma, Arpita (Author) / Baral, Chitta (Thesis advisor) / Lee, Joohyung (Committee member) / Papotti, Paolo (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2019

Description
A massive volume of data is generated at an unprecedented rate in the information age. The growth of data significantly exceeds the computing and storage capacities of the existing digital infrastructure. In the past decade, many methods are invented for data compression, compressive sensing and reconstruction, and compressed learning (learning directly upon compressed data) to overcome the data-explosion challenge. While prior works are predominantly model-based, focus on small models, and not suitable for task-oriented sensing or hardware acceleration, the number of available models for compression-related tasks has escalated by orders of magnitude in the past decade. Motivated by this significant growth and the success of big data, this dissertation proposes to revolutionize both the compressive sensing reconstruction (CSR) and compressed learning (CL) methods from the data-driven perspective. In this dissertation, a series of topics on data-driven CSR are discussed. Individual data-driven models are proposed for the CSR of bio-signals, images, and videos with improved compression ratio and recovery fidelity trade-off. Specifically, a scalable Laplacian pyramid reconstructive adversarial network (LAPRAN) is proposed for single-image CSR. LAPRAN progressively reconstructs images following the concept of the Laplacian pyramid through the concatenation of multiple reconstructive adversarial networks (RANs). For the CSR of videos, CSVideoNet is proposed to improve the spatial-temporal resolution of reconstructed videos. Apart from CSR, data-driven CL is discussed in the dissertation. A CL framework is proposed to extract features directly from compressed data for image classification, objection detection, and semantic/instance segmentation. Besides, the spectral bias of neural networks is analyzed from the frequency perspective, leading to a learning-based frequency selection method for identifying the trivial frequency components which can be removed without accuracy loss. Compared with the conventional spatial downsampling approaches, the proposed frequency-domain learning method can achieve higher accuracy with reduced input data size. The methodologies proposed in this dissertation are not restricted to the above-mentioned applications. The dissertation also discusses other potential applications and directions for future research.
ContributorsXu, Kai (Author) / Ren, Fengbo (Thesis advisor) / Li, Baoxin (Committee member) / Turaga, Pavan (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2021

Description
Imagery data has become important for civil infrastructure operation and
maintenance because imagery data can capture detailed visual information with high
frequencies. Computer vision can be useful for acquiring spatiotemporal details to
support the timely maintenance of critical civil infrastructures that serve society. Some
examples include: irrigation canals need to maintain the leaking sections to avoid water
loss; project engineers need to identify the deviating parts of the workflow to have the
project finished on time and within budget; detecting abnormal behaviors of air traffic
controllers is necessary to reduce operational errors and avoid air traffic accidents.
Identifying the outliers of the civil infrastructure can help engineers focus on targeted
areas. However, large amounts of imagery data bring the difficulty of information
overloading. Anomaly detection combined with contextual knowledge could help address
such information overloading to support the operation and maintenance of civil
infrastructures.
Some challenges make such identification of anomalies difficult. The first challenge is
that diverse large civil infrastructures span among various geospatial environments so
that previous algorithms cannot handle anomaly detection of civil infrastructures in
different environments. The second challenge is that the crowded and rapidly changing
workspaces can cause difficulties for the reliable detection of deviating parts of the
workflow. The third challenge is that limited studies examined how to detect abnormal
behaviors for diverse people in a real-time and non-intrusive manner. Using video andii
relevant data sources (e.g., biometric and communication data) could be promising but
still need a baseline of normal behaviors for outlier detection.
This dissertation presents an anomaly detection framework that uses contextual
knowledge, contextual information, and contextual data for filtering visual information
extracted by computer vision techniques (ADCV) to address the challenges described
above. The framework categorizes the anomaly detection of civil infrastructures into two
categories: with and without a baseline of normal events. The author uses three case
studies to illustrate how the developed approaches can address ADCV challenges in
different categories of anomaly detection. Detailed data collection and experiments
validate the developed ADCV approaches.
maintenance because imagery data can capture detailed visual information with high
frequencies. Computer vision can be useful for acquiring spatiotemporal details to
support the timely maintenance of critical civil infrastructures that serve society. Some
examples include: irrigation canals need to maintain the leaking sections to avoid water
loss; project engineers need to identify the deviating parts of the workflow to have the
project finished on time and within budget; detecting abnormal behaviors of air traffic
controllers is necessary to reduce operational errors and avoid air traffic accidents.
Identifying the outliers of the civil infrastructure can help engineers focus on targeted
areas. However, large amounts of imagery data bring the difficulty of information
overloading. Anomaly detection combined with contextual knowledge could help address
such information overloading to support the operation and maintenance of civil
infrastructures.
Some challenges make such identification of anomalies difficult. The first challenge is
that diverse large civil infrastructures span among various geospatial environments so
that previous algorithms cannot handle anomaly detection of civil infrastructures in
different environments. The second challenge is that the crowded and rapidly changing
workspaces can cause difficulties for the reliable detection of deviating parts of the
workflow. The third challenge is that limited studies examined how to detect abnormal
behaviors for diverse people in a real-time and non-intrusive manner. Using video andii
relevant data sources (e.g., biometric and communication data) could be promising but
still need a baseline of normal behaviors for outlier detection.
This dissertation presents an anomaly detection framework that uses contextual
knowledge, contextual information, and contextual data for filtering visual information
extracted by computer vision techniques (ADCV) to address the challenges described
above. The framework categorizes the anomaly detection of civil infrastructures into two
categories: with and without a baseline of normal events. The author uses three case
studies to illustrate how the developed approaches can address ADCV challenges in
different categories of anomaly detection. Detailed data collection and experiments
validate the developed ADCV approaches.
ContributorsChen, Jiawei (Author) / Tang, Pingbo (Thesis advisor) / Ayer, Steven (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2020
Description
Graph matching is a fundamental but notoriously difficult problem due to its NP-hard nature, and serves as a cornerstone for a series of applications in machine learning and computer vision, such as image matching, dynamic routing, drug design, to name a few. Although there has been massive previous investigation on high-performance graph matching solvers, it still remains a challenging task to tackle the matching problem under real-world scenarios with severe graph uncertainty (e.g., noise, outlier, misleading or ambiguous link).In this dissertation, a main focus is to investigate the essence and propose solutions to graph matching with higher reliability under such uncertainty. To this end, the proposed research was conducted taking into account three perspectives related to reliable graph matching: modeling, optimization and learning. For modeling, graph matching is extended from typical quadratic assignment problem to a more generic mathematical model by introducing a specific family of separable function, achieving higher capacity and reliability. In terms of optimization, a novel high gradient-efficient determinant-based regularization technique is proposed in this research, showing high robustness against outliers. Then learning paradigm for graph matching under intrinsic combinatorial characteristics is explored. First, a study is conducted on the way of filling the gap between discrete problem and its continuous approximation under a deep learning framework. Then this dissertation continues to investigate the necessity of more reliable latent topology of graphs for matching, and propose an effective and flexible framework to obtain it. Coherent findings in this dissertation include theoretical study and several novel algorithms, with rich experiments demonstrating the effectiveness.
ContributorsYu, Tianshu (Author) / Li, Baoxin (Thesis advisor) / Wang, Yalin (Committee member) / Yang, Yezhou (Committee member) / Yang, Yingzhen (Committee member) / Arizona State University (Publisher)
Created2021

Description
Graph matching is a fundamental but notoriously difficult problem due to its NP-hard nature, and serves as a cornerstone for a series of applications in machine learning and computer vision, such as image matching, dynamic routing, drug design, to name a few. Although there has been massive previous investigation on high-performance graph matching solvers, it still remains a challenging task to tackle the matching problem under real-world scenarios with severe graph uncertainty (e.g., noise, outlier, misleading or ambiguous link).In this dissertation, a main focus is to investigate the essence and propose solutions to graph matching with higher reliability under such uncertainty. To this end, the proposed research was conducted taking into account three perspectives related to reliable graph matching: modeling, optimization and learning. For modeling, graph matching is extended from typical quadratic assignment problem to a more generic mathematical model by introducing a specific family of separable function, achieving higher capacity and reliability. In terms of optimization, a novel high gradient-efficient determinant-based regularization technique is proposed in this research, showing high robustness against outliers. Then learning paradigm for graph matching under intrinsic combinatorial characteristics is explored. First, a study is conducted on the way of filling the gap between discrete problem and its continuous approximation under a deep learning framework. Then this dissertation continues to investigate the necessity of more reliable latent topology of graphs for matching, and propose an effective and flexible framework to obtain it. Coherent findings in this dissertation include theoretical study and several novel algorithms, with rich experiments demonstrating the effectiveness.
ContributorsYu, Tianshu (Author) / Li, Baoxin (Thesis advisor) / Wang, Yalin (Committee member) / Yang, Yezhou (Committee member) / Yang, Yingzhen (Committee member) / Arizona State University (Publisher)
Created2021

Description
Statistical Shape Modeling is widely used to study the morphometrics of deformable objects in computer vision and biomedical studies. There are mainly two viewpoints to understand the shapes. On one hand, the outer surface of the shape can be taken as a two-dimensional embedding in space. On the other hand, the outer surface along with its enclosed internal volume can be taken as a three-dimensional embedding of interests. Most studies focus on the surface-based perspective by leveraging the intrinsic features on the tangent plane. But a two-dimensional model may fail to fully represent the realistic properties of shapes with both intrinsic and extrinsic properties. In this thesis, severalStochastic Partial Differential Equations (SPDEs) are thoroughly investigated and several methods are originated from these SPDEs to try to solve the problem of both two-dimensional and three-dimensional shape analyses. The unique physical meanings of these SPDEs inspired the findings of features, shape descriptors, metrics, and kernels in this series of works. Initially, the data generation of high-dimensional shapes, here, the tetrahedral meshes, is introduced. The cerebral cortex is taken as the study target and an automatic pipeline of generating the gray matter tetrahedral mesh is introduced. Then, a discretized Laplace-Beltrami operator (LBO) and a Hamiltonian operator (HO) in tetrahedral domain with Finite Element Method (FEM) are derived. Two high-dimensional shape descriptors are defined based on the solution of the heat equation and Schrödinger’s equation. Considering the fact that high-dimensional shape models usually contain massive redundancies, and the demands on effective landmarks in many applications, a Gaussian process landmarking on tetrahedral meshes is further studied. A SIWKS-based metric space is used to define a geometry-aware Gaussian process. The study of the periodic potential diffusion process further inspired the idea of a new kernel call the geometry-aware convolutional kernel. A series of Bayesian learning methods are then introduced to tackle the problem of shape retrieval and classification. Experiments of every single item are demonstrated. From the popular SPDE such as the heat equation and Schrödinger’s equation to the general potential diffusion equation and the specific periodic potential diffusion equation, it clearly shows that classical SPDEs play an important role in discovering new features, metrics, shape descriptors and kernels. I hope this thesis could be an example of using interdisciplinary knowledge to solve problems.
ContributorsFan, Yonghui (Author) / Wang, Yalin (Thesis advisor) / Lepore, Natasha (Committee member) / Turaga, Pavan (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2021

Description
Applications over a gesture-based human-computer interface (HCI) require a new user login method with gestures because it does not have traditional input devices. For example, a user may be asked to verify the identity to unlock a device in a mobile or wearable platform, or sign in to a virtual site over a Virtual Reality (VR) or Augmented Reality (AR) headset, where no physical keyboard or touchscreen is available. This dissertation presents a unified user login framework and an identity input method using 3D In-Air-Handwriting (IAHW), where a user can log in to a virtual site by writing a passcode in the air very fast like a signature. The presented research contains multiple tasks that span motion signal modeling, user authentication, user identification, template protection, and a thorough evaluation in both security and usability. The results of this research show around 0.1% to 3% Equal Error Rate (EER) in user authentication in different conditions as well as 93% accuracy in user identification, on a dataset with over 100 users and two types of gesture input devices. Besides, current research in this area is severely limited by the availability of the gesture input device, datasets, and software tools. This study provides an infrastructure for IAHW research with an open-source library and open datasets of more than 100K IAHW hand movement signals. Additionally, the proposed user identity input method can be extended to a general word input method for both English and Chinese using limited training data. Hence, this dissertation can help the research community in both cybersecurity and HCI to explore IAHW as a new direction, and potentially pave the way to practical adoption of such technologies in the future.
ContributorsLu, Duo (Author) / Huang, Dijiang (Thesis advisor) / Li, Baoxin (Committee member) / Zhang, Junshan (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2021

Description
Autonomous Vehicles (AV) are inevitable entities in future mobility systems thatdemand safety and adaptability as two critical factors in replacing/assisting human
drivers. Safety arises in defining, standardizing, quantifying, and monitoring requirements
for all autonomous components. Adaptability, on the other hand, involves
efficient handling of uncertainty and inconsistencies in models and data. First, I address
safety by presenting a search-based test-case generation framework that can be
used in training and testing deep-learning components of AV. Next, to address adaptability,
I propose a framework based on multi-valued linear temporal logic syntax and
semantics that allows autonomous agents to perform model-checking on systems with
uncertainties. The search-based test-case generation framework provides safety assurance
guarantees through formalizing and monitoring Responsibility Sensitive Safety
(RSS) rules. I use the RSS rules in signal temporal logic as qualification specifications
for monitoring and screening the quality of generated test-drive scenarios. Furthermore,
to extend the existing temporal-based formal languages’ expressivity, I propose
a new spatio-temporal perception logic that enables formalizing qualification specifications
for perception systems. All-in-one, my test-generation framework can be
used for reasoning about the quality of perception, prediction, and decision-making
components in AV. Finally, my efforts resulted in publicly available software. One
is an offline monitoring algorithm based on the proposed logic to reason about the
quality of perception systems. The other is an optimal planner (model checker) that
accepts mission specifications and model descriptions in the form of multi-valued logic
and multi-valued sets, respectively. My monitoring framework is distributed with the
publicly available S-TaLiRo and Sim-ATAV tools.
ContributorsHekmatnejad, Mohammad (Author) / Fainekos, Georgios (Thesis advisor) / Deshmukh, Jyotirmoy V (Committee member) / Karam, Lina (Committee member) / Pedrielli, Giulia (Committee member) / Shrivastava, Aviral (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2021

Description
Many real-world engineering problems require simulations to evaluate the design objectives and constraints. Often, due to the complexity of the system model, simulations can be prohibitive in terms of computation time. One approach to overcome this issue is to construct a surrogate model, which approximates the original model. The focus of this work is on the data-driven surrogate models, in which empirical approximations of the output are performed given the input parameters. Recently neural networks (NN) have re-emerged as a popular method for constructing data-driven surrogate models. Although, NNs have achieved excellent accuracy and are widely used, they pose their own challenges. This work addresses two common challenges, the need for: (1) hardware acceleration and (2) uncertainty quantification (UQ) in the presence of input variability. The high demand in the inference phase of deep NNs in cloud servers/edge devices calls for the design of low power custom hardware accelerators. The first part of this work describes the design of an energy-efficient long short-term memory (LSTM) accelerator. The overarching goal is to aggressively reduce the power consumption and area of the LSTM components using approximate computing, and then use architectural level techniques to boost the performance. The proposed design is synthesized and placed and routed as an application-specific integrated circuit (ASIC). The results demonstrate that this accelerator is 1.2X and 3.6X more energy-efficient and area-efficient than the baseline LSTM. In the second part of this work, a robust framework is developed based on an alternate data-driven surrogate model referred to as polynomial chaos expansion (PCE) for addressing UQ. In contrast to many existing approaches, no assumptions are made on the elements of the function space and UQ is a function of the expansion coefficients. Moreover, the sensitivity of the output with respect to any subset of the input variables can be computed analytically by post-processing the PCE coefficients. This provides a systematic and incremental method to pruning or changing the order of the model. This framework is evaluated on several real-world applications from different domains and is extended for classification tasks as well.
ContributorsAzari, Elham (Author) / Vrudhula, Sarma (Thesis advisor) / Fainekos, Georgios (Committee member) / Ren, Fengbo (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)
Created2021

Description
Extensive efforts have been dedicated to the development of AI agents that can independently carry out sequential decision-making tasks. Learning-based solutions, particularly deep reinforcement learning (RL) and reward learning, have captured significant attention due to their scalability and the range of problems they can tackle. Nevertheless, the adoption of RL also presents multiple practical challenges, with high sample complexity being one of the most well-known limitations. This manifests in the substantial environment samples and human input (e.g., measured by the number of scalar or preference feedback labels from humans in the loop) required in the two most fundamental aspects of agentification, namely the modeling of user objectives and the acquisition of task knowledge. This dissertation posits that a fundamental cause of high sample complexity in RL is the insufficient exploitation of explicit human guidance and prior task knowledge. Symbolic lingua francas, including human languages, have played a key role in facilitating efficient preference communication among humans and enabling humans to rapidly acquire new skills by leveraging explicit prior knowledge available through mediums such as textbooks and scholarly publications. Building on this intuition, for preference learning from human feedback, this dissertation investigates (a) methodologies for constructing symbolic lingua franca in various forms to support human advice that can convey rich information; and (b) approaches for gathering and utilizing existing preference knowledge in web data as a means to reduce reliance on online human guidance. In a similar vein, this dissertation also investigates how prior domain knowledge, encoded as symbolic planning models, can effectively guide reinforcement learning in complex long-horizon tasks. Two key challenges addressed include how to acquire prior knowledge at scale by leveraging large language models (LLMs) and how to robustly extract useful knowledge from a symbolic domain model even if it only approximately characterizes the environment. The discussions to be presented clarify several misconceptions about how large pre-trained models like LLMs can be reliably applied in agentic systems.
ContributorsGuan, Lin (Author) / Kambhampati, Subbarao (Thesis advisor) / Amor, Heni (Committee member) / Yang, Yezhou (Committee member) / Stone, Peter (Committee member) / Arizona State University (Publisher)
Created2024