Matching Items (70)
Description
Covid-19 is unlike any coronavirus we have seen before, characterized mostly by the ease with which it spreads. This analysis utilizes an SEIR model built to accommodate various populations to understand how different testing and infection rates may affect hospitalization and death. This analysis finds that infection rates have a significant impact on Covid-19 impact regardless of the population whereas the impact that testing rates have in this simulation is not as pronounced. Thus, policy-makers should focus on decreasing infection rates through targeted lockdowns and vaccine rollout to contain the virus, and decrease its spread.
ContributorsShah, Aashney Lalita (Author) / Pedrielli, Giulia (Thesis director) / Candan, Kasim Selcuk (Committee member) / Industrial, Systems & Operations Engineering Prgm (Contributor) / Industrial, Systems & Operations Engineering Prgm (Contributor) / Economics Program in CLAS (Contributor) / Barrett, The Honors College (Contributor)
Created2021-05
Description
The performance of modern machine learning algorithms depends upon the selection
of a set of hyperparameters. Common examples of hyperparameters are learning
rate and the number of layers in a dense neural network. Auto-ML is a branch
of optimization that has produced important contributions in this area. Within
Auto-ML, multi-fidelity approaches, which eliminate poorly-performing configurations
after evaluating them at low budgets, are among the most effective. However, the
performance of these algorithms strongly depends on how effectively they allocate
the computational budget to various hyperparameter configurations. We first present
Parameter Optimization with Conscious Allocation 1.0 (POCA 1.0), a hyperband-
based algorithm for hyperparameter optimization that adaptively allocates the inputted
budget to the hyperparameter configurations it generates following a Bayesian sampling
scheme. We then present its successor Parameter Optimization with Conscious
Allocation 2.0 (POCA 2.0), which follows POCA 1.0’s successful philosophy while
utilizing a time-series model to reduce wasted computational cost and providing a
more flexible framework. We compare POCA 1.0 and 2.0 to its nearest competitor BOHB
at optimizing the hyperparameters of a multi-layered perceptron and find that both
POCA algorithms exceed BOHB in low-budget hyperparameter optimization while
performing similarly in high-budget scenarios.
ContributorsInman, Joshua (Author) / Sankar, Lalitha (Thesis director) / Pedrielli, Giulia (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Computer Science and Engineering Program (Contributor)
Created2024-05

Description
This dissertation addresses access management problems that occur in both emergency and outpatient clinics with the objective of allocating the available resources to improve performance measures by considering the trade-offs. Two main settings are considered for estimating patient willingness-to-wait (WtW) behavior for outpatient appointments with statistical analyses of data: allocation of the limited booking horizon to patients of different priorities by using time windows in an outpatient setting considering patient behavior, and allocation of hospital beds to admitted Emergency Department (ED) patients. For each chapter, a different approach based on the problem context is developed and the performance is analyzed by implementing analytical and simulation models. Real hospital data is used in the analyses to provide evidence that the methodologies introduced are beneficial in addressing real life problems, and real improvements can be achievable by using the policies that are suggested.
This dissertation starts with studying an outpatient clinic context to develop an effective resource allocation mechanism that can improve patient access to clinic appointments. I first start with identifying patient behavior in terms of willingness-to-wait to an outpatient appointment. Two statistical models are developed to estimate patient WtW distribution by using data on booked appointments and appointment requests. Several analyses are conducted on simulated data to observe effectiveness and accuracy of the estimations.
Then, this dissertation introduces a time windows based policy that utilizes patient behavior to improve access by using appointment delay as a lever. The policy improves patient access by allocating the available capacity to the patients from different priorities by dividing the booking horizon into time intervals that can be used by each priority group which strategically delay lower priority patients.
Finally, the patient routing between ED and inpatient units to improve the patient access to hospital beds is studied. The strategy that captures the trade-off between patient safety and quality of care is characterized as a threshold type. Through the simulation experiments developed by real data collected from a hospital, the achievable improvement of implementing such a strategy that considers the safety-quality of care trade-off is illustrated.
This dissertation starts with studying an outpatient clinic context to develop an effective resource allocation mechanism that can improve patient access to clinic appointments. I first start with identifying patient behavior in terms of willingness-to-wait to an outpatient appointment. Two statistical models are developed to estimate patient WtW distribution by using data on booked appointments and appointment requests. Several analyses are conducted on simulated data to observe effectiveness and accuracy of the estimations.
Then, this dissertation introduces a time windows based policy that utilizes patient behavior to improve access by using appointment delay as a lever. The policy improves patient access by allocating the available capacity to the patients from different priorities by dividing the booking horizon into time intervals that can be used by each priority group which strategically delay lower priority patients.
Finally, the patient routing between ED and inpatient units to improve the patient access to hospital beds is studied. The strategy that captures the trade-off between patient safety and quality of care is characterized as a threshold type. Through the simulation experiments developed by real data collected from a hospital, the achievable improvement of implementing such a strategy that considers the safety-quality of care trade-off is illustrated.
ContributorsKilinc, Derya (Author) / Gel, Esma (Thesis advisor) / Pasupathy, Kalyan (Committee member) / Sefair, Jorge (Committee member) / Sir, Mustafa (Committee member) / Yan, Hao (Committee member) / Arizona State University (Publisher)
Created2019

Description
With the development of modern technological infrastructures, such as social networks or the Internet of Things (IoT), data is being generated at a speed that is never before seen. Analyzing the content of this data helps us further understand underlying patterns and discover relationships among different subsets of data, enabling intelligent decision making. In this thesis, I first introduce the Low-rank, Win-dowed, Incremental Singular Value Decomposition (SVD) framework to inclemently maintain SVD factors over streaming data. Then, I present the Group Incremental Non-Negative Matrix Factorization framework to leverage redundancies in the data to speed up incremental processing. They primarily tackle the challenges of using factorization models in the scenarios with streaming textual data. In order to tackle the challenges in improving the effectiveness and efficiency of generative models in this streaming environment, I introduce the Incremental Dynamic Multiscale Topic Model framework, which identifies multi-scale patterns and their evolutions within streaming datasets. While the latent factor models assume the linear independence in the latent factors, the generative models assume the observation is generated from a set of latent variables with various distributions. Furthermore, some models may not be accessible or their underlying structures are too complex to understand, such as simulation ensembles, where there may be thousands of parameters with a huge parameter space, the only way to learn information from it is to execute real simulations. When performing knowledge discovery and decision making through data- and model-driven simulation ensembles, it is expensive to operate these ensembles continuously at large scale, due to the high computational. Consequently, given a relatively small simulation budget, it is desirable to identify a sparse ensemble that includes the most informative simulations, while still permitting effective exploration of the input parameter space. Therefore, I present Complexity-Guided Parameter Space Sampling framework, which is an intelligent, top-down sampling scheme to select the most salient simulation parameters to execute, given a limited computational budget. Moreover, I also present a Pivot-Guided Parameter Space Sampling framework, which incrementally maintains a diverse ensemble of models of the simulation ensemble space and uses a pivot guided mechanism for future sample selection.
ContributorsChen, Xilun (Author) / Candan, K. Selcuk (Thesis advisor) / Davulcu, Hasan (Committee member) / Pedrielli, Giulia (Committee member) / Sapino, Maria Luisa (Committee member) / Tong, Hanghang (Committee member) / Arizona State University (Publisher)
Created2019

Description
Cyber-Physical Systems (CPS) are becoming increasingly prevalent around the world. Co-simulation of cyber and physical components has shown to be an effective way towards the development of time-sensitive and reliable CPS. Correctly combining continuous models with discrete models for co-simulation can often be challenging. In this thesis, the Functional Markup Interface (FMI) is used to develop an adapter called DEVS-FMI for the DEVS-Suite simulator. The adapter, implemented using JavaFMI 2.0, allows any Functional Mock-Up Unit (FMU) to be co-simulated with a Discrete Event System Specification (DEVS) model. This approach enables taking advantage of the parallel DEVS formalism to model cyber systems and using Modelica to model physical systems. An FMU serves as a slave simulator while the DEVS-Suite serves as a master simulator. The Four-Variable model is used as a guide to define the requirements for the inputs and outputs of actuator and sensor devices used in cyber and physical systems. The input and output data as non-functional abstractions of the sensor and actuator devices. Select cyber and physical parts of an electric scooter are chosen, modeled, simulated, and evaluated using the integrated OpenModelica and the DEVS-Suite simulators. Closely related research is briefly examined and expanding this work with support for implicit state-changes for continuous models and distributed co-simulation is noted.
ContributorsLin, Xuanli (Author) / Sarjoughian, Hessam S (Thesis advisor) / Pedrielli, Giulia (Committee member) / Xue, Guoliang (Committee member) / Arizona State University (Publisher)
Created2021

Description
Recent advances in manufacturing system, such as advanced embedded sensing, big data analytics and IoT and robotics, are promising a paradigm shift in the manufacturing industry towards smart manufacturing systems. Typically, real-time data is available in many industries, such as automotive, semiconductor, and food production, which can reflect the machine conditions and production system’s operation performance. However, a major research gap still exists in terms of how to utilize these real-time data information to evaluate and predict production system performance and to further facilitate timely decision making and production control on the factory floor. To tackle these challenges, this dissertation takes on an integrated analytical approach by hybridizing data analytics, stochastic modeling and decision making under uncertainty methodology to solve practical manufacturing problems.
Specifically, in this research, the machine degradation process is considered. It has been shown that machines working at different operating states may break down in different probabilistic manners. In addition, machines working in worse operating stage are more likely to fail, thus causing more frequent down period and reducing the system throughput. However, there is still a lack of analytical methods to quantify the potential impact of machine condition degradation on the overall system performance to facilitate operation decision making on the factory floor. To address these issues, this dissertation considers a serial production line with finite buffers and multiple machines following Markovian degradation process. An integrated model based on the aggregation method is built to quantify the overall system performance and its interactions with machine condition process. Moreover, system properties are investigated to analyze the influence of system parameters on system performance. In addition, three types of bottlenecks are defined and their corresponding indicators are derived to provide guidelines on improving system performance. These methods provide quantitative tools for modeling, analyzing, and improving manufacturing systems with the coupling between machine condition degradation and productivity given the real-time signals.
Specifically, in this research, the machine degradation process is considered. It has been shown that machines working at different operating states may break down in different probabilistic manners. In addition, machines working in worse operating stage are more likely to fail, thus causing more frequent down period and reducing the system throughput. However, there is still a lack of analytical methods to quantify the potential impact of machine condition degradation on the overall system performance to facilitate operation decision making on the factory floor. To address these issues, this dissertation considers a serial production line with finite buffers and multiple machines following Markovian degradation process. An integrated model based on the aggregation method is built to quantify the overall system performance and its interactions with machine condition process. Moreover, system properties are investigated to analyze the influence of system parameters on system performance. In addition, three types of bottlenecks are defined and their corresponding indicators are derived to provide guidelines on improving system performance. These methods provide quantitative tools for modeling, analyzing, and improving manufacturing systems with the coupling between machine condition degradation and productivity given the real-time signals.
ContributorsKang, Yunyi (Author) / Ju, Feng (Thesis advisor) / Pedrielli, Giulia (Committee member) / Wu, Teresa (Committee member) / Yan, Hao (Committee member) / Arizona State University (Publisher)
Created2020

Description
In conventional supervised learning tasks, information retrieval from extensive collections of data happens automatically at low cost, whereas in many real-world problems obtaining labeled data can be hard, time-consuming, and expensive. Consider healthcare systems, for example, where unlabeled medical images are abundant while labeling requires a considerable amount of knowledge from experienced physicians. Active learning addresses this challenge with an iterative process to select instances from the unlabeled data to annotate and improve the supervised learner. At each step, the query of examples to be labeled can be considered as a dilemma between exploitation of the supervised learner's current knowledge and exploration of the unlabeled input features.
Motivated by the need for efficient active learning strategies, this dissertation proposes new algorithms for batch-mode, pool-based active learning. The research considers the following questions: how can unsupervised knowledge of the input features (exploration) improve learning when incorporated with supervised learning (exploitation)? How to characterize exploration in active learning when data is high-dimensional? Finally, how to adaptively make a balance between exploration and exploitation?
The first contribution proposes a new active learning algorithm, Cluster-based Stochastic Query-by-Forest (CSQBF), which provides a batch-mode strategy that accelerates learning with added value from exploration and improved exploitation scores. CSQBF balances exploration and exploitation using a probabilistic scoring criterion based on classification probabilities from a tree-based ensemble model within each data cluster.
The second contribution introduces two more query strategies, Double Margin Active Learning (DMAL) and Cluster Agnostic Active Learning (CAAL), that combine consistent exploration and exploitation modules into a coherent and unified measure for label query. Instead of assuming a fixed clustering structure, CAAL and DMAL adopt a soft-clustering strategy which provides a new approach to formalize exploration in active learning.
The third contribution addresses the challenge of dynamically making a balance between exploration and exploitation criteria throughout the active learning process. Two adaptive algorithms are proposed based on feedback-driven bandit optimization frameworks that elegantly handle this issue by learning the relationship between exploration-exploitation trade-off and an active learner's performance.
Motivated by the need for efficient active learning strategies, this dissertation proposes new algorithms for batch-mode, pool-based active learning. The research considers the following questions: how can unsupervised knowledge of the input features (exploration) improve learning when incorporated with supervised learning (exploitation)? How to characterize exploration in active learning when data is high-dimensional? Finally, how to adaptively make a balance between exploration and exploitation?
The first contribution proposes a new active learning algorithm, Cluster-based Stochastic Query-by-Forest (CSQBF), which provides a batch-mode strategy that accelerates learning with added value from exploration and improved exploitation scores. CSQBF balances exploration and exploitation using a probabilistic scoring criterion based on classification probabilities from a tree-based ensemble model within each data cluster.
The second contribution introduces two more query strategies, Double Margin Active Learning (DMAL) and Cluster Agnostic Active Learning (CAAL), that combine consistent exploration and exploitation modules into a coherent and unified measure for label query. Instead of assuming a fixed clustering structure, CAAL and DMAL adopt a soft-clustering strategy which provides a new approach to formalize exploration in active learning.
The third contribution addresses the challenge of dynamically making a balance between exploration and exploitation criteria throughout the active learning process. Two adaptive algorithms are proposed based on feedback-driven bandit optimization frameworks that elegantly handle this issue by learning the relationship between exploration-exploitation trade-off and an active learner's performance.
ContributorsShams, Ghazal (Author) / Runger, George C. (Thesis advisor) / Montgomery, Douglas C. (Committee member) / Escobedo, Adolfo (Committee member) / Pedrielli, Giulia (Committee member) / Arizona State University (Publisher)
Created2020

Description
Graphs are commonly used visualization tools in a variety of fields. Algorithms have been proposed that claim to improve the readability of graphs by reducing edge crossings, adjusting edge length, or some other means. However, little research has been done to determine which of these algorithms best suit human perception for particular graph properties. This thesis explores four different graph properties: average local clustering coefficient (ALCC), global clustering coefficient (GCC), number of triangles (NT), and diameter. For each of these properties, three different graph layouts are applied to represent three different approaches to graph visualization: multidimensional scaling (MDS), force directed (FD), and tsNET. In a series of studies conducted through the crowdsourcing platform Amazon Mechanical Turk, participants are tasked with discriminating between two graphs in order to determine their just noticeable differences (JNDs) for the four graph properties and three layout algorithm pairs. These results are analyzed using previously established methods presented by Rensink et al. and Kay and Heer.The average JNDs are analyzed using a linear model that determines whether the property-layout pair seems to follow Weber's Law, and the individual JNDs are run through a log-linear model to determine whether it is possible to model the individual variance of the participant's JNDs. The models are evaluated using the R2 score to determine if they adequately explain the data and compared using the Mann-Whitney pairwise U-test to determine whether the layout has a significant effect on the perception of the graph property. These tests indicate that the data collected in the studies can not always be modelled well with either the linear model or log-linear model, which suggests that some properties may not follow Weber's Law. Additionally, the layout algorithm is not found to have a significant impact on the perception of some of these properties.
ContributorsClayton, Benjamin (Author) / Maciejewski, Ross (Thesis advisor) / Kobourov, Stephen (Committee member) / Sefair, Jorge (Committee member) / Arizona State University (Publisher)
Created2019

Description
Breeding seeds to include desirable traits (increased yield, drought/temperature resistance, etc.) is a growing and important method of establishing food security. However, besides breeder intuition, few decision-making tools exist that can provide the breeders with credible evidence to make decisions on which seeds to progress to further stages of development. This thesis attempts to create a chance-constrained knapsack optimization model, which the breeder can use to make better decisions about seed progression and help reduce the levels of risk in their selections. The model’s objective is to select seed varieties out of a larger pool of varieties and maximize the average yield of the “knapsack” based on meeting some risk criteria. Two models are created for different cases. First is the risk reduction model which seeks to reduce the risk of getting a bad yield but still maximize the total yield. The second model considers the possibility of adverse environmental effects and seeks to mitigate the negative effects it could have on the total yield. In practice, breeders can use these models to better quantify uncertainty in selecting seed varieties
ContributorsOzcan, Ozkan Meric (Author) / Armbruster, Dieter (Thesis advisor) / Gel, Esma (Thesis advisor) / Sefair, Jorge (Committee member) / Arizona State University (Publisher)
Created2019

Description
Complex systems appear when interaction among system components creates emergent behavior that is difficult to be predicted from component properties. The growth of Internet of Things (IoT) and embedded technology has increased complexity across several sectors (e.g., automotive, aerospace, agriculture, city infrastructures, home technologies, healthcare) where the paradigm of cyber-physical systems (CPSs) has become a standard. While CPS enables unprecedented capabilities, it raises new challenges in system design, certification, control, and verification. When optimizing system performance computationally expensive simulation tools are often required, and search algorithms that sequentially interrogate a simulator to learn promising solutions are in great demand. This class of algorithms are black-box optimization techniques. However, the generality that makes black-box optimization desirable also causes computational efficiency difficulties when applied real problems. This thesis focuses on Bayesian optimization, a prominent black-box optimization family, and proposes new principles, translated in implementable algorithms, to scale Bayesian optimization to highly expensive, large scale problems. Four problem contexts are studied and approaches are proposed for practically applying Bayesian optimization concepts, namely: (1) increasing sample efficiency of a highly expensive simulator in the presence of other sources of information, where multi-fidelity optimization is used to leverage complementary information sources; (2) accelerating global optimization in the presence of local searches by avoiding over-exploitation with adaptive restart behavior; (3) scaling optimization to high dimensional input spaces by integrating Game theoretic mechanisms with traditional techniques; (4) accelerating optimization by embedding function structure when the reward function is a minimum of several functions. In the first context this thesis produces two multi-fidelity algorithms, a sample driven and model driven approach, and is implemented to optimize a serial production line; in the second context the Stochastic Optimization with Adaptive Restart (SOAR) framework is produced and analyzed with multiple applications to CPS falsification problems; in the third context the Bayesian optimization with sample fictitious play (BOFiP) algorithm is developed with an implementation in high-dimensional neural network training; in the last problem context the minimum surrogate optimization (MSO) framework is produced and combined with both Bayesian optimization and the SOAR framework with applications in simultaneous falsification of multiple CPS requirements.
ContributorsMathesen, Logan (Author) / Pedrielli, Giulia (Thesis advisor) / Candan, Kasim (Committee member) / Fainekos, Georgios (Committee member) / Gel, Esma (Committee member) / Montgomery, Douglas (Committee member) / Zabinsky, Zelda (Committee member) / Arizona State University (Publisher)
Created2021