Search Content

Artificial Intelligence with Graph Neural Networks Applied to a Risk-like Board Game

Description

This project aspires to develop an AI capable of playing on a variety of maps in a Risk-like board game. While AI has been successfully applied to many other board games, such as Chess and Go, most research is confined to a single board and is inflexible to topological changes.…

This project aspires to develop an AI capable of playing on a variety of maps in a Risk-like board game. While AI has been successfully applied to many other board games, such as Chess and Go, most research is confined to a single board and is inflexible to topological changes. Further, almost all of these games are played on a rectangular grid. Contrarily, this project develops an AI player, referred to as GG-net, to play the online strategy game Warzone, which is based on the classic board game Risk. Warzone is played on a wide variety of irregularly shaped maps. Prior research has struggled to create an effective AI for Risk-like games due to the immense branching factor. The most successful attempts tended to rely on manually restricting the set of actions the AI considered while also engineering useful features for the AI to consider. GG-net uses no human knowledge, but rather a genetic algorithm combined with a graph neural network. Together, these methods allow GG-net to perform competitively across a multitude of maps. GG-net outperformed the built-in rule-based AI by 413 Elo (representing an 80.7% chance of winning) and an approach based on AlphaZero using graph neural networks by 304 Elo (representing a 74.2% chance of winning). This same advantage holds across both seen and unseen maps. GG-net appears to be a strong opponent on both small and medium maps, however, on large maps with hundreds of territories, inefficiencies in GG-net become more significant and GG-net struggles against the rule-based approach. Overall, GG-net was able to successfully learn the game and generalize across maps of a similar size, albeit further work is required for GG-net to become more successful on large maps.

ContributorsBauer, Andrew (Author) / Yang, Yezhou (Thesis director) / Harrison, Blake (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor) / School of Mathematical and Statistical Sciences (Contributor)

Created2022-05

Predicting Self-Correction Attempts with FACT, an Automated Teaching Assistant for Algebra Classes

Description

Machine learning is a rapidly growing field, with no doubt in part due to its countless applications to other fields, including pedagogy and the creation of computer-aided tutoring systems. To extend the functionality of FACT, an automated teaching assistant, we want to predict, using metadata produced by student activity, whether…

Machine learning is a rapidly growing field, with no doubt in part due to its countless applications to other fields, including pedagogy and the creation of computer-aided tutoring systems. To extend the functionality of FACT, an automated teaching assistant, we want to predict, using metadata produced by student activity, whether a student is capable of fixing their own mistakes. Logs were collected from previous FACT trials with middle school math teachers and students. The data was converted to time series sequences for deep learning, and ordinary features were extracted for statistical machine learning. Ultimately, deep learning models attained an accuracy of 60%, while tree-based methods attained an accuracy of 65%, showing that some correlation, although small, exists between how a student fixes their mistakes and whether their correction is correct.

ContributorsZhou, David (Author) / VanLehn, Kurt (Thesis director) / Wetzel, Jon (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Computer Science and Engineering Program (Contributor)

Created2022-05

Using Machine Learning Classification Techniques to Predict Recessionary Periods in the U.S. Economy

Description

The goal of this research project is to determine how beneficial machine learning (ML) techniquescan be in predicting recessions. Past work has utilized a multitude of classification methods from Probit models to linear Support Vector Machines (SVMs) and obtained accuracies nearing 60-70%, where some models even predicted the Great Recession…

The goal of this research project is to determine how beneficial machine learning (ML) techniquescan be in predicting recessions. Past work has utilized a multitude of classification methods from Probit models to linear Support Vector Machines (SVMs) and obtained accuracies nearing 60-70%, where some models even predicted the Great Recession based off data from the previous 50 years. This paper will build on past work, by starting with less complex classification techniques that are more broadly used in recession forecasting and end by incorporating more complex ML models that produce higher accuracies than their more primitive counterparts. Many models were tested in this analysis and the findings here corroborate past work that the SVM methodology produces more accurate results than currently used probit models, but adds on that other ML models produced sufficient accuracy as well.

ContributorsHogan, Carter (Author) / McCulloch, Robert (Thesis director) / Pereira, Claudiney (Committee member) / Barrett, The Honors College (Contributor) / School of International Letters and Cultures (Contributor) / Economics Program in CLAS (Contributor) / School of Mathematical and Statistical Sciences (Contributor)

Created2022-05

A U-Net to Identify Deforested Areas in Satellite Imagery of the Amazon

Description

Deforestation in the Amazon rainforest has the potential to have devastating effects on ecosystems on both a local and global scale, making it one of the most environmentally threatening phenomena occurring today. In order to minimize deforestation in the Amazon and its consequences, it is helpful to analyze its occurrence…

Deforestation in the Amazon rainforest has the potential to have devastating effects on ecosystems on both a local and global scale, making it one of the most environmentally threatening phenomena occurring today. In order to minimize deforestation in the Amazon and its consequences, it is helpful to analyze its occurrence using machine learning architectures such as the U-Net. The U-Net is a type of Fully Convolutional Network that has shown significant capability in performing semantic segmentation. It is built upon a symmetric series of downsampling and upsampling layers that propagate feature information into higher spatial resolutions, allowing for the precise identification of features on the pixel scale. Such an architecture is well-suited for identifying features in satellite imagery. In this thesis, we construct and train a U-Net to identify deforested areas in satellite imagery of the Amazon through semantic segmentation.

ContributorsGiel, Joshua (Author) / Douglas, Liam (Co-author) / Espanol, Malena (Thesis director) / Cochran, Douglas (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / School of Sustainability (Contributor)

Created2024-05

An Evaluation of Machine Learning Algorithms for Cardiovascular Disease Detection

Description

This thesis aims to advance healthcare and heart disease prevention by utilizing the Python programming language and various machine learning algorithms for heart disease detection. Being one of the main causes of death worldwide, cardiovascular disease is a serious global health concern. One person passes away from cardiovascular disease every…

This thesis aims to advance healthcare and heart disease prevention by utilizing the Python programming language and various machine learning algorithms for heart disease detection. Being one of the main causes of death worldwide, cardiovascular disease is a serious global health concern. One person passes away from cardiovascular disease every 33 seconds in the United States alone. As the leading cause of death, early identification becomes critical for early intervention and prevention. The study addresses key research questions, including the role of machine learning in enhancing heart disease detection, comparative analysis of the six machine learning models, and the importance of predictive indicators. By leveraging machine learning algorithms for medical data interpretation, the thesis contributes insights into early disease detection.

ContributorsLa, Nikki (Author) / Sheehan, Connor (Thesis director) / Connor, Dylan (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor)

Created2024-05

Comparison of Machine Learning Algorithms for Predicting Breast Cancer Malignancy

Description

Breast cancer is one of the most common types of cancer worldwide. Early detection and diagnosis are crucial for improving the chances of successful treatment and survival. In this thesis, many different machine learning algorithms were evaluated and compared to predict breast cancer malignancy from diagnostic features extracted from digitized…

Breast cancer is one of the most common types of cancer worldwide. Early detection and diagnosis are crucial for improving the chances of successful treatment and survival. In this thesis, many different machine learning algorithms were evaluated and compared to predict breast cancer malignancy from diagnostic features extracted from digitized images of breast tissue samples, called fine-needle aspirates. Breast cancer diagnosis typically involves a combination of mammography, ultrasound, and biopsy. However, machine learning algorithms can assist in the detection and diagnosis of breast cancer by analyzing large amounts of data and identifying patterns that may not be discernible to the human eye. By using these algorithms, healthcare professionals can potentially detect breast cancer at an earlier stage, leading to more effective treatment and better patient outcomes. The results showed that the gradient boosting classifier performed the best, achieving an accuracy of 96% on the test set. This indicates that this algorithm can be a useful tool for healthcare professionals in the early detection and diagnosis of breast cancer, potentially leading to improved patient outcomes.

ContributorsMallya, Aatmik (Author) / De Luca, Gennaro (Thesis director) / Chen, Yinong (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Computer Science and Engineering Program (Contributor)

Created2023-05

Improving Quantum Mechanical Calculations Using Graph Neural Networks to Predict Energies from Atomic Structure

Description

Graph neural networks (GNN) offer a potential method of bypassing the Kohn-Sham equations in density functional theory (DFT) calculations by learning both the Hohenberg-Kohn (HK) mapping of electron density to energy, allowing for calculations of much larger atomic systems and time scales and enabling large-scale MD simulations with DFT-level accuracy.…

Graph neural networks (GNN) offer a potential method of bypassing the Kohn-Sham equations in density functional theory (DFT) calculations by learning both the Hohenberg-Kohn (HK) mapping of electron density to energy, allowing for calculations of much larger atomic systems and time scales and enabling large-scale MD simulations with DFT-level accuracy. In this work, we investigate the feasibility of GNNs to learn the HK map from the external potential approximated as Gaussians to the electron density 𝑛(𝑟), and the mapping from 𝑛(𝑟) to the energy density 𝑒(𝑟) using Pytorch Geometric. We develop a graph representation for densities on radial grid points and determine that a k-nearest neighbor algorithm for determining node connections is an effective approach compared to a distance cutoff model, having an average graph size of 6.31 MB and 32.0 MB for datasets with 𝑘 = 10 and 𝑘 = 50 respectively. Furthermore, we develop two GNNs in Pytorch Geometric, and demonstrate a decrease in training losses for a 𝑛(𝑟) to 𝑒(𝑟) of 8.52 · 10^14 and 3.10 · 10^14 for 𝑘 = 10 and 𝑘 = 20 datasets respectively, suggesting the model could be further trained and optimized to learn the electron density to energy functional.

ContributorsHayes, Matthew (Author) / Muhich, Christopher (Thesis director) / Oswald, Jay (Committee member) / Barrett, The Honors College (Contributor) / Chemical Engineering Program (Contributor) / School of Mathematical and Statistical Sciences (Contributor)

Created2023-05

Parameter Optimization with Conscious Allocation (POCA): Efficient Bayesian Hyperparameter Optimization with Adaptive Budget Assignment

Description

The performance of modern machine learning algorithms depends upon the selection of a set of hyperparameters. Common examples of hyperparameters are learning rate and the number of layers in a dense neural network. Auto-ML is a branch of optimization that has produced important contributions in this area. Within Auto-ML, multi-fidelity approaches, which eliminate poorly-performing…

The performance of modern machine learning algorithms depends upon the selection of a set of hyperparameters. Common examples of hyperparameters are learning rate and the number of layers in a dense neural network. Auto-ML is a branch of optimization that has produced important contributions in this area. Within Auto-ML, multi-fidelity approaches, which eliminate poorly-performing configurations after evaluating them at low budgets, are among the most effective. However, the performance of these algorithms strongly depends on how effectively they allocate the computational budget to various hyperparameter configurations. We first present Parameter Optimization with Conscious Allocation 1.0 (POCA 1.0), a hyperband- based algorithm for hyperparameter optimization that adaptively allocates the inputted budget to the hyperparameter configurations it generates following a Bayesian sampling scheme. We then present its successor Parameter Optimization with Conscious Allocation 2.0 (POCA 2.0), which follows POCA 1.0’s successful philosophy while utilizing a time-series model to reduce wasted computational cost and providing a more flexible framework. We compare POCA 1.0 and 2.0 to its nearest competitor BOHB at optimizing the hyperparameters of a multi-layered perceptron and find that both POCA algorithms exceed BOHB in low-budget hyperparameter optimization while performing similarly in high-budget scenarios.

ContributorsInman, Joshua (Author) / Sankar, Lalitha (Thesis director) / Pedrielli, Giulia (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Computer Science and Engineering Program (Contributor)

Created2024-05

Optimizing Prediction Accuracy in the English Premier League: A Comparative Analysis of Machine Learning Models for Forecasting Match Outcomes

Description

This study presents a comparative analysis of machine learning models on their ability to determine match outcomes in the English Premier League (EPL), focusing on optimizing prediction accuracy. The research leverages a variety of models, including logistic regression, decision trees, random forests, gradient boosting machines, support vector machines, k-nearest…

This study presents a comparative analysis of machine learning models on their ability to determine match outcomes in the English Premier League (EPL), focusing on optimizing prediction accuracy. The research leverages a variety of models, including logistic regression, decision trees, random forests, gradient boosting machines, support vector machines, k-nearest neighbors, and extreme gradient boosting, to predict the outcomes of soccer matches in the EPL. Utilizing a comprehensive dataset from Kaggle, the study uses the Sport Result Prediction CRISP-DM framework for data preparation and model evaluation, comparing the accuracy, precision, recall, F1-score, ROC-AUC score, and confusion matrices of each model used in the study. The findings reveal that ensemble methods, notably Random Forest and Extreme Gradient Boosting, outperform other models in accuracy, highlighting their potential in sports analytics. This research contributes to the field of sports analytics by demonstrating the effectiveness of machine learning in sports outcome prediction, while also identifying the challenges and complexities inherent in predicting the outcomes of EPL matches. This research not only highlights the significance of ensemble learning techniques in handling sports data complexities but also opens avenues for future exploration into advanced machine learning and deep learning approaches for enhancing predictive accuracy in sports analytics.

ContributorsTashildar, Ninad (Author) / Osburn, Steven (Thesis director) / Simari, Gerardo (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Economics Program in CLAS (Contributor) / Computer Science and Engineering Program (Contributor)

Created2024-05

Predict NFL Players Points for Fantasy Football

Description

For my Honors Thesis, I decided to create an Artificial Intelligence Project to predict Fantasy NFL Football Points of players and team's defense. I created a Tensorflow Keras AI Regression model and created a Flask API that holds the AI model, and a Django Try-It Page for the user to…

For my Honors Thesis, I decided to create an Artificial Intelligence Project to predict Fantasy NFL Football Points of players and team's defense. I created a Tensorflow Keras AI Regression model and created a Flask API that holds the AI model, and a Django Try-It Page for the user to use the model. These services are hosted on ASU's AWS service. In my Flask API, it actively gathers data from Pro-Football-Reference, then calculates the fantasy points. Let’s say the current year is 2022, then the model analyzes each player and trains on all data from available from 2000 to 2020 data, tests the data on 2021 data, and predicts for 2022 year. The Django Website asks the user to input the current year, then the user clicks the submit button runs the AI model, and the process explained earlier. Next, the user enters the player's name for the point prediction and the website predicts the last 5 rows with 4 being the previous fantasy points and the 5th row being the prediction.

ContributorsPanikulam, Caleb (Author) / De Luca, Gennaro (Thesis director) / Chen, Yinong (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Computer Science and Engineering Program (Contributor)

Created2022-12

Filtering by