Search Content

Can Typos Cause Harm? The Impact of Imperfect Input on LLM Safety

Description

Despite their remarkable capabilities, Large Language Models (LLMs) exhibit concerning vulnerabilities to minor input perturbations, posing risks in safety-critical applications. This thesis systematically analyzes how semantic-preserving prompt perturbations—at the character, word, and sentence levels—affect LLM safety behavior. It investigates whether even slight rephrasings can cause an otherwise aligned model to…

Despite their remarkable capabilities, Large Language Models (LLMs) exhibit concerning vulnerabilities to minor input perturbations, posing risks in safety-critical applications. This thesis systematically analyzes how semantic-preserving prompt perturbations—at the character, word, and sentence levels—affect LLM safety behavior. It investigates whether even slight rephrasings can cause an otherwise aligned model to flip from a safe (refusal or benign) response to an unsafe one, or vice versa. The study also identifies key perturbation characteristics that drive such flips and examines how flip likelihood varies across different categories of harmful content.To answer these questions, the study evaluates multiple advanced LLMs (LLaMA2, LLaMA3, Mistral) on the CatQA dataset of harmful queries. Each query is perturbed by applying controlled character-level typos, word-level substitutions, and sentence-level paraphrases while preserving semantic meaning. Model outputs are assessed using the automated safety evaluation tool LlamaGuard.
Empirical results show that sentence-level paraphrasing consistently improves safety compliance, whereas fine-grained character-level noise often degrades it due to tokenization weaknesses. Word-level changes yield the most inconsistent behavior, with random insertions particularly likely to elicit unsafe outputs. Among the models tested, LLaMA3 was the most vulnerable, exhibiting the highest rate of unsafe responses under perturbation.
Overall, these findings underscore the importance of perturbation-aware and context-aware safety evaluation and offer practical insights for improving LLM alignment in real-world deployments.

ContributorsZinjad, Saurabh Bhausaheb (Author) / Liu, Huan (Thesis advisor) / Davulcu, Hasan (Committee member) / Gupta, Vivek (Committee member) / Arizona State University (Publisher)

Created2025

Machine Learning for Predicting Financial Data

Description

This applied thesis project presents an integrated Financial Market Prediction System that provides cross-asset analysis through advanced machine learning techniques. By seamlessly unifying predictive analytics across stocks, cryptocurrencies, forex, commodities, and real estate markets, the system transcends traditional siloed approaches to financial forecasting. Leveraging machine learning ensemble methods including Gradient Boosting, Linear Regression and Random…

This applied thesis project presents an integrated Financial Market Prediction System that provides cross-asset analysis through advanced machine learning techniques. By seamlessly unifying predictive analytics across stocks, cryptocurrencies, forex, commodities, and real estate markets, the system transcends traditional siloed approaches to financial forecasting. Leveraging machine learning ensemble methods including Gradient Boosting, Linear Regression and Random Forest algorithms. The architecture incorporates a sophisticated constellation of over 25 technical indicators, asset-specific feature engineering, and a recursive prediction methodology that dynamically calibrates confidence intervals based on temporal distance and market volatility. The system's elegant backtesting framework employs walk-forward validation across multiple market regimes, delivering robust performance metrics while effectively mitigating overfitting risks inherent in financial modeling. Through an intuitive yet powerful interactive interface, users can effortlessly navigate complex market dynamics, visualize probabilistic forecasts, and gain unprecedented insights into feature importance hierarchies that drive market movements. Beyond mere prediction, the platform serves as an educational cornerstone, seamlessly integrating a comprehensive financial knowledge base with real-time analytical capabilities.

ContributorsTewary, Ganap Ashit (Author) / Menees, Jodi (Thesis director) / Srinivasan, Aravind (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2025-05

Personalized Open Source Fitness Tracking Software (P.O.F.T.S)

Description

P.O.F.T.S. is an open-source fitness-tracking platform designed with transparent data privacy, a remarkable level of customization, and frequent user interaction in mind. Unlike other commercial fitness apps on the current market, P.O.F.T.S. gives the users total control over their data for a personalized fitness experience, including workout recommendations, social leaderboards,…

P.O.F.T.S. is an open-source fitness-tracking platform designed with transparent data privacy, a remarkable level of customization, and frequent user interaction in mind. Unlike other commercial fitness apps on the current market, P.O.F.T.S. gives the users total control over their data for a personalized fitness experience, including workout recommendations, social leaderboards, and devices synced with wearables. P.O.F.T.S. (Personalized Open-Source Fitness Tracking Software)

ContributorsAlfred Vijay, Heinric (Author) / Mejari, Vikas (Co-author) / Byrne, Jared (Thesis director) / Howell, Travis (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor) / School of Mathematical and Statistical Sciences (Contributor)

Created2025-05

Natural Language Processing for Intuitive Sports Data Querying and Visualization

Description

This thesis introduces an intelligent database system that harnesses Natural Language Processing (NLP) and Machine Learning (ML) to enable seamless querying and visualization of sports data. Centered on the English Premier League—the top tier of English football—the system empowers users to interact with complex datasets through simple natural language queries.…

This thesis introduces an intelligent database system that harnesses Natural Language Processing (NLP) and Machine Learning (ML) to enable seamless querying and visualization of sports data. Centered on the English Premier League—the top tier of English football—the system empowers users to interact with complex datasets through simple natural language queries. These queries are automatically translated into structured SQL commands, eliminating the need for technical expertise and making data retrieval more accessible. In addition to flexible querying, the system supports dynamic data visualization, presenting results in user-specified formats such as tables, charts, or graphs. By integrating NLP and ML, the system streamlines the end-to-end process of data access, analysis, and presentation. This not only enhances the usability of sports data for analysts, researchers, and enthusiasts but also promotes data-driven exploration and insight generation. The proposed system represents a step toward democratizing sports analytics by bridging the gap between natural language understanding and structured data querying, enabling richer, more intuitive interactions with complex information.

ContributorsMartinez, Sebastian (Author) / Gupta, Vivek (Thesis director) / Bryan, Chris (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2025-05

Web Analytics Plugin for Ink-bloom Writing Critique Website

Description

As a brand-new social media platform, Ink-bloom has analytics requirements that fall outside of the normal range covered by existing analytics tools. Additionally, the lack of analytics and data science expertise on the permanent Ink-bloom team necessitates a simpler approach to analytics presentation. As such, this thesis aims to produce…

As a brand-new social media platform, Ink-bloom has analytics requirements that fall outside of the normal range covered by existing analytics tools. Additionally, the lack of analytics and data science expertise on the permanent Ink-bloom team necessitates a simpler approach to analytics presentation. As such, this thesis aims to produce an easy to use, integrated analytics tool for use in the product testing stage of the Ink-bloom website. It includes features such as unique visit tracking, bounce-rate calculation, and a dynamically generated clustering algorithm to group user behaviors across the platform

ContributorsGoswick, Logan (Author) / Meuth, Ryan (Thesis director) / Hsu, Ray (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2025-05

Translatica: A Survey and Implementation Study on Speech-to-Speech Translation and Voice Synthesis with Speaker Preservation

Description

This thesis presents Translatica, a modular speech-to-speech translation (S2ST) system that preserves both linguistic meaning and the speaker’s vocal identity across languages. Alongside developing a working prototype, this work surveys the landscape of S2ST methods and motivates the choice of a modular architecture over direct approaches, emphasizing flexibility, interpretability, and voice…

This thesis presents Translatica, a modular speech-to-speech translation (S2ST) system that preserves both linguistic meaning and the speaker’s vocal identity across languages. Alongside developing a working prototype, this work surveys the landscape of S2ST methods and motivates the choice of a modular architecture over direct approaches, emphasizing flexibility, interpretability, and voice fidelity. The system combines state-of-the-art tools in transcription, translation, and voice synthesis to enable expressive, speaker-preserving dubbing of prerecorded videos. Through implementation and evaluation, the thesis explores the trade-offs between accuracy, latency, and control, demonstrating how modular design enables customization for diverse use cases. Future work includes real-time translation, enhanced speaker tracking, and applications in education and live media.

ContributorsJhaj, Baaz (Author) / Ramani, Krishna (Co-author) / Hsu, Jeffrey (Co-author) / Osburn, Steven (Thesis director) / Zhu, Haolin (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2025-05

Analyzing the Effect of Professional Sports Teams on Local Crime

Description

This paper investigates the relationship between professional sports teams and local crime rates in their host cities, attempting to address a gap in the literature that predominantly focuses on the economic impact of sports. Through statistical analysis of daily crime data from three paragons of United States sports culture (Dallas,…

This paper investigates the relationship between professional sports teams and local crime rates in their host cities, attempting to address a gap in the literature that predominantly focuses on the economic impact of sports. Through statistical analysis of daily crime data from three paragons of United States sports culture (Dallas, Boston, and Philadelphia), this research will examine relationships between home games for NBA and NFL teams and local crime rates. Specifically, it will look at data from 2020, and compare the relationship between the presence of home games and the results of those games to local felonies, misdemeanors, and violations. The findings appear to find notable patterns, in that cities consistently demonstrate statistically significant reductions in felony crimes on game days. While score results and margins showed minimal correlation with crime rates, the presence of a home game was associated with the strongest reduction in crime rates. Overall, this paper contributes to understanding the complex and incredibly nuanced relationship between professional sports and crime, indicating that game days may have a temporary reduction in crime.

ContributorsNair, Rohan (Author) / Meuth, Ryan (Thesis director) / Mahzabeen, Sabiha (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor) / Economics Program in CLAS (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Department of Information Systems (Contributor)

Created2025-05

AI Ethics in Cinema: Enhancing Artistry Without Replacing Human Creativity

Description

Artificial intelligence (AI) has rapidly progressed over the past couple of years, sparking worldwide concerns about whether these new technologies will take human jobs, especially in the film industry. This thesis delves deeper into the ethical implications of AI in filmmaking and its potential to change the film industry as…

Artificial intelligence (AI) has rapidly progressed over the past couple of years, sparking worldwide concerns about whether these new technologies will take human jobs, especially in the film industry. This thesis delves deeper into the ethical implications of AI in filmmaking and its potential to change the film industry as we know it. AI technologies have found ways to streamline production processes, creating benefits such as cost efficiency for small-budget filmmakers. Although these developments can optimize pre-production, production, and post-production processes, it causes distress with factors such as ethical use of information, job security, and creative integrity. A qualitative methodology incorporating literature reviews and case studies of visual media, is used to explore the dual-sided sword AI can be: democratizing the filmmaking process while also making people question the authenticity of the new age of cinema. This study explores whether AI can maintain a human element in art without ruining authorship in AI-generated scripts, or the use of deep fakes. Recommendations are made to ensure transparency and consent are always prioritized when innovating, to maintain ethical integrity. This thesis argues that while AI can support filmmakers through creative inspiration, and streamlining certain processes, it should never completely replace human authorship. AI is there to complement, not supersede the authentic storytelling that defines film. By highlighting these obstacles, this thesis can add to the ongoing conversation about whether AI should be involved in cinema's future and our obligation to establish equitable and progressive regulations.

ContributorsPartha, Ajay (Author) / Malpe, Adwith (Thesis director) / Wheatley, Abby (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor) / Dean, W.P. Carey School of Business (Contributor)

Created2025-05

Morphological Statistics of PEARLSDG: Insights from JWST Imaging

Description

We measure non-parametric morphological statistics, including Concentration (C), Asymmetry (A), and Smoothness (S), as well as the Gini coefficient and M20 moment of light, using Statmorph. Additionally, we analyze the variation of these parameters as a function of wavelength and radius to assess structural differences across different stellar populations. To…

We measure non-parametric morphological statistics, including Concentration (C), Asymmetry (A), and Smoothness (S), as well as the Gini coefficient and M20 moment of light, using Statmorph. Additionally, we analyze the variation of these parameters as a function of wavelength and radius to assess structural differences across different stellar populations. To place PEARLSDG in the broader context of dwarf galaxy morphology, we compare its structural parameters to a sample of 30 dwarf irregular galaxies (van Zee 2000, Conselice 2003). Furthermore, we investigate the presence of isophotal twisting, measuring orientation angle as a function of radius, to explore potential evidence of past dynamical interactions. Our results aim to clarify whether PEARLSDG’s quiescence and structure arise from internal evolutionary processes or past interactions with neighboring galaxies. These findings contribute to our understanding of dwarf galaxy morphology and the environmental factors that shape their evolution.

ContributorsMcLeod, Noah (Author) / Jansen, Rolf (Thesis director) / Carleton, Timothy (Committee member) / Windhorst, Rogier (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / School of Earth and Space Exploration (Contributor)

Created2025-05

Detours: Optional Stops Within a Planned Adventure

DescriptionThis project focuses on creating a detour recommendation system as a component of a larger product. The criteria for detours in this implementation are interesting locations that can be enjoyed for free. The implementation utilizes API calls, web scraping, and machine learning to determine if a location qualifies a detour.

ContributorsYang, Nathan (Author) / Chavez-Echeagaray, Maria Elena (Thesis director) / Mclaughlin , Ranique (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2025-05

Filtering by