Matching Items (23)
Filtering by
- Member of: ASU Electronic Theses and Dissertations

Description
As the gap widens between the number of security threats and the number of security professionals, the need for automated security tools becomes increasingly important. These automated systems assist security professionals by identifying and/or fixing potential vulnerabilities before they can be exploited. One such category of tools is exploit generators, which craft exploits to demonstrate a vulnerability and provide guidance on how to repair it. Existing exploit generators largely use the application code, either through static or dynamic analysis, to locate crashes and craft a payload.
This thesis proposes the Automated Reflection of CTF Hostile Exploits (ARCHES), an exploit generator that learns by example. ARCHES uses an inductive programming library named IRE to generate exploits from exploit examples. In doing so, ARCHES can create an exploit only from example exploit payloads without interacting with the service. By representing each component of the exploit interaction as a collection of theories for how that component occurs, ARCHES can identify critical state information and replicate an executable exploit. This methodology learns rapidly and works with only a few examples. The ARCHES exploit generator is targeted towards Capture the Flag (CTF) events as a suitable environment for initial research.
The effectiveness of this methodology was evaluated on four exploits with features that demonstrate the capabilities and limitations of this methodology. ARCHES is capable of reproducing exploits that require an understanding of state dependent input, such as a flag id. Additionally, ARCHES can handle basic utilization of state information that is revealed through service output. However, limitations in this methodology result in failure to replicate exploits that require a loop, intricate mathematics, or multiple TCP connections.
Inductive programming has potential as a security tool to augment existing automated security tools. Future research into these techniques will provide more capabilities for security professionals in academia and in industry.
This thesis proposes the Automated Reflection of CTF Hostile Exploits (ARCHES), an exploit generator that learns by example. ARCHES uses an inductive programming library named IRE to generate exploits from exploit examples. In doing so, ARCHES can create an exploit only from example exploit payloads without interacting with the service. By representing each component of the exploit interaction as a collection of theories for how that component occurs, ARCHES can identify critical state information and replicate an executable exploit. This methodology learns rapidly and works with only a few examples. The ARCHES exploit generator is targeted towards Capture the Flag (CTF) events as a suitable environment for initial research.
The effectiveness of this methodology was evaluated on four exploits with features that demonstrate the capabilities and limitations of this methodology. ARCHES is capable of reproducing exploits that require an understanding of state dependent input, such as a flag id. Additionally, ARCHES can handle basic utilization of state information that is revealed through service output. However, limitations in this methodology result in failure to replicate exploits that require a loop, intricate mathematics, or multiple TCP connections.
Inductive programming has potential as a security tool to augment existing automated security tools. Future research into these techniques will provide more capabilities for security professionals in academia and in industry.
ContributorsCrosley, Zackary (Author) / Doupe, Adam (Thesis advisor) / Shoshitaishvili, Yan (Committee member) / Wang, Ruoyu (Committee member) / Arizona State University (Publisher)
Created2019

Description
Reverse engineering is critical to reasoning about how a system behaves. While complete access to a system inherently allows for perfect analysis, partial access is inherently uncertain. This is the case foran individual agent in a distributed system. Inductive Reverse Engineering (IRE) enables analysis under
such circumstances. IRE does this by producing program spaces consistent with individual input-output examples for a given domain-specific language. Then, IRE intersects those program spaces to produce a generalized program consistent with all examples. IRE, an easy to use framework, allows this domain-specific language to be specified in the form of Theorist s, which produce Theory s, a succinct way of representing the program space.
Programs are often much more complex than simple string transformations. One of the ways in which they are more complex is in the way that they follow a conversation-like behavior, potentially following some underlying protocol. As a result, IRE represents program interactions as Conversations in order to
more correctly model a distributed system. This, for instance, enables IRE to model dynamically captured inputs received from other agents in the distributed system.
While domain-specific knowledge provided by a user is extremely valuable, such information is not always possible. IRE mitigates this by automatically inferring program grammars, allowing it to still perform efficient searches of the program space. It does this by intersecting conversations prior to synthesis in order to understand what portions of conversations are constant.
IRE exists to be a tool that can aid in automatic reverse engineering across numerous domains. Further, IRE aspires to be a centralized location and interface for implementing program synthesis and automatic black box analysis techniques.
such circumstances. IRE does this by producing program spaces consistent with individual input-output examples for a given domain-specific language. Then, IRE intersects those program spaces to produce a generalized program consistent with all examples. IRE, an easy to use framework, allows this domain-specific language to be specified in the form of Theorist s, which produce Theory s, a succinct way of representing the program space.
Programs are often much more complex than simple string transformations. One of the ways in which they are more complex is in the way that they follow a conversation-like behavior, potentially following some underlying protocol. As a result, IRE represents program interactions as Conversations in order to
more correctly model a distributed system. This, for instance, enables IRE to model dynamically captured inputs received from other agents in the distributed system.
While domain-specific knowledge provided by a user is extremely valuable, such information is not always possible. IRE mitigates this by automatically inferring program grammars, allowing it to still perform efficient searches of the program space. It does this by intersecting conversations prior to synthesis in order to understand what portions of conversations are constant.
IRE exists to be a tool that can aid in automatic reverse engineering across numerous domains. Further, IRE aspires to be a centralized location and interface for implementing program synthesis and automatic black box analysis techniques.
ContributorsNelson, Connor David (Author) / Doupe, Adam (Thesis advisor) / Shoshitaishvili, Yan (Committee member) / Wang, Ruoyu (Committee member) / Arizona State University (Publisher)
Created2019

Description
Cyberspace has become a field where the competitive arms race between defenders and adversaries play out. Adaptive, intelligent adversaries are crafting new responses to the advanced defenses even though the arms race has resulted in a gradual improvement of the security posture. This dissertation aims to assess the evolving threat landscape and enhance state-of-the-art defenses by exploiting and mitigating two different types of emerging security vulnerabilities. I first design a new cache attack method named Prime+Count which features low noise and no shared memory needed.I use the method to construct fast data covert channels. Then, I propose a novel software-based approach, SmokeBomb, to prevent cache side-channel attacks for inclusive and non-inclusive caches based on the creation of a private space in the L1 cache. I demonstrate the effectiveness of SmokeBomb by applying it to two different ARM processors with different instruction set versions and cache models and carry out an in-depth evaluation. Next, I introduce an automated approach that exploits a stack-based information leak vulnerability in operating system kernels to obtain sensitive data. Also, I propose a lightweight and widely applicable runtime defense, ViK, for preventing temporal memory safety violations which can lead attackers to have arbitrary code execution or privilege escalation together with information leak vulnerabilities. The security impact of temporal memory safety vulnerabilities is critical, but,they are difficult to identify because of the complexity of real-world software and the spatial separation of allocation and deallocation code. Therefore, I focus on preventing not the vulnerabilities themselves, but their exploitation. ViK can effectively protect operating system kernels and user-space programs from temporal memory safety violations, imposing low runtime and memory overhead.
ContributorsCho, Haehyun (Author) / Ahn, Gail-Joon (Thesis advisor) / Doupe, Adam (Thesis advisor) / Shoshitaishvili, Yan (Committee member) / Wang, Ruoyu (Committee member) / Wu, Carole-Jean (Committee member) / Arizona State University (Publisher)
Created2021

Description
With the rise of the Internet of Things, embedded systems have become an integral part of life and can be found almost anywhere. Their prevalence and increased interconnectivity has made them a prime target for malicious attacks. Today, the vast majority of embedded devices are powered by ARM processors. To protect their processors from attacks, ARM introduced a hardware security extension known as TrustZone. It provides an isolated execution environment within the embedded device in which to deploy various memory integrity and malware detection tools.
Even though Secure World can monitor the Normal World, attackers can attempt to bypass the security measures to retain control of a compromised system. CacheKit is a new type of rootkit that exploits such a vulnerability in the ARM architecture to hide in Normal World cache from memory introspection tools running in Secure World by exploiting cache locking mechanisms. If left unchecked, ARM processors that provide hardware assisted cache locking for performance and time-critical applications in real-time and embedded systems would be completely vulnerable to this undetectable and untraceable attack. Therefore, a new approach is needed to ensure the correct use of such mechanisms and prevent malicious code from being hidden in the cache.
CacheLight is a lightweight approach that leverages the TrustZone and Virtualization extensions of the ARM architecture to allow the system to continue to securely provide these hardware facilities to users while preventing attackers from exploiting them. CacheLight restricts the ability to lock the cache to the Secure World of the processor such that the Normal World can still request certain memory to be locked into the cache by the secure operating system (OS) through a Secure Monitor Call (SMC). This grants the secure OS the power to verify and validate the information that will be locked in the requested cache way thereby ensuring that any data that remains in the cache will not be inconsistent with what exists in main memory for inspection. Malicious attempts to hide data can be prevented and recovered for analysis while legitimate requests can still generate valid entries in the cache.
Even though Secure World can monitor the Normal World, attackers can attempt to bypass the security measures to retain control of a compromised system. CacheKit is a new type of rootkit that exploits such a vulnerability in the ARM architecture to hide in Normal World cache from memory introspection tools running in Secure World by exploiting cache locking mechanisms. If left unchecked, ARM processors that provide hardware assisted cache locking for performance and time-critical applications in real-time and embedded systems would be completely vulnerable to this undetectable and untraceable attack. Therefore, a new approach is needed to ensure the correct use of such mechanisms and prevent malicious code from being hidden in the cache.
CacheLight is a lightweight approach that leverages the TrustZone and Virtualization extensions of the ARM architecture to allow the system to continue to securely provide these hardware facilities to users while preventing attackers from exploiting them. CacheLight restricts the ability to lock the cache to the Secure World of the processor such that the Normal World can still request certain memory to be locked into the cache by the secure operating system (OS) through a Secure Monitor Call (SMC). This grants the secure OS the power to verify and validate the information that will be locked in the requested cache way thereby ensuring that any data that remains in the cache will not be inconsistent with what exists in main memory for inspection. Malicious attempts to hide data can be prevented and recovered for analysis while legitimate requests can still generate valid entries in the cache.
ContributorsGutierrez, Mauricio (Author) / Zhao, Ziming (Thesis advisor) / Doupe, Adam (Committee member) / Shoshitaishvili, Yan (Committee member) / Arizona State University (Publisher)
Created2018

Description
The rise in popularity of applications and services that charge for access to proprietary trained models has led to increased interest in the robustness of these models and the security of the environments in which inference is conducted. State-of-the-art attacks extract models and generate adversarial examples by inferring relationships between a model’s input and output. Popular variants of these attacks have been shown to be deterred by countermeasures that poison predicted class distributions and mask class boundary gradients. Neural networks are also vulnerable to timing side-channel attacks. This work builds on top of Subneural, an attack framework that uses floating point timing side channels to extract neural structures. Novel applications of addition timing side channels are introduced, allowing the signs and arrangements of leaked parameters to be discerned more efficiently. Addition timing is also used to leak network biases, making the framework applicable to a wider range of targets. The enhanced framework is shown to be effective against models protected by prediction poisoning and gradient masking adversarial countermeasures and to be competitive with adaptive black box adversarial attacks against stateful defenses. Mitigations necessary to protect against floating-point timing side-channel attacks are also presented.
ContributorsVipat, Gaurav (Author) / Shoshitaishvili, Yan (Thesis advisor) / Doupe, Adam (Committee member) / Srivastava, Siddharth (Committee member) / Arizona State University (Publisher)
Created2023

Description
The Web plays an indispensable role in modern life, enabling countless interactions, from social engagement to critical business operations. However, these interactions expose users to a variety of security and privacy threats. This dissertation focuses on safeguarding users' Web interactions by addressing three key challenges: content injection vulnerabilities in web applications, privacy risks from browser extension fingerprinting, and Account Takeover (ATO) attacks carried out by fraud browsers, all of which directly impact users' safety. First, CONTEXT-AUDITOR introduces a novel technique to mitigate content injection vulnerabilities, including Cross-Site Scripting (XSS), scriptless attacks, and command injections. By detecting unintended context switches in the browser's parsing engine, CONTEXT-AUDITOR provides robust protection for web applications, ensuring that users' interactions with them remain secure. Second, Simulacrum enhances privacy protection by defending against DOM-based extension fingerprinting using the DOM Reality Shifting concept. It conceals browser extension behaviors from websites, preventing over 95% of vulnerable extensions from being fingerprinted. This approach directly addresses user privacy concerns, shielding them from tracking and profiling based on browser extensions. Finally, BROWSER POLYGRAPH provides a scalable, privacy-preserving solution to detect fraud browsers used in ATO attacks. By leveraging coarse-grained browser fingerprints, it identifies suspicious sessions, improving the accuracy of risk-based authentication systems and protecting users from fraudulent account compromises. In summary, this dissertation presents practical, deployable solutions that enhance the security and privacy of users' Web interactions. By safeguarding web applications, protecting user privacy, and defending against ATO fraud, these contributions play a key role in ensuring the safety of users in the increasingly adversarial Web ecosystem.
ContributorsKalantari, Faezeh (Author) / Doupe, Adam (Thesis advisor) / Shoshitaishvili, Yan (Committee member) / Wang, Ruoyu (Committee member) / Bao, Tiffany (Committee member) / Polakis, Jason (Committee member) / Arizona State University (Publisher)
Created2024

Description
Cryptographic voting systems such as Helios rely heavily on a trusted party to maintain privacy or verifiability. This tradeoff can be done away with by using distributed substitutes for the components that need a trusted party. By replacing the encryption, shuffle, and decryption steps described by Helios with the Pedersen threshold encryption and Neff shuffle, it is possible to obtain a distributed voting system which achieves both privacy and verifiability without trusting any of the contributors. This thesis seeks to examine existing approaches to this problem, and their shortcomings. It provides empirical metrics for comparing different working solutions in detail.
ContributorsBouck, Spencer Joseph (Author) / Bazzi, Rida (Thesis advisor) / Boscovic, Dragan (Committee member) / Shoshitaishvili, Yan (Committee member) / Arizona State University (Publisher)
Created2021

Description
Reverse engineers use decompilers to analyze binaries when their source code is unavailable. A binary decompiler attempts to transform binary programs to their corresponding high-level source code by recovering and inferring the information that was lost during the compilation process. One type of information that is lost during compilation is variable names, which are critical for reverse engineers to analyze and understand programs. Traditional binary decompilers generally use automatically generated, placeholder variable names that are meaningless or have little correlation with their intended semantics. Having correct or meaningful variable names in decompiled code, instead of placeholder variable names, greatly increases the readability of decompiled binary code. Decompiled Identifier Renaming Engine (DIRE) is a state-of-the-art, deep-learning-based solution that automatically predicts variable names in decompiled binary code. However, DIRE's prediction result is far from perfect. The first goal of this research project is to take a close look at the current state-of-the-art solution for automated variable name prediction on decompilation output of binary code, assess the prediction quality, and understand how the prediction result can be improved. Then, as the second goal of this research project, I aim to improve the prediction quality of variable names. With a thorough understanding of DIRE's issues, I focus on improving the quality of training data. This thesis proposes a novel approach to improving the quality of the training data by normalizing variable names and converting their abbreviated forms to their full forms. I implemented and evaluated the proposed approach on a data set of over 10k and 20k binaries and showed improvements over DIRE.
ContributorsBajaj, Ati Priya (Author) / Wang, Ruoyu (Thesis advisor) / Baral, Chitta (Committee member) / Shoshitaishvili, Yan (Committee member) / Arizona State University (Publisher)
Created2021

Description
Network Management is a critical process for an enterprise to configure and monitor the network devices using cost effective methods. It is imperative for it to be robust and free from adversarial or accidental security flaws. With the advent of cloud computing and increasing demands for centralized network control, conventional management protocols like Simple Network Management Protocol (SNMP) appear inadequate and newer techniques like Network Management Datastore Architecture (NMDA) design and Network Configuration (NETCONF) have been invented. However, unlike SNMP which underwent improvements concentrating on security, the new data management and storage techniques have not been scrutinized for the inherent security flaws.
In this thesis, I identify several vulnerabilities in the widely used critical infrastructures which leverage the NMDA design. Software Defined Networking (SDN), a proponent of NMDA, heavily relies on its datastores to program and manage the network. I base my research on the security challenges put forth by the existing datastore’s design as implemented by the SDN controllers. The vulnerabilities identified in this work have a direct impact on the controllers like OpenDayLight, Open Network Operating System and their proprietary implementations (by CISCO, Ericsson, RedHat, Brocade, Juniper, etc). Using the threat detection methodology, I demonstrate how the NMDA-based implementations are vulnerable to attacks which compromise availability, integrity, and confidentiality of the network. I finally propose defense measures to address the security threats in the existing design and discuss the challenges faced while employing these countermeasures.
In this thesis, I identify several vulnerabilities in the widely used critical infrastructures which leverage the NMDA design. Software Defined Networking (SDN), a proponent of NMDA, heavily relies on its datastores to program and manage the network. I base my research on the security challenges put forth by the existing datastore’s design as implemented by the SDN controllers. The vulnerabilities identified in this work have a direct impact on the controllers like OpenDayLight, Open Network Operating System and their proprietary implementations (by CISCO, Ericsson, RedHat, Brocade, Juniper, etc). Using the threat detection methodology, I demonstrate how the NMDA-based implementations are vulnerable to attacks which compromise availability, integrity, and confidentiality of the network. I finally propose defense measures to address the security threats in the existing design and discuss the challenges faced while employing these countermeasures.
ContributorsDixit, Vaibhav Hemant (Author) / Ahn, Gail-Joon (Thesis advisor) / Doupe, Adam (Thesis advisor) / Shoshitaishvili, Yan (Committee member) / Zhao, Ziming (Committee member) / Arizona State University (Publisher)
Created2018

Description
Web applications are ubiquitous. Accessible from almost anywhere, web applications support multiple platforms and can be easily customized. Most people interact with web applications daily for social media, communication, research, purchases, etc. Node.js has gained popularity as a programming language for web applications. A server-side JavaScript implementation, Node.js, allows both the front-end and back-end to be coded in JavaScript. Node.js contains many features such as dynamic inclusion of other modules using a built-in function named require which dynamically locates and loads code.
To be effective, web applications must perform actions quickly while avoiding unexpected interruptions. However, dynamically linked libraries can cause delays and thus downtime, because dynamically linked code must load multiple files, often from disk. As loading is one of the slowest operations a computer performs, seeking from disk can have a negative impact on performance which causes the server to feel less responsive for users. Dynamically linked code can also break when the underlying library is updated. Normally, when trying to update a server, developers will use test servers. However, if the developer accidentally updates a library in a dynamically linked system, it may be incompatible with another portion of the program.
Statically linking code makes it more reliable and faster (to load) than dynamically linking code. The static linking process varies by programming language. Therefore, different static linkers need to be developed for different languages. This thesis describes the creation of a static linker, called FrozenNode, for the popular back-end web application language, Node.js. FrozenNode resolves Node.js applications into a single file that does not rely on dynamic libraries. FrozenNode was built on top of Closure Compiler to accurately process JavaScript. We found that the resolved application was faster and self-contained yielding significant advantages over the dynamically loaded application. Furthermore, both had the same output.
Vulnerabilities in web applications can be found using static analysis tools, however static analysis tools must reason about dynamically linked application. FrozenNode can be used to statically link a Node.js application before being used by a JavaScript static analysis tool.
To be effective, web applications must perform actions quickly while avoiding unexpected interruptions. However, dynamically linked libraries can cause delays and thus downtime, because dynamically linked code must load multiple files, often from disk. As loading is one of the slowest operations a computer performs, seeking from disk can have a negative impact on performance which causes the server to feel less responsive for users. Dynamically linked code can also break when the underlying library is updated. Normally, when trying to update a server, developers will use test servers. However, if the developer accidentally updates a library in a dynamically linked system, it may be incompatible with another portion of the program.
Statically linking code makes it more reliable and faster (to load) than dynamically linking code. The static linking process varies by programming language. Therefore, different static linkers need to be developed for different languages. This thesis describes the creation of a static linker, called FrozenNode, for the popular back-end web application language, Node.js. FrozenNode resolves Node.js applications into a single file that does not rely on dynamic libraries. FrozenNode was built on top of Closure Compiler to accurately process JavaScript. We found that the resolved application was faster and self-contained yielding significant advantages over the dynamically loaded application. Furthermore, both had the same output.
Vulnerabilities in web applications can be found using static analysis tools, however static analysis tools must reason about dynamically linked application. FrozenNode can be used to statically link a Node.js application before being used by a JavaScript static analysis tool.
ContributorsHutchins, James (Author) / Doupe, Adam (Thesis advisor) / Shoshitaishvili, Yan (Committee member) / Zhao, Ziming (Committee member) / Arizona State University (Publisher)
Created2018