BARED: STRUCTURED ANALYSIS OF DISASSEMBLY THROUGH A RELATIONAL DATABASE.
Date | 22 September 2020 |
Author | Stroschein, Josh |
-
INTRODUCTION
The advancement of technology has not come without its perils. Massive data breaches, invasions of privacy, disruption to business operations and attacks against critical infrastructure are now greatly facilitated by the proliferation and inter-connected nature of computing devices [1]. This has given rise to a new profession, commonly referred to as "cyber security," in which professionals are tasked with providing the necessary insight and technical know-how to minimize the risk exposed by these devices, both commercially and at home [2].
To evaluate the resilience of computing systems to malicious attack, it is crucial to be able to audit not only core operating system software but also third-party software installed within the operating system. Therefore, the purpose of performing an audit of software is to not only identify systems susceptible to known vulnerabilities, but to also discover unknown vulnerabilities [3], The auditing process provides owners of computing systems with the opportunity to update their software or apply other defenses to mitigate these vulnerabilities, reducing their exposure to attack.
The process of performing a software audit centers around the activity of reverse engineering, which is defined by Regalado [3] as "simply taking a product apart to understand how it works." The level of effort required to audit software will vary based on several conditions such as: the complexity of the software, the scope of the audit, availability of source code, technical proficiency of those performing the review and familiarity with the software under scrutiny. In the case where source code is not available, which occurs frequently with closed-source, proprietary software, binary analysis must be performed [3]. According to Andriesse, Chen, van der Veen, Slowinska and Bos [4], "Disassembly is thus crucial for analyzing or securing untrusted or proprietary binaries, where source code is simply not available." Binary analysis requires that the software being reviewed is done so from a binary state, without the benefits provided when analyzing source code such as variable and function naming, programmer comments, control flow constructs, and class definitions. This level of analysis requires greater knowledge by the reviewer in such areas as compiler behavior, operating system internals, assembly language, file formats and other topics that correspond to the inner-workings of the computer system [3].
Due to the complexities involved with binary analysis, this work proposes BARED, which is intended to be used by security researchers and malware analysts. Security researchers can use the framework to audit software for known vulnerabilities as well as to perform research to potentially identify unknown vulnerabilities, providing an opportunity for remediation before the software is exploited. Malware analysts can also utilize the framework to perform binary analysis and also expand their investigative capabilities through the frameworks search capabilities. Furthermore, the framework can be utilized to analyze mitigation techniques implemented by operating system and third-party vendors, which assists in evaluating a computer system's defensive measures. BARED takes a novel approach to system-level security by introducing a framework that provides for binary analysis of software through a structured framework and a relational database for permanent storage of the disassembled binary instructions. BARED also introduces novel ways of searching and interacting with the disassembled instructions.
-
PROPOSED FRAMEWORK
In this paper, we propose BARED, a framework which will assist in performing reverse engineering activity on software in its binary state.
-
-
BINARY ANALYSIS FRAMEWORKS
Binary analysis frameworks have been created to streamline the process of performing binary analysis. These frameworks often exhibit common characteristics in performance, support and output provided to the end-user. This section will discuss prevailing frameworks, features provided and limitations. The section concludes by identifying how BARED addresses these limitations.
-
PEV
Efforts have been made to stream-line the process of reverse engineering, whether under a static or dynamic context. PEV was introduced by Merces and Weyrich [5] as a "fast, scriptable, multiplatform, feature-rich, free and open-source" [5] framework for the analysis of portable executable files. In addition to providing an automated framework for parsing and displaying information about a PE file, it also provides functionality to assist in identifying malformations, indicators of potential malicious activity, and the ability to generate disassembly. While a versatile platform, it is limited to parsing PE files, it docs not permanently store data and does not provide an architecture for searching the instructions of the program under analysis.
-
BARF
A framework that aims to be multi-platform, BARF [6] has also been released as an open-source project. BARF consists of three components: the core of the framework, architecture support and analysis functionality. At the core of the architecture support is the use of the Capstone disassembly engine. This framework provides broader support than that offered by Merces and Weyrich [5] in PEV as it supports multiple file formats, supports the ARM architecture, utilizes an intermediate language and offers a satisfiability modulo theories ("SMT") solver. Tools that utilize the BARF framework have also been developed and include the ability to enumerate return-oriented programming ("ROP") gadgets, produce control-flow graphs of selected functions and generate call graphs of a selected function. Limitations are like those highlighted in the previous framework, namely there is no persistence model, lacks arbitrary search of disassembly output and does not create a project-based profile that links related binaries.
-
ANGR
Angr is a python-based binary analysis framework [7]. Angr differs from the previous frameworks in that it provides both "static and dynamic symbolic analysis" [7]. A framework that provides symbolic analysis requires additional levels of translation of disassembly output before allowing for user interaction and providing output [8]. As with the previous frameworks, interaction with Angr begins with binary analysis. The output from this stage is the disassembled machine code. Frameworks such as BARF and PEV will cease program interpretation at this point and allow for user interaction with the output, it is up to the user to derive meaning from the disassembly. However, Angr provides additional stages of processing before finishing binary analysis to include the use of an intermediate representation of machine code to provide multi-architecture support. Angr additionally provides a solver engine, machine state emulation, program path analysis, semantic representation and full program analysis. BARED does not intend to provide a feature set like Angr, but to expand analysis capabilities on disassembly output like frameworks such as PEV and BARF.
-
LIMITATIONS ADDRESSED BY BARED
BARED attempts to address several of the limitations discussed earlier in this section. BARED is not intended to provide features or address limitations in frameworks such as Angr, frameworks that employ symbolic or dynamic symbolic analysis. Instead, BARED was designed to address limitations in frameworks like PEV and BARF, frameworks that produce as the final output disassembly. The primary interaction with such frameworks is in analyzing this output, which provides the context in which the analyst derives program meaning and behavior.
BARED addresses several key limitations in current binary analysis frameworks. The framework uses an architecture that stores analyzed data using a persistent open-source database system. The benefits of this design arc: reduced time in performing program analysis, an architecture for grouping related binaries and a novel search architecture based on...
-
To continue reading
Request your trialCOPYRIGHT GALE, Cengage Learning. All rights reserved.