Algorithm
1 depicts the algorithm of ESF. ESF requires additional preprocessing steps, such as acquiring firmware or identifying interfaces. It then applies traditional fuzzing steps, such as input generation, sending inputs, and target monitoring. Table
6 presents the fuzzing steps of popular embedded system fuzzers. We provide their details in the following subsections.
5.1 Firmware Acquisition
Although some works [
20,
28,
49,
145] gathered many firmware images via a web crawler, all firmware images are difficult to collect as they are companies’ property. Sometimes, firmware images can be extracted via UART or JTAG, but device vendors tend to disable debug ports nowadays. Another method, dumping flash memory of embedded devices, is sometimes possible. However, it requires desoldering the flash memory and hence becomes more difficult. This firmware acquisition is critical to emulation-based fuzzing as subsequent works can no longer proceed. In summary, firmware acquisition is an indispensable step in emulation-based fuzzing.
To overcome this limitation, several direct fuzzing techniques [
27,
86,
138] have been investigated. IoTFuzzer [
22] is targeted at network protocols for IoT devices. In the case of network-based fuzzing, fuzzing targets are mainly network applications visible from the network. This narrow view is a limitation of the network-based fuzzer, whereas fuzzing through firmware emulation can test any firmware applications. In addition, debug ports, such as UART or JTAG, can be used to connect a fuzzing system to the target device. For example, [
12,
83] used the UART connection for a fuzzer, and Avatar2 [
86] used a debug interface when it orchestrated an emulator with the actual device.
5.4 Fuzzing
Many publications and surveys [
74,
76,
82] on traditional fuzzing have intensively described fuzzing algorithms and techniques. Meanwhile, most ESF publications have briefly described them as setting up a fuzzing environment, such as firmware acquisition or emulation, is more important. Therefore, we briefly discuss fuzzing techniques in this section, based on the fuzzing features described in ESF articles.
The first type of ESF is network fuzzing. As depicted in Table
4, Prospect, FirmFuzz, and IoTFuzzer include network fuzzers. Prospect [
61] uses black-box fuzzing, which sets up a proxy to connect an emulation environment to an actual embedded system. The proxy captures network packets, and the inside fuzzer generates random traffic. Specifically, it takes packets from the captured network traffic, randomizes 1 byte at an arbitrary position, and replays the communication without a skip. This is a typical black-box fuzzing operation, but it is not smart. Accordingly, a smart fuzzer, FirmFuzz [
115], was proposed. FirmFuzz adapts a generation-based gray-box fuzzer. It conducts static analysis and uses a command-line browser free from creating standard HTTP packets, which easily creates a generation-based fuzzer. IoTFuzzer [
22] conducted a dynamic analysis to recognize the message of the IoT application and mutated the message to formulate test cases for the target device. This mutation is both a simple method without complex protocol analysis and a useful technique as it does not require consideration of encryption.
Table
8 presents a comparison of network fuzzers for embedded systems. Prospect uses simple black-box fuzzing and does not need preprocessing, so it is fast, but the result is unsatisfactory. Meanwhile, FirmFuzz leverages static analysis for smart fuzzing to obtain satisfactory results. Finally, IoTFuzzer requires dynamic IoT application analysis; thus, it is relatively slow, but it exhibits promising results. In summary, every network fuzzer has pros and cons depending on its policy and purpose.
The second type of embedded fuzzing is symbolic execution [
9]. Symbolic execution regards a variable as a symbol. When it meets a path constraint, the symbol can be any value that satisfies the constraint. Consequently, theoretically, symbolic execution explores all paths in a program. For example, FIE [
32] uses the KLEE [
19] symbolic execution engine to present an extensible inspection of firmware programs. However, symbolic execution has the disadvantage of
state explosion, which produces numerous program paths that it should explore. FIE addresses this problem through state pruning and memory smudging [
32]. Several solutions for the state explosion problem have been proposed to date, but they are beyond the scope of this article.
Hybrid fuzzing combines fuzzing and concolic (or symbolic) execution to achieve broader and deeper program testing coverage [
56,
81,
97,
117,
137,
144]. Thus far, no hybrid fuzzer is available in the ESF domain. However, several hybrid fuzzers in traditional fuzzing have shown promising results. For instance, QSYM [
137] implemented a fast concolic execution engine that leveraged the native execution with symbolic emulation to overcome the bottlenecks of the concolic executors. As hybrid fuzzers suffer from finding vulnerabilities in real-world applications, QSYM proposed such a solution. Driller [
117], another example, augmented fuzzing with selective concolic execution to find deeper bugs. It used fuzzing as an exerciser executing blocks of an application and concolic execution as an input generator that satisfies the path constraints. Consequently, Driller avoided path explosion and found the vulnerabilities successfully.
As coverage-based gray-box fuzzers [
17,
139] have shown exemplary performance in traditional fuzzing, they can be adapted to ESF. The coverage-based fuzzer measures the code coverage by calculating the proportion of executed basic blocks and total basic blocks of the program, and then it uses this information to identify unvisited program blocks (i.e., widen the code coverage). The coverage-based fuzzer aims to test every path branch of the program (i.e., PUT). A typical example of these fuzzers is the Firm-AFL [
145]. This fuzzer presents high throughput as its augmented process emulation renders it possible to test a target program in user-mode emulation quickly. However, as better coverage-based fuzzers exist in the traditional fuzzing area, we expect better coverage-based embedded fuzzers be proposed in the future.
While the feature of coverage-based fuzzers is to expand the code coverage, directed fuzzers [
16,
21,
41,
126] spend most of their execution time on arriving at specific target points (e.g., bug-suspicious area). For example, FirmCorn [
49] conducts directed fuzzing using a vulnerable code search algorithm. Owing to the effectiveness of the vulnerable code search algorithm, FirmCorn could achieve a very efficient time to crash. Directed fuzzing in traditional fuzzing research has two types: directed gray-box fuzzing and directed white-box fuzzing.
The authors of AFLGo [
16] introduced directed gray-box fuzzing, which generates inputs guiding the fuzzer to the target points. They proposed a new power scheduler that assigns more power to test inputs that lead to the target points. In another example, Hawkeye [
21] conducted a static analysis to gather information on the program and target locations and then executed the program along with the information. This strategy renders the Hawkeye fulfill fuzzing toward the target and shows better results.
BuzzFuzz [
41] leveraged dynamic taint tracing to fuzz an instrumented program and then ran the program on the generated inputs to determine whether the inputs contained any bugs. This method enabled random fuzzing to explore a deep program code while preserving the semantic form of the input. This method is useful when a user identifies the program attack points, but it cannot detect unexpected bugs and takes a long time to test many points. In another example, TaintScope [
126] used dynamic taint analysis and symbolic execution to bypass checksum mechanisms that block program execution from reaching the deep program code section. The execution monitor in TaintScope identified which input bytes control the arguments and which input bytes are related to the checksum. Then, it generated a bypass input. This method is helpful when a program has checksum mechanisms, but this requires preprocessing overhead (i.e., dynamic taint tracing). Moreover, white-box fuzzing has a disadvantage in that it takes a long time to explore all possible paths.
5.5 Target Monitoring
As described in Section
3.3, crashes in the embedded system are difficult to detect. Therefore, several response-monitoring methods have been proposed. First, network heartbeat checking is widely used for network-based fuzzers to detect some embedded systems’ hang or no effect. Second, crash detection methods in traditional fuzzing can be used in ESF, which are signal handlers or memory sanitizers [
109,
116] in an emulator. For example, signals such as signal 11 (segmentation violation) or signal 10 (SIGUSR) are issued when a fuzzer finds crashes. The monitoring module in a fuzzer catches and handles this signal. Third, some embedded devices reboot when they encounter memory corruption. This phenomenon commonly occurs in Type 3 devices, but the fuzzer does not handle it automatically.
Table
4 shows how related pieces of research monitor the crash. First, network-based fuzzers, such as IoTFuzzer [
22], check the liveness of a target device. It guesses program liveness by sending an arbitrary live check message. Muench et al. [
87] also used a liveness check. Notably, they analyzed a target device’s response behaviors thoroughly and classified the responses into six types: observable crash, reboot, hang, late crash, malfunctioning, and no effect. FirmFuzz [
115] detects vulnerabilities by monitoring the logs generated by emulating firmware. This technique is advantageous when it emulates the target firmware. Further, it can detect other vulnerabilities such as command injection [
118], null pointer dereference [
37], and cross-site scripting [
127] as these vulnerabilities can be detected by execution logs. Another detection method of memory corruption is conducted by a signal handler [
61,
145]. They monitored the program execution in an emulator and detected signals such as segmentation fault. This method has been used widely in traditional fuzzing and is better used with
AddressSanitizer (ASAN) [
109]. ASAN can also help detect a memory error by using compiler instrumentation and runtime libraries.
5.7 Seed Scheduling
After analyzing the exceptions, fuzzers generate seeds and select the next seed. Coverage-based fuzzers generate new seeds to expand the code coverage, and directed fuzzers generate new seeds to keep closer to the target points. For instance, Firm-AFL [
145] uses lightweight instrumentation to widen the branch coverage, and this instrumentation includes coarse branch-taken hit counts in calculating the branch coverage [
139]. Another fuzzer, FirmCorn [
49], calculates vulnerability feature ranking of the API functions in the static analysis stage, hooks the suspicious functions, and mutates the functions’ inputs in order to widen the code coverage. In particular, FirmCorn uses heuristic-based mutations such as the bit-flip operation. Meanwhile, AFLGo [
16] (i.e., a directed gray-box fuzzer) instruments the distance between the chosen seed and the set of target points. Then, it calculates the shortest path to the target points in the
control flow graph (CFG) and executes the program along the path with a triggering seed. This triggering seed guides the fuzzer to a specific program point.