Fuzzing - Or the Infinite Monkey Theorem

This is how the "Fuzzware" system by HGI researchers Tobias Scharnowski and Thorsten Holz works.

The researchers evaluate their fuzzer’s code coverage, i.e. how much of the program code can be analysed with their tool. The result: The code coverage is by a factor of 4 higher than with other algorithms.Copyright: CASA, Michael Schwettmann

The Bochum team is interested in the firmware of industrial control units. Copyright: CASA, Michael Schwettmann

Tobias Scharnowski is PhD student at the Horst Görtz Institute for IT Security. Copyright: CASA, Michael Schwettmann

Thorsten Holz is one of the Principal Investigators of the Cluster of Excellence CASA. Copyright: CASA, Michael Schwettmann

A program code is a bit like a jungle: complex in structure, difficult to view from the outside, with countless paths that can be taken through it. Finding vulnerabilities in such code is like looking for animals among the trees in the jungle: you know they are there, but you can’t see them directly. This is why PhD student Tobias Scharnowski is developing new methods to efficiently detect programming errors in the jungle of ones and zeros. He is conducting research at the Chair for System Security at the Horst Görtz Institute for IT Security of Ruhr-Universität Bochum, supervised by Professor Thorsten Holz.

The researchers are primarily interested in embedded systems: “We are trying to increase the security of computers that most people don’t even know are computers at all,” explains Scharnowski. Examples of such embedded systems include smart light bulbs, refrigerators connected to the internet and intelligent thermostats, to name but a few. All these objects contain electronic control technology with many lines of program code in which errors may have crept in. But household appliances are not the only things on the IT experts’ agenda. Above all, they are interested in industrial control systems, for example in critical infrastructures such as energy supply. These are areas where security gaps could have dramatic consequences.

Crashing the software on purpose
Scharnowski and Holz use what is known as fuzzing to detect errors in program code. Fuzzers are algorithms that feed the tested software with random inputs and check whether they can crash the application with them. Such crashes indicate programming errors. The fuzzer keeps varying the input in order to explore as many program components as possible step by step.

Fuzzing is already established for certain areas of application, for example to test operating systems such as Windows or Linux. It has not yet been widely used to test embedded systems, however, because they pose a number of challenges: The software – the so-called firmware – is embedded in a hardware with which it interacts. Researchers usually have little information about the hardware and how it works. “It’s like a black box for us,” describes Thorsten Holz. In addition, this black box is usually not particularly powerful – often the systems have relatively little memory and slow processors. This is a problem if the researchers want to carry out fuzzing directly on the system. It would take far too long to try out all possible inputs and wait for the system’s response.

Virtual imitation of hardware
This is why the team doesn’t analyse the firmware directly in the industrial control unit or in the light bulb. Instead, they recreate the hardware virtually – this process is called emulation. The emulator makes the firmware believe that it is inside the real device. For this, it has to interact with the program in exactly the same way as the real hardware would. “This means we have to imitate all the interfaces that exist between hardware and firmware,” explains Thorsten Holz. Once this is accomplished, the researchers can test the firmware in a powerful system.
Still, it would take a long time if they let their fuzzer try out all theoretically conceivable inputs. That’s why the researchers add another step to the fuzzing process by narrowing down the possible inputs. First, they model the framework in which the inputs must be located in order to be logical for the firmware. For example: let’s assume that the hardware is a refrigerator with a temperature sensor. The refrigerator hardware can report the measured temperatures to the refrigerator’s software, i.e., its firmware. Realistically, it’s not possible for just any given temperature to occur, it has to fall within a certain range. Therefore, the firmware is only programmed for a certain temperature range. It could not process other values at all, so there is no need to fuzz them.

Limited inputs facilitate efficient analysis
“We only use the inputs in the fuzzing process that the firmware expects and can handle,” points out Thorsten Holz and compares the process to the Infinite Monkey Theorem: “This theorem states that, if you let monkeys type on a keyboard for long enough, they would eventually come up with the works of Shakespeare.” The same applies to the fuzzer: if you let it try again and again, it would, by chance, eventually use meaningful inputs. But it would take a long time. “We want to make our monkeys a bit more intelligent, though,” says Tobias Scharnowski. “We take away all the keys they don’t need and try to get them to press only useful keys. With the inputs that are left, we can still test the code all the way down.” This makes fuzzing with the Bochum system – known as Fuzzware – particularly efficient.
Together with colleagues from Santa Barbara and Amsterdam, the Bochum team tested 77 firmwares using Fuzzware. Compared to conventional fuzzing methods, they sorted out up to 95.5 per cent of all possible inputs. This enables Fuzzware to check up to three times more of the program code than conventional methods in the same amount of time. In the process, the group also identified additional vulnerabilities that had remained undetected with other fuzzing methods.

Vulnerabilities are always there
“You can always find something,” says Thorsten Holz. “If a system has never been tested with fuzzing, it will have undiscovered vulnerabilities.” In the case of embedded systems in particular, it is almost impossible for programmers to create the perfect code. “In order to talk to the hardware of embedded systems, you have to use a low-level programming language,” explains Tobias Scharnowski. For many applications, programmers can’t simply fall back on code snippets that have been developed for other applications. They have to build their code from scratch. Edge cases – namely states that the system rarely encounters – may then not be taken into account. “For our fuzzers, however, these states are easy to analyse,” says Scharnowski. “They can therefore help make the systems more robust.” By reporting any vulnerability they identify to the manufacturers, the researchers contribute to greater security in industry, light bulbs and refrigerators, to name but a few.

Original publication
Tobias Scharnowski, Nils Bars, Moritz Schloegel, Eric Gustafson, Marius Muench, Giovanni Vigna, Christopher Kruegel, Thorsten Holz, Ali Abbasi: Fuzzware: Using precise MMIO modeling for effective firmware fuzzing, 31st Usenix Security Symposium, Boston, USA, 2022

The article is published as part of the IT security special issue of the science magazine Rubin 2022/23.

To the Outreach Website

General note: In case of using gender-assigning attributes we include all those who consider themselves in this gender regardless of their own biological sex.

Back