Addressing the Saturation Effect in Compiler Testing

12/5/2023, 4:30pm

Speaker

Cristian Cadar

"Addressing the Saturation Effect in Compiler Testing" by Cristian Cadar

Abstract

Compiler testing techniques such as Csmith have shown remarkable results, with hundreds of bugs being found in mature compilers such as GCC and LLVM. Despite their success, these techniques show signs of saturation, i.e. they are less able to generate programs that trigger further compiler bugs.

In the context of compilers for languages with extensive undefined behaviour, such as C/C++, we identify two key reasons for this saturation effect: the restrictive nature of the programs they generate, which must meet various constraints; and the blackbox nature of the testing techniques, which receive no feedback from the compilers being tested.

In this talk, I present two approaches that address this saturation effect in compiler testing. First, we show that by relaxing the constraints imposed during generation, we can create programs that find bugs which are beyond the reach of the original techniques. Second, we show that greybox fuzzing of compilers for languages with extensive undefined behaviour, particularly C/C++, is possible, by devising custom semantics-aware mutations.

This is based on joint work with Karine Even-Mendoza, Arindam Sharma and Alastair Donaldson.

Bio

Cristian Cadar is a Professor in the Department of Computing at Imperial College London, where he leads the Software Reliability Group (http://srg.doc.ic.ac.uk), working on automatic techniques for increasing the reliability and security of software systems. Cristian's research has been recognised by several prestigious awards, including the IEEE TCSE New Directions Award, BCS Roger Needham Award, HVC Award, EuroSys Jochen Liedtke Award, and two test of time awards. Many of the research techniques he co-authored have been used in both academia and industry. In particular, he is co-author and maintainer of the KLEE symbolic execution system, a popular system with a large user base. Cristian has a PhD in Computer Science from Stanford University, and undergraduate and Master's degrees from the Massachusetts Institute of Technology.