Thomas Durieux

PhD student in software engineering, he focus on automatic techniques to fix software in production environment.

Experience

  • KTH
    / Apr. 2018 - Jun. 2018

    PhD internship - International internship at KTH in the Theoretical Computer Science department.
  • KTH
    / Sep. 2017 - Dec. 2017

    PhD internship - International internship at KTH in the Theoretical Computer Science department.
  • INRIA
    / Sep. 2015 - Aug. 2018

    PhD Student - The motivation of this PhD is to create new techniques to fix automatically software in production environment.
  • INRIA
    / Sep. 2014 - Aug. 2015

    Research intern - I worked on the Nopol project, a test-suite-based tool for automated bug repair which outputs patches. First, relying on the Java PathFinder (JPF) library, I integrated symbolic execution to Nopol in order to extend its repair scope to buggy arithmetic statements. Second, I worked on the synthesis of patches based on the context buggy statements. I used the approach proposed in CodeHint. This tool, developed at UC Berkeley, provides dynamic and interactive synthesis of code snippets.
  • CERN
    / Summer 2014

    Summer student - I was integrated to the CERN the security team. Where I developed a set of tools which scans thousands of CERN WEB servers in order to detect misconfigurations. First, I made a first had to determine the common types of misconfigurations present in the CERN network. Second, I realized the 8 detection tools by focusing on the reduction of false positives. Third, the results were presented to the security team.
  • Microsoft Innovation Center
    / Feb. - Jun. 2013

    Industrial entrepreneurship at Aproove - In order to create a distributed system of the Aproove product, a leading annotation and validation tool for graphic document of high resolution. I developed in a very tight schedule, a new backend system for managing multiple instances of the Aproove product. This system allows database management and Aproove instance management on multiple hosts.

Teaching

  • University of Lille
    / Sep. 2016 - Jun. 2018

    Software Engineering (Master 1) and Algorithm and Programing (Bachelor 1)

PC Member

  • SANER
    / 2018

    Early Research Achievement Track

Education

Publications

  • Fully Automated HTML and Javascript Rewriting for Constructing a Self-healing Web Proxy
    / 2018

    Durieux, T., Hamadi, Y., and Monperrus, M.
    Over the last few years, the complexity of web applications has increased to provide more dynamic web applications to users. The drawback of this complexity is the growing number of errors in the front-end applications. In this paper, we present BikiniProxy, a novel technique to provide self-healing for the web. BikiniProxy is designed as an HTTP proxy that uses five self-healing strategies to rewrite the buggy HTML and Javascript code. We evaluate BikiniProxy with a new benchmark of 555 reproducible Javascript errors, DeadClick. We create DeadClick by randomly crawling the Internet and collect all web pages that contain Javascript errors. Then, we observe how BikiniProxy heals those errors by collecting and comparing the traces of the original and healed pages. To sum up, BikiniProxy is a novel fully-automated self-healing approach that is specific to the web, evaluated on 555 real Javascript errors, and based on original self-healing rewriting strategies for HTML and Javascript.

    PDFSource Code
  • Alleviating Patch Overfitting with Automatic Test Generation: A Study of Feasibility and Effectiveness for the Nopol Repair System
    / 2018

    Yu, Z., Martinez, M., Danglot, B., Durieux, T., and Monperrus, M. Proceedings at Empirical Software Engineering.
    Among the many different kinds of program repair techniques, one widely studied family of techniques is called test suite based repair. However, test suites are in essence input-output specifications and are thus typically inadequate for completely specifying the expected behavior of the program under repair. Consequently, the patches generated by test suite based repair techniques can just overfit to the used test suite, and fail to generalize to other tests. We deeply analyze the overfitting problem in program repair and give a classification of this problem. This classification will help the community to better understand and design techniques to defeat the overfitting problem. We further propose and evaluate an approach called UnsatGuided, which aims to alleviate the overfitting problem for synthesis-based repair techniques with automatic test case generation. The approach uses additional automatically generated tests to strengthen the repair constraint used by synthesis-based repair techniques. We analyze the effectiveness of UnsatGuided: 1) analytically with respect to alleviating two different kinds of overfitting issues; 2) empirically based on an experiment over the 224 bugs of the Defects4J repository. The main result is that automatic test generation is effective in alleviating one kind of overfitting issue–regression introduction, but due to oracle problem, has minimal positive impact on alleviating the other kind of overfitting issue–incomplete fixing.

    PDFSource Code
  • Exhaustive Exploration of the Failure-oblivious Computing Search Space
    / 2018

    Durieux, T., Hamadi, Y., Yu, Z. and Monperrus M. Proceedings of the 11th IEEE Conference on Software Testing, Validation and Verification (ICST'18)
    High-availability of software systems requires automated handling of crashes in presence of errors. Failure-oblivious computing is one technique that aims to achieve high availability. We note that failure-obliviousness has not been studied in depth yet, and there is very few study that helps understand why failure-oblivious techniques work. In order to make failure-oblivious computing to have an impact in practice, we need to deeply understand failure-oblivious behaviors in software. In this paper, we study, design and perform an experiment that analyzes the size and the diversity of the failure-oblivious behaviors. Our experiment consists of exhaustively computing the search space of 16 field failures of large-scale open-source Java software. The outcome of this experiment is a much better understanding of what really happens when failure-oblivious computing is used, and this opens new promising research directions.

    PDFSource CodeSlide
  • Dissection of a Bug Dataset: Anatomy of 395 Patches from Defects4J
    / 2018

    Sobreira, V., Durieux, T., Madeiral, F., Monperrus M. and Maia, M. A.. Proceedings of the 25th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER '18)
    Well-designed and publicly available datasets of bugs are an invaluable asset to advance research fields such as fault localization and program repair as they allow directly and fairly comparison between competing techniques and also the replication of experiments. These datasets need to be deeply understood by researchers: the answer for questions like "which bugs can my technique handle?" and "for which bugs is my technique effective?" depends on the comprehension of properties related to bugs and their patches. However, such properties are usually not included in the datasets, and there is still no widely adopted methodology for characterizing bugs and patches. In this work, we deeply study 395 patches of the Defects4J dataset. Quantitative properties (patch size and spreading) were automatically extracted, whereas qualitative ones (repair actions and patterns) were manually extracted using a thematic analysis-based approach. We found that 1) the median size of Defects4J patches is four lines, and almost 30% of the patches contain only addition of lines; 2) 92% of the patches change only one file, and 38% has no spreading at all; 3) the top-3 most applied repair actions are addition of method calls, conditionals, and assignments, occurring in 77% of the patches; and 4) nine repair patterns were found for 95% of the patches, where the most prevalent, appearing in 43% of the patches, is on conditional blocks. These results are useful for researchers to perform advanced analysis on their techniques' results based on Defects4J. Moreover, our set of properties can be used to characterize and compare different bug datasets.

    PDFSource CodeWebSite
  • Production-Driven Patch Generation
    / 2017

    Durieux, T., Hamadi, Y., and Monperrus M. ICSE NIER
    We present an original concept for patch generation: we propose to do it directly in production. Our idea is to generate patches on-the-fly based on automated analysis of the failure context. By doing this in production, the repair process has complete access to the system state at the point of failure. We propose to perform live regression testing of the generated patches directly on the production traffic, by feeding a sandboxed version of the application with a copy of the production traffic, the "shadow traffic". Our concept widens the applicability of program repair because it removes the requirements of having a failing test case.

    PDFSlide
  • Dynamic Patch Generation for Null Pointer Exceptions Using Metaprogramming
    / 2017

    Durieux, T., Cornu, B., Seinturier, L.,and Monperrus, M. IEEE International Conference on Software Analysis, Evolution and Reengineering
    Null pointer exceptions (NPE) are the number one cause of uncaught crashing exceptions in production. In this paper, we aim at exploring the search space of possible patches for null pointer exceptions with metaprogramming. Our idea is to transform the program under repair with automated code transformation, so as to obtain a metaprogram. This metaprogram contains automatically injected hooks, that can be activated to emulate a null pointer exception patch. This enables us to perform a fine-grain analysis of the runtime context of null pointer exceptions. We set up an experiment with 16 real null pointer exceptions that have happened in the field. We compare the effectiveness of our metaprogramming approach against simple templates for repairing null pointer exceptions.

    PDFSlideSource CodeExperiment Results
  • Automatic repair of real bugs in java: a large-scale experiment on the defects4j dataset
    / 2016

    Martinez, M., Durieux, T., Sommerard, R., Xuan, J., and Monperrus, M. Proceedings at Empirical Software Engineering.
    Defects4J is a large, peer-reviewed, structured dataset of real-world Java bugs. Each bug in Defects4J comes with a test suite and at least one failing test case that triggers the bug. In this paper, we report on an experiment to explore the effectiveness of automatic test-suite based repair on Defects4J. The result of our experiment shows that the considered state-of-the-art repair methods can generate patches for 47 out of 224 bugs. However, those patches are only test-suite adequate, which means that they pass the test suite and may potentially be incorrect beyond the test-suite satisfaction correctness criterion. We have manually analyzed 84 different patches to assess their real correctness. In total, 9 real Java bugs can be correctly repaired with a test-suite based repair. This analysis shows that test-suite based repair suffers from under-specified bugs, for which trivial or incorrect patches still pass the test suite. With respect to practical applicability, it takes on average 14.8 minutes to find a patch. The experiment was done on a scientific grid, totaling 17.6 days of computation time. All the repair systems and experimental results are publicly available on Github in order to facilitate future research on automatic repair.

    PDFExperiment ResultsSlide
  • Nopol: Automatic Repair of Conditional Statement Bugs in Java Programs
    / 2016

    JXuan, J., Martinez, M., Demarco, F., Clément, M., Marcote, S.L., Durieux, T., Le Berre, D. and Monperrus, M. IEEE Transactions on Software Engineering, Institute of Electrical and Electronics Engineers.
    We propose Nopol, an approach to automatic repair of buggy conditional statements (i.e., if-then-else statements). This approach takes a buggy program as well as a test suite as input and generates a patch with a conditional expression as output. The test suite is required to contain passing test cases to model the expected behavior of the program and at least one failing test case that reveals the bug to be repaired. The process of Nopol consists of three major phases. First, Nopol employs angelic fix localization to identify expected values of a condition during the test execution. Second, runtime trace collection is used to collect variables and their actual values, including primitive data types and objected-oriented features (e.g., nullness checks), to serve as building blocks for patch generation. Third, Nopol encodes these collected data into an instance of a Satisfiability Modulo Theory (SMT) problem; then a feasible solution to the SMT instance is translated back into a code patch. We evaluate Nopol on 22 real-world bugs (16 bugs with a buggy if conditions and six bugs with missing preconditions) on two large open-source projects, namely Apache Commons Math and Apache Commons Lang. Empirical analysis on these bugs shows that our approach can effectively fix bugs with buggy if conditions and missing preconditions. We illustrate the capabilities and limitations of Nopol using case studies of real bug fixes.

    PDFSource CodeExperiment Results
  • DynaMoth: Dynamic Code Synthesis for Automatic Program Repair
    / 2016

    Durieux, T., Monperrus, M. 11th International Workshop in Automation of Software Test (AST 2016), May 2016, Austin, United States.
    Automatic software repair is the process of automatically fixing bugs. The Nopol repair system repairs Java code using code synthesis. We have designed a new code synthesis engine for Nopol based on dynamic exploration, it is called DynaMoth. The main design goal is to be able to generate patches with method calls. We evaluate DynaMoth over 224 of the Defects4J dataset. The evaluation shows that Nopol with DynaMoth is capable of synthesizing patches and enables Nopol to repair new bugs of the dataset.

    PDFSlideSource Code
  • IntroClassJava: A Benchmark of 297 Small and Buggy Java Programs
    / 2016

    Durieux, T., Monperrus, M. IntroClassJava: A Benchmark of 297 Small and Buggy Java Programs
    Reproducible and comparative research requires well-designed and publicly available benchmarks. We present IntroClassJava, a benchmark of 297 small Java programs, specified by JUnit test cases, and usable by any fault localization or repair system for Java. The dataset is based on the IntroClass benchmark and is publicly available on Github.

    PDFDataset

Projects

  • BanditRepair
    / 2016

    Researcher
  • DynaMoth
    / 2015

    Researcher - DynaMoth is an open-source research project that aims to generate Java expressions with methods, variables available on a project that respect a specific behaviour.
  • NPEFix
    / 2015

    Researcher - NPEFix is an open-source research project that aims to tolerate null pointer dereference at runtime.
  • NoPol
    / 2015

    Researcher - NoPol is an automatic software repair tool developed at INRIA Lille.
  • \BlueLaTeX
    / 2014 - current

    Developer - \BlueLaTeX is an open-source project and aims to provide a tool chain to easily write collaboratively LaTeX documents. I developed the new UI interface using the MVC framework Angular.js.
  • SyncTeX JS Parser
    / 2015

    Developer - SyncTeX JS Parser is a SyncTeX parser written in JavaScript. This parser is used in the \BlueLaTeX project.
  • MultiAgents
    / 2015

    Developer - A small multi-agent system writen in JavaScript.
  • BibTeX2Wiki
    / 2014

    Developer - BibTeX2Wiki is a small tool which transforms BibTeX file into a Wikipedia formated references.
Location:
Lille, France
GitHub:
LinkedIn:
Email:
thomas|at|durieux.me