Automated Classification of Overfitting Patches with Statically Extracted Code Features

Proceedings of IEEE Transactions on Software Engineering (TSE)

Y. HeJ. GuM. MartinezT. DurieuxM. Monperrus 

PDFDOISource code


Automatic program repair (APR) aims to reduce the cost of manually fixing defects. However, APR suffers from generating overfitting patches. This paper presents a novel overfitting detection system called ODS. ODS first statically extracts code features from the AST edit script between the generated patch and the buggy program. ODS then automatically learns an ensemble probabilistic model from those features, and the learned model is used to classify and rank new potentially overfitting patches. We conduct a large-scale experiment to evaluate the effectiveness of ODS on classification and ranking based on 713 patches for Defects4J. The empirical evaluation shows that ODS is able to correctly detect 57% of overfitting patches, significantly faster than the related work. ODS is easily applicable, and can be applied as a post-processing procedure to rank the patches generated by any APR systems.

author={Ye, He and Gu, Jian and Martinez, Matias and Durieux, Thomas and Monperrus, Martin},
journal={IEEE Transactions on Software Engineering},
title={Automated Classification of Overfitting Patches with Statically Extracted Code Features},
Last Updated: 28/07/2021