IntroclassJava
If you use IntroClassJava, please cite the following technical report: Thomas Durieux and Martin Monperrus. " IntroClassJava: A Benchmark of 297 Small and Buggy Java Programs ". Technical Report hal-01272126, University of Lille; 2015.
@techreport{durieux:hal-01272126,
TITLE = {{IntroClassJava: A Benchmark of 297 Small and Buggy Java Programs}},
AUTHOR = {Durieux, Thomas and Monperrus, Martin},
URL = {https://hal.archives-ouvertes.fr/hal-01272126/document},
INSTITUTION = {{Universite Lille 1}},
YEAR = {2016},
HAL_ID = {hal-01272126},
}
2
3
4
5
6
7
8
This benchmark has been automatically generated from the IntroClass benchmark for C from the Autorepair Benchmark Suite, a joint project between Carnegie-Mellon University and the University of Massachusetts ( http://repairbenchmarks.cs.umass.edu/ ).
Main statistics
Project | # wb ok | # wb ko | # bb ok | # bb ko | # both ko | # program |
---|---|---|---|---|---|---|
digits | 15 | 60 | 24 | 51 | 36 | 75 |
grade | 1 | 88 | 0 | 89 | 88 | 89 |
checksum | 0 | 11 | 4 | 7 | 7 | 11 |
median | 9 | 48 | 6 | 51 | 42 | 57 |
smallest | 7 | 45 | 5 | 47 | 40 | 52 |
syllables | 1 | 12 | 0 | 13 | 12 | 13 |
6 Projects | 33 | 264 | 39 | 258 | 225 | 297 |
Directory Overview
introclassJava/
├─lib/
│ ├─data/
│ │ └─dataset.xml
│ ├─CToJava.py
│ └─evalIntroClassJava.py
├─dataset/
│ ├─checksum/
│ │ ├─f4a823174201234546789abcdeffff<repository ID hex string>.../
│ │ │ ├─000/
│ │ │ │ └─src/
│ │ │ │ ├─main/
│ │ │ │ │ └─java/introclassJava
│ │ │ │ │ └─digits_f4a823174201234546789abcdeffff_000.java
│ │ │ │ └─test/
│ │ │ │ └─java/introclassJava
│ │ │ │ └─digits_f4a823174201234546789abcdeffff_000BackboxTest.java
│ │ │ └─0001/
│ │ │ └─<same as above>
│ │ └─09F911029D74E35BD84156C5635688C0<next repository ID hex string>.../
│ └─digits/
└─...
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
The folder lib
contains the python scripts use to transform the C dataset to a Java dataset.
The file lib/data/dataset.xml
contains the dataset IntroClass transformed into xml via the following command:
srcml --language=C --literal --operator --modifier `find IntroClass -name "*.c"` -o IntroClassJava/lib/data/dataset.xml
The folder dataset
contains the assignment programs:
- checksum -- compute a simple checksum of a string
- digits -- reverse the digits of an integer "123" -> "321"
- grade -- compute the letter grade corresponding to a percentage
- median -- give the median of three numbers
- smallest -- give the smallest of three numbers
- syllables -- give the number of English syllables in a string
Each subdirectory below represents a student's submitted repository which contains several revisions. Each revision is a maven project.