- R 100%
| code | ||
| data | ||
| ManualValidation.csv | ||
| README.md | ||
data/glued_time_sha_origin_all.csv : contains the working data set used to address the study research questions . The file reports an entry for each changed churn of each commit.
data/All.csv: contains the working data set used to address the study research questions in the old paper before extension. The file reports an entry for each changed churn of each commit.
The file columns can be read as follows:
- ID: line id
- PROJECT: project name
- COMMIT: commit sha
- FILE: file name
- FROM: change starting line
- TO: change ending line
- BUGINTRO: whether the change induced a fix
- FUNCCHANGE: whether the change affected a functional construct
- OVERLAP: whether the fix-inducing change and the functional change occur on the same line
- FUNCADD: whether the change added a new functional construct
- BUGANDFUNCADD: whether the change added a new functional construct and induced a fix The remaining columns have a name in the form BugIntro.Construct.ChangeType where:
- BugIntro is True or False depending on whether the given type of change induced a fix
- Construct is the name of the construct being changed (lambda, listcomp, setcomp, dictcomp, map, reduce, filter)
- ChangeType is either "add" (the construct has been added) or "changed" the construct has been changed Such columns count the number of changes of each kind included in the churn.
data/ToSample.csv contains the list of fixing changes affecting lines where a functional construct directly inducted a fix. This is the population from which the sample for the manual analysis has been drawn. The file contains the following columns:
- PROJECT: project name
- COMMIT: fix commit sha
- FILE: file affected by the fix of the functional construct
- LINES: list of lines affected (separated by semi-colon)
- MESSAGE: commit message
code/FuncBugs.R: script used to perform the statistical analyses
code/statsmodels.R : script used to perform the statistical analyses and survival models
ManualValidation.csv: redacted manual validation spreadsheet. It contains the following columns:
- PROJECT,COMMIT,FILE,LINES,MESSAGE: as in "tosample.csv"
- Val1: classification by Validator 1
- Val2: classification by Validator 2
- Resolution: classification resulution after discussion
- Classif: change classification, where appropriate
- Final note, Note 1-Note 4: notes added during the validation rounds and during the final discussion
data/detailed: this directory contains raw data (results of bug fix identification, SZZ, and functional change analysis. See its README for details