If you use this data set please cite this R package and the following paper when accepted: Barbara Kitchenham, Lech Madeyski, David Budgen, Jacky Keung, Pearl Brereton, Stuart Charters, Shirley Gibbs, and Amnart Pohthong, 'Robust Statistical Methods for Empirical Software Engineering', Empirical Software Engineering, vol. 22, no.2, p. 579-630, 2017. DOI: 10.1007/s10664-016-9437-5 (https://dx.doi.org/10.1007/s10664-016-9437-5), URL: https://madeyski.e-informatyka.pl/download/KitchenhamMadeyskiESE.pdf
KitchenhamMadeyskiBudgen16.PolishSubjects
A data frame with variables:
The identifier for each subject
The identifier for each abstract - the code starts with a three alphanumeric string that defines the source of the abstract
Each judge assessed 4 abstracts in sequence, this data item identifies the order in which the subject viewed the specified abstract
Assessment by judge of question 1:Is the reason for the project clear? Can take values: Yes/No/Partly
Assessment by judge of question 2: Is the specific aim/purpose of the study clear? Can take values: Yes/No/Partly
Assessment by judge of question 3: If the aim is to describe a new or enhanced software technology (e.g. method, tool, procedure or process) is the method used to develop this technology defined? Can take values: Yes/No/Partly/NA
Assessment by judge of question 4: Is the form (e.g. experiment, general empirical study, data mining, case study, survey, simulation etc.) that was used to evaluate the technology made clear? Can take values: Yes/No/Partly
Assessment by judge of question 5: Is there a description of how the evaluation process was organised? Can take values: Yes/No/Partly
Assessment by judge of question 6: Are the results of the evaluation clearly described? Can take values: Yes/No/Partly
Assessment by judge of question 7: Are any limitations of the study reported?: Yes/No/Partly
Assessment by judge of question 8: Are any ideas for future research presented?: Yes/No/Partly
Assessment by judge of question regarding the overall understandability of the abstract: Please give an assessment of the clarity of this abstract by circling a number on the scale of 1-10 below, where a value of 1 represents Very Obscure and 10 represents Extremely Clearly Written.
A numerical value for completeness question 1 where 0=No, Partly=0.5, yes =1
A numerical value for completeness question 2 where 0=No, Partly=0.5, yes =1, NA means not applicable
A numerical value for completeness question 3 where 0=No, Partly=0.5, yes =1, NA means not applicable or not answered
A numerical value for completeness question 4 where 0=No, Partly=0.5, yes =1, NA means not applicable
A numerical value for completeness question 5 where 0=No, Partly=0.5, yes =1, NA means not applicable
A numerical value for completeness question 6 where 0=No, Partly=0.5, yes =1, NA means not applicable
A numerical value for completeness question 7 where 0=No, Partly=0.5, yes =1, NA means not applicable
A numerical value for completeness question 8 where 0=No, Partly=0.5, yes =1, NA means not applicable
The sum of the numerical completeness questions excluding those labelled NA
The count of the number of question related to completeness excluding questions considered not applicable
Sum/TotalQuestions
Data set collected at Wroclaw University of Technology (POLAND) by Lech Madeyski includes separate entries for each abstract assessed by a judge, that is 4 entries for each judge. Data collected from 16 subjects recruited from Wroclaw University of Technology who were each asked to assess 4 abstracts.
Note Only completeness question 2 was expected to be context dependent and have a NA (not applicable) answer, if other completeness answers were left blank, BAK coded the answer as NA
polishsubjects.txt
KitchenhamMadeyskiBudgen16.PolishSubjects
Run the code above in your browser using DataLab