If you use this data set please cite this R package and the following paper when accepted: Barbara Kitchenham, Lech Madeyski, David Budgen, Jacky Keung, Pearl Brereton, Stuart Charters, Shirley Gibbs, and Amnart Pohthong, 'Robust Statistical Methods for Empirical Software Engineering', Empirical Software Engineering, vol. 22, no.2, p. 579-630, 2017. DOI: 10.1007/s10664-016-9437-5 (https://dx.doi.org/10.1007/s10664-016-9437-5), URL: https://madeyski.e-informatyka.pl/download/KitchenhamMadeyskiESE.pdf
KitchenhamMadeyskiBudgen16.COCOMO
A data frame with variables:
Project ID
A categorical variable describing the type of the project
The year the project was completed
A categorical variable describing the development language used
Ordinal value defining the required software reliability
Ordinal value defining the data complexity / Data base size
Ordinal value defining the complexity of the software / Process complexity
??
Ordinal value defining the stringency of timing constraints / Time constraint for cpu
Ordinal value defining the stringency of the data storage requirements / Main memory constraint
Virtual Machine volatility
Turnaround time
A categorical variable defining the hardware type: mini, max=mainframe, midi
Ordinal value defining the analyst capability
Ordinal value defining the analyst experience / application experience
Ordinal value defining the programming capability of the team / Programmers capability
Ordinal value defining the virtual machine experience of the team
Ordinal value defining the programming language experience of the team
??
/ Modern programming practices
Ordinal value defining the extent of tool use / Use of software tools
Recoding of Tool to labelled ordinal scale
Ordinal value defining the stringency of the schedule requirements / Schedule constraint
Ordinal value defining the requirements volatility of the project
Categorical value calculated by BAK for an analysis example
Recoding of Rvol to a labelled ordinal scale
Mode of the projects: O=Organic, E=Embedded, SD-Semi-Detached
Dummy variable calculated by BAK: 1 if the project is Organic, 0 otherwise
Dummy variable calculated by BAK: 1 if the project is Semi-detached, 0 otherwise
Dummy variable calculated by BAK: 1 if the project is Embedded, 0 otherwise
Product Size Thousand of Source Instructions
Adjusted Product Size for Project in Thousand Source Instructions - differs from KDSI for enhancement projects
Project Effort in Man months
Duration in months
Productivity of project calculated by BAK as AKDSI/Effort, so the the larger the value the better the productivity
Data set collected at TRW by Barry Boehm see: B.W. Boehm. 1981. Software Engineering Economics. Prentice-Hall.
Explanations by Barbara Kitchenham / https://terapromise.csc.ncsu.edu:8443/!/#repo/view/head/effort/cocomo/cocomo1/nasa93/nasa93.arff
COCOMO.txt: pro type year Lang Rely Data CPLX aaf time store virt turn type2 acap aexp pcap vexp lexp cont modp TOOL TOOLcat SCED RVOL Select rvolcat Modecat Mode1 Mode2 Mode3 KDSI AKDSI Effort Dur Productivity
KitchenhamMadeyskiBudgen16.COCOMO
Run the code above in your browser using DataLab