This data set is taken from UCI repository, see reference. Past usage includes price prediction of cars using all numeric and boolean attributes (Kibler et al., 1989).
data(auto)
A data frame with 205 observations on the following 26 variables, of which 15 are quantitative and 11 are categorical. The following description is extracted from UCI repository (Frank and Asuncion, 2010):
Normalized-losses | the relative average loss payment per insured vehicle year; ranged from 65 to 256 |
Make | Vehicle's make |
Fuel-type | diesel, gas |
Aspiration | std, turbo |
Num-of-doors | four, two |
Body-style | hardtop, wagon, sedan, hatchback, convertible |
Drive-wheels | 4wd, fwd, rwd |
Engine-location | front, rear |
Wheel-base | continuous from 86.6 120.9 |
Length | continuous from 141.1 to 208.1 |
Width | continuous from 60.3 to 72.3 |
Height | continuous from 47.8 to 59.8 |
Curb-weight | continuous from 1488 to 4066 |
Engine-type | dohc, dohcv, l, ohc, ohcf, ohcv, rotor |
Num-of-cylinders | eight, five, four, six, three, twelve, two |
Engine-size | continuous from 61 to 326 |
Fuel-system | 1bbl, 2bbl, 4bbl, idi, mfi, mpfi, spdi, spfi |
Bore | continuous from 2.54 to 3.94 |
Stroke | continuous from 2.07 to 4.17 |
Compression-ratio | continuous from 7 to 23 |
Horsepower | continuous from 48 to 288 |
Peak-rpm | continuous from 4150 to 6600 |
City-mpg | continuous from 13 to 49 |
Highway-mpg | continuous from 16 to 54 |
Price | continuous from 5118 to 45400 |
Symboling | assigned insurance risk rating: -3, -2, -1, 0, 1, 2, 3 |
Frank, A. & Asuncion, A. (2010). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
Kibler, D., Aha, D.W., & Albert,M. (1989). Instance-based prediction of real-valued attributes. Computational Intelligence, Vol 5, 51--57.