Learn R Programming

kernlab (version 0.9-24)

ticdata: The Insurance Company Data

Description

This data set used in the CoIL 2000 Challenge contains information on customers of an insurance company. The data consists of 86 variables and includes product usage data and socio-demographic data derived from zip area codes. The data was collected to answer the following question: Can you predict who would be interested in buying a caravan insurance policy and give an explanation why ?

Usage

data(ticdata)

Arguments

Format

ticdata: Dataset to train and validate prediction models and build a description (9822 customer records). Each record consists of 86 attributes, containing sociodemographic data (attribute 1-43) and product ownership (attributes 44-86). The sociodemographic data is derived from zip codes. All customers living in areas with the same zip code have the same sociodemographic attributes. Attribute 86, CARAVAN:Number of mobile home policies, is the target variable. Data Format
1 STYPE
Customer Subtype 2
MAANTHUI Number of houses 1 - 10
3 MGEMOMV Avg size household 1 - 6
4 MGEMLEEF
Average age 5
MOSHOOFD Customer main type
6 MGODRK Roman catholic
7 MGODPR
Protestant ... 8
MGODOV Other religion
9 MGODGE No religion
10 MRELGE
Married 11
MRELSA Living together
12 MRELOV Other relation
13 MFALLEEN
Singles 14
MFGEKIND Household without children
15 MFWEKIND Household with children
16 MOPLHOOG
High level education 17
MOPLMIDD Medium level education
18 MOPLLAAG Lower level education
19 MBERHOOG
High status 20
MBERZELF Entrepreneur
21 MBERBOER Farmer
22 MBERMIDD
Middle management 23
MBERARBG Skilled labourers
24 MBERARBO Unskilled labourers
25 MSKA
Social class A 26
MSKB1 Social class B1
27 MSKB2 Social class B2
28 MSKC
Social class C 29
MSKD Social class D
30 MHHUUR Rented house
31 MHKOOP
Home owners 32
MAUT1 1 car
33 MAUT2 2 cars
34 MAUT0
No car 35
MZFONDS National Health Service
36 MZPART Private health insurance
37 MINKM30
Income >30.000 38
MINK3045 Income 30-45.000
39 MINK4575 Income 45-75.000
40 MINK7512
Income 75-122.000 41
MINK123M Income <123.000 <="" td="">
42 MINKGEM Average income
43 MKOOPKLA
Purchasing power class 44
PWAPART Contribution private third party insurance
45 PWABEDR Contribution third party insurance (firms)
46 PWALAND
Contribution third party insurance (agriculture) 47
PPERSAUT Contribution car policies
48 PBESAUT Contribution delivery van policies
49 PMOTSCO
Contribution motorcycle/scooter policies 50
PVRAAUT Contribution lorry policies
51 PAANHANG Contribution trailer policies
52 PTRACTOR
Contribution tractor policies 53
PWERKT Contribution agricultural machines policies
54 PBROM Contribution moped policies
55 PLEVEN
Contribution life insurances 56
PPERSONG Contribution private accident insurance policies
57 PGEZONG Contribution family accidents insurance policies
58 PWAOREG
Contribution disability insurance policies 59
PBRAND Contribution fire policies
60 PZEILPL Contribution surfboard policies
61 PPLEZIER
Contribution boat policies 62
PFIETS Contribution bicycle policies
63 PINBOED Contribution property insurance policies
64 PBYSTAND
Contribution social security insurance policies 65
AWAPART Number of private third party insurance 1 - 12
66 AWABEDR Number of third party insurance (firms) ...
67 AWALAND
Number of third party insurance (agriculture) 68
APERSAUT Number of car policies
69 ABESAUT Number of delivery van policies
70 AMOTSCO
Number of motorcycle/scooter policies 71
AVRAAUT Number of lorry policies
72 AAANHANG Number of trailer policies
73 ATRACTOR
Number of tractor policies 74
AWERKT Number of agricultural machines policies
75 ABROM Number of moped policies
76 ALEVEN
Number of life insurances 77
APERSONG Number of private accident insurance policies
78 AGEZONG Number of family accidents insurance policies
79 AWAOREG
Number of disability insurance policies 80
ABRAND Number of fire policies
81 AZEILPL Number of surfboard policies
82 APLEZIER
Number of boat policies 83
AFIETS Number of bicycle policies
84 AINBOED Number of property insurance policies
85 ABYSTAND
Number of social security insurance policies 86
CARAVAN Number of mobile home policies 0 - 1
Note: All the variables starting with M are zipcode variables. They give information on the distribution of that variable, e.g., Rented house, in the zipcode area of the customer.

Source

  • UCI KDD Archive:http://kdd.ics.uci.edu
  • Donor: Sentient Machine Research Peter van der Putten Sentient Machine Research Baarsjesweg 224 1058 AA Amsterdam The Netherlands +31 20 6186927 pvdputten@hotmail.com, putten@liacs.nl

Details

Information about the insurance company customers consists of 86 variables and includes product usage data and socio-demographic data derived from zip area codes. The data was supplied by the Dutch data mining company Sentient Machine Research and is based on a real world business problem. The training set contains over 5000 descriptions of customers, including the information of whether or not they have a caravan insurance policy. The test set contains 4000 customers. The test and data set are merged in the ticdata set. More information about the data set and the CoIL 2000 Challenge along with publications based on the data set can be found at http://www.liacs.nl/~putten/library/cc2000/.

References

Peter van der Putten, Michel de Ruiter, Maarten van Someren CoIL Challenge 2000 Tasks and Results: Predicting and Explaining Caravan Policy Ownership http://www.liacs.nl/~putten/library/cc2000/