This data set used in the CoIL 2000 Challenge contains information on customers of an insurance company. The data consists of 86 variables and includes product usage data and socio-demographic data derived from zip area codes. The data was collected to answer the following question: Can you predict who would be interested in buying a caravan insurance policy and give an explanation why ?
data(ticdata)
ticdata: Dataset to train and validate prediction models and build a description (9822
customer records). Each record consists of 86 attributes, containing
sociodemographic data (attribute 1-43) and product ownership (attributes 44-86).
The sociodemographic data is derived from zip codes. All customers
living in areas with the same zip code have the same sociodemographic
attributes. Attribute 86, CARAVAN:Number of mobile home policies
, is the target variable.
Data Format
1 | STYPE | Customer Subtype | |
2 | MAANTHUI | Number of houses 1 - 10 | |
3 | MGEMOMV | Avg size household 1 - 6 | |
4 | MGEMLEEF | Average age | |
5 | MOSHOOFD | Customer main type | |
6 | MGODRK | Roman catholic | |
7 | MGODPR | Protestant ... | |
8 | MGODOV | Other religion | |
9 | MGODGE | No religion | |
10 | MRELGE | Married | |
11 | MRELSA | Living together | |
12 | MRELOV | Other relation | |
13 | MFALLEEN | Singles | |
14 | MFGEKIND | Household without children | |
15 | MFWEKIND | Household with children | |
16 | MOPLHOOG | High level education | |
17 | MOPLMIDD | Medium level education | |
18 | MOPLLAAG | Lower level education | |
19 | MBERHOOG | High status | |
20 | MBERZELF | Entrepreneur | |
21 | MBERBOER | Farmer | |
22 | MBERMIDD | Middle management | |
23 | MBERARBG | Skilled labourers | |
24 | MBERARBO | Unskilled labourers | |
25 | MSKA | Social class A | |
26 | MSKB1 | Social class B1 | |
27 | MSKB2 | Social class B2 | |
28 | MSKC | Social class C | |
29 | MSKD | Social class D | |
30 | MHHUUR | Rented house | |
31 | MHKOOP | Home owners | |
32 | MAUT1 | 1 car | |
33 | MAUT2 | 2 cars | |
34 | MAUT0 | No car | |
35 | MZFONDS | National Health Service | |
36 | MZPART | Private health insurance | |
37 | MINKM30 | Income >30.000 | |
38 | MINK3045 | Income 30-45.000 | |
39 | MINK4575 | Income 45-75.000 | |
40 | MINK7512 | Income 75-122.000 | |
41 | MINK123M | Income <123.000 | |
42 | MINKGEM | Average income | |
43 | MKOOPKLA | Purchasing power class | |
44 | PWAPART | Contribution private third party insurance | |
45 | PWABEDR | Contribution third party insurance (firms) | |
46 | PWALAND | Contribution third party insurance (agriculture) | |
47 | PPERSAUT | Contribution car policies | |
48 | PBESAUT | Contribution delivery van policies | |
49 | PMOTSCO | Contribution motorcycle/scooter policies | |
50 | PVRAAUT | Contribution lorry policies | |
51 | PAANHANG | Contribution trailer policies | |
52 | PTRACTOR | Contribution tractor policies | |
53 | PWERKT | Contribution agricultural machines policies | |
54 | PBROM | Contribution moped policies | |
55 | PLEVEN | Contribution life insurances | |
56 | PPERSONG | Contribution private accident insurance policies | |
57 | PGEZONG | Contribution family accidents insurance policies | |
58 | PWAOREG | Contribution disability insurance policies | |
59 | PBRAND | Contribution fire policies | |
60 | PZEILPL | Contribution surfboard policies | |
61 | PPLEZIER | Contribution boat policies | |
62 | PFIETS | Contribution bicycle policies | |
63 | PINBOED | Contribution property insurance policies | |
64 | PBYSTAND | Contribution social security insurance policies | |
65 | AWAPART | Number of private third party insurance 1 - 12 | |
66 | AWABEDR | Number of third party insurance (firms) ... | |
67 | AWALAND | Number of third party insurance (agriculture) | |
68 | APERSAUT | Number of car policies | |
69 | ABESAUT | Number of delivery van policies | |
70 | AMOTSCO | Number of motorcycle/scooter policies | |
71 | AVRAAUT | Number of lorry policies | |
72 | AAANHANG | Number of trailer policies | |
73 | ATRACTOR | Number of tractor policies | |
74 | AWERKT | Number of agricultural machines policies | |
75 | ABROM | Number of moped policies | |
76 | ALEVEN | Number of life insurances | |
77 | APERSONG | Number of private accident insurance policies | |
78 | AGEZONG | Number of family accidents insurance policies | |
79 | AWAOREG | Number of disability insurance policies | |
80 | ABRAND | Number of fire policies | |
81 | AZEILPL | Number of surfboard policies | |
82 | APLEZIER | Number of boat policies | |
83 | AFIETS | Number of bicycle policies | |
84 | AINBOED | Number of property insurance policies | |
85 | ABYSTAND | Number of social security insurance policies | |
86 | CARAVAN | Number of mobile home policies 0 - 1 |
Note: All the variables starting with M are zipcode variables. They give information on the distribution of that variable, e.g., Rented house, in the zipcode area of the customer.
Information about the insurance company customers consists of 86 variables and includes product usage data and socio-demographic data derived from zip area codes. The data was supplied by the Dutch data mining company Sentient Machine Research and is based on a real world business problem. The training set contains over 5000 descriptions of customers, including the information of whether or not they have a caravan insurance policy. The test set contains 4000 customers. The test and data set are merged in the ticdata set. More information about the data set and the CoIL 2000 Challenge along with publications based on the data set can be found at http://www.liacs.nl/~putten/library/cc2000/.
Peter van der Putten, Michel de Ruiter, Maarten van
Someren CoIL Challenge 2000 Tasks and Results: Predicting and
Explaining Caravan Policy Ownership
http://www.liacs.nl/~putten/library/cc2000/