521
Statistica Applicata Vol. 18, n. 3, 2006
AN INTER-MODELS DISTANCE FOR CLUSTERING
UTILITY FUNCTIONS1
Elvira Romano, Carlo Lauro
Dipartimento di Matematica e Statistica, Università di Napoli Federico II,
Complesso Universitario di Monte S. Angelo, Via Cinthia, I - 80126 Napoli Italy
[email protected]; [email protected]
Giuseppe Giordano
Dipartimento di Scienze Economiche e Statistiche Università di Salerno
Via Ponte Don Melillo, 84084 Fisciano (Sa), Italy
[email protected]
Abstract
Conjoint Analysis is one of the most widely used techniques in the assessment of the
consumer’s behaviors. This method allows to estimate the partial utility coefficients
according to a statistical model linking the overall note of preference with the attribute
levels describing the stimuli. Conjoint analysis results are useful in new-product positioning
and market segmentation. In this paper a cluster-based segmentation strategy based on a
new metric has been proposed. The introduced distance is based on a convex linear
combination of two Euclidean distances em bedding information both on the estimated
parameters and on the model fitting. Market segments can be then defined according to the
proximity of the part-worth coecients and to the explicative power of the estimated models.
Key words: Multiattribute Preference Data, Conjoint Analysis, Cluster Analysis, Market
Segmentation.
1
This paper was financially supported by MIUR grants: “Multivariate Statistical and Visualization
Methods to Analyze, to Summarize, to Evaluate Performance Indicators”, coordinated by prof.
M.R. D’Esposito and “Models for Designing and Measuring Customer Satisfaction”, coordinated
by prof. C.N. Lauro.
522
Romano E., Lauro C., Giordano G.
1. INTRODUCTION
2. THE DATA STRUCTURE AND THE INTER-MODELS DISTANCE
523
An inter-models distance for clustering utility functions
Tab.1: The data collection.
Utility models
Model 1
…
Model j
…
Model J
Coefficients
w11,
...,
w1k,
...,
Model fitting
w1K – 1
…
w 1j , . . . , w kj , . . . ,w jK – 1
…
J
w 1, . . . , w Jk, . . . ,w JK – 1
w1K
…
w jK
…
w KJ
524
Romano E., Lauro C., Giordano G.
An inter-models distance for clustering utility functions
3. CLUSTERING UTILITY FUNCTIONS
525
526
4. SIMULATIONS STUDY
Romano E., Lauro C., Giordano G.
527
An inter-models distance for clustering utility functions
Tab. 2: Simulation plan for the three classes of models with dierent coecients and similar fitting
values.
In order to generate three sets of models with a quite good approximation, we
build the global preference ratings according to the model (10) and with coecients
given in Table 2. They are used as dependent variables in a multivariate multiple
regression model with dummy explicative variables defined by the orthogonal
experimental design in Table 3.
Tab. 3: Experimental design.
1
2
3
4
5
6
7
8
9
10
11
12
Intercept
x1
x2
x3
x4
1
1
1
1
1
1
1
1
1
1
1
1
1
-1
1
-1
1
-1
1
-1
1
-1
1
-1
-1
-1
1
1
0
0
-1
-1
1
1
0
0
-1
-1
0
0
1
1
-1
-1
0
0
1
1
1
1
1
1
1
1
-1
-1
-1
-1
-1
-1
528
Romano E., Lauro C., Giordano G.
Fig: 1. Box-plot of the adj - R2, the intercept and the four estimated coecients.
An inter-models distance for clustering utility functions
Fig. 2. The tree structure of the simulated models.
Fig. 3. Distribution of the optimum λ -values.
529
530
Romano E., Lauro C., Giordano G.
Tab. 4: Simulation plan involving a hidden tree structure.
Class A
N.
w0
w1
w2
w3
w4
ei
20
6.50
-1.33
1.00
1.25
-1.83
N(0, 1)
Class B
20
6.50
-1.33
1.00
1.25
-1.83
N(0, 3)
Class C
20
6.50
-0.50
-0.25
3.00
0.00
N(0, 1)
5. CONCLUDING REMARKS
An inter-models distance for clustering utility functions
531
Fig. 4: Distribution of the λ -values corresponding to the maximum cophenetic coecient in 100
replications of the clustering.
Fig. 5: Dendrogram of the models. λ = 1:the bad fitted models are hidden in the two cluster
structure.
532
Romano E., Lauro C., Giordano G.
Fig. 6: Dendrogram of the models. λ = 0.2:the cophenetic coecient is maximum, a three cluster
structure is more evident.
An inter-models distance for clustering utility functions
533
REFERENCES
GREEN P.E., SRINIVASAN V. (1990), Conjoint Analysis in marketing: New Developments With
Implication for Research and Practice, Journal of Marketing, 25: 3–19.
GUSTAFSSON A., HERMANN A., HUBER F. (2000), Conjoint Measurement. Methods and
Application. Berlin, Heidelberg: Springer Verlag.
HENNING C. (2000), Identifiability of Models for Clusterwise Linear Regression, Journal of
Classification, 17: 273–296.
LAURO C., SCEPI G., GIORDANO G., (2002) Cluster Based Conjoint Analysis, in Proceedings of
Sixth International Conference on Social Science Methodology, RC Logic & 33 Methodology,
August 17-20, 2004 Amsterdam.
PLAIA A. (2003), Constrained Clusterwise Linear Regression, in New Developments in Classification
and Data Analysis, M. Vichi P. Monari, S. Mignani, S. Montanarini eds., Springer, Bologna
79–86.
SPAETH H. (1979), Clusterwise Linear Regression, Computing, 22: 367–373.
TAKANE Y., DE LEW J., YOUNG F.W. (1990), Regression with Qualitative and Quantitative
Variables: An Alternating Least Squares Method with Optimal Scaling Features, Psycometrika,
41(4): 505–529.
VRIENS M., WEDEL M. and WILMS T. (1996), Metric Conjoint Segmentation Methods: a Monte
Carlo comparison, Journal of Marketing Research, 33: 73–85.
CLASSIFICAZIONE DI FUNZIONI DI UTILITÀ ATTRAVERSO
UNA DISTANZA TRA MODELLI
Riassunto
La Conjoint Analysis è una delle tecniche maggiormente utilizzate nella valutazione
del comportamento dei consumatori. Questa metodologia consente di stimare i coefficienti
di utilità parziale in base ad un modello statistico che lega la valutazione globale di
preferenza alle caratteristiche descrittive degli stimoli (prodotti o servizi). I risultati della
Conjoint Analysis trovano vasta applicazione nella segmentazione del mercato.
In questo lavoro viene proposta una strategia di classificazione basata su una nuova
metrica. La distanza introdotta è definita come combinazione convessa di due distanze.
Essa consente di tener conto di una duplice qualità dell’informazione relativa al modello:
il valore dei coefficienti stimati e la bontà di adattamento. Di conseguenza, la differenziazione tra segmenti di mercato è ottenuta considerando la prossimità dei modelli di utilità
individuali stimati e la capacità predittiva degli stessi.