Study designs
A Kansei study is multifaceted and, of course, the particulars vary from study to study. In general, the steps are outlined in the following subsections.
Design product prototypes for testing
Similar to conjoint analysis, you need something to show customers so they can evaluate the new product concept. Conjoint analysis relies on card descriptions, although more modern presentations are conducted online utilizing images if the images can clearly illustrate the different attributes and their levels. In many instances, this is difficult to do, but not completely impossible. For a Kansei study, images could be used but the focus is not the attributes per se but the product itself as an entity. An effective presentation includes physical prototypes customers could touch, handle, and examine. A diamond engagement ring, for example, does not have the same effect if viewed in a picture as being placed on a finger and held up to light. The problem with a physical prototype is that many versions might be needed which may be too costly to create. In addition, the sample size of customers may be restricted because time has to be allowed for them to examine the prototype which means fewer customers may be allowed per session. Since most studies are time restricted (i.e., the field work has to be completed within a set period and within a set budget) this further means that fewer customers can be recruited. An online presentation in which images could be shown would be more cost effective, faster, and would allow for more customers to participate, but it would be less effective. The image designers could create a number of variations unlike physical designers who would be more restricted in what they can physically produce.
Some Kansei studies use images or physical objects that are readily available for already existing competitive products with only a few of their new versions included in the mix. The competitive images could easily be obtained from media or the competitors’ web sites while a physical object could simply be purchased for the study. For example, a car manufacturer could conduct a clinic study in which consumers could look at, sit in, and perhaps drive different cars. Several could be competitor cars and several could be new versions of their car. The competitor cars could be masked so that the consumers would not be able to (easily) identify the make or model. I discuss clinics in Chapter 4.
Create an experimental design
An experimental design is still needed since it may be impractical to present all the prototypes to customers; it could be overwhelming to say the least. Some Kansei studies have used a large number of prototypes. See Matsubara et al. [2011], Lai et al. [2006], and Chuang et al. [2001] for discussions and examples. A possible experimental design is a Balanced Incomplete Block Design (BIBD) which I discuss in Chapter 5. Fundamentally, a BIBD creates sets of products such that each set does not contain all the prototypes but yet all the prototypes are represented in all the sets. A single customer could be shown only one set. There are restrictions on the BIBD design which I describe in Chapter 5.
Create a list of emotional descriptors
A list of emotional descriptors is needed. These are usually adjectives such as “beautiful”, “strong”, “powerful”, “artistic” and so forth. The list could be constructed, for example, from a perusal of publications (e.g., popular magazines, newspapers)
TABLE 3.3 A differential semantic scale might use adjectives such as these for a new sheet music page turning product. The adjectives represent (in order): sound level, comfort, responsiveness, aesthetic appeal, obtrusiveness, and innovativeness. A musician would be asked to rate how each adjective pair best describes the product, making one rating per pair.
1 |
2 |
3 |
4 |
5 |
||
Noisy |
Silent |
|||||
Uncomfortable |
Comfortable |
|||||
Unresponsive |
Responsive |
|||||
Boring |
Stylish |
|||||
Obtrusive |
Unobtrusive |
|||||
Mundane |
Revolutionary |
and online reviews. Focus groups could be conducted in which customers would be asked to brainstorm a list of adjectives. Matsubara et al. [2011 ]. Lai et al. [2006]. and Chuang et al. [20011 discuss how they compiled a list of adjectives for their studies. Also see the articles in Lokman et al. [2018].
The adjectives per se are not used, however. Rather, each adjective is matched with its polar opposite to form a differential pair. So “beautiful” is matched with “not beautiful,” “strong” with “weak,” “artistic” with “unimaginative.” A semantic differential scale, designed to measure the “connotative meaning of objects, events, and concepts,”^{7} is a listing of the extremes of the adjectives so it is a bipolar scale. One end of the scale is the positive use of the adjective while the other is the negative use. For example, if the adjective is “beautiful”, then the two extremes are simply “beautiful” and “not beautiful.” Other extremes might be “good-bad”, “big-little”, “worthwhile-worthless”, and “fast-slow.” The scale between the two extremes is usually on a 1-5 or 1-7 basis with the former the most common. An example is shown in Table 3.3. The words are called Kansei words. The scale is certainly not without controversy. See Beltran et al. [2009] for some discussion of semantic differential scales. Also see Wegman 11990] for some technical discussions.
Present the prototypes and semantic differential questions
Each customer is shown all the prototype images one at a time and for each image is asked to rate or describe it using the semantic differential scale question. If there are four prototypes and 10 semantic differential questions, then each customer is asked to do 40 evaluations.
Basic data arrangement
If there are N respondents who see P prototypes and are asked W Kansei words using a semantic differential scale, then the data form a cube that is N X P X W. An example is shown in Figure 3.7. Such a cube is actually common in statistical analysis although it is usually not discussed, being more subtle than provoking.^{8} Yet
FIGURE 3.7 This is the data cube resulting from a Kansei experiment.
it is important to note that there are several dimensions to any data set, the Kansei data being only one example. See Lemahieu et al. [2018], Basford and McLachlan [1985], and Vermunt [2007] for the use and analysis of three-way data.
The cube is usually collapsed by aggregating across the N respondents. The aggregation is done by averaging the Kansei word ratings for all respondents for each prototype and word. The result is a two-dimensional plane that is P X IV with cells equal to the average across all respondents. That is, if A is the resulting PxW data matrix, then cell pw,p = 1,..., P and iv = 1,..., W is
where the “dot” notation indicates average. In essence, the data cube shown in Figure 3.7 is squashed down or flattened to just the Prototype X Words plane. The resulting matrix is shown in Figure 3.8
Analyze the data
The objective is to use the prototypes to explain or account for the Kansei word score in each cell of the matrix. The prototypes per se are not used, but their attributes and the levels of those attributes are used to account for the Kansei word scores. A regression model is immediately suggested with the attributes and their levels appropriately encoded. Dummy variable coding or effects coding can be used but which encoding is not a major issue. As I noted above, dummy and effects coding are commonly used and are easily handled by most statistical and econometric software packages.
There is an issue with the specification of a model. Since there are W Kansei words whose average scores have to be explained or accounted for by the attributes
FIGURE 3.8 This is the data matrix resulting from collapsing a three-dimensional Kansei data cube.
and their levels, a model would be
where e is the usual OLS disturbance term such that e, ~ ,V(O.cr^) and independently and identically distributed (iid), and X is the dummy coded matrix of attributes. The Y is not a column vector as in traditional OLS with only one dependent variable, but is now a Px W matrix. This matrix is often reduced in size using a data reduction method such as factor analysis resulting in a Px W' matrix with 1 < W' < W. If W' = 1 then the usual OLS model results.
The X matrix is of order Px L where L is the total number of levels for all attributes and their levels. If each attribute has two levels as an example, then X has size PxA where A is the number of attributes.^{9} The number of attributes can be large enough so it is safe to assume that P « A that is, the number of observations, P, is much less than the number of variables, A. This is a problem for OLS because OLS requires that P > A for estimation so in this situation estimation is impossible. In addition, the attributes are usually associated with some degree of correlation which, despite the use of an encoding scheme for any one attribute, still introduces multicollinearity. The collinearity jeopardizes estimation by making the estimated parameters unstable and with possibly the wrong signs and magnitudes. Two ways to handle the model in (3.3) are with partial least squares (PLS) estimation and neural networks.
PLS is a variant of OLS that allows for
- 1. a dependent variable matrix rather than a dependent variable vector;
- 2. P « W; and
- 3. multicollinearity among the independent variables.
I review PLS in the next section. Neural networks, which are sometimes considered black boxes, are meant to show the connections among items much as the human brain is considered to have many connecting parts that allow you to understand and reason. I review neural networks below.
78 Deep Data Analytics for New Product Development
Partial least squares review
The Partial Least Squares (PLS) methodology has been known for a while but is only now becoming more widely available as data analysts become more aware of it, the problems it deals with become more widespread, and software becomes available. This section provides a high-level overview of this methodology. More technical detail is provided in Appendix C of this chapter. For even more detail, see Tobias 11995J, Sawatsky et al. [2015J, Ng [2013J, and Geladi and Kowalski [1986]. Also see Cox and Gaudard [2013] for a book-length treatment using the JMP software.
To understand partial least squares, it is helpful to briefly state two key assumptions (among many) that enable OLS to produce estimates of the model parameters. These are:
- 1. the independent variables are linearly independent (i.e., there is no multi- collinearity); and
- 2. the number of observations exceeds the number of independent variables (i.e., n> p where p is the number of independent variables).
If the first is violated, then the matrix formed by the independent variables cannot be inverted. See Appendix C for an explanation. If the second is violated, then there is a chance the model would overfit the data. Overfitting means the OLS procedure would attempt to account for each observation rather that some average of the observations. The implication is that the model learned from the data used in the estimation but will most likely be unable to apply those estimates to new data that are unlike those used for estimation. The purpose of a model is to not only indicate the effects on the dependent variable but to also enable predictions for new data. With n < p, especially n << p, then this ability is jeopardized if not impossible.
Partial least squares regression avoids both potential issues by first finding a reduced set of independent variables, the set containing factors that summarize the independent variables. These factors are identified in such a way that they are linearly independent so the first property will be satisfied. Since there is a reduced number of factors, the second property is also satisfied. These factors can then be used in a regression model. If you are familiar with principal components analysis (PCA), then this may seem like principal components regression (PCR). It differs from PCR in that the information is taken from both the reduced independent factors and the dependent variable at the same time. This is a more complicated procedure. See Appendix C for some high-level details.
Neural networks
Neural networks are sometimes viewed as the quintessential black box for analyzing data. They are difficult to explain and understand primarily because they were originally developed to mimic how the human brain operates. The human brain is a complex organ that has been studied for a very long time. Although we know a
FIGURE 3.9 Example parallel chart or “spaghetti chart” for a semantic differential question. A five-point Likert Scale was used. See Table 3.3 for an example of the question format.
lot about how it works, it is still a mystery. Neural networks are equally challenging. For an intuitive explanation, see Hardesty' [2017].
Other analysis methods
The semantic differential scale data are often analyzed with a parallel lines graph, sometimes called a “spaghetti chart.” This is illustrated in Figure 3.9. Another possibility is to simply calculate the mean of each item and use these to create a bar chart. See Figure 3.10. These, of course, do not fully utilize the data to provide the richest information. Instead, the final response array can be analyzed using various multivariate methods.
Another possibility is to do a correspondence analysis to determine which adjectives are most closely associated with each prototype. The purpose of a correspondence analysis is to reproduce in a two-dimensional plot the relationship in a frequency table (e.g., a crosstab where each cell of the table has the number or frequency of observations sharing the features labeling the row and column for that cell) that can be complex to read. A small table, say a 2x2, is easy to read and interpret but is typically uninformative. A large table, say a 10 X 10, is an order of magnitude more challenging to read and interpret and also typically uninformative merely because of its size. Interpretation is dramatically improved when the complex table of frequencies is plotted in a lower-dimensional space: a two-dimensional plot of
FIGURE 3.10 Example bar chart for a semantic differential question. The mean for each item measured on a five-point Likert Scale was calculated and plotted. See Table 3.3 for an example of the question format.
the rows and columns of the table. The two-dimensional plot is sometimes called a mop. The map reproduces the differences between the rows and columns of a table in a simple two-dimensional, X-Y graph or scatter plot. A two-dimensional plot is typically used because three dimensions are difficult to interpret and more than three cannot be drawn. Two dimensions in most cases completely display the data relationships.
To interpret a map, you have to examine the distances between points on the map for rows, for columns, and, depending on the method used, for both simultaneously. Relative proximities of the points count. If a point on the map for a row category is close to a point on the map for a column category, you cannot say anything about the magnitude of their interaction in an absolute sense, but you can interpret the positions in a relative sense saying that the categories are associated. The word “relative” is important. You cannot say anything about the absolute level of association. You can only say that a pair of points that are close are more strongly associated than a pair that are further apart.
The map is formally called a biplot because it simultaneously plots (measures of) the rows and columns of the table on one plot. The “bi” in “biplot” refers to the joint display of rows and columns, not to the dimensionality of the plot, which is two (Х-У). In essence, a biplot is one Х-У plot overlaid on top of another. The biplot allows you to visualize on one map the relationship both within a structure (e.g., rows) and between structures (e.g., rows and columns) of a table. An example is show in Figure 3.12. See Gower and Hand [1996J for a detailed, technical discussion of biplots.
The correspondence analysis is done using the Singular Value Decomposition (S VD) mentioned in Chapter 2 and reviewed in that chapter’s appendix. The SVD divides the frequency table into three components: a left matrix, a center matrix, and a right matrix. The left matrix contributes to the plotting coordinates for the rows of the table, the right matrix contributes to the plotting coordinates for the columns, and the center matrix contributes to a measure of the variance of the table. See Greenacre [2007] for a classic discussion of correspondence analysis. See the Appendix to Chapter 2 on the SVD.
The tables for a Kansei are numbers much like those from paired comparisons, ratings, and distance measures as some examples. These measures are not directly frequencies, but as noted by Greenacre [20071 correspondence analysis can be applied to these tables after being transformed. In the case of ratings, as for the music semantic scale, the data have to be transformed into something that has the interpretation of a frequency, a count so that correspondence analysis can be used. Assume a 1-5 scale as for the music semantic scale. A transformation is to subtract 1 from each value provided by the customer so that a rating of “1”, the minimum value possible, becomes “0”; “2” becomes “1”, and “5” becomes “4”. The new value of “0” is interpreted to mean that the customer is “0” steps from the beginning of the scale (which is “1” on the original scale) and “4” from the end of the scale (which is “5” on the original scale); a new value of “4” means the customer is “4” steps from the beginning and “0” steps from the end. These steps are counts of what the customer must do to get to the beginning and the end from whatever original rating he/she gave. They could also be viewed as offsets from either end of the original scale.^{10}. Table 3.4 shows all pairs for a five-point rating scale. Greenacre |2007| and Greenacre 11984] refer to this as scale doubling. The steps are counts which is exactly what the correspondence analysis requires. This is illustrated in Figure 3.11.
TABLE 3.4 For a five-point rating scale, the pairs are shown here. Each pair sums to 4 since for any point on the original scale there are only 4 steps in both directions that someone could take.
Rating |
Pair |
1 |
(0,4) |
2 |
0.3) |
3 |
(2,2) |
4 |
(3, 1) |
5 |
(4,0) |
FIGURE 3.11 This rating scale transformation provides a pair of counts for each single rating given by a customer: the number of steps to the beginning of the original scale and the number of steps to the end of the original scale. So the original rating of “2” becomes the pair (1, 3); an original rating of “1” becomes the pair (0, 4); and so forth. Notice that each pair sums to 4, the total number of steps in both directions.
The advantage of the pairs is that a single original rating is transformed into two values: the first of the pair shows the steps to a low or negative part of the scale while the second shows the steps to a high or positive part. A polarity is established. For a semantic differential scale with polar opposites for an adjective, the transformation automatically provides data for the negative pole and the positive pole of the scale. As an example, for a sound level adjective pair “Noisy/Silent”, the first of the scale pair is a value for “Noisy” and the second of the pair is a value for “Silent.” A “2” rating for the sound level results in a value of “1” for “Noisy” and “3” for “Silent”. The music example has six adjective pairs so a single customer provides six ratings on a 1-5 scale as in Table 3.3. After the transformation, there are 12 ratings representing valuations of polar extremes. These 12 ratings are used in a correspondence analysis.
The customer data for the music example were transformed from the original five-point scale to the polar opposite pairs following the scheme in Table 3.4. The labels for the polar opposites were the ones shown in Table 3.3. There is an underlying data table that is a crosstab of the three prototypes and 12 polar opposites labels. The top number in each cell is the sum of steps for that combination of prototype and semantic polar word. This table is shown in Figure 3.5. A correspondence analysis was drawn based on this crosstab data and a correspondence map was created. This is shown in Figure 3.12. This map is quite revealing about the relationships among the products and semantic ratings.
Notice that: ^{•}
Prototype A is judged as somewhat obtrusive, boring, and revolutionary;
- • Prototype В is associated with being judged as stylish, unobtrusive, yet noisy; and
- • Prototype C is closely associated with being judged as unresponsive, uncomfortable, but silent.
FIGURE 3.12 This is a correspondence map for the means of a semantic differential set of questions in Table 3.3 but for three products. The numbers in parentheses on the axes labels show the amount of variation in the underlying table (i.e., the variance of the data) accounted for by the two dimensions plotted. In this case, the first dimension (the X-axis) accounts for almost 97% while the Y-axis accounts for almost 2.96%. Together, the two axes account for almost 100% of the variation so that the two dimensions almost completely reproduce the data.
The associations for prototypes В and C are somewhat strong since the prototype points are close to the attribute points, but A is somewhat further away from the attribute labels indicating that customers may have had some difficulty judging this product mock-up.
The map in Figure 3.12 has more information about the points in the biplot. The two axes are the first and second dimensions extracted from the SVD of the underlying data table. These are the two that account for the most variation in the table. The most dimensions that could be extracted equals tnin(r-,c— 1) where r is the number of rows of a table and c is the number of columns. For this problem, r = 3 and ( = 12 so the maximum number of dimensions that could be extracted is min(2,11) = 2. The first dimension is used for the X-axis and the second for the Y-axis, although you can interchange these depending on the software. Certainly, if more dimensions could be extracted, then you could use these for the axes. Typically, only the first two are used. The first dimensions account for the largest proportion of the variation in the data table, in this case this is shown in the label on the x-axis: 97%. The second dimension accounts for the next largest proportion which is 2.96%. Together, the first two dimensions account for almost 100% of the variation. But this should be expected since there are only two dimensions possible for this problem. This is summarized in Figure 3.13. See the Appendix to this chapter for more details on the computations.
Combining conjoint and Kansei analyses
Conjoint and Kansei analyses have been presented as two separate methodologies for optimizing a new product design. Conjoint is associated with the designers
FIGURE 3.13 The singular values and their transformed values are shown in this table. The first singular value, corresponding to the first dimension extracted, is 0.14457. The square of this is the inertia. The total inertia is the total variation in the table. The total inertia times the sample size, which is 4300 from Figure 3.5, is the total chi-square value for the table: 92.613. The corresponding percents and cumulative percents are also shown. The cumulative percents are in the two dimension labels in the map.
perspective and Kansei with the customer’s emotional perspective. Both can, and should, be combined for a total perspective. One procedure for a unified approach involves the following steps:^{11}
- 1. Determine the correlation between the conjoint preference rating and each image evaluation for all prototypes. If there are s image evaluations, then there are s correlations. That is cor, = r(Rating, Image,), i = where the function /■(•) is the correlation function. Note that there is only one conjoint preference rating. These correlations will be used to determine weights as described below.
- 2. Estimate a key driver model for preference as a function of the image evaluations. The model has preference as the dependent variable and each image rating as an independent variable: Preference_{(} = /?<> + /?, X Image_{ci} + e_{(},c =
- 1,...,« customers. The step-wise approach will narrow the list of independent variables to a few key ones.
- 3. Based on the design elements of the prototypes, estimate a conjoint regression model using the preference rating as the dependent variable and the design elements as the independent variables. Dummy or effects coding can be used but usually effects coding is used as described earlier. The result will show which design elements have the largest weight based on their part-worths. A total utility or worth can be calculated for each configuration. This is standard conjoint analysis.
- 4. Based on the design elements for the prototypes, estimate additional conjoint models using the key driver image evaluation scores as the dependent variable and the design elements, appropriately encoded, as the independent variables. If there are s' key drivers from Step 2, then there will be s' conjoint models. Each model will show the importance of the design elements on the key image evaluations. A total utility or worth can be calculated for each configuration as in Step 3 but they will now be for the key image words.
TABLE 3.5 This is the crosstab table underlying the correspondence analysis. The first value in each cell is the sum of the steps on the doubled scale for the prototype shown in the column and the polar adjective shown in the row. In the first cell in the upper left, there were 125 steps for Prototype A and negative polar word Boring.
FIGURE 3.14 These are the chi-square tests for the crosstab data in Figure 3.5. The Pearson Chi-square is 92.613 which is the sum of the two chi-square values in Figure 3.13. See Appendix 3.A for an overview of the chi-square statistic and tests.
5. Calculate a weighted average of the image word total utilities from Step 4 using
the correlations from Step 1 as the weights. If / is the i^{,h} key image word and / is its correlation coefficient, then the weight for the i'^{1}' key image word is и/ = where £_{/ej}, w' = 1 • These weighted scores are called simple additive
weighted (.STIFF) scores.
6. Compare the SAW scores and the conjoint total utility scores from Step 3 and select the product configurations with the highest scores for both. One way to do this is to calculate a weighted average or index number of the SAW scores
FIGURE 3.15 This flowchart illustrates the steps used in the conjoint analysis of the preference ratings and the image words.
and the conjoint total utility scores perhaps using the first principal component as the weighted index. The corresponding product configuration is the one that should be developed.
These steps are illustrated in Figure 3.15.