Parameter Learning
Parameter learning determines the prior CPT of each node of the network, given the link structures and the data. It can therefore be used to examine quantitatively the strengthof the identified effect. As mentioned above, a conditional probability table has to be attached to each variable A with parents Note that if A has no parents, the table reduces to unconditional probabilities P(A). According to this logic, for example Bayesian network depicted in Figure 7.1, the prior unconditional and conditional probabilities to specify are: ^(Driving License); P(Gender); P(Number of cars); P(Mode Choice|Driving License, Gender, Number of cars). Since the variables 'Number of cars', 'Gender' and 'Driving License' are not conditionally dependent on other variables, calculating their prior frequency distribution is straightforward. Calculating the initial probabilities for the 'Mode Choice' variable is computationally more demanding.
In order to calculate the prior probabilities for the 'Mode choice' variable, the conditional probability table for P(Mode Choice| Driving License, Gender, Number of cars) was set up in the first part of Table 7.1a. Again, this is straightforward
Table 7.1a: Conditional and joint prior probability tables for the transport mode choice variable.
Conditional prior probability table specifying P(Choice Gender, Driving License, Near) |
||||||
Gender Driving license |
Male |
Female |
||||
Yes |
No |
Yes |
No |
|||
Number of cars |
1 >1 |
1 |
>1 |
1 >1 |
1 |
>1 |
Mode choice bike |
0.2 0.6 |
0.7 |
0.4 |
0.4 0.8 |
0.1 |
0.3 |
Mode choice car |
0.8 0.4 |
0.3 |
0.6 |
0.6 0.2 |
0.9 |
0.7 |
Joint prior probability table for P(Choice, Gender, Near, Driving License) |
||||||||
Gender Driving license |
Male |
Female |
||||||
Yes |
No |
Yes |
No |
|||||
Number of cars |
1 |
>1 |
1 |
>1 |
1 |
>1 |
1 |
>1 |
Mode choice bike |
0.018 |
0.216 |
0.042 |
0.096 |
0.012 |
0.096 |
0.002 |
0.024 |
Mode choice car |
0.072 |
0.144 |
0.018 |
0.144 |
0.018 |
0.024 |
0.018 |
0.056 |
mathematical calculus. In order to get the prior probabilities for the Mode Choice variable, we now first have to calculate the joint probability P(Choice, Gender, Number of cars, Driving License) and then marginalize 'Number of cars', 'Driving License' and 'Gender' out. This can be done by applying Bayes' rule, which states that:
Since 'Gender', 'Number of cars' and 'Driving License' are independent, the equation can be simplified for this example as:
Note that P(Gender = male; Gender = female)=(0.75; 0.25), ^(Driving License = yes; Driving License = no) = (0.6; 0.4) and P(Number of cars=l; Number of cars> 1)=(0.2; 0.8), which are the prior frequency distributions for those three variables. By using this information, the joint probabilities were calculated in the second part of Table 7.1a. Marginalizing 'Gender', 'Number of cars' and 'Driving License' out of P(Choice, Gender, Number of cars, Driving License) yields P(Mode Choice = bike; Mode Choice = car) = (0.506; 0.494).
These are the prior probabilities for the 'Mode Choice' variable. Of course, computations become more complex when 'Gender', 'Number of cars' and 'Driving License' are dependent. Fortunately, in these cases, probabilities can be calculated automatically by means of probabilistic inference algorithms that are implemented in Bayesian network-enabled software.