Properties of probabilities
When working with probabilities it is important to understand some of its most basic properties. Below we will shortly discuss the most basic properties.
1. 0 < P(A) < 1 A probability can never be larger than 1 or smaller than 0 by definition.
2. If the events A, B, ... are mutually exclusive we have that P(A + B +...) = P(A) + P(B) +...
Example 1.4
Assume picking a card randomly from a deck of cards. The event A represents receiving a club, and event B represents receiving a spade. These two events are mutually exclusive. Therefore the probability of the event C = A + B that represents receiving a black card can be formed by P(A + B) = P(A) + P(B)
3. If the events A, B, . are mutually exclusive and collectively exhaustive set of events then we have that P( A + B +...) = P( A) + P(B) +... = 1
Example 1.5
Assume picking a card from a deck of cards. The event A represents picking a black card and event B represents picking a red card. These two events are mutually exclusive and collectively exhaustive. Therefore P( A + B) = P( A) + P(B) = 1.
4. If event A and B are statistically independent then P(AB) = P(A)P(B)where P(AB)is called a joint probability.
5. If event A and B are not mutually exclusive then P(A + B) = P(A) + P(B)  P(AB)
Example 1.6
Assume that we carry out a survey asking people if they have read two newspapers (A and B) a given day. Some have read paper A only, some have read paper B only and some have read both A and B. In order to calculate the probability that a randomly chosen individual has read newspaper A and/or B we must understand that the two events are not mutually exclusive since some individuals have read both papers. Therefore P(A + B) = P(A) + P(B) P(AB). Only if it had been an impossibility to have read both papers the two events would have been mutually exclusive.
Suppose that we would like to know the probability that event A occurs given that event B has already occurred. We must then ask if event B has any influence on event A or if event A and B are independent. If there is a dependency we might be interested in how this affects the probability of event A to occur. The conditional probability of event A given event B is computed using the formula:
Example 1.7
Yes 
No 
Total 

Male 
19 
41 
60 
Female 
12 
28 
40 
Total 
31 
69 
100 
Table 1.2 A survey on smoking
Using the information in the survey we may now answer the following questions:
i) What is the probability of a randomly selected individual being a male who smokes?
This is just the joint probability. Using the classical definition start by asking how large the sample space is: 100. Thereafter we have to find the number of smoking males: 19. The corresponding probability is therefore: 19/100=0.19.
ii) What is the probability that a randomly selected smoker is a male?
In this case we focus on smokers. We can therefore say that we condition on smokers when we ask for the probability of being a male in that group. In order to answer the question we use the conditional probability formula (1.2). First we need the joint probability of being a smoker and a male. That turned out to be 0.19 according to the calculations above. Secondly, we have to find the probability of being a smoker. Since 31 individuals were smokers out of the 100 individuals that we asked, the probability of being a smoker must therefore be 31/100=0.31. We can now calculate the conditional probability. We have 0.19/0.31=0.6129. Hence there is 61% chance that a randomly selected smoker is a man.
The probability function  the discrete case
In this section we will derive what is called the probability mass function or just probability function
for a stochastic discrete random variable. Using the probability function we may form the corresponding probability distribution. By probability distribution for a random variable we mean the possible values taken by that variable and the probabilities of occurrence of those values. Let us take an example to illustrate the meaning of those concepts.
We are interested in smoking habits in a population and carry out the following survey. We ask 100 people whether they are a smoker or not. The results are shown in Table 1.2.
Example 1.8
Consider a simple experiment where we toss a coin three times. Each trial of the experiment results in an outcome. The following 8 outcomes represent the sample space for this experiment: (HHH), (HHT), (HTH), (HTT), (THH), (THT), (TTH), (TTT). Observe that each sample point is equally likely to occur, so that the probability that one of them occure is 1/8.
The random variable we are interested in is the number of heads received on one trial. We denote this random variable X. X can therefore take the following values 0, 1, 2, 3, and the probabilities of occurrence differ among the alternatives. The table of probabilities for each value of the random variable is referred to as the probability distribution. Using the classical definition of probabilities we receive the following probability distribution.
Table 1.3 Probability distribution for X
From Table 1.3 you can read that the probability that X = 0, which is denoted P(X = 0), equals 1/8, whereas P(X = 1) equals 3/8, and so forth.
The cumulative probability function  the discrete case
Related to the probability mass function of a discrete random variable X, is its Cumulative Distribution Function, .F(X), usually denoted CDF. It is defined in the following way:
Example 1.9
Consider the random variable and the probability distribution given in Example 1.8. Using that information we may form the cumulative distribution for X:
Table 1.4 Cumulative distribution for X
The important thing to remember is that the outcomes in Table 1.3 are mutually exclusive. Hence, when calculating the probabilities according to the cumulative probability function, we simply sum over the probability mass functions. As an example:
The probability function  the continuous case
When the random variable is continuous it is no longer interesting to measure the probability of a specific value since its corresponding probability is zero. Hence, when working with continuous random variables, we are concerned with probabilities that the random variable takes values within a certain interval. Formally we may express the probability in the following way:
In order to find the probability, we need to integrate over the probability function, f(X), which is called the probability density function (pdf) for a continuous random variable. There exist a number of standard probability functions, but the single most common one is related to the standard normal random variable.
Example 1.10
Assume that X is a continuous random variable with the following probability function:
Find the probability P(0 < X < 0.5). Using integral calculus we find that
The cumulative probability function  the continuous case
Associated with the probability density function of a continuous random variable X is its cumulative distribution function (CDF). It is denoted in the same way as for the discrete random variable. However, for the continuous random variable we have to integrate from minus infinity up to the chosen value, that is:
The following properties should be noted:
1) F(°o) = 0 and F(o) = 1, which represents the left and right limit of the CDF.
2) P(X > a) = 1  F(a)
3) P(a < X < b) = F(b)  F(a)
In order to evaluate this kind of problems we typically use standard statistical tables, which are located in the appendix.