A familiar example of statistical independence is the result of two tosses of a coin. Neither coin toss can influence the outcome of the other so they yield statistically independent results. More generally, if two variables are statistically independent, then neither one affects probabilities that involve the other variable. This is a very strong notion that there is “no relationship” between the two variables. This idea is formalized using *conditional probability*, and, to define it, some notation is now introduced that applies to the rest of this chapter.

The conditional probability that one variable, **X**_{2}, has the value *x*_{2}, given that (or conditional on) the fact that another variable, **X**_{1}, has the value *x*_{1} is commonly denoted by

(1)

In the example of two tosses of a coin, **X**_{1} could denote the outcome of the first toss and **X**_{2} the outcome of the second toss. In this example, *x*_{1} and *x*_{2} are the values “heads” and “tails.”

In terms of conditional probability, the *statistical independence* of **X**_{1} and **X**_{2} is expressed by

(2)

The probability on the right side of equation (2) is just the ordinary, *marginal*, or *unconditional* probability that **X**_{2} = *x*_{2}. The equality of the two probabilities in equation (2) means that the probability distribution of **X**_{2} is *unaffected* by the value of **X**_{1}. In other words, the conditional probability is *constant* as a function of *x*_{1}.

It is well known (for example, see Parzen, 1960) that the constant conditional probability rule in equation (2) is equivalent to the following “product rule” for joint probabilities of independent variables

(3)

The product rule means that the *joint probability* that **X**_{1} = *x*_{1}*and* that **X**_{2} = *x*_{2}, the left side of equation (3), is found by multiplying together the two marginal probabilities for each variable separately, the right side of equation (3). Both the constant conditional probability rule in equation (2) and the product rule in equation (3) are important for understanding the structure of latent variable models.