本文为《Introduction to Probability》的读书笔记
目录
- CONDITIONAL PROBABILITY
- Conditional Probabilities Specify a Probability Law 条件概率是一个概率律
- Using Conditional Probability for Modeling
CONDITIONAL PROBABILITY
Conditional probability provides us with a way to reason about the outcome of an experiment, based on partial information. For instance,
- In an experiment involving two successive rolls of a die, you are told that the sum of the two rolls is 9. How likely is it that the first roll was a 6?
- In a word guessing game, the first letter of the word is a"t". What is the likelihood that the second letter is an"h"?
In more precise terms, given an experiment, a corresponding sample space, and a probability law, suppose that we know that the outcome is within some given event BBB. We wish to quantify the likelihood that the outcome also belongs to some other given event AAA. We thus seek to construct a new probability law that takes into account the available knowledge: a probability law that for any event AAA. specifies the conditional probability of AAA given BBB. denoted by P(A∣B)P(A | B)P(A∣B).
We introduce the following definition of conditional probability:
where we assume that P(B)>0P(B) >0P(B)>0.
If all possible outcomes are equally likely here, we can also compute P(A∣B)P(A| B)P(A∣B) using a shortcut. We can simply divide the number of elements shared by AAA and BBB with the number of elements of BBB.
Conditional Probabilities Specify a Probability Law 条件概率是一个概率律
For a fixed event BBB, it can be verified that the conditional probabilities P(AIB)P(A I B)P(AIB) form a legitimate probability law that satisfies the three axioms. Indeed, nonnegativity is clear. Furthermore,
and the normalization axiom is also satisfied. To verify the additivity axiom, we write for any two disjoint events A1A_1A1? and A2A_2A2?,
Since conditional probabilities constitute a legitimate probability law, all general properties of probability laws remain valid. For example, a fact such as P(A∪C)≤P(A)+P(C)P(A \cup C)\leq P(A) + P(C)P(A∪C)≤P(A)+P(C) translates to the new fact
Example 1.8
A conservative design team, call it CCC. and an innovative design team, call it NNN, are asked to separately design a new product within a month. From past experience we know that:
(a)(a)(a) The probability that team CCC is successful is 2/32/32/3.
(b)(b)(b) The probability that team NNN is successful is 1/21 /21/2.
(c)(c)(c) The probability that at least one team is successful is 3/43/ 43/4.
Assuming that exactly one successful design is produced, what is the probability that it was designed by team NNN?
SOLUTION
There are four possible outcomes:
We were given that the probabilities of these outcomes satisfy
From these relations, together with the normalization equation
P(SS)+P(SF)+P(FS)+P(FF)=1P(SS) + P(SF) + P(FS) + P(FF) = 1P(SS)+P(SF)+P(FS)+P(FF)=1
we can obtain the probabilities of individual outcomes:
The desired conditional probability is
Problem 1.15
A coin is tossed twice. Alice claims that the event of two heads is at least as likely if we know that the first toss is a head than if we know that at least one of the tosses is a head. Is she right? Does it make a difference if the coin is fair or unfair?
SOLUTION
Let AAA be the event that the first toss is a head and let BBB be the event that the second toss is a head. We must compare the conditional probabilities P(A∩B∣A)P(A\cap B|A)P(A∩B∣A) and P(A∩B∣A∪B)P(A \cap B |A\cup B)P(A∩B∣A∪B). We have
and
Since P(A∣B)≥P(A)P(A| B)\geq P(A)P(A∣B)≥P(A), the first conditional probability above is at least as large, so Alice is right, regardless of whether the coin is fair or not.
Using Conditional Probability for Modeling
When constructing probabilistic models for experiments that have a sequential character, it is often natural and convenient to first specify conditional probabilities and then use them to determine unconditional probabilities. The rule
P(A∩B)=P(B)P(A∣B)P(A\cap B) = P(B)P(A | B)P(A∩B)=P(B)P(A∣B) is often helpful in this process.
Example 1.9.
If an aircraft is present in a certain area, a radar detects it and generates an alarm signal with probability 0.99. If an aircraft is not present. the radar generates a (false) alarm, with probability 0.10. We assume that an aircraft is present with probability 0.05. What is the probability of no
aircraft presence and a false alarm? What is the probability of aircraft presence and no detection?
SOLUTION
A sequential representation of the experiment is appropriate here, as shown in Fig. 1.9. Let AAA and BBB be the events
and consider also their complements
Each possible outcome corresponds to a leaf of the tree, and its probability is equal to the product of the probabilities associated with the branches in a path from the root to the corresponding leaf. The desired probabilities are
Extending the preceding example, we have a general rule for calculating various probabilities in conjunction with a tree-based sequential description of an experiment. In particular:
- We view the occurrence of the event as a sequence of steps and set up the tree so that an event of interest is associated with a leaf.
- We record the conditional probabilities associated with the branches of the tree.
- We obtain the probability of a leaf by multiplying the probabilities recorded along the corresponding path of the tree.
Example 1.10.
Three cards are drawn from an ordinary 52-card deck without replacement (drawn cards are not placed back in the deck). We wish to find the probability that none of the three cards is a heart. We assume that at each step, each one of the remaining cards is equally likely to be picked.
SOLUTION
A cumbersome (麻烦的) approach, which we will not use, is to count the number of all card triplets that do not include a heart, and divide it with the number of all possible card triplets. Instead, we use a sequential description of the experiment in conjunction with the multiplication rule.
The desired probability is now obtained by multiply ing the probabilities recorded along the corresponding path of the tree:
Note that once the probabilities are recorded along the tree, the probability of several other events can be similarly calculated. For example,
Example 1.11.
A class consisting of 4 graduate and 12 undergraduate students is randomly divided into 4 groups of 4. What is the probability that each group includes a graduate student?
SOLUTION
We interpret “randomly” to mean that given the assignment of some students to certain slots, any of the remaining students is equally likely to be assigned to any of the remaining slots. We then calculate the desired probability using the multiplication rule, based on the sequential description. Let us denote the four graduate students by 1, 2, 3, 4, and consider the events
Thus, the desired probability is
Example 1.12. The Monty Hall Problem (蒙提.霍尔问题 / 三门问题).
You are told that a prize is equally likely to be found behind any one of three closed doors in front of you. You point to one of the doors. A friend opens for you one of the remaining two doors, after making sure that the prize is not behind it.
Consider the following strategies:
(a)(a)(a) Stick to your initial choice.
(b)(b)(b) Switch to the other unopened door.
(c)(c)(c) You first point to door 1. If door 2 is opened, you do not switch. If door 3 is opened, you switch.
Which is the best strategy?
SOLUTION
To answer the question, let us calculate the probability of winning under each of the three strategies.
(a)(a)(a) Under the strategy of no switching, your initial choice will determine whether you win or not, and the probability of winning is 1/31/31/3.
(b)(b)(b) Under the strategy of switching, if the prize is behind the initially chosen door (probability 1/31/31/3). you do not win. If it is not (probability 2/32/32/3), and given that another door without a prize has been opened for you, you will get to the winning door once you switch. Thus. the probability of winning is now 2/32/32/3, so (b)(b)(b) is a better strategy than (a)(a)(a).
要注意,直觉上不管怎么选,获奖概率都应该是 50%50\%50%。不妨假设在主持人开了一扇空门之后又来了个参赛者 BBB 来选门,那么毫无疑问他不管怎么选获奖率都是 50%50\%50%。
但是第一个参赛者不一样,他的获奖概率是基于第一次选择的条件概率,而不是和参赛者 BBB 一样,计算的是一个独立事件的概率
在网上看到一个更好的解释:考虑极端情况,从10000扇门中选奖品,直接选中奖品的概率微乎其微。选完门后主持人开出9998扇空门,这时候你是换门还是不换门?
显然应该换,因为只要你之前选错了,换门之后就能得奖。换门之后获奖率显然更高
(c)(c)(c) The answer depends on the way that your friend chooses which door to open. Let us consider two possibilities.
Suppose that if the prize is behind door 1, your friend always chooses to open door 2. (If the prize is behind door 2 or 3, your friend has no choice.) If the prize is behind door 1. your friend opens door 2, you do not switch, and you win. If the prize is behind door 2, your friend opens door 3, you switch. and you win. If the prize is behind door 3. your friend opens door 2. you do not switch, and you lose. Thus, the probability of winning is 2/32/32/3. so strategy (c)(c)(c) in this case is as good as strategy (b)(b)(b).
Suppose now that if the prize is behind door 1. your friend is equally likely to open either door 2 or 3. If the prize is behind door 1 (probability 1/31/31/3). and if your friend opens door 2 (probability 1/21 /21/2), you do not switch and you win (probability 1/61/61/6). But if your friend opens door 3, you switch and you lose. If the prize is behind door 2, your friend opens door 3. you switch, and you win (probability 1/31/31/3). If the prize is behind door 3, your friend opens door 2, you do not switch and you lose. Thus. the probability of winning is 1/6+1/3=1/21/6 + 1/3 =1/21/6+1/3=1/2, so strategy (c)(c)(c) in this case is inferior to strategy (b)(b)(b).