UPDATE: I have corrected an inaccuracy in my probabilistic analysis of the Gemara, and I have added to it an example using concrete numbers.
The Mishnah states:
האשה שנתארמלה או שנתגרשה היא אומרת בתולה נשאתני והוא אומר לא כי אלא אלמנה נשאתיך אם יש עדים שיצאת בהינומא וראשה פרוע כתובתה מאתים1
The Gemara analyzes:
וכיון דרוב נשים בתולות נישאות כי לא אתו עדים מאי הוי
אמר רבינא משום דאיכא למימר רוב נשים בתולות נישאות ומיעוט אלמנות וכל הנשאת בתולה יש לה קול וזו הואיל ואין לה קול איתרע לה רובא
אי כל הנשאת בתולה יש לה קול כי אתו עדים מאי הוי הנך סהדי שקרי נינהו
אלא אמר רבינא רוב הנשאת בתולה יש לה קול וזו הואיל ואין לה קול איתרע לה רובא2
This is a classic example of Bayesian inference, albeit expressed in qualitative, rather than quantitative, terms:
Bayesian inference uses aspects of the scientific method, which involves collecting evidence that is meant to be consistent or inconsistent with a given hypothesis. As evidence accumulates, the degree of belief in a hypothesis ought to change. With enough evidence, it should become very high or very low. …
Bayesian inference uses a numerical estimate of the degree of belief in a hypothesis before evidence has been observed and calculates a numerical estimate of the degree of belief in the hypothesis after evidence has been observed. …
Bayes’ theorem adjusts probabilities given new evidence in the following way:
- H represents a specific hypothesis, which may or may not be some null hypothesis.
- P(H) is called the prior probability of H that was inferred before new evidence, E, became available.
- P(E | H) is called the conditional probability of seeing the evidence E if the hypothesis H happens to be true. It is also called a likelihood function when it is considered as a function of H for fixed E.
- P(E) is called the marginal probability of E: the a priori probability of witnessing the new evidence E under all possible hypotheses. It can be calculated as the sum of the product of all probabilities of any complete set of mutually exclusive hypotheses and corresponding conditional probabilities:
- P(H | E) is called the posterior probability of H given E.
The factor P(E | H) / P(E) represents the impact that the evidence has on the belief in the hypothesis. If it is likely that the evidence E would be observed when the hypothesis under consideration is true, but unlikely that E would have been the outcome of the observation, then this factor will be large. Multiplying the prior probability of the hypothesis by this factor would result in a larger posterior probability of the hypothesis given the evidence. Conversely, if it is unlikely that the evidence E would be observed if the hypothesis under consideration is true, but a priori likely that E would be observed, then the factor would reduce the posterior probability for H. Under Bayesian inference, Bayes’ theorem therefore measures how much new evidence should alter a belief in a hypothesis.
In our Gemara:
- H is the hypothesis that the woman was a virgin at her marriage
- E is the absence of a קול
- P(H) is רוב נשים בתולות נישאות
- P(E | H) is low, since רוב הנשאת בתולה יש לה קול
- P(E) is not stated
- P(H | E) is הואיל ואין לה קול איתרע לה רובא
The Gemara’s point, expressed in the language of Bayesian probability, is that in the absence of a קול [E], the likelihood of בתולה נשאת [P(H | E)] is significantly lower than our initial assumption of the same [P(H)], given that the likelihood of the absence of a קול had she been a virgin [P(E | H] is low. [This is true
regardless of the value of P(E), .] In the הוה אמינא of the Gemara, P(E | H) was actually zero, and we would therefore reject even the testimony of witnesses to her virginity, normally the gold standard of evidence in Halachah, having no choice but to conclude that they are liars.
For concreteness’s sake, here is the logic expressed using specific, albeit somewhat arbitrary numbers. Let us assign the value of 75% to any רוב, and let us assume that the rate of false positives, i.e., the presence of a קול for a woman who is not a virgin, is quite low, say 5%. So:
- P(H) is .75
- P(E | H) is .25
- P(E) is (.75 * .25) + (.25 * .95) = .425
And plugging these values into Baye’s theorem tells us that P(H | E) = .25 * .75 / .425 ≆ .441
As the Gemara says, איתרע לה רובא.
The principle of this Gemara arose during a presentation that I heard yesterday. The question being discussed was the status of a בעלת תשובה vis-a-vis marrying a Cohen. The problem is that many Poskim rule that an apostate who returns to observance is assumed to be טמאה, with her protestations to the contrary not believed, based on the principle of רוב גוים פרוצים בעריות3, and it would seem to follow that the same should apply to an irreligous Jew, since she has moved in non-Jewish society and followed its norms.
One (out of several) arguments raised by the presenter was as follows: He noted that he had been informed, by a Posek with much experience in these matters, that 95% of בעלות תשובה actually admit that they are טמאות, and even though we have no idea of what percentage of the remaining 5% are telling the truth, it is nevertheless indubitably clear that רוב טמאות admit to their history4. He therefore reasoned, by analogy to the above Gemara, that whenever a woman insists that she is טהורה, the argument against her purity that רוב גוים פרוצים בעריות is weakend by the opposing, supporting argument that רוב טמאות admit to their status, and she does not.
The reaction to this suggestion by the audience was mild pandemonium. While I staunchly defended the plausibility of the idea, nearly everyone else who expressed an opinion, as well as a couple of my colleagues to whom I later related it, seemed to find it at least dubious, if not downright preposterous. I did not, however, hear anyone articulate a clear objection. Of course, one can simply deny the claim of 95% of בעלות תשובה admitting to being טמאות, but if, for the sake of argument, we grant it, I challenge the reader to either concede the point, or clearly articulate why it is incorrect!
- כתובות ריש פרק שני [↩]
- שם דף ט”ז ע”א – ע”ב [↩]
- See שולחן ערוך אה”ע סימן ז’ סעיף י”א בהגה [↩]
- We are tacitly assuming that the rate of false positives, i.e. admissions of טומאה by טהורות, is very low. Obviously, if we assume that admission or denial of טומאה is completely independent of her true status, then we can derive nothing at all from her denial. [↩]