Hi! The formula I used was Definition 1. Definition 2 is incorrect, isn't it? (Or at least "elusion" and "prevalence" are being used loosely to mean "discard yield" and "collection yield".)

William

]]>Definition 1. In his earlier work (cited above) he uses prevalence to estimate TP+FN (the total number of relevant documents) and he uses elusion to estimate FN (the total number of missed relevant documents. He then plugs these estimates into a contingency table with N (the total number of documents) and D (the total number of discarded documents). If I am not mistaken, the resulting formula is

eRecall = 1 - (elusion/prevalence * D/N)

Definition 2. In his most recent work (http://orcatec.com/2014/11/04/the-pendulum-swings-practical-measurement-in-ediscovery/), he defines

eRecall = 1 - elusion/prevalence

]]>