A technology-assisted review (TAR) process frequently begins with the creation of a control set---a set of documents randomly sampled from the collection, and coded by a human expert for relevance. The control set can then be used to estimate the richness (proportion relevant) of the collection, and also to gauge the effectiveness of a predictive coding (PC) system as training is undertaken. We might also want to use the control set to estimate the completeness of the TAR process as a whole. However, we may run into problems if we attempt to do so.
The reason the control set can be used to estimate the effectiveness of the PC system on the collection is that it is a random sample of that collection. As training proceeds, however, the relevance of some of the documents in the collection will become known through human assessment---even more so if review begins before training is complete (as is often the case). Direct measures of process effectiveness on the control set will fail to take account of the relevant and irrelevant documents already found through human assessment.