Tutorial on confidence intervals in e-discovery

August 2nd, 2012

Ever since Judge Grimm opined that random sampling constituted prudent for checking the reliability of a production (Victor Stanley v. Creative Pipe, 269 F.R.D. 497), there has been strong interest in the topic of sampling within e-discovery, including from lawyers themselves. Ralph Losey, for interest, has devoted a post in his blog to the topic of sampling, and his recent blog posts narrating an example predictive coding exercise have contained much sampling-related material.

I've written some research work on more advanced topics in confidence intervals, but I thought it might be useful to write some more introductory material as well. I originally intended to write a series of blog posts giving a brief tutorial on sampling and estimation, but the brief tutorial worked out to be around 5,000 words, so I've made it into a separate document: A tutorial on interval estimation for a proportion, with particular reference to e-discovery. The tutorial aims to give an understanding of the workings behind confidence intervals, while avoiding as much math as possible. (If you want a still more high-level discussion of sampling, estimation, and intervals, then I recommend Venkat Rangan's post on Predictive Coding -- Measurement Challenges.) The tutorial is marked as Version 0.1; I'd be very grateful for any corrections, comments, or suggestions for improvement, and will work them in to later versions.

Do document reviewers need legal training?

July 15th, 2012

In my last post, I discussed an experiment in which we had two assessors re-assess TREC Legal documents with less and more detailed guidelines, and found that the more detailed guidelines did not make the assessors more reliable. Another natural question to ask of these results, though not one the experiment was directly designed to answer, is how well our assessors compared with the first-pass assessors employed for TREC, who for this particular topic (Topic 204 from the 2009 Interactive task) happened to be a review team from a vendor of professional legal review services. How well do our non-professional assessors compare to the professionals?
Read the rest of this entry »

Detailed guidelines don't help assessors

July 2nd, 2012

Social scientists are often accused of running studies that confirm the obvious, such as that people are happier on the weekends, or that having many meetings at work make employees feel fatigued. The best response is, what seems obvious may not actually be true. That, indeed, is what we found in a recent experiment. We set out to confirm that giving relevance assessors more detailed guidelines would make them more reliable. We found it didn't.
Read the rest of this entry »

"Approximate Recall Confidence Intervals", updated and in submission

May 18th, 2012

Much later than I intended, after painstaking editing to get the length down from 39 to 31 pages, I've prepared a revised version of "Approximate Recall Confidence Intervals", which is now in submission. Aside from tightening up the text and excluding a few inessential results, the main change from the first version has been to force interval upper edges to 1 where no relevant documents are found in the unretrieved sample, and to 0 where none are found in the retrieved sample. I've also released recci, an R package for computing recall confidence intervals, along with other R packages for generating figures and tables and re-running the experiments reported in the paper.

Recall confidence intervals

February 25th, 2012

Frequent readers of this blog will know of my burning desire to move IR research away from dry technical topics and towards questions that directly impact and excite the retrieval user. In pursuit of this goal, I have for the past year been working on a paper on estimating two-tailed confidence intervals for recall under simple and stratified random sampling of assessments. I posted a pre-print of this article, Approximate Recall Confidence Intervals, to arXiv about a week ago.

Sampling distribution of recall

Sampling distribution of recall

Read the rest of this entry »

Attention-enhancing information retrieval

February 19th, 2012

Last week I was at SWIRL, the occasional talkshop on the future of information retrieval. To me the most important of the presentations was Diane Kelly's "Rage against the Machine Learning", in which she observed the way information retrieval currently works has changed the way people think. In particular, she proposed that the combination of short query with snippet response has reworked peoples' plastic brains to focus on working memory, and forgo the processing of information required for it to lay its tracks down in our long term memory. In short, it makes us transactionally adept, but stops us from learning.
Read the rest of this entry »

How accurate can manual review be?

December 18th, 2011

One of the chief pleasures for me of this year's SIGIR in Beijing was attending the SIGIR 2011 Information Retrieval for E-Discovery Workshop (SIRE 2011). The smaller and more selective the workshop, it often seems, the more focused and interesting the discussion.

Read the rest of this entry »

Assessor disagreement and court sanctions

September 4th, 2011

I mentioned Cross and Kerksiek's suggestion of vocabulary discovery in my previous post. Their paper also contains an interesting reference to a case (Felman Products, Inc. v. Industrial Risk Insurers) in which the defendant was penalized for the carelessness of their production. The defendant inadvertently produced privileged documents, and sought to have them ruled inadmissable. Two judges, the original and an appellate, ruled against the defendant, on the grounds that the defendant had not shown sufficient care in their production.
Read the rest of this entry »

Corpus characterization in e-discovery

September 4th, 2011

In e-discovery (document retrieval for civil litigation), one side has the documents, the other side proposes the query. This creates an information asymmetry; the requesting side cannot view the corpus to decide what keywords to use and what queries to propose, and opportunities for query iteration are limited, expensive, and liable to being contested.
Read the rest of this entry »


July 21st, 2011

Harvard researcher and open-access advocate, Aaron Swartz, faces 35 years' jail for downloading 4.8 million articles from JSTOR. Still relaxed and comfortable about publishing in closed-access journals?

Correct spelling and grammar more important than positivity or negativity of product reviews -- Panos Ipeirotis.

Fitting an elephant with four parameters.

Placebos as effective as real medicine in improving subjectively-measured asthma symptoms, but ineffective in improving objectively-measured symptoms -- Science-based medicine.

The Spread of Evidence-Poor Medicine via Flawed Social-Network Analysis. You don't need to worry about friends catching your obesity---but you might need to worry (even more) about being subjected to interventions based upon poor statistics and faulty peer reviewing.