Patent Number: 8,793,271

Title: Searching documents using a dynamically defined ignore string

Abstract: Techniques are disclosed for searching a plurality of documents using a dynamically defined ignore string. The ignore string may be specified by a user. An overlay index may be generated over the plurality of documents. The overlay index may include a posting list for each term in the ignore string. Each posting list may specify the documents of the plurality of documents in which the respective term occurs outside of the ignore string. The overlay index may also include a posting list that specifies all occurrences of the ignore string in the plurality of documents. Once generated, a user may search the plurality of documents while occurrences of the ignore string in the plurality of documents are ignored in text-based searches.

Inventors: Balakrishnan; Sreeram V. (Los Altos, CA), Busch; Michael (San Jose, CA), Hinrichs; Christopher J. (San Jose, CA), Jacopi; Tom W. (San Jose, CA), Neumann; Andreas (Sunnyvale, CA)

Assignee: International Business Machines Corporation

International Classification: G06F 17/30 (20060101)

Expiration Date: 7/29/12018