DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 4, 10, 12, 16 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Doermann1 and further in view of Shaw et al.2 or Musgrove. 
With regard to claim 1, Doermann teaches method for generating and searching a document feature repository, the method comprising: receiving a plurality of documents (see abstract: indexing a plurality of documents; see also § 1.2 ¶¶ 4-5); extracting a set of image features from one or more documents within the plurality of documents and generating the document feature repository by storing the one or more sets of image features, by document (see abstract, §§ 1.1-1.2: document image databases generated from text and graphic features); receiving a document search query (see § 2.1 ¶ 3: provide a query as a set of terms). Doermann fail to explicitly teach pre-processing the document search query to generate a set of variable document search queries; see ¶¶ 2, 4, 24-25: substitute words for query term); searching the document feature repository using the set of variable document search queries and presenting the search results to a user (see ¶¶ 2, 4, 52: searching the database using the substitute terms and presenting the retrieved documents). See also Musgrove ¶¶ 5, 41-42: searching using alternative search terms and presenting the retrieved documents. 
One skilled in the art before the effective filing date would have found it obvious to combine the teachings to arrive at the claimed invention. In particular, it would have been obvious to incorporate known teachings of using alternative or substitute terms for the query as taught by Shaw et al.  and Musgrove into the configuration of Doermann for retrieving documents stored in the database. The motivation would have been to enhance the likelihood of finding a matching document by using synonyms or alternative terms. 
With regard to claim 4, Shaw et al. teach parsing the document search query into a set of query subcomponents (see abstract: parsing into first and second query term); modifying the set of query subcomponents by replacing a first query subcomponent with a second query subcomponent, wherein the first and second query subcomponents are similar (see abstract, ¶¶ 24-25: first query term replaced with synonym or similar word); and outputting the modified set of query subcomponents as a first variable document search query within the set of variable document search queries (see abstract, ¶¶ 49-52: extracting documents matching the revised search query).
With regard to claims 9 and 16, see claim 1. 
With regard to claims 12 and19, see claim 4. 
Claims 2-3, 5-6, 10-11, 13-14, 17-18, 20 are rejected under 35 U.S.C. 103 as being unpatentable over Doermann3 and further in view of Shaw et al.4 or Musgrove and further in view of Tian.5
	With regard to claim 2, Doermann fail to explicitly teach wherein extracting the set of image features comprises: parsing the one or more documents into a set of page images, by document; generating one or more sets of character blocks for each page image within the set of page images; extracting a plurality of character block features for each character block; and storing the plurality of character block features in the document feature repository, by page image. However, Tian teaches the missing features. Tian teaches wherein extracting the set of image features comprises: parsing the one or more documents into a set of page images, by document (see fig. 3, ¶ 24: analyzing each page of the document); generating one or more sets of character blocks for each page image within the set of page images (see fig. 3, ¶¶ 24, 26, fig. 1A: extracting text regions for each page); extracting a plurality of character block features for each character block (see fig. 1a, fig. 3, ¶¶ 25-27: features of text regions extracted) and storing the plurality of character block features in the document feature repository, by page image (see fig. 1, 3, ¶¶ 22, 37: storing the character or text block features in the database.
	One skilled in the art before the effective filing date would have found it obvious to combine the teachings to arrive at the claimed invention. In particular, it would have been obvious to incorporate known teachings of parsing the document by character block and by page as taught by Tian into the configuration of Doermann, yielding predictable results. The motivation would have been to index documents efficiently by using character block features for each page of the document and improving retrieval results. 
see ¶ 24: pre-processing includes binarization); and performing morphological dilation on the binarized document to generate the one or more sets of character blocks (see ¶ 26: morphological operations). The motivation for incorporating the image binarization and morphological operation would have been to enhance extraction of text or character regions for indexing. 
	With regard to claim 5, Tian teaches wherein the plurality of character block features includes a set of character features, a set of word features and a set of sentence features (see ¶¶ 25, 27, fig 3: number of lines or sentences, number or length of characters and pixel density, word lengths). The motivation for combining the references is the same as stated above. 
	With regard to claim 6, Tian teaches wherein generating a set of match scores between each set within the plurality of character block features and the set of variable document search queries, determining a best match, according to the set of match scores, and presenting the document associated with the best match score to the user (see fig. 3, ¶¶ 50-51: comparing character block features to determine similarity scores and presenting the results to the user).
 	With regard to claims 10-11 and 13, see discussion of claims 2-3 and 5, respectively. 
	With regard to claim 14, see claim 6. 
	With regard to claims 17-18 and 20, see discussion of claims 2-3 and 5, respectively.	

Claims 7-8, 15 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.



Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AVINASH YENTRAPATI whose telephone number is (571)270-7982.  The examiner can normally be reached on 8AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sumati Lefkowitz can be reached on (571) 272-3638.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/AVINASH YENTRAPATI/Primary Examiner, Art Unit 2662                                                                                                                                                                                                        





    
        
            
        
            
        
            
        
            
        
            
    

    
        1 Doermann, David. "The indexing and retrieval of document images: A survey." Computer Vision and Image Understanding 70.3 (1998): 287-298.
        2 US Publication No. 2015/0205866
        3 Doermann, David. "The indexing and retrieval of document images: A survey." Computer Vision and Image Understanding 70.3 (1998): 287-298.
        4 US Publication No. 2015/0205866.
        5 US Publication No. 2013/0170749.