DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159.  See MPEP §§ 706.02(l)(1) - 706.02(l)(3) for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 

Claim 1-7 are provisionally rejected on the ground of nonstatutory double patenting as being unpatentable over claims 8-20 of copending Application 15/795071.  Although the claims are not identical, they are not patentably distinct from each other.

Notes on the Prior Art
 A few notes on what is well-known in the art.
A locality sensitive hashing (LSH) and its variations is a well-known indexing techniques for approximate similarity search.  The LSH constructs indexing data structures (multiple hash tables) for similarity search.  To perform a similarity search, the indexing method hashes a query object to find matches most similar to the query.  Such matching uses well-known similarity or distance measures, such as Jaccard (also known as index, coefficient), Hamming, Dice, Levenshtein, Jaro-Winkler or cosine.  Most common and widely used is Jaccard index, which calculates LSH or MinHash  for all values in the query.  This creates an array /index structure of multiple hashes.  To perform a search a query is also hashed and similarity between query hash and the Jaccard index hash (MinHash, LSH, SimHash etc.) is determined.  Usually the similarity is weighted by the probability of occurrence of the hash in the index, which yields the weighted Jaccard Index as the collision probability.
https://en.wikipedia.org/wiki/MinHash.

Claim Interpretations / Constructions
A few notes on the claim interpretation.  Claim 1 requires processing a query with multiple Boolean operators and construction of the AND and OR indices.  However, the way claim is written, the results are only returned from the OR index.    Thus, the AND index is never actually used in the independent claims and its purpose is unknown. 
Further, once the OR index is constructed, there is no difference between such index and any other hashed index.    I.e. it’s just an array of hashed words (i.e. plurality of OR conditions of the query).

Therefore, on the record, the limitation – 
“a query similarity score for a first document in the set of documents, wherein the query similarity score is based on a first hash value computed for the OR condition of the query, a weight value for the OR condition, and a second hash value for the first document specified in an OR index”- 
- is construed to be analogous to performing a similarity search on the weighted Jaccard index, as shown above, for the OR condition of the query.
In view of the above, please see alternative rejection for the independent claims below.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the 

Claims 1-3, 5 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cao et al. (US 8,166,021) in view of STOICA et al. (US 2017/0161375) and in further view of Ryger et al. (US 2015/0310005).

Regarding claim 1, Cao teaches a method comprising: 
receiving a query specifying an AND condition and an OR condition (C23L16-19, 45-47, C29L1-5); 
determining, based on an AND index structure, a set of documents, of a plurality of documents in a corpus, satisfying the AND condition of the query (C16L22-24, 44-49, C25L55-59, C26L22-24,67-68 where "index server assigned to the AND nodes" is an AND index); 
computing, by a processor, a query similarity score for a first document in the set of documents (C16L34-47, C31L43-50 “generate a phrase relevance score for a document with respect to a phrase”), wherein the query similarity score is based on a first hash value (C14L65-67, C16L1-3) 
returning an indication of the first document and the query similarity score as responsive to the query (C31L50-59).

Note - Cao teaches that Boolean query is divided into nodes (AND/OR) and each node is assigned to a set of index servers (C26L22-32, 59-60).  An index assigned to the OR node is an OR index and likewise an index assigned to the AND node is an AND index.  Clearly, different portions of the query 
Still, Cao does not explicitly teach, however STOICA discloses wherein the query similarity score is based on a first hash value computed for the OR condition of the query ([0063], [0076]), 
It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Cao to include a first hash value computed for the OR condition of the query, a weight value for the OR condition as disclosed by STOICA.  Doing so would provide location information stored in the index with minimal computing resources and provide the speed and efficiency with which the engine responds to a query for documents (STOICA [0092], [0109]).

Cao as modified by STOICA does not explicitly teach, a weight value for the OR condition. However Ryger discloses the same in [0088], [0149]. 
It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Cao to include a weight value for the OR condition as disclosed by Ryger.  Doing so would improve relevancy ranking and allowing scoring that better matches the user’s intention (Ryger [0102], [0155]).

Regarding claim 2, Cao as modified teaches the method of claim 1, wherein the AND index comprises a posting list configured to store a document identifier (ID) for each document including a 

Regarding claim 3, Cao as modified teaches the method of claim 2, wherein the query specifies a plurality of AND conditions, wherein the determined set of documents satisfy each of the plurality of AND conditions, wherein determining the set of documents comprises: 
generating a search query including an indication of each of the plurality of AND conditions specified in the query (Cao C15L11-15, C24L1-50, C29L1-67); 
processing the search query against the AND index (Cao C16L1-3, C26L18-24); and receiving, from the AND index, the set of documents comprising the document ID of each document in the set of documents (Cao C26L44-49, 66-67, C28L36-44, C31L14-21, STOICA [0047]).

Regarding claim 5, Cao as modified teaches the method of claim 1, further comprising prior to computing the similarity score for the first document: 
receiving a document identifier (ID) for the first document from the OR index (Cao C26L 59-61, C30L59-63, STOICA [0047]); and 
determining that the document ID for the first document is included in the set of documents (Cao C32L5-17, 25-26, STOICA [0047]).

Claims 6-7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Newman et al. as modified and in further view of Liddy et al. (US 5,963,940).


receiving a document identifier (ID) for a second document of the plurality of documents in the corpus from the OR index(Cao C26L 59-61, C30L59-63).
Newman as modified does not explicitly teach, but Liddy discloses  
determining that the document ID for the second document is not included in the set of documents; refraining from computing a query similarity score for the second document (Liddy C22L42-49 as in “no PTS score is generated based on the query term”, where PTS score is one of the “five individual measures of similarity (scores) between the query and a given document”); and 
refraining from returning the second document as responsive to the query (Liddy C24L64-67 – C25L1-15 “cutoff criteria on a ranked list of relevant documents for individual queries based on the similarity of documents to queries”).
It would have been obvious to one of ordinary skill in the art at the time of invention to modify the teachings of Newman as modified to refrain from computing a query similarity score as disclosed by Liddy.  Doing so would enable improved query processing efficiency (Cao C6L2).

Regarding claim 7, Cao as modified teaches the method of claim 1, wherein the AND index and the OR index are generated (Cao C26L13-32, 59-67) during a preprocessing phase of the plurality of documents in the corpus (Cao C26L13-32, 59-67).

Claim 1 is alternatively rejected under 35 U.S.C. 103 as being unpatentable over Bawa et al. “LSH Forest: SelfTuning Indexes for Similarity Search”, in view of Cao et al. (US 8,166,021) and in further view what is known in the art.

Regarding claim 1, Bawa teaches a method comprising: 
a range (page 651 C2, see “range queries”); 
determining, based on an range of the query  (p.653 C2 “construct many LSH indexes … thus ensuring that there is one index that works well for every query … captures the nearest-neighbor distance for most queries, and construct the LSH index … the indexes work well for m-nearest neighbors”); 
computing, by a processor, a query similarity score for a first document in the set of documents (p.651 C2 “actual similarity to the query is computed”), wherein the query similarity score is based on a first hash value computed for the range of the query (p.653 §4.2, where distance, nearest neighbor is the similarity measure, also see query to document matching in §5.1 and p.657 C2 “Similarity Metric”), a weight value for the range (p.657 see C2 “Similarity Metric” where Jaccard similarity is computed for weighted terms), and a second hash value for the first document specified in an 
returning an indication of the first document and the query similarity score as responsive to the query (p.659, F5(b) where similarity of the results is output, also see p.651 “The results returned by LSH are guaranteed to have a similarity”).

Bawa does not explicitly teach, however Cao discloses receiving a query specifying an AND condition and an OR condition (C23L16-19, 45-47, C29L1-5); 
determining, based on an AND index structure, a set of documents, of a plurality of documents in a corpus, satisfying the AND condition of the query (C16L22-24, 44-49, C25L55-59, C26L22-24,67-68 where "index server assigned to the AND nodes" is an AND index) and 
OR index (C26L30-32) and OR condition (F11A-B). 


The dependent claims are rejected in further view of Liddy, STOICA and Ryger as indicated in the primary rejection above.

Allowable Subject Matter
Claim 4 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims and upon overcoming the Double Patenting rejection. 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure is indicated on PTO-892.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to POLINA G PEACH whose telephone number is (571)270-7646.  The examiner can normally be reached on Monday-Friday, 9:30 - 5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/POLINA G PEACH/               Primary Examiner, Art Unit 2165                                                                                                                                                                                         	January 13, 2021