DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

This Office Action is in response to correspondence filed 28 June 2019 in reference to application 16/457,372.  Claims 1-20 are pending and claims 1-15 have been examined.

Election/Restrictions
Restriction to one of the following inventions is required under 35 U.S.C. 121:
I. Claims 1-15, drawn to semantically indexing a corpus, classified in G06F40/40.
II. Claims 16-10, drawn to searching a corpus, classified in G06F16/90335.
Inventions I and II are directed to related fields of corpus searching. The related inventions are distinct if: (1) the inventions as claimed are either not capable of use together or can have a materially different design, mode of operation, function, or effect; (2) the inventions do not overlap in scope, i.e., are mutually exclusive; and (3) the inventions as claimed are not obvious variants.  See MPEP § 806.05(j). In the instant case, the inventions as claimed are different in operation and function as one invention is for preparing a corpus for search, while the other is for processing a query in a search.  Furthermore, the inventions as claimed do not encompass overlapping subject matter and there is nothing of record to show them to be obvious variants.
Restriction for examination purposes as indicated is proper because all the inventions listed in this action are independent or distinct for the reasons given above and there would be a serious search and/or examination burden if restriction were not required because one or more of the following reasons apply:
Different prior art is relevant to each invention and thus would create a serious search burden.
Applicant is advised that the reply to this requirement to be complete must include (i) an election of a invention to be examined even though the requirement may be traversed (37 CFR 1.143) and (ii) identification of the claims encompassing the elected invention. 
The election of an invention may be made with or without traverse. To reserve a right to petition, the election must be made with traverse. If the reply does not distinctly and specifically point out supposed errors in the restriction requirement, the election shall be treated as an election without traverse. Traversal must be presented at the time of election in order to be considered timely. Failure to timely traverse the requirement will result in the loss of right to petition under 37 CFR 1.144. If claims are added after the election, applicant must indicate which of these claims are readable upon the elected invention.
Should applicant traverse on the ground that the inventions are not patentably distinct, applicant should submit evidence or identify such evidence now of record showing the inventions to be obvious variants or clearly admit on the record that this is the case. In either instance, if the examiner finds one of the inventions unpatentable 
During a telephone conversation with Weiguo Chen on 25 January 2021 a provisional election was made without traverse to prosecute the invention of I, claims 1-15.  Affirmation of this election must be made by applicant in replying to this Office action.  Claims 16-20 are withdrawn from further consideration by the examiner, 37 CFR 1.142(b), as being drawn to a non-elected invention.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1-4 and 9-12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. (US PAP 2017/0083610) in view of Rangan (US PAP 2012/0209847).

Consider claim 1,  Wang teaches a corpus generating method (0004, 0045, gathering and organizing corpus for search), implementable by a computing device (0105, computer systems), the method comprising: 
generating a corpus according to corpus content (0045- 49 organizing information and documents for search); and 

but does not specifically teach generating a corpus vector; 
determining a vector type of the corpus vector; and 
indexing according to the vector type and the corpus vector. 
In the same field of organizing information or search, Rangan teaches 
generating a corpus vector (0089-90, determining vectors for words found in each of the documents of the corpus); 
determining a vector type of the corpus vector (0097-99, projecting vectors into document vectors); and 
indexing according to the vector type and the corpus vector (0098, indexing)
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to use vector based indexing as taught by Rangan in the system of Wang in order to improve relevancy in document retrieval (Wang 0008-10).

Consider claim 2, Wang and Rangan teach  he method according to claim 1, wherein the generating a corpus vector according to corpus content comprises: 
determining a word segmentation result of the corpus content, the word segmentation result including a plurality of words (Wang 0047 text segmentation); 
determining word vectors corresponding to the words in the word segmentation result (Wang, multidimensional features based on segmentation 0048, Rangan, 0089-90, determining word vectors ); and 


Consider claim 3, Rangan teaches the method according to claim 2, wherein the determining word vectors corresponding to the words in the word segmentation result comprises: searching a preset word vector library to determine the word vectors corresponding to the words in the word segmentation result (0097, word frequency vectors determined from preset documents ).

Consider claim 4, Rangan teaches the method according to claim 2, wherein the generating the corpus vector corresponding to the corpus content according to the word vectors corresponding to the words in the word segmentation result comprises: summing the word vectors corresponding to the words in the word segmentation result to generate the corpus vector (0097,equation 1).

Consider claim 9,  Wang teaches a corpus generating apparatus (abstract), comprising: one or more processors (0111) and one or more non-transitory computer-readable memories (0112) coupled to the one or more processors and configured with instructions executable by the one or more processors to cause the apparatus to perform operations comprising:: 
generating a corpus according to corpus content (0045- 49 organizing information and documents for search); and 

but does not specifically teach generating a corpus vector; 
determining a vector type of the corpus vector; and 
indexing according to the vector type and the corpus vector. 
In the same field of organizing information or search, Rangan teaches 
generating a corpus vector (0089-90, determining vectors for words found in each of the documents of the corpus); 
determining a vector type of the corpus vector (0097-99, projecting vectors into document vectors); and 
indexing according to the vector type and the corpus vector (0098, indexing)
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to use vector based indexing as taught by Rangan in the system of Wang in order to improve relevancy in document retrieval (Wang 0008-10).

Claim 10 contains similar limitations as claim 2 and is therefore rejected for the same reasons.

Claim 11 contains similar limitations as claim 3 and is therefore rejected for the same reasons.

Claim 12 contains similar limitations as claim 4 and is therefore rejected for the same reasons.

Claims 5-7 and 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang in view of Rangan as applied to claims 1 above, and further in view of Jegou (Product Quantization for Nearest Neighbor Search).

Consider claim 5, Wang and Rangan teach the method according to claim 1, but does not specifically teach wherein the determining a vector type of the corpus vector comprises: determining the vector type of the corpus vector through Product Quantization.
In the same field of search, Jegou teaches wherein the determining a vector type of the corpus vector comprises: determining the vector type of the corpus vector through Product Quantization (section 2.2, product quantization).
Therefore it would have been obvious to one of ordinary skill in the art at the time of effective filing to generate vectors with product quantization as taught by Jegou in the system of Wang and Rangan to increase computational efficiency (Jegou introduction).

Consider claim 6, Wang, Rangan and Jegou teach the method according to claim 5, further comprising generating a plurality of corpus vectors (Rangan 0089-90, determining vectors for words found in each of the documents of the corpus), and wherein the determining the vector type of the corpus vector through Product Quantization comprises: 

clustering each of the sub-vector sets using a clustering algorithm (Jegou section 2.2 custering), and generating m class centers for each of the sub-vector sets, wherein m is a positive integer equal to or greater than one (Jegou section 2.2 generating a m centroid); 
determining a class center to which each sub-vector of each corpus vector belongs (Jegou section 2.2 generating a centroid); and 
determining a vector type of the corpus vector according to the class centers to which the k sub-vectors of the each corpus vector belong (section 2.2, determining vector quantization based on centroids).

Consider claim 7, Wang and Rangan teach the method according to claim 6, wherein the generating, according to the vector type and the corpus vector, a corpus having an inverted chain index comprises: generating the corpus having the inverted chain index according to record data, each piece of the record data including corpus vectors having a same vector type and the corresponding vector type (Wang abstract, 0019, 0046, generating an inverted chain index, based on keyword frequencies, Rangan 0089-90, determining vectors for words found in each of the documents of the corpus).

Claim 13 contains similar limitations as claim 5 and is therefore rejected for the same reasons.

Claim 14 contains similar limitations as claim 6 and is therefore rejected for the same reasons.

Claim 15 contains similar limitations as claim 7 and is therefore rejected for the same reasons.

Claims 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang and Rangan as applied to claims 1 above, and further in view of Weare et al. (US PAP 2006/0242135).

Consider claim 8, Wang and Rangan teach the method according to claim 1, further comprising: generating an incremental corpus vector according to incremental corpus content (Rangan 0099, 0112, 0146, incrementally added documents); generating an incremental corpus according to the incremental corpus vector (Rangan 0099, 0112, 0146, incrementally added documents, updating models); and forming a new corpus based on the corpus and the incremental corpus (Rangan 0099, 0112, 0146, incrementally added documents, updating corpus).
Wang and Rangan fail to specifically teach having a tile index. 
In the same field of search, Weare teaches having a tile index (0062 indexing using content tiles).
.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure and is listed on the Notice of References Cited.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DOUGLAS C GODBOLD whose telephone number is (571)270-1451.  The examiner can normally be reached on 7:30-12 Monday and Friday, 7:30-6 Tuesday-Thursday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571) 272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  


DOUGLAS GODBOLD
Examiner
Art Unit 2658



/DOUGLAS GODBOLD/Primary Examiner, Art Unit 2658