Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1, 4-5, 7, 10-11, 13, 16-17, and 19-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Bai et al. US 2009/0204605 (hereinafter “Bai”).
Regarding claim 1, Bai discloses a computer-implemented method (see paragraph 0007, a computer process for searching for information contained in a database of documents)

    PNG
    media_image1.png
    197
    442
    media_image1.png
    Greyscale

, comprising: for each file from a plurality of files: chunking the file into a plurality of chunks (see figure 1, and paragraph 0014, documents are collected in step 10 of figure 1 and processed in step 13 with a sentence splitter, the sentence splitter is interpreted by the Examiner as equivalent to chunking the file [document] into a plurality of chunks [sentences]);

    PNG
    media_image2.png
    211
    430
    media_image2.png
    Greyscale


    PNG
    media_image3.png
    522
    763
    media_image3.png
    Greyscale


 loading the plurality of chunks into a neural network and operating the neural network to generate a plurality of feature vectors, one feature vector for each of the plurality of chunks, wherein the neural network is pre-trained to generate similar feature vector for similar chunks and dissimilar vector for dissimilar chunks (see paragraph 0014 above, the sentences at the output 14 of the sentence splitter are processed by a tagger comprising a neural network which computers part-of-speech [POS] tags for each sentence [the POS tags are interpreted by the Examiner as equivalent to feature vectors, see paragraph 0016 which supports that the neural network tagger generates vectors, the first layer of the neural network tagger converts each work into a low dimensional vector], the POS tags created would inherently generate similar POS for similar sentences and dissimilar POS tags for dissimilar sentences, note that the POS are used in a similarity ranking in an index of figure 1) 

    PNG
    media_image4.png
    419
    350
    media_image4.png
    Greyscale

; assemble the plurality of feature vectors into a single vector (see figure 2 and paragraph 0018, in the third layer of the neural network tagger, the columns or vectors of the resulting matrix from the second layer are converted into a single vector in step 24); 

    PNG
    media_image5.png
    665
    790
    media_image5.png
    Greyscale


    PNG
    media_image6.png
    380
    432
    media_image6.png
    Greyscale

loading the single vector into the neural network to generate a file feature vector (see figure 2 step 25 takes as an input the single vector and generates a semantic class prediction which is read as the file feature vector), wherein the neural network is pre-trained to generate similar file feature vector for similar assembled feature vectors and dissimilar file feature vector for dissimilar assembled feature vectors (again this is inherent as similar words and sentences will generate similar classifications and dissimilar words and sentences will generate dissimilar classifications); and storing the file feature vectors of the plurality of files into a file feature vector database (see figure 1, 101 and paragraph 0014, the sentences and their POS tags...are stored in a database in step 19).
Regarding claim 4, Bai discloses upon receiving a query file (see query 17 of figure 1 and paragraph 0007 above) performing the operations: chunking the query file into a plurality of chunks (see figure 2 above, the words of the query are separated [into chunks]; loading the plurality of chunks into the neural network and operating the neural network to generate a plurality of feature vectors, one feature vector for each of the plurality of chunks (see figure 2 and paragraph 0014 above); concatenating the plurality of feature vectors into a query vector (see figure 1, paragraph 0007 and 0014 above); loading the query vector into the neural network and operating the neural network to search the file feature vector database for all file feature vectors within a specified distance from the query vector (see paragraph 0016 and figure 1 above).
Regarding claim 5, Bai discloses wherein a distance between the query vector and a file feature vector equals a distance between the query file and a file corresponding to the file feature vector to within a predefined error (paragraph 0018 above the neural network tagger is trained by backpropagation to minimize training error).
Claim 7 is similarly analyzed to claim 1. 
Claims 10-11 are similarly analyzed to claim 4-5. 
Claim 13 is similarly analyzed to claim 1. 
Claims 16-17 are similarly analyzed to claims 4-5.
Regarding claim 19, Bai discloses A data processing system, comprising: a backup database comprising a plurality of files (see paragraph 0007 and figure 1 as well as the explanation above for claim 1); a similarity index database comprising a plurality of similarity vectors, one similarity vector representing each of the files of the plurality of files (see paragraph 0014 above as well as the explanation given above for the similar limitation of claim 1), wherein each of the similarity Atty. Docket No.: 206368.0587.827 Patent Applicationvectors is generated from a concatenation of a plurality of chunk feature vectors (see paragraph 0018 and figure 2 as well as the explanation above where the POS features are converted into a single vector), each of the chunk feature vectors representing a chunk of a file from the plurality of files (see paragraphs 0007 and 0018 above); a neural network pre-trained to receive a query file, generate a query vector, and provide an indication of all similarity vectors from the plurality of similarity vectors that are within a specified distance from the query vector (see paragraph 0007 above).
Regarding claim 20, Bai discloses a chunk feature database storing the plurality of chunk feature vectors (101 of figure 1), and wherein the neural network is further pre-trained to receive a query chunk (17 of figure 1 above), generate a chunk query vector, and provide an indication of all similarity chunk vectors from the plurality of chunk feature vectors that are within a specified distance from the chunk query vector (similarity ranking 12 above and results 18, see also paragraph 0007 above).

Allowable Subject Matter
Claims 2-3, 6, 8-9, 12, 14-15, 18 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Please see the attached 892 notice of references cited.


Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHN B STREGE whose telephone number is (571)272-7457. The examiner can normally be reached M-F 9-5 (PST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan Park can be reached on (571)272-7409. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JOHN B STREGE/           Primary Examiner, Art Unit 2669