Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION 

2.	Claims 1, 5-10, 14-18 and 20 are presented for examination.
3.          This office action is in response to the RCE filed 02/24/2021. 
4.	Claims 1, 10 and 18 are independent claims. 
5.	Various claims have been amended to more particularly define the invention. No new matter has been added. Claims 22-23 have been cancelled without prejudice or disclaimer. 
6.	The office action is made Non-Final.

Continued Examination under 37 CFR 1.114
7.	A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 02/24/2021 has been entered.


Examiner Note
8.         The Examiner cites particular columns and line numbers in the references as applied to the claims below for the convenience of the Applicant(s). Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the Applicant fully consider the references in their entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.


Claim Rejections - 35 USC § 103
9.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

10.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
a) A patent may not be obtained through the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains.  Patentability shall not be negatived by the manner in which the invention was made.

11.        Claims 1, 5, 7, 10, 14, 16, 18 and 19 are rejected under 35 U.S.C.103 as being unpatentable over Essafi et al (20070271224) in view of Zhang et al (US 20170228361 A1) and Martineau et al (US 20190392330 A1) and Further in view of Lee et al (US 7975003 B1).


12.	Regarding claim 1 (Currently amended), Essafi teaches a computer-implemented concept discovery method, the method comprising: 
preparing a concept index for concepts built over a set of input data comprising structural data having input terms (Figs 1 to 5, [0110]-[0111],”indexing multimedia documents by extracting terms for each document (a set of input data comprising structural data having input terms), where the terms are constituted by vectors characterizing properties of the document to be indexed and further extracting concepts to structuring the concept dictionary 5 or fingerprint base (concept index)”, see [0009] and [0038], [0049] and [0304], [0307], extracting terms describing the content of a structural component of documents/ images (input data)); 
building a vector representation of the concepts in the input data ([0209], the signature vectors (a vector representation) of each concept of the dictionary 5, The signature vector is constituted by the documents where the concept is present and by the positions and the weight of said concept in the document”, more specifically in Fig 4, [0241], The fingerprint base 10 (a vector representation) is constituted by the set of concepts representing the terms of the documents to be protected (input data)); 
receiving a set of query terms as an additional input (Figs 6 and 7, step 331, [0251], [0258], [0279], query terms 331 are received as an additional input in addition to the fingerprint base (after the concept index was built)); 
mapping the set of query terms to the concepts in the concept index (Figs 6 and 7, step 331, [0251], “search for concepts in the concept index that match (mapping) the set of query term (step 33 of Fig 6 and step 332, Fig 7)”, [0255], for each term in the query (step 331 f Fig 7), the concept Ci is determined (step 332)); 
Calculating: 
a co-occurrence score for each of the concepts in the concept index by measuring their frequency of co-occurrence with the input terms' concepts (Fig 7, [0017], for each document in which a concept Ci is present, a fingerprint of the concept Ci is registered in the document, said fingerprint containing the frequency with which the concept occurs”, see also [0046] and more specifically in Fig 4, [0241], [0305], “The occurrences of these concepts and their positions and frequencies constitute the "fingerprint" of a document.  These fingerprints then act as links between a query document and documents in a database while searching for a document”, More specifically see Fig 7, [0252]-[0257], for each term ti (input terms) in the query provided in step 331 (FIG. 7), the concept Ci that represents it is determined (step 332). For each document dj where the concept is present, the function f (frequency, score) =frequency. Score (a co-occurrence score) where frequency designates the number of occurrences of concept Ci in document dj and where score designates the mean of the resemblance scores of the terms of document dj with concept Cj. The pdj are ordered, and those that are greater than a given threshold (step 333) are retained.  Then the responses are confirmed and validated (step 34).) , the calculation relying on an efficient index to measure a level of co-occurrence of concepts in a collection of events in the structured data and uses this as a measure of relevance (From the mapping of the previous limitation “calculating the co-occurrence score for each of the concepts in the concept index by measuring their frequency of co-occurrence with the input terms' concepts “, the calculating is relying on an efficient index (document/fingerprint base (index) of Fig 4), see also [0110], [0183], [0187]); and 
a similarity score for each of the concepts in the concept index by measuring the similarity of their vector representations according to a vector similarity measure ([0017], “a score which is a mean value of similarity measurements between the concept ci and the terms ti of the document that are the closest to the concept ci”, [0060], [0198]-[0199]); and further implicitly teaches
ranking the concepts with respect to their relevance to the input terms by the a combination of the co-occurrence score and the similarity score (Fig 7, [0017], “a score which is a mean value of similarity measurements between the concept ci and the terms ti of the document that are the closest to the concept Ci.”, more specifically see Fig 7, step 34 and [0252]-[0258], for each term in the query, the concept Ci is determined. For each document dj, where the concept is present, pj is updated wherein pj is the degree of resemblance between document dj and the query document, pj are ordered”, see also o [0110], [0183], [0187] and [0241]).
Essafi further teaches wherein the set of input data is prepared without interaction with an existing database ,for both event databases and knowledge sources (Figs 1 to 5, [0110]-[0111, see [0009] and [0038], [0049] and [0304], [0307], extracting terms describing the content of a structural component of documents/ images (input data))

Essafi did not specifically teach where every value in the input data is transformed into an embedding vector using a variation of a skip-gram model of word2vec; wherein the set of input data is prepared by a common ingestion pipeline operating on a cluster in a cloud node, 
However Zhang explicitly teaches where every value in the input data is transformed into an embedding vector using a variation of a skip-gram model of word2vec ([0060]); and further teaches wherein the set of input data is prepared by a common ingestion pipeline operating on a cluster in a cloud node, without interaction with an existing database, for both event databases and knowledge sources and the common ingestion pipeline crawls remote sources, cleaning invalid records and applying filters, and then storing a result as the concept index ([0059]-[0063], “The preprocessing module 142 can employ and pipeline natural language processing (NLP) sub-processes to extract features from the data set of text, such as features from an email communication.”, [0089], “the majority method wherein extracted sentences are ordered based on their relative frequency among the email communications; and the topic method wherein the extracted sentences are ordered based on topic clusters among the electronic communications ordering extracts sentences from their original documents and clusters them into topic clusters.”, [0096], “The first party preferably can filter the types of texts that the retrieval agent module 150 retrieves”, see also [0055], for validating records).
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to incorporate the concept of teachings suggested in Zhang’s system into Essafi’s and by incorporating Zhang into Essafi because both system are generally related to the field of information retrieval (IR) would have an IR system that retrieves content from electronic messages and that quickly and accurately analyzes and stores the information for later use;

Further Martineau teaches ranking the concepts with respect to their relevance to the input terms by a combination of the co-occurrence score and the similarity score ([0096], the semantic similarity scores and the co-occurrence scores are combined).  
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to incorporate the concept of combining two scores suggested in Martineau’s system into Essafi and Zhang combined system and by incorporating Martineau into Essafi and Zhang combined system because all system are generally related to recommendation systems would generate aspect-enhanced explainable description-based recommendations.
For the recently limitation added “working on top of an existing database, the preparing being based on a payment and performing a minimal curation by a lightweight mapping of known entities and linking the known entities using a Uniform Resource Indicator (URI) until a cost of a running of the minimal curation meets the payment”, the support of this limitation can be found on Pre-Grand pub [0027], as a well-known technology “pay-as-you-go" that the invention adopt it.
Further Lee teaches the same limitation and the same technology.
preparing a concept index for concepts built over a set of input data comprising structured data having input terms (col 2, lines 3-9, lines 52-67, col 11, lines 29-44 and col 12, lines 4-14).
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to incorporate the concept of pay-as-you-go suggested in Lee’s system into Essafi, Zhang and Martineau combined system and by incorporating Lee into Essafi, Zhang and Martineau combined system because all system are generally related to recommendation systems would enable the tracking of payable media content transactions.


13.	Regarding claim 5 (Original), Essafi, Zhang, Martineau and Lee teach the invention as claimed in claim 1 above and further Essafi teaches wherein the input data comprises event databases and knowledge bases (Fig 8, element 51).  

14.	Regarding claim 7 (Original), Essafi, Zhang, Martineau and Lee teach the invention as claimed in claim 1 above and further Essafi teaches receiving at least one of an image and a video as a further input ([0004], [0009], [0054], [0058]-[0059], Fig 9, [0076], processing a query as an image and extracting query terms by analyzing the image properties such shape/structural.., see also [0279]); and extracting query terms using at least one of optical character recognition (OCR), speech recognition, examination of captions, and machine recognition of objects and people ([0004], [0009], [0054], [0058]-[0059], Fig 9, [0076], processing a query as an image and extracting query terms by analyzing the image properties such shape/structural.. see also [0279]).

15.	Regarding claims 10, 14 and 16, those claims recite computer program product for concept discovery, the computer program product comprising a computer-readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to perform the method of claims 1, 5 and 7 respectively and are rejected under the same rationale.

16.	Regarding claim 18, this claim recites a system performs the method of claim 1 and is rejected under the same rationale.

17.        Claims 6 and 15 are rejected under 35 U.S.C.103 as being unpatentable over Essafi et al (20070271224) in view of Zhang et al (US 20170228361 A1), Martineau et al (US 20190392330 A1) and Lee et al (US 7975003 B1) as claimed in claim 1 above and Further in view of Lastra Dtaz et al (US 20160179945 A1) hereinafter as Lastra.

18.         Regarding claim 6 (Original), Essafi, Zhang, Martineau and Lee teach the invention as claimed in claim 1 above, Essafi, Zhang, Martineau and Lee did not specifically teach receiving a natural language question as a further input; and extracting query terms from the question.
However Lastra teaches receiving a natural language question as a further input; and extracting query terms from the question ([0050], questing answering system applied in the context of any natural language processing (NLP), [0053], potential answer to a question and [0089], he system parses any query into lexical elements defined by words or phrases, see also [0287]).
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to incorporate the concept of receiving a natural language question as a further input; and extracting query terms from the question suggested in Lastra’s system into Essafi, Zhang, Martineau and Lee combined system and by incorporating Lastra into Essafi, Zhang, Martineau and Lee combined system because all system are generally related to the field of information retrieval (IR) would improve the results expected by the final users of any information search system.

19.	Regarding claim 15, this claim recites a computer program product for concept discovery, the computer program product comprising a computer-readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to perform the method of claim 6 and is rejected under the same rationale.

20.        Claims 8 and 17 are rejected under 35 U.S.C.103 as being unpatentable over Essafi et al (20070271224) in view of Zhang et al (US 20170228361 A1), Martineau et al (US 20190392330 A1) and Lee et al (US 7975003 B1) as claimed in claim 1 above and Further in view of Lamba et al (US 20140067832 A1) hereinafter as Lamba.

21.         Regarding claim 8 (Previously presented), Essafi, Zhang, Martineau and Lee teach the invention as claimed in claim 1 above, Essafi, Zhang, Martineau and Lee further implicitly teach measuring a relatedness of the concept index to the query terms (same as co-occurrence score/ similarity score of terms in the query to the term of the document to be returned as a search result).
Further Lamba explicitly teaches measuring a relatedness to the query terms ([0005], measuring relatedness of terms in the query to a document to be returned as a search result.).  
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to incorporate the concept of measuring a relatedness to the query terms suggested in Lamba’s system into Essafi, Zhang, Martineau and Lee combined system and by incorporating Lamba into Essafi, Zhang, Martineau and Lee combined system because all system are generally related to the field of information retrieval (IR) would establish a relationships between terms of a query and a category in a taxonomy.

22.	Regarding claim 17, this claim recites a computer program product for concept discovery, the computer program product comprising a computer-readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to perform the method of claim 8 and is rejected under the same rationale.

23.        Claims 9 and 20 are rejected under 35 U.S.C.103 as being unpatentable over Essafi et al (20070271224) in view of Zhang et al (US 20170228361 A1), Martineau et al (US 20190392330 A1) and Lee et al (US 7975003 B1) as claimed in claim 1 above and Further in view of Cohen et al (US 9672279 B1) hereinafter as Cohen.

24.         Regarding claim 9, Essafi, Zhang, Martineau and Lee teach the invention as claimed in claim 1 above, Essafi, Zhang, Martineau and Lee did not specifically teach the computer-implemented method of claim 1, embodied in a cloud-computing environment.
However Cohen explicitly teaches the computer-implemented method of claim 1, embodied in a cloud-computing environment (Fig 1, Fig 6, the cloud infrastructure 600).
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to incorporate the concept of the cloud-computing environment suggested in Cohen’s system into Essafi, Zhang, Martineau and Lee combined system and by incorporating Cohen into Essafi, Zhang, Martineau and Lee combined system because all system are generally related to the field of information retrieval (IR) would implement a cluster labeling system for documents comprising unstructured text data.

25.	Regarding claim 20, this claim recites a system to perform the method of claim 9 and is rejected under the same rationale.

Respond to Amendments and Arguments
26.	Applicant argued that the combination of Essafi in view of Zhang and further in view of Martineau and/or Zhang, Lastra Dtaz, Lamba and Cohen does not teach or suggest "wherein the set of input data is prepared by a common ingestion pipeline operating on a cluster in a cloud node, without interaction with an existing database, for both event databases and knowledge sources and the common ingestion pipeline crawls remote sources, cleaning invalid records and applying filters, and then storing a result as the concept index", as recited in exemplary claim 1.
Applicant's arguments received on 02/24/2021 have been fully considered but they are not persuasive. Referring to the previous Office action, Examiner has cited relevant portions of the references as a means to illustrate the systems as taught by the prior art. As a means of providing further clarification as to what is taught by the references used in the first Office action, Examiner has expanded the teachings for comprehensibility while maintaining the same grounds of rejection of the claims, except as noted above in the section labeled “Status of Claims.” This information is intended to assist in illuminating the teachings of the references while providing evidence that establishes further support for the rejections of the claims.

CONCLUSION

Any inquiry concerning this communication or earlier communications from the examiner should be directed to HICHAM SKHOUN whose telephone number is (571)272-9466.  The examiner can normally be reached on Normal schedule: Mon-Fri 10am-6:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached on 5712724046.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/HICHAM SKHOUN/Primary Examiner, Art Unit 2169