DETAILED ACTION

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office action is responsive to the following communication:  Preliminary Amendment filed on 12 October 2018.
The instant application claims foreign priority to Chinese Application No. 201611002092.2, with a filing date of 14 November 2016.
Claim(s) 1-10 and 16-20 is/are pending and present for examination.  Claim(s) 1, 6, and 16 is/are in independent form.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Examiner’s Note
Examiner cites particular columns and/or paragraphs and line numbers in the references as applied to claims below for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may be applied as well. It is respectfully requested that, in preparing responses, the applicant fully consider the references in entirely as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the examiner.

Priority
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.


Information Disclosure Statement
The information disclosure statement (IDS) submitted on 18 October 2018 and 24 June 2021 are being considered by the examiner.

Drawings
The drawings were received on 12 October 2018.  These drawings are accepted.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-10 and 16-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
As per claims 1, 6, and 16, the claim(s) recite(s) in part “acquiring a question set…”, “performing feature extraction…”, “determining whether the question feature set meets a preset splitting condition”, “performing segmenting clustering…”, “updating the question feature…”, and “determining whether the question feature set meets the preset splitting condition.”
The limitations directed towards “acquiring”, “performing”, “determining”, and “updating” are interpreted to be the observation or judgment about the actions the examiner has taken, therefore, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, other than reciting “a device” and “a server” nothing in the claim element precludes the step from practically being performed in the mind.
For example, the “acquiring” feature in the context of this claim encompasses the user mentally evaluating a question set. For example, “performing feature extraction” in the context of this claim encompasses mentally or physically recording the outcome of a feature extraction.  For example, “determining” whether a preset splitting condition is met in the context of this claim encompasses mentally  evaluating whether a feature meets a condition. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer 
This judicial exception is not integrated into a practical application by additional elements. In particular, the claim recites using a processor to perform the steps. The processor is recited at a high-level of generality such that it amounts no more than mere instructions to apply the exception using a generic computer component.  Additionally, the claimed feature of “receiving” and “outputting” is merely insignificant extra-solution activity, i.e., necessary data outputting. See MPEP 2106.05(g). At step 2A, prong two, considering these limitations individually and the claim as a whole, the claim fails to integrate the abstract idea into a practical application.  The elements directed to “receiving” and “outputting” do not integrate the abstract idea into a practical application because they do not impose a meaningful limit on the judicial exception and provide only insignificant extra solution activity that is mere data gathering in conjunction with the abstract idea.
At step 2B, the "receiving” and “outputting" limitations are clearly well-understood, routine, and conventional; see MPEP 2106.05(d)(II), "receiving or transmitting data over a network."  The claim(s) do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the outputting of a question feature set only add well-understood, routine and conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 1681, 1701 (Fed. Cir. 2015); OIP Techs., 788 F.3d at 1363, 115 USPQ2d at 1092-93).  Therefore, the outputting is nothing more than what can be handled by a conventional search engine and does not provide significantly more than the judicial exception. The claim(s) is/are not patent eligible.
As per claims 2, 7, and 17, the limitations are directed towards “determining whether the question feature set can be segmented” and “determining whether the number of question features of the question feature set is greater than a preset splitting number”, which are additional elements beyond the above identified judicial exception. The limitations elaborate upon the aforementioned “Mental Process” of determining whether the question feature set meets a preset splitting condition, and are interpreted to be the observation or judgment, therefore, under its broadest reasonable interpretation, covers performance 
As per claims 3, 8, and 18, the limitations are directed towards “performing feature extraction” and “performing feature mapping”, which are additional elements beyond the above identified judicial exception. The limitations elaborate upon the aforementioned “Mental Process” of performing feature extract on the question set, and are interpreted to be the observation or judgment, therefore, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components.  There are no additional elements that would tie the limitations to a practical application and/or that would amount to significantly more than the judicial exception.
As per claims 4, 9, and 19, the limitations are directed towards “preprocessing the question set to be clustered… [via] stop word removal”, which are additional elements beyond the above identified judicial exception. The limitations elaborate upon the “Mental Process” of applying stop word removal to a set of words, and are interpreted to be the observation or judgment, therefore, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components.  There are no additional elements that would tie the limitations to a practical application and/or that would amount to significantly more than the judicial exception.
As per claims 5, 10 and 20, the limitations are directed towards “performing a database field matching process… and storing the processed clustering class cluster”, which are additional elements beyond the above identified judicial exception. These additional elements represent mere extra-solution activities to the judicial exception.  These elements do not integrate the abstract idea into a practical application because they do not impose a meaningful limit on the judicial exception and provide only insignificant extra solution activity that is mere data processing in conjunction with the abstract idea.
The claim(s) do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the matching and storing processes only add well-understood, routine and conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (See Versata Dev. Group, Inc. v. SAP Am., Inc., 793 F.3d 1306, 1334, 115 USPQ2d 

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:

An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 

Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “a clustering request receiving unit configured to…”, “a clustering question set acquiring unit configured to…”, “a feature extracting unit configured to…”, “a splitting determining unit configured to…”, “a first processing unit configured to…”, and “a second processing unit configured to…” in claim 6.  Claims 7-10 are replete with similar limitations which also recite units which are “configured to” perform some action.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.


Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1, 6, and 16 is/are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Chua et al, USPGPUB No. 2013/0024457, filed on 6 April 2011, published on 24 January 2013, and claiming priority to 6 April 2010.
As per independent claims 1, 6, and 16, Chua teaches:
	A method for processing question clustering in an automatic question and answering system, wherein the method comprises:

receiving a clustering request input by a writer {See Chua, [0014], wherein this reads over “In one embodiment, CQA data comprises user-generated question-answer pairs structured so that one or more answers are associated with a question. A CQA source 115 may associate stored question-answer pairs with a topic, theme or other category to facilitate subsequent data retrieval”};

acquiring a question set to be clustered from a database of unanswered questions based on the clustering request, wherein the question set to be clustered comprises at least one question to be clustered {See Chua, [0003], wherein this reads over “Community-based Question Answering (CQA) and Frequently Asked Question (FAQ) data are similar in that both provide information using pairs of questions and answers”; and [0025], wherein this reads over “In one embodiment, the selection module 330 groups CQA associated with a leaf theme into a predefined number of clusters using K-means clustering, or another suitable clustering method, and identifies representative CQA data having a high quality from a plurality of clusters associated with the leaf node”};

performing feature extraction on the question set to be clustered with a text feature extraction algorithm to output a question feature set, wherein the question feature set comprises at least one question feature {See Chua, [0026], wherein this reads over “In one embodiment, the selection module 330 selects a set of features used to estimate the quality of different question and answer pairs included in CQA data.”};

determining whether the question feature set meets a preset splitting condition {See Chua, [0026], wherein this reads over “Because some features may not be available for each cluster of CQA data or for each question and answer pair within a cluster of CQA data, a subset of the features may also be used to obtain an approximate measure of the quality of a question and answer pair within a cluster of CQA data”};

performing segmenting clustering on the question feature set with a segmenting clustering algorithm if the preset splitting condition is met, to output at least two question feature subsets {See Chua, [0027], wherein this reads over “In one embodiment, a labeled training set is generated for CQA quality estimation, where the labeled training set provides the ground-truth for question-answer pairs of a topic or theme. The inter annotator agreement on question and answer pairs having high quality, "positive instances," is found to be low because of the subjective determination of what constitutes a high quality question and answer pair”; and [0028], wherein this reads over “the low-quality question and answer pairs are obtained from a separate developing CQA source 115 or CQA sources 115A, 115B. Examples of low quality question and answer pairs, or "negative instances," include questions for chatting, questions seeking personal opinions or questions with ungrammatical English”; wherein the sets of “positive instances” and “negative instances” would read upon the claimed “two question feature subsets”};

updating the question feature subsets to a question feature set {See Chua, [0029], wherein this reads over “In one embodiment, the quality score and the representative score are linearly combined to produce a suitability metric describing the suitability of a question and answer pair to be put into the final FAQ. For example, question and answer pairs having a suitability metric exceeding a specified threshold are stored and included in the final FAQ”}, and determining whether the question feature set meets the preset splitting condition {See Chua, [0029], wherein this reads over “In one embodiment, the quality score and the representative score are linearly combined to produce a suitability metric describing the suitability of a question and answer pair to be put into the final FAQ. For example, question and answer pairs having a suitability metric exceeding a specified threshold are stored and included in the final FAQ”}; and

outputting the question feature set as a clustering class cluster if the preset splitting condition is not met {See Chua, [0032], wherein this reads over “For each cluster, the FAQ generator 132 selects 524 a number of representative data of the cluster, and measures 526 the quality of the representative data. For example, the FAQ generator 130 measures the quality of the representative data using a labeled training set, which provides ground-truth of question-answer pairs of a topic or theme. The quality of a representative data in a cluster can be represented by a quality score.”}. 




Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 3, 8, and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chua, in view of Li et al, USPGPUB No. 2013/0226846, filed on 25 February 2013, claiming priority to 24 February 2012, and published on 29 August 2013.
As per dependent claims 3, 8, and 18, Chua, in combination with Li, discloses:
The method for processing question clustering in an automatic question and answering system according to claim 1, wherein performing feature extraction on the question set to be clustered with a text feature extraction algorithm to output a question feature set comprises:

performing feature extraction on the question set to be clustered with a vector space model of an IT-IDF algorithm to output an initial feature set {See Li, [0055], wherein this reads over “In another embodiment, template surfaces within a tf -idf (term frequency inverse document frequency) distance are merged together. The tf -idf is a well-tested statistical measure used to evaluate how important a word is to a document in corpus. Computing tf -idf in the collection of template surfaces can provide empirical observations on each word's weight for reflecting the topic of the template surface. The similarity between word sequences can be defined as the cosine similarity of their if -idf vectors”}; and

performing feature mapping on the initial feature set with an LSI model to output the question feature set {See Li, [0057], wherein this reads over “Therefore, matrix analysis techniques can be applied to reduce noises and/or detect correlativeness in relationship path distributions. Such techniques include, but are not limited to, Principal Component Analysis (PCA), Singular Value Decomposition (SVD), Canonical Correlation Analysis (CCA), and Latent Semantic Indexing ( LSI).”}. 
	Chua is directed to the invention of a system for automatically compiling frequently asked questions from a community-based question answer archive.  Chua fails to disclose the claimed features of “performing feature extraction on the question set to be clustered with a vector space model of an IT-IDF algorithm to output an initial feature set” and “performing feature mapping on the initial feature set with an LSI model to output the question feature set.”

	As per the claimed feature of “performing feature extraction on the question set to be clustered with a vector space model of an IT-IDF algorithm to output an initial feature set,” Li discloses that “template surfaces within a tf -idf (term frequency inverse document frequency) distance are merged together” wherein “[t]he tf -idf is a well-tested statistical measure used to evaluate how important a word is to a document in corpus.”  See Li, [0055].  Additionally, Li discloses that “[t]he similarity between word sequences can be defined as the cosine similarity of their tf -idf vectors.”  Id.  That is, Li discloses a system wherein tf-idf may be utilized to cluster word sequences using a cosine similarity (i.e. a vector space model) of tf-idf vectors (i.e. it-idf algorithm).  It is noted that while applicant claims an “IT-IDF algorithm”, said “IT-IDF algorithm” has been described within the instant Specification as “term frequency-inverse document frequency”, which is presented as “tf-idf” by the prior art of Li. 
	As per the claimed feature of “performing feature mapping on the initial feature set with an LSI model to output the question feature set,” Li discloses “matrix analysis techniques can be applied to reduce noises and/or detect correlativeness in relationship path distributions” wherein ”[s]uch techniques include, but are not limited to, Principal Component Analysis (PCA), Singular Value Decomposition (SVD), Canonical Correlation Analysis (CCA), and Latent Semantic Indexing ( LSI).”  See Li, [0057].  That is, Lie discloses a system wherein Latent Semantic Indexing may be utilized to reduce noise within similarity matrices (i.e. perform feature mapping on the initial feature set).
Accordingly, wherein Li discloses a system for utilizing tf-idf to determine distances and similarities of words within a question answer system, it would have been obvious to one of ordinary skill in the art before the effective filing date to improve the prior art of Chua with that of Li such that the CQA data of Chua may be further analyzed via tf-idf and LSI.
Claims 4, 9, and 19 are is/are rejected under 35 U.S.C. 103 as being unpatentable over Chua, in view of Fang et al, U.S. Patent No. 10,049,148, filed on 13 August 2015, claiming priority to 14 August 2014, and issued on 14 August 2018.
As per dependent claims 4, 9, and 19, Chua, in combination with Fang, discloses:


preprocessing the question set to be clustered with a text preprocessing algorithm, wherein the text preprocessing algorithm comprises at least one of unification of traditional Chinese and simplified Chinese, unification of upper case and lower case, Chinese word segmentation, and stop word removal {See Fang, column 4, lines 51-67, wherein this reads over “The identified stop words can be removed or marked. The remaining words in the text are non-stop words.”}. 

Chua is directed to the invention of a system for automatically compiling frequently asked questions from a community-based question answer archive.  Chua fails to disclose the claimed features of “preprocessing the question set to be clustered with a text preprocessing algorithm, wherein the text preprocessing algorithm comprises at least one of unification of traditional Chinese and simplified Chinese, unification of upper case and lower case, Chinese word segmentation, and stop word removal.”
	Fang is directed to the invention of a system for enhanced text clustering based on topic clusters.  Specifically, Fang discloses that “[t]he identified stop words can be removed or marked” while “[t]he remaining words in the text are non-stop words.” See Fang, column 4, lines 51-67.  Wherein Fang discloses a system for removing stop words, it would have been obvious to one of ordinary skill in the art before the effective filing date to improve the prior art of Chua with that of Fang such that stop words may be removed from the question set via a preprocessing algorithm.
Claims 5, 10, and 20 are is/are rejected under 35 U.S.C. 103 as being unpatentable over Chua, in view of Asadullah et al, USPGPUB No. 2012/0254162, filed on 19 May 2011, and published on 4 October 2012.
As per dependent claims 5, 10, and 20, Chua, in combination with Asadullah, discloses:
The method for processing question clustering in an automatic question and answering system according to claim 1, further comprising: 

performing a database field matching process on the clustering class cluster and storing the processed clustering class cluster in a cluster question database {See Asadullah, [0127], wherein this reads over “[t]he index documents 1540 can have fields such as the fields of index document 1545. In index document 1545 there can be a query result field that includes text for a source-code query result that can be displayed. Also index document 1545 can include a code field 1545E that includes the code of the code element. The index document 1545 can include a cluster identification field 1545C that can include words, data, or information used in determining a cluster ID”; and [0154], wherein this reads over “For example, an index document can include a cluster identification field that can include words, data, or information used in determining a cluster ID for a cluster that includes the source-code search result of the index document”}. 


	Asadullah is directed to the invention of a system of facet support and clustering for source code query results.  Specifically, Asadullah discloses that “[t]he index document 1545 can include a cluster identification field 1545C that can include words, data, or information used in determining a cluster ID.” See Asadullah, [0127].  Additionally, Asadullah discloses that “an index document can include a cluster identification field that can include words, data, or information used in determining a cluster ID for a cluster that includes the source-code search result of the index document.”  See Asadullah, [0154]. Wherein Asadullah discloses a system for determining and search for cluster IDs, it would have been obvious to one of ordinary skill in the art before the effective filing date to improve the prior art of Chua with that of Asadullah such that the clusters of Chua may be further identified and stored according to the cluster ID determination of Asadullah.

Allowable Subject Matter
Claims 2, 7, and 17 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL KIM whose telephone number is (571)272-2737.  The examiner can normally be reached on Monday-Friday, 9AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


/Paul Kim/
Examiner
Art Unit 2152



/PK/