DETAILED ACTION
This action is responsive to the Amendment filed on 28 October 2022. Claims 1-12 are pending in the case. Claims 1 and 8 are the independent claims.
This office action is FINAL.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Applicant’s Response
In Applicant’s response dated 28 October 2022 (hereinafter Response), Applicant amended Claims 1, 4, 6, and 8; Amended the Drawings; and argued against all objections and rejections previously set forth in the Office Action dated 30 June 2022 (hereinafter Previous action).
Applicant's amendment to the drawings is acknowledged.
Applicant’s amendment to claims 1, 4, 6, and 8 to further clarify the metes and bounds of the invention are acknowledged.
Response to Amendment/Arguments
In response to Applicant's amendment to the drawings, the objection to the drawings in the Previous action is respectfully withdrawn.
In response to Applicant’s amendment with respect to the 35 U.S.C. § 112 rejection of claim 4, Applicant has added wherein the obtained original data includes different types of preprocessed data for extracting the different feature values without providing any citation of support for this limitation. 
The portions of the detailed written description relevant to “preprocessing” are [031],[036-039], none of which clearly states the obtained original data includes different types of preprocessed data for extracting the different feature values. At best, there may be a Chinese language preprocessing and English preprocessing (see [031],[036]) but the claim is not limited by language. 
Further, the disclosure states [037] In some embodiments, different preprocessing methods are selected depending on the type of the feature value: which is the language of the original limitation and [038] Specifically, in some embodiments, the preprocessing may be different processing performed on the news text depending on the type of feature value. For example, the following four processing methods a, b, c, and d are performed in the preprocessing: a. deleting invalid characters; b. deleting stop words; c. deleting numerals; and d. extracting stems and restoring parts of speech (none of which are described as language dependent).
Accordingly, new grounds of rejection under 35 USC 112(a) as lacking written descriptive matter is appropriate, in response to Applicant’s amendment.
New grounds of rejection in view of art is also appropriate, under the interpretation that if the reference or combination of references teaches preprocessing which includes the possible preprocessing mechanisms (deleting invalid characters, deleting stop words, deleting numerals, and extracting stems/restoring parts of speech) then the art teaches the ability to choose to apply one (or more mechanisms) individually or in any combination, to the obtained original data to obtain preprocessed data, where the preprocessed data may then be used to obtain the claimed feature value types.
In response to Applicant's argument with respect to the 35 U.S.C. § 112 rejection of claims 6-7 (see Response, page 8), Applicant has amended claim 6 to depend on claim 1 (rather than as previously claim 5) and to recite wherein the extracting the probability model feature value comprises the following steps:… however, the disclosure does not clearly support this limitation as written and Applicant has provided no citation of support.
The written description supports extracting a probability model at [012, 048-052], using a probability model feature value module at [016 057], extracting probability model feature value at [032] without details, using a probability model to process a news text in the original data, for example, extracting a stem, restoring a part of speech, deleting a stop word, filtering out an invalid word, etc. at [050] (though not specifically probability model feature values; generically for processing text).
Accordingly, new grounds of rejection under 35 U.S.C. § 112(a) are made in response to Applicant’s amendment. Further, as it remains unclear as to the scope of how extracting the probability model feature value comprises obtaining a latent Dirichlet allocation model (used for generating a document) but not using the obtained model to perform the extraction, claim 6 remains rejected under 35 USC 112(a) as indefinite.
In response to Applicant's argument against the rejection of claims 1-2 and 10-12 under 35 USC 102 as anticipated by SISK (see Response, starting page 8), Examiner respectfully disagrees.
Applicant does not argue against specific citations in SISK relied upon in the rejection of claim 1, and instead provides only a brief summary (As such, analyzing past news and the resulting response of the stock price can help build a model to predict stock behavior when examining short, recent spans of news stories. See, e.g., claim 1 and paragraph [13] of Sisk. Sisk does not teach all of the limitations of claim 1, as amended, such as extracting three different and particular feature values (i.e., metadata, a keyword, and a probability model feature value); obtaining a score for each of those three feature values, and then evaluating significance of the text news based on such scores).
The instant application does not have any specific definitions for the types of feature values which are extracted (“metadata”, “keyword”, “probability model”).

    PNG
    media_image1.png
    109
    479
    media_image1.png
    Greyscale
The previous action rejected the “extracting … feature values…” limitation of claim 1 relying on SISK FIG 3 (304), 
with additional information regarding “feature engineering” found in [0053-0054] and clearly teaches at least three different feature values (metadata or other descriptors (e.g. about the article contents), keywords (e.g. company, words, phrases), attribute relevance (a probability value)) having different types.

The instant application does not have any specific definition for “a score of each feature value according to a weight ratio corresponding to each feature value.” The limitation was interpreted as “determining a weighted value associated with each feature value to be used when calculating the overall sentiment score”.


    PNG
    media_image2.png
    123
    373
    media_image2.png
    Greyscale
The previous action rejected the “obtaining a score of each feature value…” limitation of claim 1 relying on SISK FIG 3 (306) with additional evidence from [0088-0089]:

[0088] …uses a sentiment engine… ingests (reads) news stories, identifies the news stories as relating to Company A and then scores the news stories as follows: +1 for positive story; 0 for neutral (e.g., mere mention) news story; and -1 for negative news story. Types of news stories {i.e. metadata}…
[0089] … The NAS may also look to the content and context of the news stories and weigh them in accordance with a predetermined taxonomy. The weighting may factor in recency, criticality, repeatedness, trustworthiness, etc.
The “context and content” of the news story includes the identified metadata (e.g. the type of news story), the identified words and phrases (e.g. company, other attributes), and the determined relevance (probability) of an attribute to a particular company. Each “feature value” is weighted (thus an intermediate score is determined) for its contribution to the overall sentiment score of the news story for a particular company. The overall sentiment score is considered the “significance” of the text news, which was made clear in the rejection of this element in claim 1 which also relied on (306) and [0088].
The rejection of claim 1 is respectfully maintained. Applicant makes no other arguments with respect to claims 2 and 10-12, thus these rejections are maintained for the same reasons.
In response to Applicant's argument against the rejection of claims 3 and 8-9 under 35 USC 103 as unpatentable over SISK in view of MITTENMAYER (see Response, starting page 9), Examiner respectfully disagrees because 
Applicant argues patentability of claim 3 based on its dependence on claim 1;
Applicant argues patentability of claim 8 for at least the same reasons as claim 1; and
Applicant argues patentability of claim 9 based on its dependence from claim 8.
These are not persuasive for the reasons discussed above.
In response to Applicant's argument against the rejection of claims 5 and 8-9 under 35 USC 103 as unpatentable over SISK in view of MITTENMAYER, further in view of CSOMAI and KIM (see Response starting page 12), Examiner respectfully disagrees.
Applicant first argues patentability based on dependence of claim 5 from claim 1 (see Response page 13). This is not persuasive for the reasons discussed above.
Applicant then argues that CSOMAI fails to teach all the elements of S1 in claim 5, but does not argue against the rejection of record for this limitation (see Response, starting bottom of page 13), 
constructing a multivariate dictionary, further comprising: selecting financial sector keywords to form a static dictionary; dynamically obtaining training set keywords through natural language processing and neural network training to form a dynamic dictionary, wherein the training set keywords do not overlap with the financial sector keywords; and combining the static dictionary and the dynamic dictionary to form a multivariate dictionary;
The rejection of record relies on SISK to teach “financial sector keywords”, “recognizing named entities”, and “dictionary” (see Previous action item 31), and notes that the deficiency is that SISK uses this dictionary (a database of data) without explanation of how it was created other than natural language processing and other linguistics technology.
The rejection of record then relies on MITTERMAYER to teach selecting a static dictionary and dynamically generating a dictionary of feature keywords (see Previous action items 33-34).
The rejection of record then adds teachings of CSOMAI to teach constructing a multivariate dictionary which is a combination of a static dictionary and a dynamic dictionary, where the dynamic dictionary is generated through the use of a trained neural network and natural language processing.  CSOMAI is not relied upon to teach the entirety of S1 in claim 5, it is used to improve the teachings of SISK in view of MITTERMAYER which already teaches a multivariate dictionary (static dictionary plus dynamic dictionary). CSOMAI is used explain how the dynamic dictionary may be generated using a neural network.
Applicant makes no argument against the reasonable combination of teachings for this limitation (see Previous action, item 37). As further evidence, Examiner provided KIM to tie all the other claim elements together (see Previous action, items 38-42) to insure that it was clear that the trained neural network of CSOMAI (for learning keywords generally) could be used specifically for ranking (scoring) named entities in financial news articles in order to determine the relevancy of the news text to financial markets.
Accordingly, this rejection is respectfully maintained.
As there was no art rejection provided for claims 4, 6-7 due to the 35 USC 112(b) rejection, any new grounds of rejection in response to Applicant’s amendment required for these claims is made response to Applicant’s amendment.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) are, in claims 8-9:
a text news reading module, configured to read text news…
a text news preprocessing module, configured to preprocess the text news read by the text news reading module…
a feature value extraction module, configured to extract, from the original data, feature values…
a feature value weight determining module, configured to calculate a weight ratio corresponding to each feature value, and determine a score of each feature value.
a text news significance evaluation module, configured to evaluate significance of the text news…
a metadata module, configured to calculate a quantity of numerals…
a keyword module, configured to dynamically obtain keywords and sort the keywords…
a probability model feature value module, configured to perform dynamic distribution training on a probability model, and obtain a model and training set keywords for significance evaluation
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
The apparatus of claims 8 and 9 is shown in FIG 2 [056-057] without any additional description. It is noted that the instant application generically describes in [058] an “execution module that can run on the electronic device of FIG 3”. The operations performed by each named module are, at best, described with respect the operational steps of FIG 1.
Should Applicant disagree with the analysis of claims 8-9 under 35 USC 112(f), the claims would be open to a rejection under 35 USC 101 as being directed to software per se.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 4, 6-7 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
Regarding dependent claim 4, as explained in detail in the Response to Arguments section above, the limitation wherein the obtained original data includes different types of preprocessed data for extracting the different feature values recited in claim 4 fails to find support in the disclosure as originally filed relevant to “preprocessing” [031],[036-039] (there may be a Chinese language preprocessing and English preprocessing (see [031],[036]) but the claim is not limited by language; [037] In some embodiments, different preprocessing methods are selected depending on the type of the feature value: which is the language of the original limitation and [038] Specifically, in some embodiments, the preprocessing may be different processing performed on the news text depending on the type of feature value. For example, the following four processing methods a, b, c, and d are performed in the preprocessing: a. deleting invalid characters; b. deleting stop words; c. deleting numerals; and d. extracting stems and restoring parts of speech (none of which are described as language dependent)).
Regarding dependent claim 6 (and thus claim 7 which inherits the deficiency of claim 6), as explained in detail in the Response to Arguments section above, the limitation wherein the extracting the probability model feature value comprises the following steps… is not supported in the disclosure as originally filed (after consideration of extracting a probability model at [012, 048-052], using a probability model feature value module at [016, 057], extracting probability model feature value at [032] without details, using a probability model to process a news text in the original data, for example, extracting a stem, restoring a part of speech, deleting a stop word, filtering out an invalid word, etc. at [050] (though not specifically probability model feature values; generically for processing text)).
Claims 6-7 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Regarding dependent claim 6 (and thus claim 7 which inherits the deficiencies of claim 6), it is not clear how extracting the probability model feature value comprises obtaining a latent Dirichlet allocation model (used for generating a document) without using the obtained model to perform the extraction of the feature value.
As there is no reasonable interpretation for the subject matter of claims 6-7 in view of the disclosure as originally filed, no art rejections are provided below. Note, however, the additional references at the end of this Office action, as well as the Previous Office action, which may be relevant to the claimed subject matter, in addition to any references cited in the art rejections of claims 1-5 and 8-12.
Claim Rejections – 35 USC 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-2, 10-12 are rejected under 35 U.S.C. 102(a)(1) as being unpatentable over SISK, Jacob (Pub. No.: US 2013/0138577 A1, previously cited).
Regarding claim 1, SISK teaches the text-based news significance evaluation method, comprising (see e.g. method FIG 3 [0051], performed by the system in FIG 2 [0045] NAS includes a News Article Processing Engine that cooperates with IIT 126 and metadata module 116, that includes or may cooperate with one or more search engines for receiving and processing against metadata and aggregating, scoring, and filtering, recommending, and presenting results):
reading text news ((302) obtain news articles from internal or external sources);
preprocessing the text news to obtain original data ((304) apply pre-processing to documents…; [0041] Metadata module 116 includes is adapted to identify, extra or apply, or otherwise discern metadata associated with news stories. Such meta data may be used by NAS 100 to pre-process news stories, e.g., sentence splitting, speech tagging, parsing of text, tokenization, etc ., to facilitate association of stories with one or more companies and to prepare the content for the application of computational linguistic processes and for sentiment analysis);
extracting at least three different feature values from the original data, wherein the feature values comprise metadata, a keyword, and a probability model feature value ((304)…to identify embedded metadata or other descriptors, process test, words, phrases, and attribute relevance to one or more companies; [0083] Features are encoded representations and analytics of how people think about, process and respond to news; [0084] shows examples of feature engineering including news type, genre, source, relevance level (probability), novelty; these are metadata, keywords, and probability);
obtaining a score of each feature value ((306) apply sentiment analysis…[0088] uses a sentiment engine) according to a weight ratio corresponding to each feature value ([0089] NAS may also look to the content and context of the news stories and weigh them in accordance with a predetermined taxonomy…weighting may factor in recency, criticality, repeatedness, trustworthiness, etc.; note that the computation of a “sentiment” requires some weighted computation of the elements which are used to determine the overall sentiment; the weighted computation for a news story is determined from the “context and content” of the news story; where the context and content are represented by the extracted feature values); and
evaluating significance of the text news according to the score of each feature value ((306) …and arrive at sentiment score associated with document as it relates to each company identified therein; see example in [0088] Assuming a sentiment engine that ingests (reads) news stories, identifies the news stories as relating to Company A and then scores the news stories as follows: +1 for positive story; 0 for neutral (e.g., mere mention) news story; and -1 for negative news story. Types of news stories or content may include: credit related; merger and acquisition; transaction; dividends; forecast; research and development; and FDA activity… NAS may use the derivative of sentiment or other consistent process, e.g., ratio, by looking at the most recent 10 stories and weighing them more heavily, as more relevant, for predicting stock price behavior (short-term or longer term)).
Regarding dependent claim 2, incorporating the rejection of claim 1, SISK further teaches wherein the text news comprises a news text in a txt or pdf format (recited in the alternative, thus only one must be shown in the art; note [0041] parsing of text in the news content database which is necessary to prepare content for computational linguistics and sentiment analysis).
Regarding dependent claim 10, SISK further teaches the text-based news significance evaluation electronic device, comprising a memory and a processor, wherein the memory stores a computer program that runs on the processor, and the processor implements a method according to claim 1 (system in FIG 2 [0045] NAS includes a News Article Processing Engine that cooperates with IIT 126 and metadata module 116, that includes or may cooperate with one or more search engines for receiving and processing against metadata and aggregating, scoring, and filtering, recommending, and presenting results; [0040] NAS may be implemented in a variety of deployments and architectures. NAS data can be delivered as a deployed solution at a customer or client site, via a hosting solution(s) or central server, or through a dedicated service [0043] server 120).
Regarding dependent claim 11, incorporating the rejection of claim 10, SISK further teaches a bus and a communication interface, wherein the memory, the processor, and the communication interface are connected through the bus (these elements are inherent in the hardware of a server capable of executing stored code using a processor and communicating with a client device (e.g. serving data)).
Regarding dependent claim 12, incorporating the rejection of claim 1, SISK further teaches wherein the method is implemented by a processor in a computing device (e.g. system in FIG 2 in environment of FIG 1; note discussion in rejection of claim 10 for variety of implementation architectures [0040]; could be server 120 [0043] or could be client/access device 130 [0046]).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 3-4, 8-9 are rejected under 35 U.S.C. 103 as being unpatentable over SISK in view of MITTERMAYER, Marc-Andre’ (Forecasting Intraday Stock Price Trends with Text Mining Techniques. Proceedings of the 37th Hawaii International Conference on System Sciences – 2004. 0-7695-2056-1/04 © 2004 IEEE. 10 pages, previously cited).
Regarding dependent claim 3, incorporating the rejection of claim 1, while SISK clearly teaches preprocessing text news as explained in the rejection of claim 1, SISK does not appear to expressly disclose wherein the preprocessing comprises converting a character sequence to a lowercase character, selecting a word within a specific length range, deleting an invalid character, deleting a numeral, deleting a stop word, or extracting a stem and restoring a part of speech (recited in the alternative, thus only one must be shown in the art). Note that SISK states in [0041] metadata may be used by NAS 100 to pre-process news stories, e.g., sentence splitting, speech tagging, parsing of text, tokenization, etc ., to facilitate association of stories with one or more companies and to prepare the content for the application of computational linguistic processes and for sentiment analysis.
MITTERMAYER is directed to (abstract) predicting stock trends for the time immediately after the publication of press releases {examples of news articles}, using NewsCATS (News Categorization and Trading System), that consists mainly of three components. The first component retrieves relevant information from press releases through the application of text preprocessing techniques. The second component sorts the press releases into predefined categories. Finally, appropriate trading strategies are derived by the third component by means of the earlier categorization.
MITTERMAYER states (section 2.1, page 2) Most algorithms used in automatic text categorization (ATC) are familiar from data mining applications. The data analyzed by data mining are numeric, which means they are already in the format required by the algorithms. These algorithms can be applied in ATC, but first it is necessary to convert the content of the documents to a numeric representation. This step is called text preprocessing, and it is often divided into the activities feature extraction, feature selection, and document representation. MITTERMAYER goes on to explain each of these activities. 
Feature extraction is used to generate a dictionary of words and phrases (i.e. features), where candidates are compared against a list of stop words and the dictionary that is generated is then usually free of noise (e.g., articles, prepositions, numbers). word stemming techniques can be applied so that features that differ only in the affix (suffix or prefix), i.e., words with the same stem, are treated as single features.
Feature selection is used to eliminate those features that provide few or less important items of information (e.g. using TFxIDF so that only the top n words with the highest scores are selected as features).
Document representation is when the documents are represented in terms of the features to which the dictionary has been reduced in the preceding steps. Thus, the representation of a document is a feature vector of n elements, where n is the number of features remaining when the selection process is complete.
MITTERMAYER describes the basic architecture of NewsCATS in (section 3.1, page 3) which is designed to (1) automatically preprocess incoming press releases, (2) categorize them into different news types, and (3) derive trading rules. Note that the implementation (section 3.2, page 4) uses the above described pre-processing operations in a Document Preprocessing Engine (During the feature extraction phase the engine is able to select from various stemming algorithms (table lookup, peak & plateau, and Porter's Algorithm) and to remove predefined stop words. Feature selection is performed by choosing TF, IDF, or TF×IDF as the measure of frequency. Document representation can be performed with a boolean measure of frequency or with TF, IDF, or TF×IDF. The Document Preprocessing Engine is further able to create local dictionaries if required.).
Accordingly, it would have been simply obvious to one having ordinary skill in the art before the effective filling date of the claimed invention, having the teachings of SISK (determining relevancy of news to stock prices by analyzing the textual content) and MITTERMAYER (determining relevancy of news to stock prices by analyzing the textual content, where the analysis is performed over three different phases; focusing on the Document Preprocessing Engine), to have used the document preprocessing of MITTERMAYER when analyzing document features in SISK, thus teaching wherein the preprocessing comprises converting a character sequence to a lowercase character, selecting a word within a specific length range, deleting an invalid character, deleting a numeral, deleting a stop word, or extracting a stem and restoring a part of speech (where only one must be shown in the art). The combination is motivated by the explicit statements in MITTERMAYER in section 2.1 which explains that automatic text characterization requires preprocessing the documents in order to get them into numeric representation which is suitable for data mining techniques.
Regarding dependent claim 4, incorporating the rejection of claim 3, SISK in view of MITTERMAYER, combined at least for the reasons discussed above, further teaches wherein the obtained original data includes different types of preprocessed data for extracting the different feature values (see e.g. SISK [0041] metadata may be used by NAS 100 to pre-process news stories, e.g., sentence splitting, speech tagging, parsing of text, tokenization, etc ., to facilitate association of stories with one or more companies and to prepare the content for the application of computational linguistic processes and for sentiment analysis; see also MITTERMAYER (section 2.1, page 2) Most algorithms used in automatic text categorization (ATC) are familiar from data mining applications. The data analyzed by data mining are numeric, which means they are already in the format required by the algorithms. These algorithms can be applied in ATC, but first it is necessary to convert the content of the documents to a numeric representation. This step is called text preprocessing, and it is often divided into the activities feature extraction, feature selection, and document representation); wherein for extracting metadata, the text news is preprocessed by one or more methods selected from the group consisting of deleting an invalid character and deleting a stop word (recited in the alternative, thus only one must be shown in the art; see discussion claim 3 as taught by MITTERMAYER); for extracting the keyword the text news is preprocessed by one or more methods selected from the group consisting of deleting an invalid character, deleting a stop word, and deleting a numeral (recited in the alternative, thus only one must be shown in the art; see discussion claim 3 as taught by MITTERMAYER); and for extracting the probability model feature value, the text news is preprocessed by one or more methods selected from the group consisting of deleting an invalid character, deleting a stop word, deleting a numeral, and extracting a stem and restoring a part of speech (recited in the alternative, thus only one must be shown in the art; see discussion claim 3 as taught by MITTERMAYER).
Regarding claim 8, SISK in view of MITTERMAYER, combined at least for the reasons discussed above, similarly teaches the text-based news significance evaluation apparatus (e.g. method FIG 3 [0051], performed by the system in FIG 2 [0045] NAS includes a News Article Processing Engine that cooperates with IIT 126 and metadata module 116, that includes or may cooperate with one or more search engines for receiving and processing against metadata and aggregating, scoring, and filtering, recommending, and presenting results), comprising:
a text news reading module, configured to read text news, wherein the text news comprises a news text in a txt or pdf format (executable software to perform elements, as rejected in claims 1, 2);
a text news preprocessing module, configured to preprocess the text news read by the text news reading module to obtain original data, wherein the preprocessing converting a character sequence to a lowercase character, selecting a word within a specific length range, deleting an invalid character, deleting a numeral, extracting a stem and restoring a part of speech, or deleting a stop word (executable software to perform elements, as rejected in claims 1, 3);
a feature value extraction module, configured to extract at least three different feature values, from the original data, said feature values comprising metadata, a keyword, and a probability model feature value (executable software to perform element as rejected in claim 1);
a feature value weight determining module, configured to calculate a weight ratio corresponding to each feature value, and determine a score of each feature value (executable software to perform element as rejected in claim 1); and
a text news significance evaluation module, configured to evaluate significance of the text news according to the score of each feature value (executable software to perform element as rejected in claim 1).
Regarding dependent claim 9, incorporating the rejection of claim 8, SISK in view of MITTERMAYER, combined at least for the reasons discussed above, further teaches wherein the feature value extraction module further comprises:
a metadata module, configured to calculate a quantity of numerals comprised in the original data (executable software to determine metadata within the articles; see e.g. SISK [0041] Metadata module 116 includes is adapted to identify, extra or apply, or otherwise discern metadata associated with news stories; additional information explaining feature extraction may be found in MITTERMAYER (section 2.1));
a keyword module, configured to dynamically obtain keywords and sort the keywords according to a popularity value (executable software to determine relevant keywords within articles; see e.g. SISK [0044] NAP Engine 209 includes one or more feature vector builders or feature engine 206, predictive modeling module 207, and learning or training engine or module 208; additional information explaining feature selection may be found in MITTERMAYER (section 2.1) particularly using TFxIDF; “With the combined procedure TF×IDF the two measures are aggregated into one variable. Whatever metric is used, at the end of the feature selection process only the top n words with the highest scores are selected as features”);
a probability model feature value module, configured to perform dynamic distribution training on a probability model, and obtain a model and training set keywords for significance evaluation (executable software to determine probabilities used for scoring; see e.g. SISK [0044] NAP Engine 209 includes one or more feature vector builders or feature engine 206, predictive modeling module 207, and learning or training engine or module 208; additional information explaining document representation may be found in MITTERMAYER (section 2.1) “ representation of a document is a feature vector of n elements, where n is the number of features remaining when the selection process is complete. The whole document collection can therefore be seen as an m×n-feature matrix F (with m as the number of documents), where the element fij represents the frequency of occurrence of feature j in document i…).
Claim 5 is rejected under 35 USC 103 as unpatentable over SISK in view of MITTERMAYER, further in view of CSOMAI et al. (Pub. No.: US 2010/0145678 A1), further in view of KIM et al. (US 11,334,949 B2, priority to provisional application No. 62/913,885, filed on Oct. 11, 2019; as the provisional application fully supports the relied-upon subject matter, the patent publication is relied upon for convenience in citation, previously cited).
Regarding dependent claim 5, SISK clearly teaches extracting keywords from news text as explained in the rejection of claim 1. SISK is not silent with respect to recognizing selected financial sector keywords or recognizing named entity keywords ([0016] metadata may include, for example: company identifiers; topic codes- identifying subject matter; stage of the story alert, article, update, etc.; and business sector and geographic classification codes; index references to similar articles…[0037-0038] NAS employs natural language processing and other linguistic technology to score text across various dimensions and may utilize a variety of text scoring and metadata types). Further, SISK clearly uses [0050] databases store documents, collections, and data (e.g. dictionary) associated with processing such information. SISK merely does not describe the processes for creating the databases of keywords (the dictionaries) or recognizing the keywords within the texts because SISK relies on [0036-0037] natural language processing and other linguistics technology without providing any additional details.
Nonetheless, SISK cannot be relied upon to expressly disclose the underlined portions: wherein the extracting the keyword comprises the following steps:
S1: constructing a multivariate dictionary, further comprising: selecting financial sector keywords to form a static dictionary; dynamically obtaining training set keywords through natural language processing and neural network training to form a dynamic dictionary, wherein the training set keywords do not overlap with the financial sector keywords; and combining the static dictionary and the dynamic dictionary to form a multivariate dictionary;
S2: recognizing a named entity, further comprising: obtaining a named entity for evaluation through natural language processing and neural network training; and recognizing the named entity as a keyword by using a neural network model; and
S3: sorting keywords, further comprising: extracting, through popularity search, a popularity value from the keyword in the multivariate dictionary described in step S1 and the keyword obtained by recognizing the named entity in step S2, and sorting the keywords according to the popularity value.
It is noted that the instant application as originally filed provides little to no details regarding the claimed training of neural networks for keyword analysis (see e.g. [011] which is a verbatim recitation of the claim elements; [018] broadly describing the benefits of using dynamically-determined keywords; [043-046] restatement of claim elements with respect to specific example words; [062] summary of invention).
MITTERMAYER, as discussed above, explains that most automatic text categorization includes feature extraction and feature selection in order to generate a dictionary of words and phrases (features) that describes the document, and ultimately convert the documents into feature vectors (m x n-feature matrix with m as the number of documents), where the element fij represents the frequency of occurrence of feature j in document i. Typical frequency measures are, again, TF, IDF, and TF×IDF). In MITTERMAYER’s NewsCATS system (see section 3.2 page 4), the system uses these techniques to extract features of financial news articles and creates local dictionaries if needed.
Thus, SISK-MITTERMAYER, combined for the reasons discussed above, may be relied upon to teach known natural language processing operations for extracting keywords, namely using a predefined static dictionary as well as dynamically generating a dictionary of feature keywords using natural language processing, extracting the keywords for a particular document from the static and dynamically-generated dictionaries with a popularity value (frequency measure of the keyword as it appears in the document), so that they may be ranked and sorted (MITTERMAYER: as explained in section 2.1 “When TF is used it is assumed that important terms occur in the document collection more often than unimportant ones… IDF presupposes that the rarest terms in the document collection have the highest explanatory power… only the top n words with the highest scores are selected as features). 
SISK-MITTERMAYER is silent with respect to and neural network training for determining the words in the dynamic dictionary. Further, SISK-MITTERMAYER does not provide any specific details with respect to obtaining a named entity for evaluation through natural language processing and neural network training; and recognizing the named entity as a keyword by using a neural network model.
CSOMAI is an explicit teaching for constructing a multivariate dictionary, further comprising: selecting {domain} keywords to form a static dictionary; dynamically obtaining training set keywords through natural language processing and neural network training to form a dynamic dictionary, wherein the training set keywords do not overlap with the {domain} keywords; and combining the static dictionary and the dynamic dictionary to form a multivariate dictionary ([0035] manually-constructed lists of keywords form "gold standard" references for a "good" index or keyword list in terms of usefulness, relevance and importance to the subject matter or domain-specific knowledge within the document, and inclusion of an appropriate number of keywords (example of static dictionary) [0034] unsupervised learning method to identify candidate entries for consideration for inclusion as the final keywords, [0032] where unsupervised learning methods include using neural networks; and where [0035] supervised training (e.g. using the constructed list of keywords) may be performed before the unsupervised training; collectively generating a neural network capable of recognizing both domain-specific static dictionary keywords and dynamically determined keywords (collectively the “multivariate dictionary”).
Note that one of the intended benefits of the teachings in CSOMAI is improved ranking (mechanism used for sorting) of keywords [0031], such as “Page Rank” [0061] (which is well-known relevance of documents to search terms). Further, CSOMAI explains how the improved keyword dictionary were evaluated [0080], confidence score assigned by the machine learning algorithm, and consequently the top entries in this ranking were selected.
Accordingly, it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention, having the teachings of SISK using the known natural-language processing for feature (keyword) extraction taught by MITTERMAYER, to have improved keyword recognition through the use of supervised and unsupervised machine learning techniques, including the use of neural networks using the teachings of CSOMAI, with a reasonable expectation of success, the combination motivated by the need to do so as stated in CSOMAI [0010].
KIM is broadly directed to (abstract) a framework for automated news recommendation system for financial analysis [that] includes the automated ingestion, relevancy, clustering, and ranking of news events for financial analysts in the capital markets. The framework is adaptable to any form of input news data and can seamlessly integrate with other data used for analysis like financial data.
Of particular interest (col 8 line 1) example of supervised learning algorithm that can be used to train artificial intelligence models is Named Entity Recognition (NER), Deep Neural Network Classification. (col 13 line 36) Named entities are first extracted from each news article using a statistical model. Each article is represented by a one-hot vector built based on its named entities. (col 13 line 64) The two-phase clustering is compared with 1) a name entity-based (NER), in which spaCy natural language processing and one-hot encoding are used to extract and model named entities; and 2) a latent dirichlet allocation (LDA) based model, in which an LDA model is used to compute low dimensional news representation.
Note also process 600 in FIG 6, which is a flowchart illustrating a process for identifying and recommending relevant news articles (discussed in col 16, lines 1-19).
From KIM, it is clear that named entities recognized in financial news articles using a trained neural network can be used for ranking the relevance of the financial news article (obtaining a named entity for evaluation through natural language processing and neural network training; and recognizing the named entity as a keyword by using a neural network model), which is how company information is used in SISK as previously explained.
Accordingly, it would have been obvious to one having ordinary skill the art before the effective filling date of the claimed invention, having the teachings of SISK (as improved by MITTERMAYER and CSOMAI) and KIM before them, to have used trained neural networks for the recognition of company entities (as taught in KIM) as part of the keyword extraction process taught in SISK, for the purpose of identifying the most relevant keywords (as taught in MITTERMAYER) in order to determine the relevancy of the news text to financial markets, with a reasonable expectation of success. The combination is motivated by the need in SISK to use natural language processing and linguistics in order to identify the keywords (where keywords include named entities) and the teaching in KIM of a specific natural language processing for identifying named entities that uses trained neural networks, where the identification may also be used for ranking the relevancy of news articles to financial markets.
Claims 6-7 are rejected under 35 USC 103 as unpatentable over SISK as applied to claim 1 above, further in view of KIM, further in view of GRIFFITHS et al. (Finding scientific topics. Published 04/06/2004 in PNAS, vol. 101, suppl. 1, www.pnas.org/cgi/doi/10.1073/pnas.0307752101. pp5228–5235; based on the Arthur M. Sackler Colloquium of the National Academy of Sciences, ‘‘Mapping Knowledge Domains,’’ held May 9–11, 2003, at the Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering in Irvine, CA).
For purposes of rejection only, claims 6-7 are interpreted as if the extraction of probability model feature value requires obtaining a model, training the model, and then using the trained model for the extraction, even though not all of these elements are recited in the claims. This rejection is a “best guess” interpretation of the claim elements which are neither described nor clearly recited as explained in the 35 USC 112 rejections above.

    PNG
    media_image3.png
    195
    828
    media_image3.png
    Greyscale
Regarding dependent claim 6, incorporating the rejection of claim 1, while SISK may be relied upon to teach an interpretation of extracting the probability model feature value as well as other feature values as explained in claim 1, SISK cannot be relied upon to expressly disclose: obtaining a model and training set keywords for significance evaluation by training the probability model, wherein the probability model is a latent Dirichlet allocation model; and in the latent Dirichlet allocation model, a document is generated in the following method:

    PNG
    media_image4.png
    766
    835
    media_image4.png
    Greyscale

Note that this claim only obtains and trains a latent Dirichlet allocation (LDA) model which can be used to generate another document and doesn’t actually perform any feature value extraction using the trained LDA model.
SISK is silent regarding the claimed feature value LDA extraction model and at best describes in [0037] employing natural language processing and other linguistics technology to score text across several dimensions.
KIM is broadly directed to (abstract) a framework for automated news recommendation system for financial analysis [that] includes the automated ingestion, relevancy, clustering, and ranking of news events for financial analysts in the capital markets. The framework is adaptable to any form of input news data and can seamlessly integrate with other data used for analysis like financial data.
Of particular interest (col 8 line 1) example of supervised learning algorithm that can be used to train artificial intelligence models is Named Entity Recognition (NER), Deep Neural Network Classification. (col 13 line 36) Named entities are first extracted from each news article using a statistical model. Each article is represented by a one-hot vector built based on its named entities. (col 13 line 64) The two-phase clustering is compared with 1) a name entity-based (NER), in which spaCy natural language processing and one-hot encoding are used to extract and model named entities; and 2) a latent dirichlet allocation (LDA) based model, in which an LDA model is used to compute low dimensional news representation.
Thus, from KIM it is clear that the use of an LDA-based model for the analysis of news articles was known in the art, thus teaching obtaining a model and training set keywords for significance evaluation by training the probability model, wherein the probability model is a latent Dirichlet allocation model. However, KIM provides no details of how the LDA is generated, merely how it is used.
GRIFFITHS teaches a technique for generating a statistical model to analyze technical papers in order to discover a set of topics covered by the papers which is based on a generative model (see page 5228, col 1). The particular generative model we use, called Latent Dirichlet Allocation, was introduced in [Blei, D. M., Ng, A. Y.&Jordan, M. I. (2003) J. Machine Learn. Res. 3, 993–1022].
GRIFFITHS states (page 5229, col 1) Viewing documents as mixtures of probabilistic topics makes it possible to formulate the problem of discovering the set of topics that are used in a collection of documents and then provides a number of term definitions and formulas for maximizing the likelihood expectations that any particular word is a topic. Latent Dirichlet Allocation (1) is one such model, combining Eq. 1 with a prior probability distribution on (theta) to provide a complete generative model for documents. The process of using the LDA generative model is explained.
GRIFFITHS then explains Using Gibbs Sampling to Discover Topics (page 5229 col 1) as using the probability model for Latent Dirichlet Allocation and, after defining further calculations (page 5229 col 2), explaining a technique applied from statistical physics for sampling terms in order to (top of page 5230 col 1) obtain the full conditional distribution.
The claimed terms and equations above are restatements of the terms and equations presented in the discussion of GRIFFITHS.
In other words, GRIFFITHS is a clear teaching of the claimed model for obtaining topic words from a document with some probability. Note the result of applying the model on page 5232, col 2:
Finding strong diagnostic topics for almost all of the minor categories suggests that these categories have differences that can be expressed in terms of the statistical structure recovered by our algorithm. The topics discovered by the algorithm are found in a completely unsupervised fashion, using no information except the distribution of the words themselves, implying that the minor categories capture real differences in the content of abstracts, at the level of the words used by authors. It also shows that this algorithm finds genuinely informative structure in the data, producing topics that connect with our intuitive understanding of the semantic content of documents.
Accordingly, it would have been obvious to one having ordinary skill in the art before the effective filling date of the claimed invention, having the teachings of SISK-KIM and GRIFFITHS before them, to have combined SISK-KIM (using an LDA generative model for determining topics) and GRIFFITHS (explaining the steps and equations needed for using an LDA generative model with Gibbs sampling to generate topics) by using the known technique in GRIFFITHS to generate the model used in SISK-KIM with a reasonable expectation of success, the combination motivated at least by the statement in GRIFFITHS that topics are found in a completely unsupervised fashion, using no information except the distribution of the words themselves.
Regarding dependent claim 7, incorporating the rejection of claim 6, SISK-KIM-GRIFFITHS, combined at least for the reasons discussed above, further teaches wherein 
the obtaining of the training set keywords is dynamic, a target word that has not appeared in a training set is added as a keyword in the training set by training (see at least KIM (col 8 lines 14-19) which explains that supervised learning and unsupervised learning learn from a dataset, reinforcement learning learns from interactions with an environment… (as new information is obtained, the models can learn the new information as well); see also (col 14 line 33) Character convolutions are used to learn morphological features and form a valid representation for out-of-vocabulary words.; see also GRIFFITHS (page 5230, col 1) With a set of samples from the posterior distribution P(z|w), statistics that are independent of the content of individual topics can be computed by integrating across the full set of samples; equations for estimating probability distributions; These values correspond to the predictive distributions over new words w and new topics z conditioned on w and z.), and 
the training set keywords are sorted in real time through popularity search and then used for evaluation {of the news article} (SISK, when determining the relevance of different features in feature engineering (thus based on the created dictionary of words found in the news articles) [0084] considers News Topic Differentiation (e.g., Was the news item associated with an important topic code? (AAA, MRG, DIV, CORPD, RES, RESF, RSCH));… Framing or regime (e.g., Was the previous day's (sentiment, return, volatility) low or high?).

CONCLUSION
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 



Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMY M LEVY whose telephone number is (571)270-3771. The examiner can normally be reached Mon-Fri 8am-4pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KIEU VU can be reached on (571) 272-4057. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Amy M Levy/Primary Examiner, Art Unit 2173