DETAILED ACTION
This action is responsive to the Application filed on 14 June 2021. Claims 1-12 are pending in the case. Claims 1 and 6 are the independent claims.
This action is non-final.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, or 365(c) is acknowledged.
In particular, the instant application is a national stage of PCT/KR2019/015553 (filed 11/14/2019) which relies on a benefit of priority to KR10-2018-0162214 (filed 12/14/2018).
Acknowledgement of References Cited By Applicant
As required by MPEP 609 (c), the Applicants’ submission of the Information Disclosure Statement(s) on 06/14/2021 is acknowledged by the examiner and the cited references have been considered in the examination of the claims now pending. 
As required by MPEP 609 (c)(2), a copy of each PTOL-1449, initialed and dated by the Examiner, is attached to the instant office action.
Specification
The disclosure is objected to for minor informalities: in [49] it is not clear how context anomaly neural network 40 may cause itself to perform learning (see underlined portion below):
Again returning to FIG. 1, by using the embedding vector sequence generated by the context embedder neural network 32 subjected to learning as described above, the context anomaly detector neural network 40 causes the context anomaly detector neural network 40 to perform learning so as to calculate a result value indicating whether a contextually-anomalous sentence exists in the input document data.
Applicant’s assistance is required in identifying and correcting any deficiencies in the disclosure discovered during prosecution.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) are: 
a sentence encoder configured to… encode each of a plurality of sentences …; a context embedder neural network configured to convert the generated encoding vectors into a plurality of context embedding vectors…; a context anomaly detector neural network configured to calculate a result value…; a detector learning unit configured to… cause the context anomaly detector neural network to perform learning…; a detection unit configured to …determine whether the anomalous sentence exists in the suspected data… in claim 1;
a sentence sampling module configured to generate context learning data…; a distance learning neural network configured to … calculate a distance value between the pairs of embedding vectors; an embedder learning unit configured to cause the context embedder neural network to perform learning… in claim 2;
…encoding, by a sentence encoder, each of a plurality of sentences…; converting, by a context embedder neural network, the generated encoding vectors into a plurality of context embedding vectors…; …calculating, by a context anomaly detector neural network, a result value…; …determining, by a detection unit, whether the anomalous sentence exists in the suspected data… in claim 6;
…causing, by a detector learning unit, the context anomaly detector neural network to perform learning… in claim 7; and
…generating, by a sentence sampling module, context learning data…; …calculating, by a distance learning neural network, a distance value…; …causing, by an embedder learning unit, the context embedder neural network to perform learning … in claim 8.

Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof. 
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
Note that should Applicant disagree with the above interpretation under 35 USC 112(f), the apparatus of claims 1-5 would be subject to a rejection under 35 USC 101 as software per se, as there are no hardware/structural elements in the written description of the instant application other than the explanation in [55] of using program modules executed by a computer.
Note [28] makes clear terms should not be constrained as limited to a conventional or lexical meaning, thus “module”, “unit”, and “neural network” may be interpreted simply as any software mechanism which performs a claimed function and/or has a claimed characteristic.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claim 12 is rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because the claimed invention encompasses signals per se. 
During examination, the claims must be interpreted as broadly as their terms reasonably allow (In re American Academy of Science Tech Center, 367 F.3d 1359, 1369, 70 U.S.P.Q.2d 1827, 1834 (Fed. Cir. 2004)).
Claim 12 recites A computer-readable recording medium in which a program for performing the method for detecting a contextually-anomalous sentence in a document according to claim 6 is recorded. The instant application does not provide a specific definition for computer-readable recording medium (see [55] which provides only examples of “storage medium”).
The broadest reasonable interpretation of a claim drawn to a computer-readable recording medium covers forms of non-transitory tangible media and transitory propagating signals per se in view of the ordinary and customary meaning of computer readable media (Ex parte Mewherter (PTAB 2013) finding that under the "broadest reasonable interpretation" a "machine readable storage medium" continues to encompass unpatentable transitory signals). Transitory propagating signals are non-statutory subject matter (In re Nuijten, 500 F.3d 1346, 1356-57, 84 U.S.P.Q.2d 1495, 1502 (Fed. Cir. 2007) (transitory embodiments are not directed to statutory subject matter).  See also Subject Matter Eligibility of Computer Readable Media, 1351 Off. Gaz. Pat. Office 212 (Feb. 23, 2010)).
Claim Rejections – 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-12 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Regarding claim 1, the claim recites in part …a context anomaly detector neural network configured to calculate a result value indicating whether a contextually-anomalous sentence exists in the document data from the embedding vector sequence; … perform learning so that a difference between the result value calculated by the context anomaly detector neural network and an expected value indicating whether an anomalous sentence exists in the learning data falls within a predetermined range; and a detection unit configured to, if the document data is suspected data, determine whether the anomalous sentence exists in the suspected data based on the result value calculated by the context anomaly detector neural network. It is not clear whether “contextually-anomalous sentence” and “anomalous sentence” are intended to refer to the same or different elements. 
For purposes of interpretation, it is assumed the claim recites …perform learning so that a difference between the result value calculated by the context anomaly detector neural network and an expected value indicating whether [[an]] the contextually-anomalous sentence exists in the learning data falls within a predetermined range; and a detection unit configured to, if the document data is suspected data, determine whether the contextually-anomalous sentence exists in the suspected data based on the result value calculated by the context anomaly detector neural network.
Regarding claim 6, the claim recites … a third step of calculating, by a context anomaly detector neural network, a result value indicating whether a contextually-anomalous sentence exists in the document data from the embedding vector sequence; and a fourth step of, if the document data is suspected data, determining, by a detection unit, whether the anomalous sentence exists in the suspected data based on the result value calculated by the context anomaly detector neural network. It is not clear whether “a contextually-anomalous sentence” and “the anomalous sentence” are intended to refer to the same or different elements.
For purposes of interpretation, it is assumed the claim recites …a fourth step of, if the document data is suspected data, determining, by a detection unit, whether the contextually-anomalous sentence exists in the suspected data based on the result value calculated by the context anomaly detector neural network.
Regarding dependent claim 7, the claim recites …before the fourth step, if the document data is learning data, causing, by a detector learning unit, the context anomaly detector neural network to perform learning so that a difference between the result value calculated by the context anomaly detector neural network and an expected value indicating whether an anomalous sentence exists in the learning data falls within a predetermined range which is unclear for similar reasons and is interpreted as being the same term (i.e. the contextually-anomalous sentence exists).
Regarding dependent claims 2-5 and 8-12, dependent claims necessarily inherit the deficiencies of the parent claim.

Claim Rejections - 35 USC § 102

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1, 6-7, 12 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by CAO et al. (Pub. No.: US 2020/0134016 A1, priority to provisional application no. 62/753,621, filed on Oct. 31 , 2018; the relevant portions relied upon below are fully supported in the provisional, thus the published application is relied upon for ease in citation).
Regarding claim 1, CAO teaches the apparatus for detecting a contextually-anomalous sentence in a document ((abstract) generating a coherence score for text data based aggregating the local coherence scores for adjacent sentence pairs within the text data using a neural network), the apparatus (FIG 1, FIG 3) comprising:
a sentence encoder configured to, when document data is input, encode each of a plurality of sentences included in the document data to generate an encoding vector sequence including a plurality of encoding vectors ([0094-0095] neural model…some sentence encoder transforming sentences into real-valued vectors S and T; see also [0099] and alternatively [0101-0102] which is convolutional variant of the proposed approach, “an embedding layer, mapping the sequence of tokens in a sentence to a vector representation”; see also [0052-0056]);
a context embedder neural network configured to convert the generated encoding vectors into a plurality of context embedding vectors corresponding to each of the plurality of encoding vectors included in the encoding vector sequence to generate an embedding vector sequence ([0101-0102] which is convolutional variant of the proposed approach “a feature layer mapping each pair of sentences to a feature space”; see also [0057] extracting features from adjacent sentence pairs);
a context anomaly detector neural network (the neural model) configured to calculate a result value (the global coherence score for the entire text) indicating whether a contextually-anomalous sentence exists in the document data from the embedding vector sequence ([0058] neural network trained using string tokens of adjacent sentence pairs of the training text as positive examples, string tokens of non-adjacent sentence pairs as negative examples [0061] for each adjacent sentence pairs, a local coherence score engine generates a local coherence score for the pair…[0062] local coherence scores are aggregated to generate global coherence score)
a detector learning unit ([0058] neural network engine 106 maintains the neural network) configured to, if the document data is learning data ([0058] trained against a plurality of corpuses… multiple topics), cause the context anomaly detector neural network to perform learning [0058] so that a difference between the result value calculated by the context anomaly detector neural network and an expected value indicating whether an {the contextually-} anomalous sentence exists in the learning data falls within a predetermined range ([0091] training objective is to condition a loss function [0092] to encourage high score for positive training example and low score for negative training example; see alternatively [0103] which relies on the construction of negative examples from training corpus and margin loss function that strives to encourage the model to assign low scores to positive pairs of sentences and a high score to negative pairs of sentences); and
a detection unit (software generally) configured to, if the document data is suspected data (interpreting “suspected data” as merely input data to be tested for coherence), determine whether the {contextually-}anomalous sentence exists in the suspected data based on the result value calculated by the context anomaly detector neural network (output the result; note [0066] system can be utilized as a coherence checking device, part of a larger system; note also [0106] evaluation/use of model).
Regarding claim 6, CAO teaches the method for detecting a contextually-anomalous sentence in a document (FIG 2, the operations performed by elements of the apparatus as explained in the rejection of claim 1), the method comprising:
a first step of, when document data is input, encoding, by a sentence encoder, each of a plurality of sentences included in the document data to generate an encoding vector sequence including a plurality of encoding vectors (202, 206, arranging string tokens [representing document portions, e.g. sentences] to represent adjacent sentence pairs);
a second step of converting, by a context embedder neural network, the generated encoding vectors into a plurality of context embedding vectors corresponding to each of the plurality of encoding vectors included in the encoding vector sequence to generate an embedding vector sequence (202,206; interpreting here the “context” as adjacent or non-adjacent sentence pairs);
a third step of calculating, by a context anomaly detector neural network, a result value indicating whether a contextually-anomalous sentence exists in the document data from the embedding vector sequence (208, for each adjacent pair, using neural network to determine local coherence level of adjacent pair; 210, aggregating local coherence to generate global coherence score); and
a fourth step of, if the document data is suspected data, determining, by a detection unit, whether the {contextually-}anomalous sentence exists in the suspected data based on the result value calculated by the context anomaly detector neural network (an optional step, not required if the document data is not suspected data; nonetheless [0066-0067] describes different ways the global coherence score may be used, for example if the coherence score is not high (the document contains some anomaly) then a new document may be requested).
Regarding dependent claim 7, incorporating the rejection of claim 6, CAO further teaches before the fourth step, if the document data is learning data, causing, by a detector learning unit, the context anomaly detector neural network to perform learning so that a difference between the result value calculated by the context anomaly detector neural network and an expected value indicating whether an {the contextually} anomalous sentence exists in the learning data falls within a predetermined range (an optional step; not required if the document data is not learning data; nonetheless [0064] system is trained prior to usage; training is done with positive and negative examples using various reward/optimization functions; [0091] training objective is to condition a loss function [0092] to encourage high score for positive training example and low score for negative training example; see alternatively [0103] which relies on the construction of negative examples from training corpus and margin loss function that strives to encourage the model to assign low scores to positive pairs of sentences and a high score to negative pairs of sentences).
Regarding dependent claim 12, CAO further teaches the computer-readable recording medium in which a program for performing the method for detecting a contextually-anomalous sentence in a document according to claim 6 is recorded (see e.g. [0016] computer systems, methods, devices, and computer program products (e.g., machine interpretable instruction sets affixed into computer readable media).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 2-5, 8-11 are rejected under 35 U.S.C. 103 as being unpatentable over CAO in view of HO et al. (Pub. No.: US 2017/0132288 A1).
Regarding dependent claim 2, incorporating the rejection of claim 1, CAO further teaches (see training procedure generally; [0058] and [0103-104]) a sentence sampling module (software) configured to generate context learning data including pairs of encoding vectors corresponding to two or more sentences selected from the encoding vector sequence generated by the sentence encoder (a positive pair of sentences or a negative pair of sentences) and a reference value (is this a positive example or negative example) indicating whether the contexts between the two or more sentences match with each other (positive is adjacent coherence, negative adjacent incoherence, where sample is negative for a number of different reasons). 
However, CAO cannot be relied upon to expressly disclose a distance learning neural network configured to, when generating the pairs of encoding vectors included in the context learning data as pairs of embedding vectors corresponding thereto by the context embedder neural network, calculate a distance value between the pairs of embedding vectors; and an embedder learning unit configured to cause the context embedder neural network to perform learning so that a difference between the distance value between the pairs of embedding vectors calculated by the distance learning neural network and the reference value included in the context learning data falls within a predetermined range (interpreted as learning to generate a context-embedded sentence vector from a sequence of words by mapping the sequence of words and the context to a representative vector in a cluster (small distance neighborhood) of similar sentences with common context).
CAO describes the limitations of other coherence models in [0011-0014] and makes clear that the described approach is an improvement to other known coherence models [0015]. CAO further describes at least one convolutional model which includes some topic embedding, as explained above, but stops short of explaining how the sentence vectors might be generated or compared.
Also in the field of natural language processing, HO teaches [0003] a system to generate or extract a sequence of concepts from an information or text source and to compare a distributed representation of each concept to one or more distributed representations of the concept(s) (i.e., concept vectors) extracted from the same information or text source to determine if there is a problem with the identified sequence of concepts… a set of concept/annotations may be assessed for consistency with the full context of the source information or text by using similarities computed between their vector representations to validate if the set of concept/annotations is internally coherent to an acceptable extent, to identify any outliers which may indicate annotation errors, and/or to select between potential alternative annotations. As disclosed herein, the similarity metric values can be used with the reference concept vectors to evaluate a set of candidate concepts or annotations which each uniquely correspond to a piece of text in the corpus and/or to evaluate a set of candidate concepts or annotations which each non-uniquely correspond to a piece of text in the corpus (e.g., there are two or more candidate concepts/annotations which correspond to the same piece of text).
[0047] Concept vectors are deduced from word vectors extracted from other documents. For example, the learning task in the concept vector extractor 13 may be configured to implement a scanning method where learning takes place by presenting examples from a very large corpus of Natural Language (NL) sentences. [0048] the concept vector extractor 13 may be configured to use a concept to predict its neighboring concepts, and the training result produces the vectors. 
A specific example in [0050] explains how different concept vectors may be generated from different sequences, and then compared. If both versions of annotated text sources are included in the embedding process, by way of association with other concepts and non-concept words, the respective concept vectors can be brought to close proximity in the embedding space. Computing similarities between the vectors could reveal the linkage between such alternative annotations. 
Thus, HO teaches it was known in the art of natural language processing to convert simple extracted vectors of sentences into context vectors such that the context vectors which are similar will have a close proximity (small distance) in the context embedding space. HO goes on to explain one possible usage is to [0053] detect errors or outliers by using similarity to find reference concept vectors, e.g. by applying a similarity threshold value. HO goes on to explain how concept vectors may be generated based on a system which has been trained to do so (see e.g. [0076-0078]).
In other words, HO teaches it was known in the art to determine whether two sentences (sequences of word vectors) are similar or not by generating concept vectors and further, determining a representative concept vector for a concept vector which is based on a distance measurement (proximity) between the representative concept vector and the target concept vector. The concept vectors and representative concept vectors can then be used to determine errors within a document. The concept vectors themselves may be generated by a system which has learned from training examples in which the system tries to minimize the distance (proximity) for similar concepts.
HO may thus be relied upon to teach a distance learning neural network configured to, when generating the pairs of encoding vectors included in the context learning data as pairs of embedding vectors corresponding thereto by the context embedder neural network, calculate a distance value between the pairs of embedding vectors; and an embedder learning unit configured to cause the context embedder neural network to perform learning so that a difference between the distance value between the pairs of embedding vectors calculated by the distance learning neural network and the reference value included in the context learning data falls within a predetermined range.
Accordingly, it would have been obvious to one having ordinary skill in natural language processing before the effective filling date of the claimed invention, having the teachings of CAO and HO before them, to have combined CAO (an improvement to known coherency detection methods for a document which relies the adjacency of sentences and relies on the generation of sentence vectors from sequences of words) and HO (a known coherency detection method for a document which relies on detecting context matches/mismatches using concept vectors and representation concept vectors based on sentences (sequences of words) in the document) by using as the sentence vectors in CAO the concept vectors (or representative concept vectors) and obtaining a system that can compare the concepts of two adjacent sentences to determine whether they are similar or not, thus improving the determination of document coherency, with a reasonable expectation of success, the combination motivated by explicit teaching in CAO that known coherency detection methods can be improved by considering the adjacency of sentences, where HO is capable only of detecting concept mismatches.
Regarding dependent claim 3, incorporating the rejection of claim 2, CAO further teaches wherein the reference value indicating whether the contexts between the two or more sentences match with each other is set based on whether the two or more sentences are arranged adjacent to each other in the document data (in training data, two adjacent sentences are assumed to be positive example for coherence, while negative example could be two non-adjacent sentences).
Regarding dependent claim 4, incorporating the rejection of claim 2, while CAO does not appear to expressly disclose wherein the reference value indicating whether the contexts between the two or more sentences match with each other is set based on whether the two or more sentences belong to the same paragraph in the document data, this is an obvious variant of selecting non-adjacent sentences during training (e.g. one from a first paragraph and one from a second paragraph).
Regarding dependent claim 5, incorporating the rejection of claim 2, while CAO does not appear to expressly disclose wherein the reference value indicating whether the contexts between the two or more sentences match with each other is set based on whether the two or more sentences belong to the same category, incorporating the teachings of HO which are with respect to determining similar concepts (categories) discussed in the rejection of claim 2 cures this deficiency.
Regarding dependent claim 8, incorporating the rejection of claim 6, CAO further teaches before the second step, generating, by a sentence sampling module, context learning data including pairs of encoding vectors corresponding to two or more sentences selected from the encoding vector sequence generated by the sentence encoder and a reference value indicating whether the contexts between the two or more sentences match with each other (see training procedure generally; [0058] and [0103-0104], a positive pair of sentences or a negative pair of sentences, where reference value indicates whether the sentences are a positive training set or a negative training set (coherent-adjacent sample or incoherent-non-adjacent sample)). However, CAO does not appear to expressly disclose when generating each of the pairs of encoding vectors included in the context learning data as pairs of embedding vectors corresponding thereto by the context embedder neural network, calculating, by a distance learning neural network, a distance value between the pairs of embedding vectors; and causing, by an embedder learning unit, the context embedder neural network to perform learning so that a difference between the distance value between the pairs of embedding vectors calculated by the distance learning neural network and the reference value included in the context learning data falls within a predetermined range. Incorporating the teachings of HO as discussed in the rejection of claim 2 above cures this deficiency.
Regarding dependent claim 9, incorporating the rejection of claim 8, CAO further teaches wherein the reference value indicating whether the contexts between the two or more sentences match with each other is set based on whether the two or more sentences are arranged adjacent to each other in the document data (coherent-adjacent sample or incoherent-non-adjacent sample).
Regarding dependent claim 10, incorporating the rejection of claim 8, while CAO does not appear to expressly disclose wherein the reference value indicating whether the contexts between the two or more sentences match with each other is set based on whether the two or more sentences belong to the same paragraph in the document data, this is an obvious variant of selecting non-adjacent sentences during training (e.g. one from a first paragraph and one from a second paragraph).
Regarding dependent claim 11, incorporating the rejection of claim 8, CAO does not appear to expressly disclose wherein the reference value indicating whether the contexts between the two or more sentences match with each other is set based on whether the two or more sentences belong to the same category. Incorporating the teachings of HO which are with respect to determining similar concepts (categories) discussed in the rejection of claim 2 cures this deficiency.


It is noted that any citation to specific pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way. “The use of patents as references is not limited to what the patentees describe as their own inventions or to the problems with which they are concerned. They are part of the literature of the art, relevant for all they contain.” In re Heck, 699 F.2d 1331, 1332-33, 216 USPQ 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 USPQ 275, 277 (CCPA 1968)). Further, a reference may be relied upon for all that it would have reasonably suggested to one having ordinary skill the art, including nonpreferred embodiments. Merck & Co. v. Biocraft Laboratories, 874 F.2d 804, 10 USPQ2d 1843 (Fed. Cir.), cert. denied, 493 U.S. 975 (1989). See also Upsher-Smith Labs. v. Pamlab, LLC, 412 F.3d 1319, 1323, 75 USPQ2d 1213, 1215 (Fed. Cir. 2005); Celeritas Technologies Ltd. v. Rockwell International Corp., 150 F.3d 1354, 1361, 47 USPQ2d 1516, 1522-23 (Fed. Cir. 1998).


CONCLUSION
The prior art made of record is considered pertinent to applicant’s disclosure and is recorded on Form PTO-892. Applicant is required under 37 C.F.R. § 1.111(c) to consider these references fully when responding to this action.
BHARUCHA et al. Detection of coherence-disrupting and coherence-conferring alterations in text. Memory & Cognition 1985, 13(6), 573-578. (explains that anomalous sentence pairs are incoherent pairs and coherent pairs are non-anomalous)
HUANG et al. Bidirectional LSTM-CRF Models for Sequence Tagging. arXiv: 1508.01991v1 [cs.CL] 9 Aug 2015. Retrieved from [arXiv.com] on [13 May 2022]. (background information)
RODER et al. Exploring the Space of Topic Coherence Measures. WSDM’15, February 2–6, 2015, Shanghai, China. http://dx.doi.org/10.1145/2684822.2685324. 10 pages. (a unifying framework that spans a configuration space of coherence definitions… exhaustively search this space for the coherence definition with the best overall correlation with respect to all available human topic ranking data…discuss applications to search, advertising and automatic translation)
CUI et al. Text Coherence Analysis Based on Deep Neural Network. CIKM’17 , November 6–10, 2017, Singapore, Singapore © 2017 Association for Computing Machinery. ACM ISBN 978-1-4503-4918-5/17/11. 4 pages.
LOGESWARAN et al. Sentence Ordering and Coherence Modeling using Recurrent Neural Networks. Copyright © 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). arXiv:1611.02654v2 [cs.CL] 22 Dec 2017. Retrieved from [arXiv.com] on [09/22/2022]. 8 pages.
TSUNOO et al. Hierarchical Recurrent Neural Network for Story Segmentation. INTERSPEECH 2017, August 20–24, 2017, Stockholm, Sweden. http://dx.doi.org/10.21437/Interspeech.2017-392. 5 pages. (broadcast news stream consists of a number of stories and each story consists of several sentences. We capture this structure using a hierarchical model based on a word-level Recurrent Neural Network (RNN) sentence modeling layer and a sentence-level bidirectional Long Short-Term Memory (LSTM) topic modeling layer).
McCLURE et al. Context is Key: New Approaches to Neural Coherence Modeling. Published at [arXiv:1812.04722v1 [cs.CL]] on [6 Dec 2018]. 7 pages. (formulate coherence modeling as a regression task and propose two novel methods to combine techniques from our setup with pairwise approaches)
NOURBAKHSH et al. A framework for anomaly detection using language modeling, and its applications to finance. Published at [arXiv:1908.09156v1 [cs.CL]] on [24 Aug 2019]. 5 pages.
US-20090067719-A1 (SRIDHAR) contextually cohesive sentence detection
US-20200285737-A1 (KRAUS) sequence anomalies
US-20200394364-A1 (VENKATESHWARAN) sentence clustering for tagging
US-20120150534-A1 (SHEEHAN) [0038] expected cohesive devices may not be determined across pairs of consecutive sentences that span two paragraphs. In many situations, having distinct thoughts conveyed in different paragraphs connotes easier to understand text. Thus, in some implementations, a text may not be identified as being more difficult based on a lack of cohesive devices in pairs of sentences in disparate paragraphs.
US-20130138665-A1 (HU) [0038] A semantic space is used to encode the semantics of a large corpus of documents and to analyze new texts. Then semantic similarity is used to measure semantic cohesion between adjacent sentences, paragraphs, and documents, as well as nonadjacent text segments.
WO-2005045695-A1 (BURSTEIN) [0020] a method that captures the expressive quality of sentences in the discourse elements of an essay is described. For example, two global coherence aspects and, for example, two local coherence aspects may define the expressive quality of an essay. The global coherence aspects may include (a) the correlation of a sentence to an essay question (topic) and (b) the correlation between discourse elements. The local coherence aspects may include (c) the interrelation of sentences within a discourse element and (d) intra-sentence quality.
	

	
	
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMY M LEVY whose telephone number is (571)270-3771. The examiner can normally be reached Mon-Fri 8am-4pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, KIEU VU can be reached on (571) 272-4057. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Amy M Levy/Primary Examiner, Art Unit 2173