DETAILED ACTION

Introduction
This office action is in response to Applicant’s submission filed on 4/23/2020. Claims
1-20 are pending in the application. As such, claims 1-20 have been examined.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 11/18/2020.  The submission is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Drawings
The drawings filed on 4/23/2020 is accepted and considered by the Examiner.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The independent claims 1, 10 and 19 recites “receive, from at least one external database, requirements data comprising first text that indicates a plurality of restrictions associated with one or more processes of an organization; determine second text indicating a current configuration of the organization, wherein the current configuration of the organization corresponds to operations of the organization; remove, based on comparing the first text with a predetermined list of terms, a portion of the first text; process, using a lemmatization algorithm, the non-removed portion of the first text to generate simplified text; generate, based on the simplified text, a first vector corresponding to one or more terms in the simplified text; determine a frequency of use of each term, of the one or more terms, in the simplified text; weight, based on the frequency of use of each term in the simplified text, each element in the first vector; normalize, based on semantic analysis of the simplified text, the first vector; remove, based on comparing the second text with the predetermined list of terms, a portion of the second text; process, using the lemmatization algorithm, the non-removed portion of the second text to generate second simplified text; generate, based on the second simplified text, a second vector corresponding to one or more second terms in the second simplified text; determine a second frequency of use of each term, of the one or more terms, in the second simplified text; weight, based on the second frequency of use of each term in the second simplified text, each element in the second vector; normalize, based on semantic analysis of the second simplified text, the second vector; determine, based on comparing first elements of the first vector and second elements of the second vector, a portion of the second vector corresponding to the first vector; generate, based on the portion of the second vector corresponding to the first vector, third text; and transmit, to a computing device and based on a quantity of elements of the portion of the second vector satisfying a threshold, the third text.”
The limitation of “receive…”, “determine…”, “remove…”, “process…”, “generate…”, “weight…”, “normalize…”, and “transmit…”, as drafted covers a mental process that “can be performed in the human mind or by a human using a pen and paper.  More specifically, an application of a person obtaining a first list of requirement relating to a process in an organization, determining, locating or write up a second list of the current configuration of the organization as it relates to the requirements, then based on comparing the two list, removing non-essential terms/words, use a generic computer model/or with paper/pen, simplify the list of key terms or keywords into their basic or root form, then generate vector or place the words into digital format that the computer can understand, determine or count the frequency of use for each term, weight or place emphasis on terms depending on the frequency of usage, then perform semantic analysis on the simplified text, and repeat that process for the second list, then compare the two list, generate another document/checklist to show how well the two list compares to each other, and if the two list have sufficient matching based on a predetermined threshold/requirement, then sent the result/checklist a computer.
This judicial exception is not integrated into a practical application. In particular, independent claims 1, 10 and 19 recite additional elements of “processor”, and/or “memory and/or computer-readable storage media”, “computer device”, “database” and “lemmatization algorithm”.  For example, in [0022] of the as filed specification, there is description of using a general purpose computer.  As such, a general purpose computer would contain a processor, memory and computer-readable storage media.  In [0023] of the as filed specification, a description of conventional memory and tangible storage device is also mentioned. In [0020] of the as filed specification, there is a description of a general or conventional database.  In [0047] of the as filed specification, there is a description of a generic computer model which simplifies and group words based on their meaning.  Independent claims 10 and 19 recite additional elements of “computing device”, which describes a general purpose computer.  Accordingly, these additional elements does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Thus, the claim is directed to an abstract idea.
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional element of using a computer is noted as a general computer as described. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept.  Further, the additional limitations in the claims noted above are directed towards insignificant solution activity. Thus, the claims are not patent eligible.
With respect to claims 2, 11 and 20, the claim relates to wherein the instructions, when executed by the one or more processors, cause the apparatus to transmit the third text by: transmitting, to a second computing device associated with the organization, data comprising instructions regarding compliance with at least one of the plurality of restrictions.  This reads on a person viewing the checklist and sending the data/notice regarding compliance to a computer.  There are no additional limitations that would make this claim eligible.  
  With respect to claims 3 and 12, the claim relates to wherein the instructions, when executed by the one or more processors, cause the apparatus to transmit the third text by: transmitting, based on determining that the third text indicates that a configuration of one or more devices is contrary to the plurality of restrictions, an indication that the one or more devices are out of compliance. This reads on a person viewing the checklist, and notifying the organization when the requirements of the current configuration is not in compliance.  There are no additional limitations that would make this claim eligible.  
  Regarding claims 4 and 13, the claims relate to wherein the instructions, when executed by the one or more processors, are configured to normalize the first vector by: removing, based on the semantic analysis, one or more elements of the first vector.  This reads on a person performing semantic analysis, and simplifying the text by removing unnecessary words.  There are no additional limitations that would make this claim eligible.  
 Regarding claim 5 and 14, the claim relates to wherein the predetermined list of terms comprises one or more predetermined terms associated with structural elements of a legal document.  This reads on a person viewing a list and determining that it contains particular terms for a certain official/formal document, like a specification.  There are no additional limitations that would make this claim eligible.  
  Regarding claim 6 and 15, the claim relates to wherein the instructions, when executed by the one or more processors, cause the apparatus to compare the first elements of the first vector and second elements of the second vector by generating a third vector, wherein each element of the third vector indicates a presence or absence of a different term.  This reads on person comparing two list, and notes the match or discrepancies in a checklist.   There are no additional limitations that would make this claim eligible.  
  Regarding claim 7 and 16, the claim relates to wherein the instructions, when executed by the one or more processors, cause the apparatus to weight each element in the first vector by weighting each element in the first vector based on an inverse of the frequency of use of each term in the first text, and wherein the instructions, when executed by the one or more processors, cause the apparatus to weight each element in the second vector by weighting each element in the second vector based on an inverse of the second frequency of use of each term in the second text.  This reads on a person viewing both list, and determine which terms/keywords are more important based on the inverse frequency of their appearance, like appearing less frequent or rare is determined to be more important.  There are no additional limitations that would make this claim eligible.  
  Regarding claim 8 and 17, the claim relates to wherein the instructions, when executed by the one or more processors, cause the apparatus to generate the first vector by: determining third elements corresponding to terms in the first text; and determining fourth elements corresponding to phrases in the first text, wherein the first vector comprises the third elements and the fourth elements.  This reads on a person viewing the list, and realizing that some element of the text/list corresponds to a term or keyword and some elements from the text/list corresponds to a phrase (multiple words).  There are no additional limitations that would make this claim eligible.  
  Regarding claim 9 and 18, the claim relates to wherein the instructions, when executed by the one or more processors, cause the apparatus to process the first text to generate simplified text by: removing, from one or more first terms of the first text and based on the lemmatization algorithm, one or more characters.  This reads on a person using a generic computer program or use their brain to simplify text by removing one or more characters.  There are no additional limitations that would make this claim eligible.  


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 5, 7, 10, 14, 16, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over (Falessi, D., Cantone, G., & Canfora, G. (2010, September). A comprehensive characterization of NLP techniques for identifying equivalent requirements. In Proceedings of the 2010 ACM-IEEE international symposium on empirical software engineering and measurement (pp. 1-10)) hereinafter as Falessi, in view of Gupta et al. (US Patent Application Publication No.: US 20210294797 A1) hereinafter as Gupta, and further in view of applicant supplied reference, (Molino, P., Zheng, H., & Wang, Y. C. (2018, July). Cota: Improving the speed and accuracy of customer support through ranking and deep networks. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 586-595)) hereinafter as Molino. 
Regarding claim 1, Falessi discloses: requirements data comprising first text that indicates a plurality of restrictions associated with one or more processes of an organization; ([sect 2.2] Objects to retrieve: the compared text fragments are industrial requirements expressed in natural languages; moreover, such requirements are supposed to be in the same abstraction level. [sect 4.1] Finmeccanica is a large Italian industrial group operating globally in the aerospace, defense, and security sectors, and is one of the world's leading groups in the fields of helicopters and defense electronics.)
determine second text indicating a current configuration of the organization, wherein the current configuration of the organization corresponds to operations of the organization; ([sect 2] In this paper, the term NLP refers to the technique adopted to provide the similarity between two pieces of text.  See [sect 4.1] for more detail regarding Finmeccanica’s system segment specification (SSS), the requirements and the equivalence thereof)
remove, based on comparing the first text with a predetermined list of terms, a portion of the first text; ([sect 3.2] Term Extraction. Simple: the simplest way to preprocess the text is to apply to it a tokenization and a stop-word removal activity. During the tokenization activity, the text is transformed in a series of tokens where capitals, punctuation, and brackets are removed. Afterwards, the token are analyzed and stop words are removed. The stop words are the terms that do not contribute to the semantic of the text; examples include articles, pronouns, etc. The list of stop words needs to be defined before the text preprocessing takes place and it obviously depends from the adopted language.)
generate, based on the simplified text, a first vector corresponding to one or more terms in the simplified text; (See Fig. 1, Term extraction (Simple) and vector space model and similarity metrics and [sect 3.4])
determine a frequency of use of each term, of the one or more terms, in the simplified text; (See Fig. 1, Term Frequency. and [sect 3.3] Term Weighting)
weight, based on the frequency of use of each term in the simplified text, each element in the first vector; (See Fig. 1, Weighing schema - TD-IDF and [sect 3.3] Term Weighting )
normalize, based on semantic analysis of the simplified text, the first vector; ([sect 3.4.1 Vector similarity metrics] Dice: The Dice coefficient normalizes for length by dividing by the total number of non-zero entries. The scale factor 2 gives a measure that ranges from 0.0 to 1.0, with 1.0 indicating identical vectors. In such a way, as compared to Euclidean distance, the Dice distance (1 – SDice) retains sensitivity in more heterogeneous data sets and gives less weight to outliers [24].  Also see Fig. 1 (Latent Semantic Analysis))
remove, based on comparing the second text with the predetermined list of terms, a portion of the second text; ([sect 3.2] Term Extraction. Simple: the simplest way to preprocess the text is to apply to it a tokenization and a stop-word removal activity. During the tokenization activity, the text is transformed in a series of tokens where capitals, punctuation, and brackets are removed. Afterwards, the token are analyzed and stop words are removed. The stop words are the terms that do not contribute to the semantic of the text; examples include articles, pronouns, etc. The list of stop words needs to be defined before the text preprocessing takes place and it obviously depends from the adopted language.)
generate, based on the second simplified text, a second vector corresponding to one or more second terms in the second simplified text; (See Fig. 1, Term extraction (Simple) and vector space model and similarity metrics and [sect 3.4])
determine a second frequency of use of each term, of the one or more terms, in the second simplified text; (See Fig. 1, Term Frequency and [sect 3.3] Term Weighting.)
weight, based on the second frequency of use of each term in the second simplified text, each element in the second vector; (See Fig. 1, Weighing schema - TD-IDF and [sect 3.3] Term Weighting)
normalize, based on semantic analysis of the second simplified text, the second vector; ([sect 3.4.1 Vector similarity metrics] Dice: The Dice coefficient normalizes for length by dividing by the total number of non-zero entries. The scale factor 2 gives a measure that ranges from 0.0 to 1.0, with 1.0 indicating identical vectors. In such a way, as compared to Euclidean distance, the Dice distance (1 – SDice) retains sensitivity in more heterogeneous data sets and gives less weight to outliers [24].  Also see Fig. 1 (Latent Semantic Analysis))
 determine, based on comparing first elements of the first vector and second elements of the second vector, a portion of the second vector corresponding to the first vector; ([sect 3.4.2 WordNet similarity metrics] WordNet has its own set of metrics [25]. Such metrics, however, cannot be directly exploited in our work because the WordNet system provides only a unidirectional measure2. In order to measure the similarity of text fragments, the unidirectional estimated similarity is transformed into a bidirectional similarity measure as described in Table 3, where: • SWORDNET(x,y) is the estimated similarity between the text fragments x and y.)
Falessi does not explicitly, but Gupta discloses: 1. An apparatus comprising: one or more processors; ([0111] FIG. 5 shows a computer system 500 in accordance with the disclosed embodiments. Computer system 500 includes a processor 502, memory 504, storage 506, and/or other components found in electronic computing devices. Processor 502 may support parallel processing and/or multi-threaded operation with other processors in computer system 500. Computer system 500 may also include input/output (I/O) devices such as a keyboard 508, a mouse 510, and a display 512.) 
and memory storing instructions that, when executed by the one or more processors, cause the apparatus to: receive, from at least one external database, ([0116] The methods and processes described in the detailed description section can be embodied as code and/or data, which can be stored in a computer-readable storage medium as described above. When a computer system reads and executes the code and/or data stored on the computer-readable storage medium, the computer system performs the methods and processes embodied as data structures and code and stored within the computer-readable storage medium.)
generate, based on the portion of the second vector corresponding to the first vector, third text; ([0057] Conflation apparatus 208 also, or instead, generates a third set of “mid-confidence” matches between the user records and employee records when first names, last names, titles, locations, positions, and/or companies in the user records and employee records are identical.)
and transmit, to a computing device and based on a quantity of elements of the portion of the second vector satisfying a threshold, the third text. ([0058] When the match score exceeds a threshold, conflation apparatus 208 establishes a match between the user record and employee record. [0066] After the data store(s) return data 230 in response to the query, online service 222 formats data 230 into a response to the request (e.g., according to the specification for the API) and transmits the response to the entity.)
Falessi and Gupta are considered analogous art because they are both in the related art of data conflation. Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Falessi to combine the teaching of Gupta, to incorporate the apparatus comprising of one or more processor, and memory storing instructions…, and generate, …third text, and transmit, … satisfying a threshold, the third text, because the combination of the disclosures would enable the conflation of the input datasets from two or more data providers (Gupta, [0014]).
Falessi in view of Gupta does not explicitly, but Molino discloses: process, using a lemmatization algorithm, the non-removed portion of the first text to generate simplified text; ([sect 3.1 NPL Processing] The first step is to analyze text at the word-level and use topic modeling to better understand the meaning of text data. The text is cleaned by removing HTML tags. Next, the message’s sentences are tokenized and stop-words are removed. Then, each word is lemmatized to convert different inflected forms into the same base form.  Also see Fig. 2 which shows a process/step which performs lemmatization.)
process, using the lemmatization algorithm, the non-removed portion of the second text to generate second simplified text; ([sect 3.1 NPL Processing] The first step is to analyze text at the word-level and use topic modeling to better understand the meaning of text data. The text is cleaned by removing HTML tags. Next, the message’s sentences are tokenized and stop-words are removed. Then, each word is lemmatized to convert different inflected forms into the same base form.  Also see Fig. 2 which shows a process/step which performs lemmatization.)
Falessi, Gupta and Molino are considered analogous art because they are all in the related art of data conflation and/or text simplification. Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Falessi, in view of Gupta, to combine the teaching of Molino, to incorporate process, the lemmatization algorithm, … to generate …simplified text, because the combination of the disclosures would reduce problem resolution time while improve customer satisfaction (Molino, conclusion).

Regarding claim 5, Falessi in view of Gupta, and further in view of Molino discloses: The apparatus of claim 1, 
Falessi further discloses: wherein the predetermined list of terms comprises one or more predetermined terms associated with structural elements of a legal document. ([sect 4] The context of this study is the development of systems of systems. In addition to being large, distributed, adaptive and complex, a system of systems is structured into components (i.e. systems) that can work independently of each other, though their cooperation provides functionality that are greater than the sum of their functionalities. The requirements taken into account in this paper can be characterized, according to MIL STD 498, as System Segment Specification (SSS); they specify the requirements for a system or subsystem and the methods to be used to ensure that each requirement has been met.)

Regarding claim 7, Falessi in view of Gupta, and further in view of Molino discloses: The apparatus of claim 1, 
Falessi further discloses: wherein the instructions, when executed by the one or more processors, cause the apparatus to weight each element in the first vector by weighting each element in the first vector based on an inverse of the frequency of use of each term in the first text, and wherein the instructions, when executed by the one or more processors, cause the apparatus to weight each element in the second vector by weighting each element in the second vector based on an inverse of the second frequency of use of each term in the second text. ([sect 3.3 Term Weighting] Inverse Document Frequency (IDF): It assigns a weight depending on the number of given texts that include the term (rated to the total number of texts). Although IDF is strongly correlated with the inverse of TF, the two variables are not completely predictable from one another [22]. The underlying idea for IDF weighting is the observation that the documents related to a given domain share a lot of words; therefore, such frequent words do not provide a lot of semantic value; i.e., they are unable to discriminate among the different documents. For instance, according to IDF, in the automotive domain the term “car” should have a low weight because it adds few information in the documents in which it is used.)

Regarding claim 10, Falessi in view of Gupta discloses: A system comprising: a first computing device, and a second computing device; (Gupta, [0080] Those skilled in the art will appreciate that the system of FIG. 2 may be implemented in a variety of ways. First, conflation apparatus 208, online service 222, metadata store 224, and/or data stores 234 may be provided by a single physical machine, multiple computer systems, one or more virtual machines, a grid, one or more databases, one or more filesystems, and/or a cloud computing system. Conflation apparatus 208 and online service 222 may additionally be implemented together and/or separately by one or more hardware and/or software components and/or layers.)
wherein the first computing device is configured to: transmit, to the second computing device, second text indicating a current configuration of an organization, wherein the current configuration corresponds to operations of the organization; (Gupta [0019] In turn, the platform reduces overhead associated with storing, processing, and/or querying sensitive data and/or data associated with multiple data access policies. For example, the platform automatically creates storage accounts, security identities, roles, and/or other components for storing and accessing a conflated dataset based on compliance and/or access control policies of the corresponding data providers. The platform also modifies queries for creating and/or accessing the conflated data in a way that enforces the policies. In contrast, conventional techniques require manual configuration and/or review of roles, accounts, queries, platforms, and/or other components involved in isolating or securing data.  [0080] Those skilled in the art will appreciate that the system of FIG. 2 may be implemented in a variety of ways. First, conflation apparatus 208, online service 222, metadata store 224, and/or data stores 234 may be provided by a single physical machine, multiple computer systems, one or more virtual machines, a grid, one or more databases, one or more filesystems, and/or a cloud computing system. Conflation apparatus 208 and online service 222 may additionally be implemented together and/or separately by one or more hardware and/or software components and/or layers.)
and wherein the second computing device is configured to: receive the second text; (Falessi [sect 2] In this paper, the term NLP refers to the technique adopted to provide the similarity between two pieces of text.)[ computer device is already covered in the Gupta disclosure]
As for the rest of the elements of the claim, they recite the same elements as claim 1, therefore the rationale in rejecting claim 1 also applies to claim 10.

Regarding claim 14, although different in scope from claim 5, they recite elements of the apparatus of claim 5 as a system.  Thus, the analysis in rejecting claim 5 is equally applicable to claim 14.
Regarding claim 16, although different in scope from claim 7, they recite elements of the apparatus of claim 7 as a system.  Thus, the analysis in rejecting claim 7 is equally applicable to claim 16.

Regarding claim 19, Falessi discloses: A method comprising: receiving, by a first computing device and from at least one external database, requirements data comprising first text that indicates a plurality of restrictions associated with one or more processes of an organization; (requirements data, restrictions associated with process of an organization is discussed in Gupta disclosure, (Gupta [0019] In turn, the platform reduces overhead associated with storing, processing, and/or querying sensitive data and/or data associated with multiple data access policies. For example, the platform automatically creates storage accounts, security identities, roles, and/or other components for storing and accessing a conflated dataset based on compliance and/or access control policies of the corresponding data providers. The platform also modifies queries for creating and/or accessing the conflated data in a way that enforces the policies. In contrast, conventional techniques require manual configuration and/or review of roles, accounts, queries, platforms, and/or other components involved in isolating or securing data.) [ first and second text is disclosed in the Falessi disclosure see below.]
determining, by the first computing device(disclosed in Gupta), second text indicating a current configuration of the organization, wherein the current configuration of the organization corresponds to operations of the organization; (Falessi [sect 2] In this paper, the term NLP refers to the technique adopted to provide the similarity between two pieces of text.)
As for the rest of the elements of the claim, they recite the same elements as claim 1, therefore the rationale in rejecting claim 1 also applies to claim 19.

Claims 2, 11 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Falessi, in view of Gupta, further in view of Molino, and furthermore in view of (Arora, C., Sabetzadeh, M., Briand, L., & Zimmer, F. (2015). Automated checking of conformance to requirements templates using natural language processing. IEEE transactions on Software Engineering, 41(10), 944-968.)) hereinafter as Arora.
Regarding claim 2, Falessi in view of Gupta, and further in view of Molino discloses: The apparatus of claim 1, 
Gupta further discloses: wherein the instructions, when executed by the one or more processors, cause the apparatus to transmit the third text by: transmitting, to a second computing device associated with the organization, ([0080] Those skilled in the art will appreciate that the system of FIG. 2 may be implemented in a variety of ways. First, conflation apparatus 208, online service 222, metadata store 224, and/or data stores 234 may be provided by a single physical machine, multiple computer systems, one or more virtual machines, a grid, one or more databases, one or more filesystems, and/or a cloud computing system. Conflation apparatus 208 and online service 222 may additionally be implemented together and/or separately by one or more hardware and/or software components and/or layers. )
Falessi in view of Gupta, further in view of Molino, does not explicitly, but Arora discloses: data comprising instructions regarding compliance with at least one of the plurality of restrictions.(See Fig. 8 - (conformance checking diagnostics))
Falessi, Gupta, Molino, and Arora are considered analogous art because they are all in the related art of data conflation and/or text simplification. Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Falessi, in view of Gupta, further in view of Molino, to combine the teaching of Arora, to incorporate data comprising instructions regarding compliance…, because the combination of the disclosures would efficiently distinguish requirements that conforms to a template (Arora, [sect 5.4]).

Regarding claims 11 and 20, although different in scope from claim 2 and each other, they recite elements of the apparatus of claim 2 as a system and method respectively.  Thus, the analysis in rejecting claim 2 is equally applicable to claims 11 and 20.

Claims 3, 6, 12 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Falessi, in view of Gupta, further in view of Molino, and furthermore in view Arora.
Regarding claim 3, Falessi in view of Gupta, and further in view of Molino discloses: The apparatus of claim 1, 
Falessi in view of Gupta, and further in view of Molino does not explicitly, but Arora discloses: wherein the instructions, when executed by the one or more processors, cause the apparatus to transmit the third text by: transmitting, based on determining that the third text indicates that a configuration of one or more devices is contrary to the plurality of restrictions, an indication that the one or more devices are out of compliance. (See Fig. 8 - (conformance checking diagnostics))
Falessi, Gupta, Molino, and Arora are considered analogous art because they are all in the related art of data conflation and/or text simplification. Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Falessi, in view of Gupta, further in view of Molino, to combine the teaching of Arora, to incorporate wherein the instructions, … an indication that one or more devices are out of compliance, because the combination of the disclosures would efficiently distinguish requirements that conforms to a template (Arora, [sect 5.4]).

Regarding claim 6, Falessi in view of Gupta, and further in view of Molino discloses: The apparatus of claim 1, 
Falessi in view of Gupta, and further in view of Molino does not explicitly, but Arora discloses: wherein the instructions, when executed by the one or more processors, cause the apparatus to compare the first elements of the first vector and second elements of the second vector by generating a third vector, wherein each element of the third vector indicates a presence or absence of a different term. ([sect 8] The study further shows that, within the range of alternatives considered, there exist several text chunking solutions with little sensitivity to the presence or absence of a requirements glossary.)
Falessi, Gupta, Molino, and Arora are considered analogous art because they are all in the related art of data conflation and/or text simplification. Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Falessi, in view of Gupta, further in view of Molino, to combine the teaching of Arora, to incorporate wherein the instructions, … wherein each element of the third vector indicates a presence or absence of a different term, because the combination of the disclosures would efficiently distinguish requirements that conforms to a template (Arora, [sect 5.4]).

Regarding claim 12, although different in scope from claim 3, they recite elements of the apparatus of claim 3 as a system.  Thus, the analysis in rejecting claim 3 is equally applicable to claim 12.

Regarding claim 15, although different in scope from claim 6, they recite elements of the apparatus of claim 6 as a system.  Thus, the analysis in rejecting claim 6 is equally applicable to claim 15.

Claims 4, 8-9, 13 and 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Falessi, in view of Gupta, further in view of Molino, and furthermore in view of Bar-on et al. (US patent Application Publication No: US 20200410997 A1) hereinafter as Bar-on.
Regarding claim 4, Falessi in view of Gupta, and further in view of Molino discloses: The apparatus of claim 1, 
Falessi in view of Gupta, and further in view of Molino does not explicitly, but Bar-on discloses: wherein the instructions, when executed by the one or more processors, are configured to normalize the first vector by: removing, based on the semantic analysis, one or more elements of the first vector. ([0047] In some implementations, the analyzer 114 may include a semantic analysis engine to identify specific phrasing or sentence structure that may include a specific action or type of input that is provided; leverage a text preprocessing analysis engine to tokenize, lemmatize, normalize, or otherwise simplify words of the voice input, to remove certain characters or phrases (e.g., newlines or stop words), and so on...)
Falessi, Gupta, Molino, and Bar-on are considered analogous art because they are all in the related art of data conflation and/or text simplification. Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Falessi, in view of Gupta, and further in view of Molino, to combine the teaching of Bar-on, to incorporate wherein the instructions, …: removing, based on the semantic analysis, one or more elements of the first vector, because the combination of the disclosures would efficiently analyze dataset by simplifying words (Bar-on, [0047]).

Regarding claim 8, Falessi in view of Gupta, and further in view of Molino discloses: The apparatus of claim 1, 
Falessi further discloses: wherein the instructions, when executed by the one or more processors, cause the apparatus to generate the first vector by: determining third elements corresponding to terms in the first text; ([sect 3.2] Term Extraction. Simple: the simplest way to preprocess the text is to apply to it a tokenization and a stop-word removal activity. During the tokenization activity, the text is transformed in a series of tokens where capitals, punctuation, and ...)
Falessi in view of Gupta, and further in view of Molino does not explicitly, but Bar-on discloses: and determining fourth elements corresponding to phrases in the first text, wherein the first vector comprises the third elements and the fourth elements. ([0046] The analyzer 114 and/or the flow interactor 112 may be configured to receive an analyzed dataset (e.g., a text string) that is based on audio received from the voice analyzer 118. The analyzer 114 may be configured to further analyze the dataset received from the voice analyzer 118 to extract key terms and/or phrases. The analyzer 114 may, in some instances, be configured to perform natural language processing in order to remove common terms or other language that may not be unique to the topic being discussed. The analyzer 114 may perform further processing in order to identify one or more of: a topic being discussed, a user being discussed, an activity associated with the topic, or an action to be performed.)
Falessi, Gupta, Molino, and Bar-on are considered analogous art because they are all in the related art of data conflation and/or text simplification. Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Falessi, in view of Gupta, and further in view of Molino, to combine the teaching of Bar-on, to incorporate determining fourth elements corresponding to phrases in the first text, …and the fourth elements, because the combination of the disclosures would efficiently analyze dataset by simplifying words (Bar-on, [0047]).

Regarding claim 9, Falessi in view of Gupta, and further in view of Molino discloses: The apparatus of claim 1, 
Falessi in view of Gupta, and further in view of Molino does not explicitly, but Bar-on discloses: wherein the instructions, when executed by the one or more processors, cause the apparatus to process the first text to generate simplified text by: removing, from one or more first terms of the first text and based on the lemmatization algorithm, one or more characters. ([0047] leverage a text preprocessing analysis engine to tokenize, lemmatize, normalize, or otherwise simplify words of the voice input, to remove certain characters or phrases (e.g., newlines or stop words), and so on in order to generate a bag of words, bag of parts of speech, or any other numerical statistic(s) or metadata (e.g., length of text, number of sentences, number of tokens, lemma types) that may be compared to a database of previously-identified actions or commands.)
Falessi, Gupta, Molino, and Bar-on are considered analogous art because they are all in the related art of data conflation and/or text simplification. Therefore, it would have been obvious to one of ordinary skilled in the art before the effective filing date of the claimed invention to modify the teachings of Falessi, in view of Gupta, and further in view of Molino, to combine the teaching of Bar-on, to incorporate wherein the instructions, …: removing, from one or more first terms of the first text and based on the lemmatization algorithm, one or more characters, because the combination of the disclosures would efficiently analyze dataset by simplifying words (Bar-on, [0047]).

Regarding claim 13, although different in scope from claim 4, they recite elements of the apparatus of claim 4 as a system.  Thus, the analysis in rejecting claim 4 is equally applicable to claim 13.

Regarding claim 17, although different in scope from claim 8, they recite elements of the apparatus of claim 8 as a system.  Thus, the analysis in rejecting claim 8 is equally applicable to claim 17.

Regarding claim 18, although different in scope from claim 9, they recite elements of the apparatus of claim 9 as a system.  Thus, the analysis in rejecting claim 9 is equally applicable to claim 18.


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  (Mhatre, M., Phondekar, D., Kadam, P., Chawathe, A., & Ghag, K. (2017, July). Dimensionality reduction for sentiment analysis using pre-processing techniques. In 2017 International Conference on Computing Methodologies and Communication (ICCMC) (pp. 16-21). IEEE.) hereinafter as Mhatre.  Mhatre teaches a method of text pre-processing techniques in sentiment analysis using tokenization and lemmatization.
Ferrari, A., Spagnolo, G. O., & Dell'Orletta, F. (2013, August). Mining commonalities and variabilities from natural language documents. In Proceedings of the 17th International Software Product Line Conference (pp. 116-120) hereinafter as Ferrari.  Ferrari teaches a method to compare and contrast commonalities and variabilities from brochures to identify similarities or difference between current company product/service offering vs competitor offerings.
Kadiyala et al. (US Patent Application Publication No: US 20210089667 A1) hereinafter as Kadiyala.  Kadiyala discloses using natural language processing (NLP) methods and various combination of algorithms to improve classification prediction model.  Data cleaning techniques such as tokenization, lemmatization, and weighting technique such as term frequency-inverse document frequencies (TD-IDF) application are disclosed in details.


Any inquiry concerning this communication or earlier communications from the examiner should be directed to Phillip H Lam whose telephone number is (571)272-1721. The examiner can normally be reached 10 AM-6 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on (571) 272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/PHILIP H LAM/Examiner, Art Unit 2656                                                                                                                                                                                                        
/HUYEN X VO/Primary Examiner, Art Unit 2656