DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

EXAMINER’S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for this examiner’s amendment was given in an email from Stephen Dew on 05/11/2021. 
Replace the latest set of claims with the following:

1.	(Currently Amended) A method for classifying documents as public or private, the method comprising: 
accessing a document comprising a plurality of sentences;
generating a plurality of discourse trees from the sentences, wherein each discourse tree corresponds to a sentence of the sentences, comprises elementary discourse units, comprises a rhetorical relationship that relates two elementary discourse units, and comprises a plurality of nodes, each nonterminal node of the nodes of the discourse tree representing one of the rhetorical relationships and each terminal node of the nodes of the discourse tree associated with one of the elementary discourse units;
creating a first communicative discourse tree from a first discourse tree of the discourse trees by matching, in the first discourse tree, each elementary discourse unit and wherein each thematic role describes a relationship between the verb and related words;
creating a second communicative discourse tree from a second discourse tree of the discourse trees by matching, in the second discourse tree, each elementary discourse unit 
combining the first communicative discourse tree and the second communicative discourse tree into a parse thicketthat relates the first communicative discourse tree and the second communicative discourse tree and comprises discourse relationships between the sentences represented by the first communicative discourse tree and the sentences represented by the second communicative discourse tree;
applying a machine learning model to the parse thicket 
providing the classification to a user interface.

2.	(Currently Amended) The method of claim 1, wherein applying the machine learning model 

 3.	(Currently Amended) The method of claim 1, wherein the document comprises a plurality of paragraphs that comprise elementary discourse units, the method further comprising:
creating, for each paragraph of the plurality of paragraphs, an additional communicative discourse tree;
combining the additional communicative discourse trees into [[a]] an additional parse thicket;
additional parse thicket; and
in response to determining that a threshold number of paragraphs are private, determining that the document is private.	

4.	(Previously presented) The method of claim 1, further comprising:
determining, from the document, a set of keywords; 
executing a query for the document, wherein the query comprises the set of keywords; and
responsive to receiving a result of the query that indicates that the document is public, updating the classification to public. 
 
5.	(Original) The method of claim 1, further comprising:
selecting, at random, the plurality of sentences from different paragraphs of the document. 
 
6.	(Original) The method of claim 1, further comprising:
responsive to determining that the document is classified as public, permitting a transmission of the document over a data network.
 
7.	(Original) The method of claim 1, further comprising:
responsive to determining that the document is classified as private, preventing a transmission of the document over a data network.
 
8.	(Currently Amended) The method of claim 1, further comprising:
accessing a set of training data comprising a set of training data pairs, wherein each training data pair comprises a training parse thicket corresponding to a plurality of training sentences from a training document and an expected classification and wherein the set of 
training the machine learning model by iteratively:
providing one of the training data pairs to the machine learning model;
receiving, from the machine learning model, a determined classification;
calculating a loss function by calculating a difference between the determined classification and the expected classification; and
adjusting internal parameters of the machine learning model to minimize the loss function. 

9.	(Currently Amended) A method for classifying a document, the method comprising:
accessing an electronic document comprising a plurality of features, a plurality of sentences, and 
recognizing a document feature of the plurality of features;
extracting, from the document, metadata and text;
classifying the document into a category of a plurality of categories by applying a first machine learning model to the metadata, the text, and the document feature;
selecting, based on the determined category, a second machine learning model from a plurality of machine learning models; [[and]]
generating one or more discourse trees from the sentences, wherein each discourse tree corresponds to a sentence of the sentences, comprises elementary discourse units, comprises a rhetorical relationship that relates two elementary discourse units, and comprises a plurality of nodes, each nonterminal node of the nodes of the discourse tree representing one of the rhetorical relationships and each terminal node of the nodes of the discourse tree associated with one of the elementary discourse units;
creating one or more communicative discourse trees from the one or more discourse trees by matching, in the respective discourse tree, each elementary discourse unit that has a verb to a verb signature, wherein each verb signature comprises a verb of the respective elementary discourse unit and a sequence of thematic roles, and wherein each thematic role describes a relationship between the verb and related words;
combining the one or more communicative discourse trees into a parse thicket, wherein the parse thicket relates the one or more communicative discourse trees and comprises discourse relationships between the sentences represented by the one or more communicative discourse trees; and
determining whether the document is public or private by applying the second machine learning model to the parse thicket , wherein the determining is based on discourse relationships.
 
10.	(Currently Amended) The method of claim 9, wherein the plurality of categories comprise one or more of (i) financial, (ii) legal, (iii) engineering, and (iv) health, and wherein each of the plurality of machine learning models corresponds to a respective category. 

11.	(Canceled) 
 
12.	(Previously presented) The method of claim 9, further comprising iteratively training the second machine learning model by:
providing a training data pair from a set of training data to the second machine learning model, wherein each training data pair comprises a training document and an expected document classification;
receiving, from the second machine learning model, a determined classification;
calculating a loss function by calculating a difference between the determined classification and the expected document classification; and
adjusting internal parameters of the second machine learning model to minimize the loss function. 
 
13.	(Previously presented) A system comprising:

a processing device communicatively coupled to the non-transitory computer-readable medium for executing the computer-executable program instructions, wherein executing the computer-executable program instructions configures the processing device to perform operations comprising:
accessing a document comprising a plurality of sentences;
generating a plurality of discourse trees from the sentences, wherein each discourse tree corresponds to a sentence of the sentences, comprises elementary discourse units, comprises a rhetorical relationship that relates two elementary discourse units, and comprises a plurality of nodes, each nonterminal node of nodes of the discourse tree representing one of the rhetorical relationships and each terminal node of the nodes of the discourse tree associated with one of the elementary discourse units;
creating a first communicative discourse tree from a first discourse tree of the discourse trees by matching, in the first discourse tree, each elementary discourse unit and wherein each thematic role describes a relationship between the verb and related words;
creating a second communicative discourse tree from a second discourse tree of the discourse trees by matching, in the second discourse tree, each elementary discourse unit 
combining the first communicative discourse tree and the second communicative discourse tree into a parse thicket, wherein the parse thicket relates the first communicative discourse tree and the second communicative discourse tree and comprises discourse relationships between the sentences represented by the first communicative discourse tree and the sentences represented by the second communicative discourse tree;
applying a machine learning model to the parse thicket 

 
14.	(Currently Amended) The system of claim 13, wherein applying the machine learning model 
 
15.	(Currently Amended) The system of claim 13, wherein the document comprises a plurality of paragraphs comprising 
creating, for each paragraph of the plurality of paragraphs, an additional communicative discourse tree;
combining the additional communicative discourse trees into an additional parse thicket;
determining whether each paragraph is public or private by applying the machine learning model to the additional parse thicket; and
in response to determining that a threshold number of paragraphs are private, determining that the document is private.	
 
16.	(Previously presented) The system of claim 13, executing the computer-executable program instructions configures the processing device to perform operations comprising:
determining, from the document, a set of keywords; 
executing a query for the document, wherein the query comprises the set of keywords; and
responsive to receiving a result of the query that indicates that the document is public, updating the classification to public. 


selecting, at random, the plurality of sentences from different paragraphs of the document. 
 
18.	(Currently Amended) The system of claim 13, wherein executing the computer-executable program instructions configures the processing device to perform operations comprising:
accessing a set of training data comprising a set of training data pairs, wherein each training data pair comprises a parse thicket corresponding to a plurality of training sentences from a training document and an expected classification and wherein the set of training data includes 
training the machine learning model by iteratively:
providing one of the training data pairs to the [[the]] machine learning model;
receiving, from the [[the]] machine learning model, a determined classification;
calculating a loss function by calculating a difference between the determined classification and the expected classification; and
adjusting internal parameters of [[the]] the machine learning model to minimize the loss function. 
 
19.	(Previously presented) The system of claim 13, wherein executing the computer-executable program instructions configures the processing device to perform operations comprising:
(i) responsive to determining that the document is classified as public, permitting a transmission of the document over a data network or (ii) responsive to determining that the document is classified as private, preventing a transmission of the document over a data network.

20.	(Previously presented) The system of claim 13, wherein the machine learning model is a support vector machine using tree-kernel learning.

Allowable Subject Matter
Claims 1-10, 12-20 are allowed.
The following is an examiner’s statement of reasons for allowance: 
The prior arts of record teaches limitations as noted in the previous Office Action. However, the prior art of record, taken either alone or in combination, fails to teach or fairly suggest “applying a machine learning model to the parse thicket to determine a classification, wherein the machine learning model is trained to classify text as public or private based on discourse relationships”, as recited in claim 1, in combination with the remaining features and elements of the claimed invention.
A thorough search of the prior art reveals no prior has used communicative discourse trees together with parse thicket in machine learning model to classify documents are public or private. The closest prior available arts are Mathkour and Hart. Mathkour teaches classifying the document in public or confidential using discourse structure but it does not use machine learning model. Hart teaches classifying texts as public or private but it does not use discourse trees or structure. In current application, machine learning model was trained using supervised learning in which discourse or rhetorical relation between communicative units was formed or determined to classify document as public or private. These methods/elements in combination were not used in prior arts. As a result, this claim is allowable or patentable.
.
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
An inquiry concerning this communication or earlier communication from the examiner should be directed QAMAR IQBAL whose telephone number is 571-272-2563. The examiner can normally be reached on M-F 10-6pm (EST). 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on 571-270-3428. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. 
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 

/Q.I/ 
Examiner 

05/12/2021

/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123