DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Interpretation
As per Claim 2 (and similarly claims 11 and 20): 
“for the one or more multimodal communications” is interpreted as referring to “electronically receive”, not to “the one or more class labels”.
“and the one or more class labels” is interpreted as referring to where the generating of the training dataset is also based on “the one or more class labels”.
As per Claim 5 (and similarly claim 14):
“from the text modality” in line 6 of claim 5 refers to where the one or more features are extracted from.
As per Claim 6 (and similarly claim 15): 
“from the audio modality” in line 6 of claim 6 refers to where the one or more features are extracted from.
As per Claim 7 (and similarly claim 16): 
“from the video modality” in line 6 of claim 7 refers to where the one or more features are extracted from.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 8-18 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

As per Claim 8 (and similarly claim 17):
“the one or more features extracted from each of the one or more communication modalities associated with the one or more multimodal communications” lacks antecedent basis.  Extracting one or more features from the one or more communication modalities in lines 10-11 of claim 1 does not necessarily extract the one or more features from each of the one or more communication modalities (as opposed to where some features are extracted from some modalities whereas other features are extracted from other modalities).
	
	As per Claim 9 (and similarly claim 18)
“the one or more communication modalities associated with the unseen multimodal communication” in lines 2-3 of claim 9 lacks explicit antecedent basis (claim 1 does not specify that the unseen multimodal communication is associated with [or comprises] the one or more communication modalities [claim 3 does]).
“the multimodal metric” in line 3 of claim 9 is ambiguous when there is more than one communication modality (one multimodal metric is determined for each/per communication modality, and so when there is more than one multimodal metric, it is not clear which multimodal metric is the one that “the multimodal metric” in line 3 of claim 9 is supposed to refer to).

	 “code causing a first apparatus to:” in line 3 of claim 10 seems to require actual execution of the code to infringe.  It is not clear if Applicant meant to claim where the code must actually cause the first apparatus to perform the steps of claim 10 in order to infringe of if Applicant meant to claim where the code, when executed, causes the first apparatus to perform the claimed steps (i.e. where infringement can occur when a computer merely stores the code and does not actually run/execute the code).

Claims 11-18 impose further limits on the first apparatus, not the code stored in the medium.  Dependent claims of CRM independent claims typically further limit the code/instructions stored on the CRM and not the apparatus that executes the code/instructions, and so it is not clear if Applicant meant to claim where the apparatus is further configured to perform the dependent claim steps, or if the code, when executed, further causes the apparatus to perform the dependent claim steps.
Allowable Subject Matter
Claims 1-7, 19, and 20, are allowed.
Claim 10 would be allowable if rewritten or amended to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action.
Claims 8, 9, and 11-18 would be allowable if rewritten to overcome the rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  
	As per Claim(s) 1 (and similarly claim[s] 10 and 19, and consequently claim[s] 2-9, 11-18, and 20 which depend on claim[s] 1, 10, and 19), the prior art of record does not teach or suggest the combination of all limitations in claim(s) 1, including (i.e. in combination with the remaining limitations in claim[s] 1) A system for intelligent multimodal classification in a distributed technical environment, the system comprising: at least one non-transitory storage device; and at least one processing device coupled to the at least one non-transitory storage device, wherein the at least one processing device is configured to: electronically retrieve one or more multimodal communications from a data repository, wherein the one or more multimodal communications comprises one or more communication modalities; initiate one or more feature extraction algorithms on the one or more communication modalities associated with the one or more multimodal communications to extract one or more features from the one or more communication modalities associated with the one or more multimodal communications; generate a training dataset based on at least the one or more features extracted from the one or more communication modalities associated with the one or more multimodal communications; initiate one or more machine learning algorithms on the training dataset to generate a first set of parameters; electronically receive an unseen multimodal communication; generate an unseen dataset based on at least the unseen multimodal communication; classify, using the first set of parameters, the unseen multimodal communication into one or more class labels; and initiate an execution of one or more actions on the unseen multimodal communication based on at least classifying the unseen multimodal communication into the one or more class labels (extracting feature[s] from one or more communication modalities associated with one or more multimodal communications from a data repository, generating a training dataset based on the extracted feature[s], generating a first set of parameters by initiating machine learning on the training dataset [generated based on the extracted feature[s]], classifying an unseen multimodal communication into one or more class labels using the first set of parameters, and initiating action[s] on the unseen multimodal communication based on classifying the unseen multimodal communication into the one or more class labels)
2005/0131847 teaches “In inductive inference, which has been used thus far in the learning process, one is given data from which one builds a general model and then applies this model to classify new unseen (test) data” (paragraph 437).  This reference describes building a model from data and then applying the model to classify unseen data, and also describes unseen data as being synonymous with test data.  This reference does not appear to describe where the test data and the given data are multimodal communications or communication modalities.
2015/0294194 teaches “classifying a multimodal test object described according to at least one first and one second modality” (Abstract).
2018/0046721 teaches “extracting a plurality of features from the each of the plurality of modalities of content” (claim 4).  Paragraph 19 describes where multiple modalities can be text, image, audio, or video content.  This reference appears to be directed to searching multimodal content (not classifying an unseen multimodal input).
2017/0084295 teaches “In certain embodiments, a module execution interface 420 communicates extracted features produced by the speech feature extraction module 412 to an analytics module 422. The analytics module 422 may be configured to perform further processing on speech information provided by the speech feature extraction module 412. For example, the analytics module 422 may compute additional features (e.g., longitudinal features) using the features extracted from the speech signal by the speech feature extraction module 412. The analytics module 422 may subsequently provide as output information or data, e.g., raw analytics, for use in fusion 430, for example. The fusion module 430 may combine or algorithmically “fuse” speech features (or resultant analytics produced by the platform 400) with other multimodal data. For instance, speech features may be fused with features extracted from data sources of other modalities, such as visual features extracted from images or video, gesture data, etc. The fused multimodal features may provide a more robust indication of speaker state, in some instances” (paragraph 87).  This reference describes extracting features from data sources of multiple modalities.
2021/0192142 teaches “In another possible design of the embodiment of the disclosure, the processing module 802 is further configured to obtain a multimodal data set which includes multiple multimodal content samples, process the multimodal data set to determine an ontology of the multimodal knowledge graph, mine multimodal knowledge node samples of each of the multimodal content samples in the multimodal data set, establish an association relationship between the multimodal knowledge node samples through knowledge graph representation learning, and construct the multimodal knowledge graph based on the association relationship between the multimodal knowledge nodes and the ontology of the multimodal knowledge graph” (paragraph 10).  This reference describes a multimodal data set which includes multimodal content samples and processing the multimodal dataset to determine an ontology.  This reference does not appear to describe where the ontology is used to classify an unseen multimodal input.
2011/0213737 teaches “Training is further enhanced by exploiting the fact that each training dataset 60 is multi-modal, i.e., contains multiple classes or dimensions for a given feature data sample, e.g., feature data sample 42A. Enhanced training is implemented as follows. Once obtained, a given feature data sample 42A is passed into the feature correlation system 44, which finds exclusive groupings of features within the feature data sample 42A that have either the most characteristics in common or the least characteristics in common. Grouping criteria are generally determined a priori. For example, it may be known that financial data and travel data are commonly linked, or that a first health condition is common to a second health condition. The groupings of data become correlated features” (paragraph 25).  This reference describes where a training dataset is multi-modal.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC YEN whose telephone number is (571)272-4249. The examiner can normally be reached M-F 12:00PM -8:30PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, RICHEMOND DORVIL can be reached on (571)272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





EY 8/10/2022
/ERIC YEN/Primary Examiner, Art Unit 2658