DETAILED ACTION
Introduction
This office action is in response to applicant’s claims filed 11/14/2019. Claims 1-24 are currently pending and have been examined. Applicant’s IDS have been considered. There is no claim to foreign priority.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1, 7, 13, 19 and 21 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Georgiou et al. (Georgiou, US 2020/0335092).
As per claim 1, Georgiou teaches a computer system comprising: 
a processor operatively coupled to memory (paragraph [0068]-as the processor, memory, and corresponding system hereinafter); 
an artificial intelligence (AI) platform, in communication with the processor, having machine learning (ML) tools to employ deep learning techniques to fused data across modalities (ibid-paragraphs [0027-0029]-his deep fusion, including multimodal input, RNN/LSTM applied across modalities), the tools comprising: 
a first data feed manager operatively coupled to a first data set, the first data set having a first modality in a first data format (ibid, Fig. 2 his timestep textual input, paragraphs [0033-0035]-see his first mode/format textual input); 
a second data feed manager operatively coupled to a second data set, the second data set having a second modality in a second data form, the second modality being different from the first modality (ibid, Fig. 2 paragraphs [0036]-his audio encoder, audio input); 
the first data feed manager to encode the first data set into a first set of vectors (ibid-see Text encoder discussion, the text representation as the first set of vectors, paragraph [0043]); 
the second data feed manager to encode the second data set into a second set of vectors (ibid-his audio encoder, audio encoded representation vectors, paragraph [0043]); 
an analyzer operatively coupled to the first and second data feed managers, the analyzer to leverage an artificial recurrent neural network (RNN) to analyze the encoded first and second data sets, including iteratively and asynchronously fuse encoded features from the first and second data modalities (ibid, paragraphs [0057-0066]), the fusing including combining vectors from the first and second data sets representing correlated temporal behavior (ibid-paragraph [0029, 0057-0066]-his RNN/LSTM-his recursive neural network fusion of asynchronous data, using LSTM framework); and 
the fused vectors returned as output data (paragraph [0041]-his high-level representation output data, wherein the fused vectors, output representation is further applied to a desired task). 
As per claim 7, claim 7 sets forth limitations similar to claim 1 and is thus rejected under similar reasons and rationale, wherein the computer program product and medium is deemed to embody the system (and method), such that Georgiou teaches a computer program product (as distinguished in the applicant’s specification, paragraph [0051-0053], from transitory signals) to employ deep learning techniques to fused data across modalities, the computer program product comprising a computer readable storage medium having program code embodied therewith, the program code executable by a processor to (ibid, claim 1, Georgiou paragraph [0068]): receive a multi-modal data set, the multi-modal data set including data in different formats from two or more modalities, including a first data set having a first modality and a second data set having a second modality (ibid-see claim 1, corresponding and similar limitation, Fig. 2, first and second data set and modalities discussion); separately process the first and second data sets, including encode the first data set into one or more first vectors and encode the second data set into one or more second vectors (ibid-see claim 1, corresponding and similar limitation, Fig. 2, see encoding, vector and representation discussion for each data set); analyze the processed multi-modal data set, including iteratively and asynchronously fuse encoded features from the first and second data modalities, the fuse modalities including combining vectors from the first and second data sets representing correlated temporal behavior (ibid-see claim 1, corresponding and similar limitation, analyzer discussion); and return the fused vectors as output data (ibid-see claim 1, corresponding and similar limitation).
As per claim 13, claim 13 sets forth limitations similar to claims 1 and 7 and is thus rejected under similar reasons and rationale, wherein the system and computer program product is deemed to embody the method, such that Georgiou teaches a method comprising (paragraph [0006]): 
receiving, by a computing device, a multi-modal data set, the multi-modal data set including data in different formats from two or more modalities, including a first data set having a first modality and a second data set having a second modality (ibid-see claim 7, corresponding and similar limitation); 
separately processing the first and second data sets, including encoding the first data set into one or more first vectors and encoding the second data set into one or more second vectors (ibid); 
analyzing the processed multi-modal data set, including iteratively and asynchronously fusing encoded features from the first and second data modalities, the fusing including combining vectors from the first and second data sets representing correlated temporal behavior (ibid); and 
returning the fused vectors as output data (ibid).As per claim 19, Georgiou teaches a method comprising: 
receiving, by a computing device, multi-modal data, the multi-modal data including data in different formats from two or more modalities, including at least a first data set and a second data set, the first data set having a first modality and the second data set having a second modality (ibid-see claim 13, corresponding and similar limitation, Fig. 2); 
separately processing the first and second data sets, including encoding the first data set into one or more first vectors and encoding the second data set into one or more second vectors (ibid); 
analyzing the processed multi-modal data, including fusing encoded vectors from the first and second data modalities, the fusing including combining vectors from the first and second data sets representing correlated temporal behavior between performance behavior of data in the modalities included in the multi-modal data (ibid-see claim 7, corresponding and similar limitation, paragraph [0029, 0057-0066]-his RNN/LSTM-his recursive neural network fusion of asynchronous data, using LSTM framework); and 
returning the fused vectors encoding common behaviors (ibid, paragraphs [0040, 0041]-his output vector, representation to common multimodal space based on correlations).
As per claim 21, Georgiou further teaches the method of claim 19, wherein analyzing the processed multi-modal data set includes employing deep learning techniques to fuse data across the modalities (ibid-see above, deep fusion RNN/LSTM and modalities discussion, paragraphs [0029-0066]).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 2-6, 8-12, 14-18, 20 and 22-24 are rejected under 35 U.S.C. 103 as being unpatentable over Georgiou as applied to claim 1 above, and further in view of Sardelich et al. (Sardelich, Multimodal deep learning for short-term stock volatility prediction).
As per claims 2, 8 and 14, Georgiou further teaches the system of claim 1, wherein a first input data from the first data set represents a time[-stamped] textual data feed and a second input data from the second data set represents time [series] data (ibid-his timestep textual data input, and audio data set, respectively, paragraph [0009-0015, 0052-0055]-see his time scale granularities for modal input signals, securities/stock applications, event predictions from multi-media input signals).
Georgiou lacks explicitly teaching that which Sardelich teaches wherein a first input data from the first data set represents a time-stamped textual data feed and a second input data from the second data set represents time series data (pages 18-20-his time-stamped news, and corresponding numerical stock value time series data).
Thus, it would have been obvious to one of ordinary skill in the linguistics art, before the effective filing date of the invention, as all the claimed elements were known in the prior art and one skilled in the art could have combined the elements as claimed by known methods (computer implemented techniques and algorithms combining processes and steps in natural language processing), in view of the teachings of Georgiou and Sardelich to combine the prior art element of textual data and securities application data as taught by Georgiou with the time-stamped data and time series financial data as taught by Sardelich as each element performs the same function as it does separately, as the combination would yield predictable results, KSR International Co. v. Teleflex Inc., 550 US. -- 82 USPQ2nd 1385 (2007), wherein the predictable result would be prediction task specific data, such as stock volatility (ibid-Sardelich, abstract, see also Georgiou-task, event prediction discussion). 
As per claims 3, 9 and 15, Georgiou with Sardelich further makes obvious the system of claim 2, wherein the iterative and asynchronous fusing includes the RNN to correlate temporal behavior of the time series data of the second data set with representative and encoded vectors from the first data set (ibid-Georgiou, see his RNN/LSTM discussion, paragraphs [0037-0040, 0057-0066]-as his iterative and asynchronous fusing, based on the independent audio temporal behavior and encoded vectors from the textual data). 
As per claims 4, 10 and 16, Georgiou with Sardelich further makes obvious the system of claim 2, wherein the iterative and asynchronous fusing includes the RNN to filter out one or more representative vectors from the first data set irrelevant to patterns ascertained in the encoded second data set (ibid-paragraphs [0038-0041]-his RNN/LSTM captures the cross-modal correlations in the fusion, and filters out the vectors from the first data set irrelevant to patterns from the second data set, producing the output representation vector).
As per claims 5, 11 and 17, Georgiou with Sardelich further makes obvious the system of claim 2, wherein encoding a text based modality includes the first data feed manager to learn semantic dependencies among words and aggregate the text into a representative vector for each input text document (ibid-Fig. 2, paragraphs [0033-0035]-his text representation of the input, word level and sentence level encoding, attention mechanism meaning/semantic aggregation representation vector for the entire input sequence/document). 
As per claims 6, 12 and 18, Georgiou with Sardelich further makes obvious the system of claim 1, wherein analysis of the encoded first and second data sets further comprises the RNN to explore and inter-relate information from at least two temporal sequences of different sampling frequencies (ibid-paragraphs [0006-0009, 0056]-his varying granularities for the temporal sequences, text/audio/visual/physiological of time-scales frequencies, and corresponding RNN cross-modal analysis of the input streams).
As per claim 20, Georgiou further teaches the method of claim 19, wherein the first modality is textual data and the second modality is time series data (ibid-see claim 2, corresponding and similar limitation, Sardelich, wherein claim 20 sets forth similar limitations as claim 2 and is rejected under similar reasons and rationale), and wherein the fusing further comprises obtaining a probability mass of attention on the textual data for a current state of the time series modality (Georgiou, paragraph [0034]-his attention mechanism on the textual data, time dependent and probability distribution, weight/mass representation on the entire message, as applied to the textual and time series data).
As per claim 22, Georgiou with Sardelich further makes obvious the method of claim 20, wherein a first input data from the first data set represents a time-stamped textual data feed and a second input data from the second data set represents numerical time series data (ibid-Georgiou, Fig. 2, see claim 2, textual data input, time series, securities/stocks, numerical event prediction and tracking, Sardelich abstract), and wherein fusing the encoded vectors further comprises referencing the numerical time series data against the time-stamped textual data (ibid-see claims 1, 7 and 13, fusion discussion, Georgiou, paragraphs [0050-0066]).
As per claim 23, Georgiou with Sardelich further makes obvious the method of claim 21, wherein fusing the encoded vectors occurs unsupervised (ibid-Georgiou, paragraph [0063-0066]-his unsupervised fusion). 
As per claim 24, Georgiou further teaches the method of claim 19, wherein the multi-modal data includes one or more of medical data, climate data, computer vision data, [financial data], or a combination thereof (paragraph [0052-0054]-his physiological signal inputs, as medical data, from human generated signals, securities applications). Georgiou lacks explicitly teaching that which Sardelich teaches, the multi-modal data includes financial data.
Thus, it would have been obvious to one of ordinary skill in the linguistics art, before the effective filing date of the invention, as all the claimed elements were known in the prior art and one skilled in the art could have combined the elements as claimed by known methods (computer implemented techniques and algorithms combining processes and steps in natural language processing), in view of the teachings of Georgiou and Sardelich to combine the prior art element of textual data and securities application data as taught by Georgiou with the time-stamped data and time series financial data as taught by Sardelich as each element performs the same function as it does separately, as the combination would yield predictable results, KSR International Co. v. Teleflex Inc., 550 US. -- 82 USPQ2nd 1385 (2007), wherein the predictable result would be prediction task specific data, such as stock volatility (ibid-Sardelich, abstract, see also Georgiou-task, event prediction discussion).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure (See PTO-892). 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LAMONT M SPOONER whose telephone number is (571)272-7613. The examiner can normally be reached 8:00 AM -5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571)272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/LAMONT M SPOONER/           Primary Examiner, Art Unit 2657                                                                                                                                                                                             
lms
6/4/2022