DETAILED ACTION
Introduction
This office action is in response to applicant’s amendment filed 9/6/2022. Claims 1-24 are currently pending and have been examined. Applicant’s IDS have been considered. There is no claim to foreign priority.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments, see remarks, filed 9/6/2022, with respect to the rejection(s) of claim(s) 1-24, under one of 35 USC 102 and 35 USC 103, have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Sardelich in view of Alvelda (see rejection below).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) Claims 1-24 are rejected under 35 U.S.C. 103 as being unpatentable over Sardelich et al. (Sardelich, Multimodal deep learning for short-term stock volatility prediction) in view of Alvelda, VII (Alveda, US 2020/0104641).
As per claim 1, Sardelich teaches a computer system comprising: 
a processor operatively coupled to memory; and 
an artificial intelligence (AI) platform, in communication with the processor, having machine learning (ML) tools to employ deep learning techniques operable to fuse data across modalities (ibid-pages 15-18, his Recurrent Neural Network (RNN), pages 18-21-his “deep multi-modal learning framework”, pages 21-24 his “multimodal” network, across time series data and news information, his FC layer, fusing data across both modalities, via merging, concatenation and joint representation encoding), the tools comprising: 
a first data feed manager operatively coupled to a first data set, the first data set having a first modality in a first data format (ibid, his news source data input); 
a second data feed manager operatively coupled to a second data set, the second data set having a second modality in a second data form, the second modality being different from the first modality (ibid-his price data, as his numerical time series information); 
the first data feed manager to configured to encode the first data set into first encodings comprising a first set of vectors (ibid-his embedding layers, wherein the first data set is encoded into a first set of vectors); 
the second data feed manager configured to encode the second data set into second encodings comprising a second set of vectors (ibid-his embeddings for the temporal context and time steps); 
an analyzer operatively coupled to the first and second data feed managers (ibid-his deep learning multimodal neural network architecture and system, Figs. 2-7, see pages 21-23 section 5.3), the analyzer configured to: 
leverage an artificial recurrent neural network (RNN) to analyze the encoded first and second data sets, including iteratively and asynchronously fuse the first and second encodings (ibid-his each iteration of stochastic gradient descent, learning using RNN, for learns joint representation for all modes), the fusing comprising combining vectors from the first and second sets of vectors representing correlated temporal behavior (ibid-his RNN/LSTM-his recursive neural network fusion of multi-modal data, using LSTM framework, and his temporal sequence as illustrated in his neural network architecture, his BiLSTM embedding with time step information, as his asynchronous LSTM, and corresponding fusion thereof); and 
return the fused vectors as output data (ibid-his output vector from the FC layer).
Sardelich lacks explicitly teaching that which Alvelda teaches, a processor operatively coupled to memory, an artificial intelligence (AI) platform, in communication with the processor (Fig. 7).
 Thus, it would have been obvious to one of ordinary skill in the linguistics art, before the effective filing date of the invention, as all the claimed elements were known in the prior art and one skilled in the art could have combined the elements as claimed by known methods (computer implemented techniques and algorithms combining processes and steps in natural language processing), in view of the teachings Sardelich and Alvelda to combine the prior art element time-stamped data and time series financial data as taught by Sardelich with the computer processing system as taught by Alvelda as each element performs the same function as it does separately, as the combination would yield predictable results, KSR International Co. v. Teleflex Inc., 550 US. -- 82 USPQ2nd 1385 (2007), wherein the predictable result would be prediction task specific data, such as stock volatility using hardware/system and/or machine readable medium for implementation thereof (ibid-Sardelich/Alvelda abstract).
As per claim 7, claim 7 sets forth limitations similar to claim 1 and is thus rejected under similar reasons and rationale, wherein the computer program product and medium is deemed to embody the system (and method), such that Sardelich with Averlda teaches a computer program product (as distinguished in the applicant’s specification, paragraph [0051-0053], from transitory signals) to employ deep learning techniques to fuse data across modalities, the computer program product comprising a computer readable storage medium having program code embodied therewith, the program code executable by a processor to (ibid, claim 1, Alvelda): receive a multi-modal data set, the multi-modal data set comprising data in different formats from two or more modalities, including a first data set having a first modality and a second data set having a second modality that differs from the first modality (ibid-see claim 1, corresponding and similar limitation, Sardelich, his first and second data set and modalities discussion, his news data feed and time series, temporal based data modes of information); separately process the first and second data sets, including encode the first data set into first encodings comprising one or more first vectors and encode the second data set into second encodings comprising one or more second vectors (ibid-see claim 1, corresponding and similar limitation, see encoding, vector and representation discussion for each data set); analyze the processed multi-modal data set, including iteratively and asynchronously fuse the first and second encodings, the fusion comprising combining the first and second vectors representing correlated temporal behavior (ibid-see claim 1, corresponding and similar limitation, analyzer discussion); and return the fused vectors as output data (ibid-see claim 1, corresponding and similar limitation).
As per claim 13, claim 13 sets forth limitations similar to claims 1 and 7 and is thus rejected under similar reasons and rationale, wherein the system and computer program product is deemed to embody the method, such that Sardelich with Alveda make obvious a method comprising (ibid-Alveda, abstract): 
receiving, by a computing device (ibid-Alveda-see claim 1, system-Fig. 7, as his computing device, processor discussion), a multi-modal data set, the multi-modal data set comprising data in different formats from two or more modalities, including a first data set having a first modality and a second data set having a second modality that differs from the first modality (ibid-see claim 7, corresponding and similar limitation); 
separately processing the first and second data sets, including encoding the first data set into first encodings comprising one or more first vectors and encoding the second data set into encodings comprising one or more second vectors (ibid); 
analyzing the processed multi-modal data set, including iteratively and asynchronously fusing the first and second encodings, the fusing comprising combining first and second vectors representing correlated temporal behavior (ibid); and 
returning the fused vectors as output data (ibid).As per claim 19, claim 19 sets forth limitations similar to claims 7 and 13, and is rejected under similar reasons and rationale, wherein Sardelich with Alvelda make obvious teaches a method comprising: 
receiving, by a computing device (ibid-see claim 13, corresponding and similar limitation), multi-modal data, the multi-modal data comprising data in different formats from two or more modalities (ibid), including at least a first data set and a second data set, the first data set having a first modality and the second data set having a second modality that differs from the first modality (ibid-see claim 13, corresponding and similar limitation); 
separately processing the first and second data sets, including encoding the first data set into first encodings comprising one or more first vectors and encoding the second data set into second encodings comprising one or more second vectors (ibid); 
analyzing the processed multi-modal data, including fusing the first and second encodings, the fusing comprising combining the first and second vectors representing correlated temporal behavior between performance behavior of data in the modalities included in the multi-modal data (ibid-see claim 7, corresponding and similar limitation,); and 
returning the fused vectors encoding common behaviors (ibid, see returned fused vectors discussion, his output vector from FC, as his common multimodal space based on time series and data correlations).
As per claim 21, Georgiou further teaches the method of claim 19, wherein analyzing the processed multi-modal data set includes employing deep learning techniques to fuse data across the modalities (ibid-see above, deep fusion RNN/LSTM and modalities discussion, paragraphs [0029-0066]).
As per claims 2, 8 and 14, Sardelich further makes obvious the system of claim 1, wherein first input data from the first data set represents a time-stamped textual data feed and second input data from the second data set represents time series data (ibid-pages 18-20-his time-stamped news, and corresponding numerical stock value time series data).
As per claims 3, 9 and 15, Sardelich with Alvelda further makes obvious the system of claim 2, wherein the leverage of the RNN to iteratively and asynchronously fuse the first and second encodings comprises the RNN configured to correlate temporal behavior of the time series data of the second data set with representative vectors from the first data set (ibid-Sardelich, see his RNN/LSTM discussion FC and joint representation and merging discussion, as his iterative and asynchronous fusing, based on the independent temporal behavior of the time series data from the second mode of input and vectors from the textual data). 
As per claims 4, 10 and 16, Sardelich with Alvelda further makes obvious the system of claim 2, wherein leverage of the RNN to iteratively and asynchronously fuse the first and second encodings comprises the RNN configured to filter out one or more representative vectors from the first set of vectors irrelevant to patterns ascertained in the second set of vectors (ibid-pages 21-23, Sardelich -his RNN/LSTM captures the cross-modal correlations in the fusion, and filters out the vectors from the first data set irrelevant to patterns from the second data set, producing the output representation vector).
As per claims 5, 11 and 17, Sardelich with Alvelda further makes obvious the system of claim 2, wherein the first data feed manager configured to encode the first data set representing the time-stamped textual data feed comprises the first data manager configured to learn semantic dependencies among words of the time-stamped textual data feed and aggregate the words into a representative vector for each input text document (ibid-Sardelich pages 15-18 21-23, his NLP, semantic dependencies via sequence modeling, word order and relations in a sentence, his text representation of the input, encoding thereof, attention mechanism meaning/semantic aggregation representation vector for the entire input sequence/document and corresponding time steps, and final max vector, aggregated for each input text document, see also page 20.). 
As per claims 6, 12 and 18, Sardelich with Alvelda further makes obvious the system of claim 1, wherein the RNN is further configured to explore and inter-relate information from at least two temporal sequences of different sampling frequencies (ibid- see his temporal sequences, time-scales frequencies, and corresponding RNN cross-modal analysis of the input streams, see page 28-as his different sampling frequencies, evaluation at different frequencies and information correlated).
As per claim 20, Sardelich further teaches the method of claim 19, wherein the first modality is textual data and the second modality is time series data (ibid-see claim 2, corresponding and similar limitation, Sardelich, wherein claim 20 sets forth similar limitations as claim 2 and is rejected under similar reasons and rationale), and wherein the fusing further comprises obtaining a probability mass of attention on the textual data for a current state of the time series modality (Sardelich, pages 20-23 -his attention mechanism on the textual data, time dependent and probability distribution, weight/mass representation on the entire message, as applied to the textual and time series data).
As per claim 22, Sardelich with Alvelda further makes obvious the method of claim 20, wherein  first input data from the first data set represents a time-stamped textual data feed and second input data from the second data set represents numerical time series data (ibid-Sardelich, see claim 2, textual data input, time series, securities/stocks, numerical event prediction and tracking, Sardelich abstract), and wherein the fusing of the first and second vectors further comprises referencing the numerical time series data against the time-stamped textual data (ibid-see claims 1, 7 and 13, Sardelich, fusion discussion-his joint representation encoder, fusing the first and second vectors from the encoded data streams).
As per claim 23, Sardelich with Alvelda further makes obvious the method of claim 21, wherein the fusing the encoded vectors occurs unsupervised (ibid-page 4-see his “unsupervised” deep learning and fusion discussion). 
As per claim 24, Sardelich further teaches the method of claim 19, wherein the multi-modal data includes one or more of medical data, climate data, computer vision data, financial data, or a combination thereof (ibid-see Sardelich multiple data fields, including financial, his abstract).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure (See PTO-892). 
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LAMONT M SPOONER whose telephone number is (571)272-7613. The examiner can normally be reached 8:00 AM -5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571)272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/LAMONT M SPOONER/           Primary Examiner, Art Unit 2657                                                                                                                                                                                             
lms
11/3/2022