DETAILED ACTION
	This communication is in response to the application filed 4/8/2020. Claims 1-20 are pending in the application.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless – (a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1, 5-8, 12-15, and 19-20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Rossi et al. (US 10235403).
As per claims 1, 8, 15, Rossi et al. (US 10235403) teaches
a computer-implemented method, comprising: obtaining a Structured Query Language (SQL) query to create a matrix factorization model based on a set of training data (col. 16:12-22: with reference to fig. 14, a matrix factorization system receives one or more datasets to be fused and factorized together into a model. The datasets share at least one dimension (e.g., a user dimension) and are, for example, graphs represented as adjacency matrices or traditional data in the form of rows and columns. The rows are typically user information and the columns are typically features/attributes; col. 17:4-26: consider an online retail dataset stored in a relational database (or any arbitrary data type), and including many columns and rows. If only a user column is considered, this column can be merged with purchases or other parameters to get a matrix for which Collective CCD2 can be applied; col. 20:10-31: the initial datasets are used to select and train a model, as described above, approximating the initial datasets. The initially selection and training can be performed upon receipt of the initial datasets or by a batch computation module periodically performing batch computations. To select and train the model, the recommendation system makes use of, or includes, the matrix factorization system of fig. 14); 
generating SQL sub-queries that don't include non-scalable functions; obtaining the set of training data; and generating a matrix factorization model based on the set of training data and the SQL sub-queries that don't include non-scalable functions (col. 23:8-11: divide such requests among the p processing units, and use the smaller model learned from the matrix factorization framework to obtain predictions independently of one another; col. 14:17-22: during analysis, each of the datasets was split into training and testing datasets. Further, whenever possible, cross-validation was employed with the splits for reproducibility. For Dataset3, the training and testing datasets were also combined and used to test scalability; col. 18:40-56: the set of items based on select criteria, such as clusters, neighborhood, friends, similarity, etc. For example, heuristics may be used to only search over q entries, where q is much less than n (e.g., using simple similarity, social network, etc.) As another example, using k-means, the users and products could be clustered into groups… to constrain the set of items is using common neighbors to select a subset of then products. This improves the scalability since a score is computed for only a fraction of the total n products for each user; col. 22:21-23: providing a real-time matrix factorization system that is fast, scalable, and accurate and that is applicable for a variety of applications such as recommendation). Thus, select statements of subsets of datasets based on select criteria that help obtain prediction independently of one another, find interesting groups or trends etc. are equivalent to relational/SQL sub-queries with scalable functions – See col. 17:4-10; col. 21:39-45.

As per claims 5, 12, 19, Rossi et al. teaches
wherein generating SQL sub-queries that don't include non-scalable functions comprises: generating the SQL sub-queries such that all functions called in the SQL sub-queries are scalable (col. 23:8-11: divide such requests among the p processing units, and use the smaller model learned from the matrix factorization framework to obtain predictions independently of one another; col. 14:17-22: during analysis, each of the datasets was split into training and testing datasets. Further, whenever possible, cross-validation was employed with the splits for reproducibility. For Dataset3, the training and testing datasets were also combined and used to test scalability; col. 18:40-56: the set of items to compute ratings for can be constrained based on select criteria, such as clusters, neighborhood, friends, similarity, etc. For example, heuristics may be used to only search over q entries, where q is much less than n (e.g., using simple similarity, social network, etc.) As another example, using k-means, the users and products could be clustered into groups… to constrain the set of items is using common neighbors to select a subset of then products. This improves the scalability since a score is computed for only a fraction of the total n products for each user; col. 21:30-47: the framework is flexible, scalable in streaming environments, efficient/fast, and accurate. The framework can fuse an arbitrary number of edge attributes (matrices) or vertex attributes (vectors), and the computational complexity of the framework is linear in the number of non-zeros in the matrices. Further, the framework is incremental and scalable for the streaming environment; col. 22:21-23: providing a real-time matrix factorization system that is fast, scalable, and accurate and that is applicable for a variety of applications such as recommendation).   Thus, select statements of subsets of datasets based on select criteria that help obtain prediction independently of one another, find interesting groups or trends etc. are equivalent to relational/SQL sub-queries with scalable functions – See col. 17:4-10; col. 21:39-45.

As per claims 6, 13, 20, Rossi et al. teaches
wherein obtaining a Structured Query Language (SQL) query to create a matrix factorization model based on a set of training data comprises: obtaining a SQL query that specifies a model type, a source of the set of training data, a number of factors, a rating column in the set of training data, a user column in the set of training data, and an item column in the set of training data (col. 3:20-30: a more general approach to matrix factorization for factorizing an arbitrary number of matrices and attributes using CCD2, herein referred to as Collective CCD2 • Collective CCD2 fuses two or more datasets together by representing datasets as matrices and factorizing them. For instance, given a user-by-item matrix (e.g., users purchased a product or rated a movie); col. 14: 5-17: ratings; col. 20:4-31: the model selection suitably adapts parameters of the model, such as model size d and/or regularization parameters. Advantageously, by recomputing U, V, and Z from scratch, new users (user column), updates/new purchases (item column), new friendships, etc. are incorporated into the model. The initial datasets (source of the training data) are used to select and train a model, as described above, approximating the initial datasets. The initially selection and training can be performed upon receipt of the initial datasets or by a batch computation module periodically performing batch computations. To select and train the model, the recommendation system makes use of, or includes, the matrix factorization system of FIG. 14; col. 22:43-55: providing a method for adjusting the matrix factorization and the resulting model using either user-defined parameters given as input, or learned automatically through cross validation. Providing a system for evaluating and tuning a model for a particular prediction or factorization task).  

As per claims 7, 14, Rossi et al. teaches
wherein the training data indicates ratings that users gave to items, and the matrix factorization model provides, for each of the users, predictions of ratings that the user would give to items for which the user did not give a rating (col. 13:25-29: regarding neighborhood based models, after training U and V (e.g., using CCD2), a prediction or recommendation is made using the user-by-user matrix S ( e.g., a social network or an interaction/similarity matrix); col. 14:25-31: the trained model was then evaluated using the testing datasets. That is to say, the trained model was used to make predictions and compared to the known results; col. 17:21-51: predicting the rating of a movie for a given user,….forecasting describes the prediction of the product a user will buy at t+1 by treating a user-by-product matrix as a continuous, dynamic system and using the current data at time t with exponentially weighted past data from t-1; col. 18:12-18).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2, 9, 16 are rejected under 35 U.S.C. 103 as being unpatentable over Rossi et al. (US 10235403) in view of Tamayo et al. (US 20130246319).
As per claims 2, 9, 16, Rossi et al. teaches
wherein generating a matrix factorization model based on the set of training data and the SQL sub-queries that don't include non- scalable functions comprises: defining, based on the training data, a model table; defining, based on the training data, a data model table; and generating the matrix factorization model based on both the model table and the data model table (col. 16:12-22: with reference to fig. 14, a matrix factorization system includes a CCD2/Collective CCD2 module. The CCD2 module receives one or more datasets to be fused and factorized together into a model. The datasets share at least one dimension (e.g., a user dimension) and are, for example, graphs represented as adjacency matrices or traditional data in the form of rows and columns. The rows are typically user information and the columns are typically features/attributes; col. 14:26-31; col. 6:6-19: multiple models are trained, typically simultaneously in parallel, using CCD2 or Collective CCD. The trained models are then combined to get a single model representing the matrix factorization. More specifically, p different bootstrap datasets are generated by dividing a dataset (e.g., A) into the p datasets. For example, two bootstrap datasets can be generated from even and odd numbered rows of the training dataset, respectively. A model is then trained using each of the p bootstrap datasets, and an ensemble is created by combining the models to obtain a more accurate model; col. 22:43-55: providing a method for adjusting the matrix factorization and the resulting model using either user-defined parameters given as input, or learned automatically through cross validation. Providing a system for evaluating and tuning a model for a particular prediction or factorization task. Since data are stored in columns and rows, thus, input/training dataset for a particular model are stored in tables). Even if Rossi does not explicitly disclose model table and data model table,
	Tamayo teaches model table and data model table at para. 78: by projecting datasets into lower-dimensional matrix representations it reduces the noise and emphasizes salient features in the data. Once products, customers, etc. are projected in a suitable space of representation, their relationships can be modeled much more easily and efficiently. As most operations in Projection Mining are matrix operations between tables of data or use data milling models, the paradigm fits very well with the RDBMS environment; para. 81: models are tables that are relatively transparent, sparse and can easily be interpreted. Therefore, the models are tables and that the table(s) of data used for training models are data model table. Thus, it would have been obvious to one or ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Rossi et al. and Tamayo in order to effectively predict, forecast or provide future recommendations – See Tamayo, para. 45.

Claims 3-4, 10-11, 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Rossi et al. (US 10235403) in view of Tamayo et al. (US 20130246319) and further in view of Shiebler (US 20190251435).
As per claims 3, 10, 17, Rossi et al. teaches
wherein generating the matrix factorization model based on both the model table and the data model table (col. 6:6-19: multiple models are trained, typically simultaneously in parallel, using CCD2 or Collective CCD. The trained models are then combined to get a single model representing the matrix factorization; col. 10:26-39: collective CCD2 is a general framework for collective matrix factorization based on CCD. The proposed approach fuses multiple data sources into a single factorization. This includes both edge attributes in the form of additional matrices and vertex attributes in the form of vectors. Collective CCD2 uses the additional data sources to learn more accurate weights which appropriately capture the influence between data sources; col. 21:30-47: the framework is flexible, scalable in streaming environments, efficient/fast, and accurate. The framework can fuse an arbitrary number of edge attributes (matrices) or vertex attributes (vectors), and the computational complexity of the framework is linear in the number of non-zeros in the matrices. Further, the framework is incremental and scalable for the streaming environment. Also, see columns and lines cited in claim 2 above in relating to the model table and data model table).  Even if Rossi does not explicitly disclose model table and data model table,
	Tamayo teaches model table and data model table at para. 13,78: by projecting datasets into lower-dimensional matrix representations it reduces the noise and emphasizes salient features in the data. Once products, customers, etc. are projected in a suitable space of representation, their relationships can be modeled much more easily and efficiently. As most operations in Projection Mining are matrix operations between tables of data or use data milling models, the paradigm fits very well with the RDBMS environment; para. 81: models are tables that are relatively transparent, sparse and can easily be interpreted. Therefore, the models are tables and that the table(s) of data used for training models are data model table. Thus, it would have been obvious to one or ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Rossi et al. and Tamayo in order to effectively predict, forecast or provide future recommendations – See Tamayo, para. 45.
 	Rossi and Tamayo do not explicitly teach dot product of the two vectors.
	Shiebler teaches 
	generating the matrix factorization model based on a dot product of the two vectors (para. 16: in large multi-component systems these embeddings can be used as information dense inputs to other machine learning models. The user and item embeddings generated by matrix factorization have another desirable property: they are co-embeddings which lie in the same vector space. Therefore, user-item affinity can be estimated with just a dot product of two embedding vectors, rather than an computationally expensive neural network evaluation. Furthermore, approximate nearest neighbors systems may be used to efficiently match items to users; para. 36). Thus, it would have been obvious to one or ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Rossi, Tamayo and Shiebler in order to effectively predict, forecast or provide future recommendations – See Shiebler, para. 28-31, 76.

As per claims 4, 11, 18, Rossi teaches 
providing the matrices to a linear solver; obtaining item vectors from the linear solver; 32Attorney Docket No. 16113-8864001generating the matrix factorization model based on the item vectors (col. 10:26-39: collective CCD2 is a general framework for collective matrix factorization based on CCD. The proposed approach fuses multiple data sources into a single factorization. This includes both edge attributes in the form of additional matrices and vertex attributes in the form of vectors. Collective CCD2 uses the additional data sources to learn more accurate weights which appropriately capture the influence between data sources; col. 21:30-47: the framework is flexible, scalable in streaming environments, efficient/fast, and accurate. The framework can fuse an arbitrary number of edge attributes (matrices) or vertex attributes (vectors), and the computational complexity of the framework is linear in the number of non-zeros in the matrices.)
Rossi and Tamayo do not explicitly teach determining matrices based on the dot product of the two vectors.
	Shiebler teaches 
determining matrices based on the dot product of the two vectors; providing the matrices to a linear solver; obtaining item vectors from the linear solver; 32Attorney Docket No. 16113-8864001generating the matrix factorization model based on the item vectors (para. 16: in large multi-component systems these embeddings can be used as information dense inputs to other machine learning models. The user and item embeddings generated by matrix factorization have another desirable property: they are co-embeddings which lie in the same vector space. Therefore, user-item affinity can be estimated with just a dot product of two embedding vectors. Furthermore, approximate nearest neighbors systems may be used to efficiently match items to users; para. 18-19: discovering and exploiting structure underlying user-item affinity matrices in multiple domains; para. 25: the model computes a user's auxiliary domain embedding as a linear combination of the embeddings of the items that user has interacted with, weighted by the user's affinity towards those items; para. 36, 77-79). Thus, it would have been obvious to one or ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Rossi, Tamayo and Shiebler in order to effectively predict, forecast or provide future recommendations – See Shiebler, para. 28-31, 76).  

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Cohen et al. (US 9934134) teaches col. 149:15-17: vector dot product; col 183:30-39: matrix factorization, model generation. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LINH BLACK whose telephone number is (571)272-4106. The examiner can normally be reached 9AM-5PM EST M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tony Mahmoudi can be reached on 571-272-4078. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/LINH BLACK/Examiner, Art Unit 2163                                                                                                                                                                                                        




2/13/2022

/TONY MAHMOUDI/Supervisory Patent Examiner, Art Unit 2163