DETAILED ACTION
This action is responsive to the Application filed on 07/01/2020. Claims 1-20 are Cancelled. Claims 21-40 are pending in the case.  Claims 21, 30 and 37 are independent claims. 
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claim(s) 28 is rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
Examiner notes there seems to be no discussion in the specification about converting labels into binary labels. The machine learning model is trained on items with binary labels, there is no mention of “non-binary” or scalar labels beyond the aside that a loss function suited for non-binary labels can be used (“Finally, we point out that any standard surrogate loss for ranking can be used as the per-step loss `(s, y), including losses that depend on non-binary labels y, such as relevance scores” Section 3.2 Seq2Slate: Re-ranking and slate optimization with RNNs”)

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claim(s) 21, 26, 28-30, 35 and 37 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Yin et al “Ranking Relevance in Yahoo Search” hereinafter Yin.

Regarding claim 21
Yin teaches, A computer-implemented method to train a machine-learned ranking model for generating a ranking for a set of candidate items, the method comprising: ( pg 323 Section 2.1 “The Yahoo search engine can retrieve the most relevant documents from a corpus of billions” Section 3.1 ¶02 “We name our learning algorithm LogisticRank and compare with the leading ranking algorithms: … in the Yahoo search engine. All three methods share the same training data, which is collected through active learning” the search algorithm presents a list of relevant documents using a machine learned model LogisitcRank) obtaining, by one or more computing devices, data descriptive of the machine-learned ranking model, (Section 2.3 ¶03 Table 1 “To assess base relevance, we leverage professional editors’ judgment…We then use Discounted Cumulative Gain (DCG) as the metric to evaluate the search relevance performance… In the following sections, we will report DCG1, DCG3, DCG5” the method also produces performance metrics which are obtained based on a validation candidate set.) wherein the machine-learned ranking model is configured to receive the set of candidate items and output the ranking of the candidate items in the set of candidate items;  (Section 3.1 ¶02 “We name our learning algorithm LogisticRank…training data, which is collected through active learning and editor labels and includes about 2 million query-URL pairs” Section 3.2 “The reranking phase is applied after the results from the core ranking phase have been sorted, aggregated on one machine, and trimmed down to the best tens of results” before the reranking phase a list of sorted and ranked URLS are output.) obtaining, by the one or more computing devices, a set of training data, the set of training data comprising a plurality of training pairs, each training pair of the plurality of training pairs comprising a training item and an associated relevance score; (Section 2.3 “To assess base relevance, we leverage professional editors’ judgment. Editors manually judge each query-URL pair, assigning one of five grades: Perfect, Excellent, Good, Fair or Bad” Section 3 “We here introduce a unified method for web search which is based on logistic loss and incorporates the Perfect, Excellent and Good information into the model though scaling the gradient for GBDT.” In order to train the model the labels query URL Pairs were received by the system. The relevance score is the label assigned by editors.) determining, by the one or more computing devices, a simulated interaction for each training item of the set of training data, the simulated interaction indicating whether a simulated user will interact with the training item,( Section 2.3 ¶03 “We sample 2,000 queries from a one year query log as our test queries and evaluate their search results in the Yahoo search engine by capturing the results” Section 3 ¶02 and Table 1 “We here introduce a unified method for web search which is based on logistic loss and incorporates the Perfect, Excellent and Good information into the model though scaling the gradient for GBDT. Online evaluation of the Yahoo search engine shows that this framework decreases the percent of bad URLs by 40%”  the authors describe a search engine which takes training data an evaluates it using a machine learning method to determine which documents are relevant with a score. Determining whether a query-document has a high score for example is a “simulated interaction” between a user of the search engine and the data set. A determination of a positive score which indicates that a use may interact with the item, while negative scores (bad URLS)  are learned such that they will not presented to a hypothetical/interactive user. Presenting or not presenting a URL is a determination of whether a simulated user will interact with a training item.)the simulated interaction based at least in part on the associated relevance score for the training item (Table 1 as stated previously the machine learning ranking system determines a relevance score DCG1, DCG3, or DCG5 as shown in table 1.)  and a probability that the simulated user will observe the training item based on a position of the training item within a candidate presentation; (Section 2.3 Table 1 “According to the popularity of the query, the evaluation query set is split into three parts: top, torso and tail. Top queries have very high impressions and, in terms of relevance… Tail and torso queries are relatively hard queries for search relevance” as shown in table 1 each training item has an associated popularity label (top, torso tail), these labels correspond to the “popularity” or “probability” of a query-document that a hypothetical/simulated user is likely to interact with. Tail queries for example appear at the end of a list of recommended results presented to a hypothetical user in a search engine, while top queries are presented at the beginning of a candidate presentation.) and training, by the one or more computing devices, the machine-learned ranking model based on the simulated interactions for each training item of the training data. (Section 4.3 Table 3 “The neural network is then trained on a GPU cluster. After training the deep neural network, at runtime, the top level neurons will generate the embeddings of queries and URLs” the architecture described in the art is trained on the query URL pairs from the candidate set. Also as shown in Table 3, the performance of a variety of models based on the base system are displayed.) 

Regarding claim 26
Yin teaches claim 21
Yin teaches, wherein the machine-learned ranking model comprises one or more regression trees.(Section 3 ¶03 “Search can be treated as a binary problem. In our experiments ,we observe that gradient boosting trees” a GBDT is a regression model composed of regression tress.)

Regarding claim 28
Yin teaches claim 21
Yin teaches, wherein: the associated relevance score of each of the plurality of training pairs comprises a scalar value;  and obtaining, by the one or more computing devices, the set of training data further comprises converting, by the one or more computing devices, the scalar value of each of the plurality of training pairs to a binary value based on whether the scalar value is above a scalar threshold. ( Section 3.1 ¶01 we first introduce logistic loss at the core ranking where we aim to reduce bad/fair URLs in top results. We first flat the labels “Perfect”, “Excellent” and “Good” to “Positive” (+1) and “Fair”, “Bad” to “Negative” (-1)… To distinguish Perfect/Excellent/Good, we scale the gradient, (which is also known as the pseudo-response of stage i), in different levels (e.g., 3 for Perfect, 2 for Excellent, and 1 for Good)” initially the labeled data is labeled as “perfect”, “Excellent” “good” “fair” and “bad”, clearly must have been converted to scalar values for processing (1,3,5) for the positive labels, then the labels are converted to binary values +1 or -1. Evidence that the labels are treated as numerical values is in section 2.3 where Gi is the label weight Section 2.3 ¶02 For a ranked list of N documents, we use the following variation of DCG 
    PNG
    media_image1.png
    65
    235
    media_image1.png
    Greyscale
where Gi represents the weight assigned to the label of the document at position i. Higher degrees of relevance correspond to higher values of the weight….In the following sections, we will report DCG1, DCG3, DCG5 with N in {1,3,5}”)

Regarding claim 29
Yin teaches claim 21
Yin teaches, wherein determining, by the one or more computing devices, the simulated interaction for each training item comprises: determining, a preliminary ranking for each training item of the set of training data based at least in part on the associated relevance score of each training item; (Section 3.1 ¶02 “We name our learning algorithm LogisticRank…training data, which is collected through active learning and editor labels and includes about 2 million query-URL pairs” Section 3.2 “The reranking phase is applied after the results from the core ranking phase have been sorted, aggregated on one machine, and trimmed down to the best tens of results” An initial ranking is determined based on the labeled data) determining, by the one or more computing devices, a simulated interaction for each training item of the set of training data, the simulated interaction indicating whether a simulated user will interact with the training item,( Section 2.3 ¶03 “We sample 2,000 queries from a one year query log as our test queries and evaluate their search results in the Yahoo search engine by capturing the results” Section 3 ¶02 and Table 1 “We here introduce a unified method for web search which is based on logistic loss and incorporates the Perfect, Excellent and Good information into the model though scaling the gradient for GBDT. Online evaluation of the Yahoo search engine shows that this framework decreases the percent of bad URLs by 40%”  the authors describe a search engine which takes training data an evaluates it using a machine learning method to determine which documents are relevant with a score. Determining whether a query-document has a high score for example is a “simulated interaction” between a user of the search engine and the data set. A determination of a positive score which indicates that a use may interact with the item, while negative scores (bad URLS)  are learned such that they will not presented to a hypothetical/interactive user. Presenting or not presenting a URL is a determination of whether a simulated user will interact with a training item.)the simulated interaction based at least in part on the associated relevance score for the training item (Table 1 as stated previously the machine learning ranking system determines a relevance score DCG1, DCG3, or DCG5 as shown in table 1.)  and a probability that the simulated user will observe the training item based on a position of the training item within a candidate presentation; (Section 2.3 Table 1 “According to the popularity of the query, the evaluation query set is split into three parts: top, torso and tail. Top queries have very high impressions and, in terms of relevance… Tail and torso queries are relatively hard queries for search relevance” as shown in table 1 each training item has an associated popularity label (top, torso tail), these labels correspond to the “popularity” or “probability” of a query-document that a hypothetical/simulated user is likely to interact with. Tail queries for example appear at the end of a list of recommended results presented to a hypothetical user in a search engine, while top queries are presented at the beginning of a candidate presentation.
Regarding claim 30
Yin teaches,  A computer system comprising: one or more processors; and one or more non-transitory computer readable media that collectively store: (Section 4.3 ¶04 “The neural network is then trained on a GPU cluster” in order to operate the neural network on a GPU a computer must have access to computer readable media  pg 323 Section 2.1 “The Yahoo search engine can retrieve the most relevant documents from a corpus of billions” Section 3.1 ¶02 “We name our learning algorithm LogisticRank and compare with the leading ranking algorithms: … in the Yahoo search engine. All three methods share the same training data, which is collected through active learning” the search algorithm presents a list of relevant documents using a machine learned model LogisitcRank) obtaining data descriptive of a machine-learned ranking model, (Section 2.3 ¶03 Table 1 “To assess base relevance, we leverage professional editors’ judgment…We then use Discounted Cumulative Gain (DCG) as the metric to evaluate the search relevance performance… In the following sections, we will report DCG1, DCG3, DCG5” the method also produces performance metrics which are obtained based on a validation candidate set.) wherein the machine-learned ranking model is configured to receive a set of candidate items and output a ranking of the candidate items in the set of candidate items; (Section 3.1 ¶02 “We name our learning algorithm LogisticRank…training data, which is collected through active learning and editor labels and includes about 2 million query-URL pairs” Section 3.2 “The reranking phase is applied after the results from the core ranking phase have been sorted, aggregated on one machine, and trimmed down to the best tens of results” before the reranking phase a list of sorted and ranked URLS are output.) obtaining, by the one or more computing devices, a set of training data, the set of training data comprising a plurality of training pairs, each training pair of the plurality of training pairs comprising a training item and an associated relevance score (Section 2.3 “To assess base relevance, we leverage professional editors’ judgment. Editors manually judge each query-URL pair, assigning one of five grades: Perfect, Excellent, Good, Fair or Bad” Section 3 “We here introduce a unified method for web search which is based on logistic loss and incorporates the Perfect, Excellent and Good information into the model though scaling the gradient for GBDT.” In order to train the model the labels query URL Pairs were received by the system. The relevance score is the label assigned by editors.) determining, by the one or more computing devices, a simulated interaction for each training item of the set of training data, the simulated interaction indicating whether a simulated user will interact with the training item,( Section 2.3 ¶03 “We sample 2,000 queries from a one year query log as our test queries and evaluate their search results in the Yahoo search engine by capturing the results” Section 3 ¶02 and Table 1 “We here introduce a unified method for web search which is based on logistic loss and incorporates the Perfect, Excellent and Good information into the model though scaling the gradient for GBDT. Online evaluation of the Yahoo search engine shows that this framework decreases the percent of bad URLs by 40%”  the authors describe a search engine which takes training data an evaluates it using a machine learning method to determine which documents are relevant with a score. Determining whether a query-document has a high score for example is a “simulated interaction” between a user of the search engine and the data set. A determination of a positive score which indicates that a use may interact with the item, while negative scores (bad URLS)  are learned such that they will not presented to a hypothetical/interactive user. Presenting or not presenting a URL is a determination of whether a simulated user will interact with a training item.)the simulated interaction based at least in part on the associated relevance score for the training item (Table 1 as stated previously the machine learning ranking system determines a relevance score DCG1, DCG3, or DCG5 as shown in table 1.)  and a probability that the simulated user will observe the training item based on a position of the training item within a candidate presentation; (Section 2.3 Table 1 “According to the popularity of the query, the evaluation query set is split into three parts: top, torso and tail. Top queries have very high impressions and, in terms of relevance… Tail and torso queries are relatively hard queries for search relevance” as shown in table 1 each training item has an associated popularity label (top, torso tail), these labels correspond to the “popularity” or “probability” of a query-document that a hypothetical/simulated user is likely to interact with. Tail queries for example appear at the end of a list of recommended results presented to a hypothetical user in a search engine, while top queries are presented at the beginning of a candidate presentation.) training, by the one or more computing devices, the machine-learned ranking model based on the simulated interactions for each training item of the training data. (Section 4.3 Table 3 “The neural network is then trained on a GPU cluster. After training the deep neural network, at runtime, the top level neurons will generate the embeddings of queries and URLs” the architecture described in the art is trained on the query URL pairs from the candidate set. Also as shown in Table 3, the performance of a variety of models based on the base system are displayed.) 

Regarding claim 35
Claim 26 is rejected for the reasons set forth in claim 26 in connection with claim 30

Regarding claim 37
Yin teaches,  One or more tangible, non-transitory computer readable media storing computer- readable instructions that when executed by one or more processors cause the one or more processors to perform operations, the operations comprising (Section 4.3 ¶04 “The neural network is then trained on a GPU cluster” in order to operate the neural network on a GPU a computer must have access to computer readable media  pg 323 Section 2.1 “The Yahoo search engine can retrieve the most relevant documents from a corpus of billions” Section 3.1 ¶02 “We name our learning algorithm LogisticRank and compare with the leading ranking algorithms: … in the Yahoo search engine. All three methods share the same training data, which is collected through active learning” the search algorithm presents a list of relevant documents using a machine learned model LogisitcRank) obtaining data descriptive of a machine-learned ranking model (Section 2.3 ¶03 Table 1 “To assess base relevance, we leverage professional editors’ judgment…We then use Discounted Cumulative Gain (DCG) as the metric to evaluate the search relevance performance… In the following sections, we will report DCG1, DCG3, DCG5” the method also produces performance metrics which are obtained based on a validation candidate set.) wherein the machine-learned ranking model is configured to receive a set of candidate items and output a ranking of the candidate items in the set of candidate items (Section 3.1 ¶02 “We name our learning algorithm LogisticRank…training data, which is collected through active learning and editor labels and includes about 2 million query-URL pairs” Section 3.2 “The reranking phase is applied after the results from the core ranking phase have been sorted, aggregated on one machine, and trimmed down to the best tens of results” before the reranking phase a list of sorted and ranked URLS are output.) obtaining, by the one or more computing devices, a set of training data, the set of training data comprising a plurality of training pairs, each training pair of the plurality of training pairs comprising a training item and an associated relevance score (Section 2.3 “To assess base relevance, we leverage professional editors’ judgment. Editors manually judge each query-URL pair, assigning one of five grades: Perfect, Excellent, Good, Fair or Bad” Section 3 “We here introduce a unified method for web search which is based on logistic loss and incorporates the Perfect, Excellent and Good information into the model though scaling the gradient for GBDT.” In order to train the model the labels query URL Pairs were received by the system. The relevance score is the label assigned by editors.) determining, by the one or more computing devices, a simulated interaction for each training item of the set of training data, the simulated interaction indicating whether a simulated user will interact with the training item,( Section 2.3 ¶03 “We sample 2,000 queries from a one year query log as our test queries and evaluate their search results in the Yahoo search engine by capturing the results” Section 3 ¶02 and Table 1 “We here introduce a unified method for web search which is based on logistic loss and incorporates the Perfect, Excellent and Good information into the model though scaling the gradient for GBDT. Online evaluation of the Yahoo search engine shows that this framework decreases the percent of bad URLs by 40%”  the authors describe a search engine which takes training data an evaluates it using a machine learning method to determine which documents are relevant with a score. Determining whether a query-document has a high score for example is a “simulated interaction” between a user of the search engine and the data set. A determination of a positive score which indicates that a use may interact with the item, while negative scores (bad URLS)  are learned such that they will not presented to a hypothetical/interactive user. Presenting or not presenting a URL is a determination of whether a simulated user will interact with a training item.)the simulated interaction based at least in part on the associated relevance score for the training item (Table 1 as stated previously the machine learning ranking system determines a relevance score DCG1, DCG3, or DCG5 as shown in table 1.)  and a probability that the simulated user will observe the training item based on a position of the training item within a candidate presentation; (Section 2.3 Table 1 “According to the popularity of the query, the evaluation query set is split into three parts: top, torso and tail. Top queries have very high impressions and, in terms of relevance… Tail and torso queries are relatively hard queries for search relevance” as shown in table 1 each training item has an associated popularity label (top, torso tail), these labels correspond to the “popularity” or “probability” of a query-document that a hypothetical/simulated user is likely to interact with. Tail queries for example appear at the end of a list of recommended results presented to a hypothetical user in a search engine, while top queries are presented at the beginning of a candidate presentation.) training, by the one or more computing devices, the machine-learned ranking model based on the simulated interactions for each training item of the training data. (Section 4.3 Table 3 “The neural network is then trained on a GPU cluster. After training the deep neural network, at runtime, the top level neurons will generate the embeddings of queries and URLs” the architecture described in the art is trained on the query URL pairs from the candidate set. Also as shown in Table 3, the performance of a variety of models based on the base system are displayed.) 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 22-24, 38-40 are rejected under 35 U.S.C. § 103 as being unpatentable over Yin further in view of Shalom et al. “Beyond Collaborative Filtering: The List Recommendation Problem” hereinafter Shalom.

Regarding claim 22
Yin teaches claim 21
Yin teaches, wherein the simulated interaction is based at least in part on the associated relevance score for the training item, (pg 323 Section 3.1 “We adopt GBDT (Gradient Boosting Decision Tree) [10] into the framework. Based on the gradient boosting framework, we first introduce logistic loss at the core ranking where we aim to reduce bad/fair URLs in top results. We first flat the labels “Perfect”, “Excellent” and “Good” to “Positive” (+1) and “Fair”, “Bad” to “Negative” (-1)” Section 3.1 ¶02 “We name our learning algorithm LogisticRank… training data, which is collected through active learning and editor labels and includes about 2 million query-URL pairs” the performance of the model, or simulated interaction is based in part on the labeled relevance  Query URL pairs.)  the probability that the simulated user will observe the training item based on the position of the training item within the candidate presentation, (Section 3.2 ¶02-¶03 “The reranking phase is applied after the results from the core ranking phase have been sorted, aggregated on one machine, and trimmed down to the best tens of results… we extract following contextual features for specific existing features – (1) Rank: sorting URLs by the feature value in ascending order to get the ranks of specific URLs” one of the features of the model is to include the rank or probability that a user will select a specific URL.)
Yin does not explicitly teach, [the simulated interaction is based on] a similarity metric representing a degree of similarity between the training item and one or more different training items of the plurality of training pairs.
Shalom however when addressing inter item similarity features to use in a system that recommends a list of diverse and/or similar items teaches, [the simulated interaction is based on] a similarity metric representing a degree of similarity between the training item and one or more different training items of the plurality of training pairs. ( Section 5.2 Similarity Diversity Features “CTR depends on interaction effects between predicted item rating, item similarities and list position. We can learn these complex interactions by encoding item-to-item similarity features in xtul. In this paper we used the Jaccard similarity between item i to item j… we found that for our purposes, similarities between adjacent items in the list are sufficient.”)
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify ranking system of Yin to consider intra item similarity discussed in the ranking system of Shalom.  One would have been motivated to make such a combination because both Yin and Shalom disclose ranking system trained on a variety of features. In particular Shalom notes, “inter-item interactions have an effect on the list’s Click-Through Rate (CTR) that is unaccounted for using traditional CF approaches…Our approach accounts for inter-item interactions as well as additional information such as item fatigue, trendiness patterns, contextual information etc… showcases its effectiveness in real-world settings” (Shalom Abstract)

Regarding claim 23
Yin/Shalom teaches claim 22
Further Shalom teaches, wherein the similarity metric is based at least in part on a distance between a feature vector associated with the training item and one or more feature vectors respectively associated with the one or more different training items of the plurality of training pairs.( Section 5.2 Similarity Diversity Features “CTR depends on interaction effects between predicted item rating, item similarities and list position. We can learn these complex interactions by encoding item-to-item similarity features in xtul. In this paper we used the Jaccard similarity between item i to item j… we found that for our purposes, similarities between adjacent items in the list are sufficient.” The Jacard Similarity represents the similarity between to different items across users. The Jaccard coefficient presented here is a measure of distance between vectors, therefore PHOSI. In this case the vectors xtul are the feature vectors.)  

Regarding claim 24
Yin/Shalom teaches claim 23
Further Shalom teaches, wherein the distance between the feature vector associated with the training item and the one or more feature vectors respectively associated with the one or more different training items of the plurality of training items comprises a distance in a multi-dimensional vector space. (Section 5.2 Similarity/Diversity features “We can learn these complex interactions by encoding item-to-item similarity features in xtul…Hence, the number of features stays linear in K and the next K − 1 features in xtul are:… 
    PNG
    media_image2.png
    26
    403
    media_image2.png
    Greyscale
” the similarity vector encoding is the similarity or distance between multiple pairs of users, the set of all users is the multi-dimensional vector space, the similarity between user 1 and 2, 2 and 3, ect. is a measure of the distance in vector space.)

Regarding claim 38
Claim 28 is rejected for the reasons set forth in claim 22 in connection with claim 37
Regarding claim 39
Claim 29 is rejected for the reasons set forth in claim 23 in connection with claim 38
Regarding claim 40
Claim 40 is rejected for the reasons set forth in claim 24 in connection with claim 39


Claim(s) 25 and 31-34 are rejected under 35 U.S.C. § 103 as being unpatentable over Yin/Shalom further in view of Lin et al. US Document ID US 20150088791 A1 hereinafter Lin.

Regarding claim 25
Yin/Shalom teaches claim 22
Yin/Shalom does not explicitly teach, wherein the simulated interaction indicates that the simulated user will not interact with the training item when the similarity metric is lower than a similarity threshold.
Lin however when addressing removing items from a supervised machine learning model data set based on a similarity metric teaches, wherein the simulated interaction indicates that the simulated user will not interact with the training item when the similarity metric is lower than a similarity threshold. (¶0028-0030 “Expectation maximization algorithm 122 is a maximum likelihood algorithm that fits Gaussian mixture model 134 to a set of training data, such as imbalanced training data set 118…Calculated maximum Mahalanobis distance 124 is a threshold Mahalanobis distance score with regard to generated data samples within sample data space 120. Calculated minimum Mahalanobis distance 126 is a Mahalanobis distance calculated by data processing system 100 from each generated data sample within sample data space 120 to a center of a closest Gaussian kernel in Gaussian mixture model 134. A Mahalanobis distance is a metric that provides a relative measure of a data point's distance from a common point, such as, for example, a determined center point within a Gaussian kernel. The Mahalanobis distance identifies a degree of similarity between the generated data samples for minority data class 132 and the recorded data samples associated with majority data class 130…If calculated minimum Mahalanobis distance 126 corresponding to a particular generated data sample is equal to or greater than maximum Mahalanobis distance 124, then data processing system 100 will discard or eliminate that particular data sample because that particular data sample is beyond the maximum threshold distance. In other words, data processing system 100 automatically disregards that particular data sample” data samples are compared with the other data samples in the training set using the Mahalanobis distance, or similarity metric, when the similarity is lower than a threshold or equivalently the distance is greater than a threshold, the data sample is discarded from the set. Thus the user would not be presented with such a sample.)
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify supervised learning system in Yin/Shalom to assess the similarity of items in a training set.  One would have been motivated to make such a combination because both Yin/Shalom and Lin disclose a supervised machine learning system which benefits from preprocessed training data according to intra item similarity. Lin presents methods for correcting class imbalances in training data sets. This is beneficial because “Data class imbalance creates difficulties for supervised machine learning programs and decreases classifier performance” ( Lin ¶0005)

Regarding claim 31
Claim 31 is rejected for the reasons set forth in claim 21 in connection with claim 30

Regarding claim 32
Claim 32 is rejected for the reasons set forth in claim 23 in connection with claim 30

Regarding claim 33
Claim 33 is rejected for the reasons set forth in claim 24 in connection with claim 30

Regarding claim 34
Claim 34 is rejected for the reasons set forth in claim 25 in connection with claim 30

Claim(s) 27 and 36 are rejected under 35 U.S.C. § 103 as being unpatentable over Yin further in view of Chapelle et al. “Efficient algorithms for ranking with SVMs” hereinafter Chapelle.

Regarding claim 27
Yin teaches claim 2
Yin does not explicitly teach, wherein the machine-learned ranking model comprises a support vector machine.
Chapelle however when addressing the use of ranking algorithms that utilize support vector machines teaches, wherein the machine-learned ranking model comprises a support vector machine. (pg 202 ¶01 “The RankSVM method forms a ranking model by minimizing a regularized margin based pairwise loss.” )
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the ranking system in Yin to utilize a support vector machine.  One would have been motivated to make such a combination because Yin presents a machine learning system which ranks candidate items, and Chapelle demonstrates the success of using a support vector machine to achieve a similar goal. Chappelle notes that “In this paper we have given fast algorithms for training RankSVM, demonstrated their usefulness in efficiently solving web page ranking… these are useful in the solution of large scale problems arising in practice” (Pg 214 Section 7 ¶01 Chappelle)

Regarding claim 36
Claim 36 is rejected for the reasons set forth in claim 27 in connection with claim 30


Conclusion
Prior art:	Cory Hicks US Document ID US 20090132459 A1 discloses a recommendation engine that filters similar items in the database according to a similarity score, items whose similarity score is greater than a threshold are not recommended together. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHNATHAN R GERMICK whose telephone number is (571)272-8363. The examiner can normally be reached M-F 7:30-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on 571-272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/J.R.G./
Examiner, Art Unit 2122                                                                                                                                                                                                        
/KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122