DETAILED ACTION
Status of Claims
This is a final office action on the merits in response to the amendments and arguments filed on 14 July 2022. 
Claims 1-3, 6, 11-13, and 16 were amended. Claims 1-20 are currently pending and have been examined. 

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 112(a)
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1-20 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. Claims not listed below are rejected for dependency. 

Amended Claim 2 recites the non-original limitation “generating embedding vectors via the second ReLU activation function of the multilayer neural network.” The most relevant original disclosure appears to be:
[0045] In various embodiments, the output layer from the second ReLU layer can generate an embedding vector which can represent a combination of the historical performance data for both the dense traffic features and the extracted non- structural meta features. In several embodiments, this embedding vector can be pre-computed offline and retrieved in near-real time (NRT) for use in an online learning service. 

The above disclosure describes the output layer generating an embedding vector, and does not contemplate or appear to support the generation of embedding vectors “via” the output layer. Note that “via” is substantially broader than direct generation, as it encompasses situations where the output layer output is used by other elements to train and generate embedding vectors. However, such embodiments are not suggested or supported by this disclosure. The remainder of the original disclosure does not appear to support the identified limitation. Because the claims include a non-original limitation which does not appear to be supported by the original disclosure, one of ordinary skill in the art would not recognize applicant as possessing the claimed invention at the time of filing. Thus the claim is rejected for lack of written description support. Claim 11 is similarly rejected. 

Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. Claims not listed below are rejected for dependency. 

Claim 1 recites “wherein an output of the first ReLU activation function is input into a second ReLU activation function.” The scope of this claim would not be clear to one of ordinary skill in the art, rendering the claim indefinite. A plain reading of the limitation appears to suggest that the output of the ReLU is directly input into a second ReLU without intervening operations. However, this interpretation would appear absurd to one of ordinary skill in the art based on their understanding of the ReLU activation function. The ReLU activation function is defined as ReLU(X) = max(0, X).  In other words, for any value of X equal or below zero, the ReLU will return “0”. And for any value of X greater than zero, the ReLU will return the input value X. So for any value of X, the ReLU of ReLU(X) will return the same value (i.e., ReLU(ReLU(X)) = ReLU(X)). Thus performing a ReLU on the output of a ReLU produces no change in the value of the information passed through the second ReLU. One of ordinary skill in the art would not know whether the plain but redundant meaning of the limitation is correct, or whether the input is not really input into the second ReLU but rather is further modified by addition or weights before that modified value is input into the second ReLU. The specification does not appear to resolve this ambiguity. As such, one of ordinary skill in the art would not be able to determine the scope of the claim, rendering the claim indefinite. Claim 11 is similarly rejected. 
For the purposes of examination, the second interpretation, where the output of the first ReLU function may be modified before being input into the second ReLU function, will be used. 

	Claim 3 recites “training the scoring layer using the inputs to predict a performance metric for the item” where the input is generated by “combining the embedding vectors with real-time online data for the item”, and the embedding vectors are previously generated and stored. The limitation describes training a layer using inputs based on stored data. One of ordinary skill in the art understands layer training to be a time-consuming process done in advance of when inference/prediction outputs are needed. Thus the fact that this training is based on stored information and real-time data would confuse one of ordinary skill in the art. In other words, the structure of the claim looks as if applicant is intending to claim an online inference process based on previously generated data, but the present claims directly claim training which is not associated with online inference at all. One of ordinary skill in the art would not know how to interpret the claim, rendering the claim indefinite. Claim 13 is similarly rejected.

	Claim 3 recites “combining the embedding vectors with real-time online data for the item to generate inputs; and training the scoring layer using the inputs to predict a performance model for the item”. The embedding referenced by this limitation are recited in claim 2 (which claim 3 is dependent on) as “generated by the second ReLU activation function”, which is described in claim 1 (which claim 2 is dependent on” via “an output of the first ReLU activation function is input into a second ReLU activation function of the multilayer neural network to output metrics at a scoring layer”. The limitation of claim 3 indicates that the second ReLU is part of the scoring layer, but the scoring layer is described in claim 3 as using data generated in claim 2 by the second ReLU. So the claims appear to describe a second ReLU layer which both produces embeddings and which receives the embeddings as an input. One of ordinary skill in the art would not know how to interpret this limitation, rendering the scope of the claim unclear, and the claim indefinite. Claim 13 is similarly rejected.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1, 5, 11, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Xu et al. (US 2020/0285937 A1) in view of Krishnamurthy et al. (US 2018/0300609 A1). 

Regarding Claim 1 and 11: Xu discloses a system comprising: one or more processors (See at least [0013]); and one or more non-transitory computer-readable media storing computing instructions (See at least [0013]) configured to run on the one or more processors and perform:
extracting meta features for an item to generate feature embeddings for the item (For example, when the target object is a hotel, the statistical characteristic data may include: a historical consumption price average value of the hotel in the last week, historical consumption price average values of other target objects except the hotel in the last week, etc. When the target object is a KTV, the statistical characteristic data may include: a historical consumption price median of the KTV in the last week, historical consumption price medians of other target objects except the KTV in the last week, etc. Similarly, when the target object is the hotel, the temporal sequence characteristic data may include: a historical consumption price average value of the hotel in the last 24 months. When the target object is the KTV, the temporal sequence characteristic data may include: a historical consumption price average value of the KTV in the last 24 months. See at least [0070]. Also: In FIG. 3, X.sub.1, X.sub.2, . . . , X.sub.n−1 and X.sub.n represent input characteristic data of the sample user. n is a positive integer greater than or equal to 2. A part of the characteristic data is the one or more statistical characteristic data of the sample user, and is represented by specific numerical values. See at least [0050] and Fig. 3). 
extracting, using a recurrent neural network, sequential data from traffic features for the item over a period of time (Step 204, corresponding one or more temporal characteristic data is determined by utilizing the recurrent neural network of the hybrid neural network prediction model on the basis of the one or more temporal sequence characteristic data of the target user. See at least [0078]. Also: The temporal sequence characteristic data of the sample user are processed by utilizing the recurrent neural network, distribution characteristics of historical prices of the sample user in time are learned according to the temporal sequence characteristic data, and the temporal sequence characteristic data are calculated to obtain temporal characteristic data to be transmitted to the traditional neural network. See at least [0053]. Also: The temporal sequence characteristic data include one or more of following sequences: a historical consumption price parameter sequence of the target object within a set time period, a historically browsed price parameter sequence of the target object within the set time period, a historical consumption price parameter sequence of the non-target object within the set time period, and a historically browsed price parameter sequence of the non-target object within the set time period. Parameters may be selected to be an average value, a maximum value, a minimum value, a variance, a median, etc. See at least [0033]). 
inputting the feature embeddings and the sequential data from the traffic features into a multilayer neural network comprising a first rectified linear unit (ReLU) activation function, wherein an output of the first ReLU activation function is input into a second ReLU activation function of the multilayer neural network to output metrics at a scoring layer, and wherein the scoring layer generates one or more performance metrics for the item (Step 102, a consumption capacity of the target user with respect to the target object is determined by utilizing a preset hybrid neural network prediction model on the basis of the one or more statistical characteristic data and the one or more temporal sequence characteristic data. See at least [0035]. Also:  The hybrid neural network prediction model may include the recurrent neural network and the traditional neural network. The temporal sequence characteristic data of the sample user are processed by utilizing the recurrent neural network, distribution characteristics of historical prices of the sample user in time are learned according to the temporal sequence characteristic data, and the temporal sequence characteristic data are calculated to obtain temporal characteristic data to be transmitted to the traditional neural network. The traditional neural network may be a fully-connected deep neural network (DNN), and the temporal characteristic data and the statistical characteristic data of the sample user are processed by the traditional neural network. See at least [0053] and Fig. 5. Also: The traditional neural network may be a fully-connected deep neural network (DNN), and the temporal characteristic data and the statistical characteristic data of the sample user are processed by the traditional neural network. See at least [0053]. Also: Generally, the traditional neural network may be divided into an input layer 21, hidden layers 22 and an output layer 23, and then values of the hidden layers H.sub.1, H.sub.2 and H.sub.3 are respectively obtained via formulas (1) to (3): H.sub.1=g(a.sub.1X.sub.1+a.sub.2X.sub.2+a.sub.3M.sub.n−1+a.sub.4M.sub.n) (1), … g represents an activation function of the traditional neural network. See at least [0062]-[0063]. Also: A value of the output layer Z is obtained via a formula (4). Z=g(d.sub.1H.sub.1+d.sub.2H.sub.2+d.sub.3H.sub.3) (4). See at least [0064]. Also: g represents the activation function of the traditional neural network, d.sub.1 represents a weighted value from the hidden layer H.sub.1 to the output layer Z, d.sub.2 represents a weighted value from the hidden layer H.sub.2 to the output layer Z, and d.sub.3 represents a weighted value from the hidden layer H.sub.3 to the output layer Z. See at least [0065] and Fig. 3. Examiner’s note: The limitations describing the second ReLU as “output[ing] metrics at a first scoring layer” an “the scoring layer generates one or more performance metrics for the item” are interpreted as describing the ReLU as part of the scoring layer, and producing the performance metrics for the item.  Also: Examiner’s note: One of ordinary skill in the art would understand activation functions to include ReLU activation functions).

Xu does not disclose reducing, using a multilayer perceptron, a dimension of the feature embeddings to generate a representation vector for the meta features. 
Krishnamurthy teaches reducing, using a multilayer perceptron, a dimension of a feature embedding to generate a representation vector (the method 300 involves encoding the user-session representation vectors 180. The user-session representation encoder module 160 uses a neural network to encode each of the reduced-dimension intermediate sessions 442a-m into one of the user-session representation vectors 252a-m. See at least [0058]. Examiner’s note: one of ordinary skill in the art would understand a “multilayer perceptron” as encompassing a neural network). 
Xu provides a system which uses a hybrid neural network processing sequential data and non-sequential data to generate object predictions, upon which the claimed invention’s dimensional reduction of non-sequential data prior to inputting into multilayer neural network can be seen as an improvement. However, Krishnamurthy demonstrates that the prior art already knew of using a neural network to reduce the dimension of non-sequential data. One of ordinary skill in the art could have trivially applied the techniques of Krishnamurthy to the non-sequential characteristic data of Xu. Further, one of ordinary skill in the art would have recognized that such an application of Krishnamurthy would have resulted in an improved system which could use fewer computing resources after the dimensional reduction of the non-sequential data. As such, the claimed invention would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in view of the disclosure of Xu and the teaching of Krishnamurthy. 

Regarding Claim 5 and 15: Xu in view of Krishnamurthy makes obvious the above inventions. Additionally, Xu discloses wherein extracting the meta features comprises encoding hierarchy information about the item (For example, when the target object is a hotel, the statistical characteristic data may include: a historical consumption price average value of the hotel in the last week, historical consumption price average values of other target objects except the hotel in the last week, etc. When the target object is a KTV, the statistical characteristic data may include: a historical consumption price median of the KTV in the last week, historical consumption price medians of other target objects except the KTV in the last week, etc. Similarly, when the target object is the hotel, the temporal sequence characteristic data may include: a historical consumption price average value of the hotel in the last 24 months. When the target object is the KTV, the temporal sequence characteristic data may include: a historical consumption price average value of the KTV in the last 24 months. See at least [0070]. Also: In FIG. 3, X.sub.1, X.sub.2, . . . , X.sub.n−1 and X.sub.n represent input characteristic data of the sample user. n is a positive integer greater than or equal to 2. A part of the characteristic data is the one or more statistical characteristic data of the sample user, and is represented by specific numerical values. See at least [0050] and Fig. 3).

Claims 2-3 and 12-14 are rejected under 35 U.S.C. 103 as being unpatentable over Xu et al. (US 2020/0285937 A1) in view of Krishnamurthy et al. (US 2018/0300609 A1), and further in view of Wu et al. (US 2019/0034994 A1). 

Regarding Claim 2 and 12: Xu in view of Krishnamurthy makes obvious the above inventions. Xu does not appear to disclose generating embedding vectors via the second ReLU activation function of the multilayer neural network; and storing the embeddings vectors generated by the second ReLU activation function of the multilayer neural network. 
	Wu teaches generating embedding vectors via a ReLU activation function of a multilayer neural network; and storing the embedding vectors generated by the ReLU activation function of the multilayer neural network (Because the personalized retrieval model must be quick, in particular embodiments, part of the necessary processing may be performed offline. As an example, because generating product-listing and content-interaction embeddings may be computationally expensive or time consuming, generating the embeddings may be performed in batch operations during off-peak usage hours. In particular embodiments, the social-networking system may generate the product-listing embedding for a particular product listing when the particular product listing is submitted to the marketplace or when the particular product listing is updated. … Although this disclosure describes the offline generation of embeddings in a particular manner, this disclosure contemplates the offline generation of embeddings in any suitable manner. See at least [0044]). 
	Xu and Krishnamurthy suggests a system which produces performance metrics for items based on a hybrid neural network, upon which the claimed invention’s creation and storage of embeddings can be seen as an improvement. However, Wu demonstrates that the prior art already knew of generating embedding in advance to speed up model performance. One of ordinary skill in the art could have easily applied the techniques of Wu to the system of Xu and Krishnamurthy to generate embeddings in advance for subsequent generation of performance metrics. One of ordinary skill in the art would have recognized that such an application of Wu would have resulted in an improved system which could more quickly generate item performance information. As such, the claimed invention would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in view of the disclosure of Xu and the teaching of Krishnamurthy and Wu. 

Regarding Claim 3 and 13: Xu in view of Krishnamurthy and Wu makes obvious the above inventions. Additionally, Wu teaches retrieving the embedding vectors; combining the embedding vectors with real-time online data to generate inputs; and training the layer using the inputs to predict, wherein the layer comprises a machine learning model (where the social-networking system may receive a request, from a client system of a first user of an online social network, to access a marketplace comprising a plurality of products offered for sale by a second user of the online social network. At step 720, the social-networking system may filter a set of product listings based on a plurality of respective product-listing embeddings and a content-interaction embedding associated with the first user, wherein each of the product listings comprises a description of one of the products in the marketplace. At step 730, the social-networking system may rank each product listing in the filtered set based at least on a product-score representing a likelihood of the first user interacting with the respective product, the product-score being based on interaction information associated with the first user, product information associated with the product, and sparse information associated with the first user. See at least [0060]. Also: In particular embodiments, the social-networking system may filter the set of product listings by searching the product-listing embeddings for product-listings that are similar to the content interaction history of the first user. Various techniques may be used to compare the similarity of the embeddings. In particular embodiments, the social-networking system may search the embedding space of the product-listing embeddings for product-listing embeddings that are within a threshold distance of the content-interaction embedding of the first user, or the projection of the content-interaction embedding of first user onto the product-listing embedding space. In the illustrated example of FIG. 6, the threshold distance is depicted as an area 620 in embedding space 600. As an example and not by way of limitation, point 610 may be a point corresponding to the content-interaction embedding of the first user and the points identified as being within area 620 of point 610 may include points 640a corresponding to product-listing embeddings for products relevant to the first user. In particular embodiments, social-networking system may use any suitable technique for computing a distance including an inner product (or “dot product”) of the vector representations of the embeddings, locality-sensitive hashing, hierarchical clustering techniques, ball tree techniques, binary search tree techniques, a space-partitioning data structure for organizing points in a k-dimensional space (e.g., a k-dimensional tree), quantization, any other suitable search algorithm or technique, or any combination thereof. Although this disclosure describes and illustrates particular embodiments of FIG. 6 as being implemented by social-networking system , this disclosure contemplates any suitable embodiments of FIG. 6 occurring on any suitable interface and as being implemented by any suitable platform or system. As an example, and not by way of limitation, particular embodiments of FIG. 6 may be implemented by client system 130 or third-party system 170. More about embedding spaces and training models to map an object to vector representations and embedding spaces can be found in U.S. patent application Ser. No. 14/949,436, filed 23 Nov., 2015, which is incorporated by reference herein. Although this disclosure describes and illustrates embedding spaces in a particular manner, this disclosure contemplates any suitable manner of configured embedding spaces. See at least [0046]). The motivation to combine Xu, Krishnamurthy, and Wu is the same as explained under claim 2 above, and is incorporated herein. 

Regarding Claim 4 and 14: Xu in view of Krishnamurthy and Wu makes obvious the above inventions. Additionally, Xu discloses wherein the machine-learning model comprises linear regression (a general machine learning model is adopted, such as a linear regression (LR) model. See at least [0082]). 

Claims 6, 7, 16, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Xu et al. (US 2020/0285937 A1) in view of Krishnamurthy et al. (US 2018/0300609 A1), and further in view of Maldonado et al. (US 2021/0366022 A1). 

Regarding Claim 6 and 16: Xu in view of Krishnamurthy makes obvious the above inventions. Xu does not appear to disclose wherein extracting the meta features comprises using an NLP-based embedding algorithm to embed text features about the item, wherein the text features comprise one or more of a title of the item and a description of the item. However, Maldonado teaches extracting the meta features comprises using an NLP-based embedding algorithm to embed text features about the item, wherein the text features comprise one or more of a title of the item and a description of the item (The word vectorisation component 308 takes a text description 304 of the item and embeds the description as a representative word vector 314. This component 308 is a pre-trained word embedding model. In contrast to the image feature extractor 306, the training of the word vectorization component 308 is domain-specific (i.e. specific to the types of items within the system, such as clothes fashion items). In particular, the pre-trained word embedding model uses a corpus of domain-specific language to learn a vector representation of words based on their respective contexts within sentences. This is based on a domain-specific vocabulary, specific to the class of items under consideration. For example, in the case of fashion items, this could be a fashion-specific vocabulary. See at least [0059] and Fig. 3A). 
Xu and Krishnamurthy suggest a system that generates object performance predictions based on neural network processing of input data, upon which the claimed invention’s use of an NLP embedding algorithm to generate input data can be seen as an improvement. However, Maldonado demonstrates that the prior art already knew of word embeddings (an NLP algorithm) to generate input data. One of ordinary skill in the art could have trivially applied the techniques of Maldonado to the system of Xu and Krishnamurthy. Further, one of ordinary skill in the art would have recognized that such an application would have predictably resulted in an improved system which would generate predictions based on descriptive information. As such, the claimed invention would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in view of the disclosure of Xu and the teaching of Krishnamurthy and Maldonado. 

Regarding Claim 7 and 17: Xu in view of Krishnamurthy makes obvious the above inventions. Xu does not appear to disclose wherein extracting the meta features comprises extracting data from one or more images of the item using a convolutional neural network. However, Maldonado teaches extracting meta features comprising extracting data from one or more images of an item using a convolutional neural network (The item images are input to the image feature extractor component 306. The image feature extractor 306 outputs a feature representation 312 of the item image and may, for example, have a Convolutional Neural Network (CNN) or other neural network architecture. The image feature extractor 306 may, for example, comprise one or more trained CNN layers, which extract semantically rich image features 312 from the item image data 302. See at least [0057] and Fig. 3A). 
Xu and Krishnamurthy suggest a system that generates object performance predictions based on neural network processing of input data, upon which the claimed invention’s use of a convolutional neural network to generate input data can be seen as an improvement. However, Maldonado demonstrates that the prior art already knew of using a CNN on object image data to generate input data. One of ordinary skill in the art could have trivially applied the techniques of Maldonado to the system of Xu and Krishnamurthy. Further, one of ordinary skill in the art would have recognized that such an application would have predictably resulted in an improved system which would generate predictions based on image information. As such, the claimed invention would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in view of the disclosure of Xu and the teaching of Krishnamurthy and Maldonado. 

Claims 8-10 and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Xu et al. (US 2020/0285937 A1) in view of Krishnamurthy et al. (US 2018/0300609 A1), and further in view of Cooley et al. (US 2012/0130798 A1).
 
Regarding Claim 8 and 18: Xu in view of Krishnamurthy makes obvious the above inventions. Xu does not appear to disclose wherein the one or more performance metrics further comprise a conversion rate for the item, an order size for the item, and a contributed profit per order for the item. However, Cooley teaches conversion rate, order size and contributed profit per order (Also, as described above, an event is an action considered to be of value to a business, such as conversions, orders or sales, leads, or applications or registrations, and events can be assigned a value, such as revenue, margin, or profit. Therefore, the events-per-view or events-per-click models 116 and the number of events models 152 can predict values such as a number of conversions, a number of conversions per click or per view, a number of orders or sales, a number of orders or sales per click or per view, a number of leads, a number of leads per click or per view, a number of applications or registrations, a number of applications or registrations per click or per view, etc. Similarly, the value-per-event models 122 and the total value models 150 can predict values such as a revenue, a revenue per click or per view, a revenue per conversion, a margin, a margin per click or per view, a margin per conversion, a profit, a profit per click or per view, a profit per conversion, etc. It should be understood that more than one event and/or value model may be used in a particular model sequence and multiple event. See at least [0043]). 
	Xu and Krishnamurthy suggest a system that generates object performance predictions based on neural network processing, which differs from the claimed invention by the substitution of Xu’s pricing metric with conversion rate, order size, and profit per order. However, Cooley demonstrates that the prior art was aware of such metrics. One of ordinary skill in the art could have trivially substituted the metrics of Cooley into the system of Xu and Krishnamurthy to produce a system that generates object performance predictions based on neural network processing. Further, one of ordinary skill in the art would have recognized that such a substitution would have predictably resulted in a system which would use neural networks to predict advertisement performance metrics. As such, the claimed invention would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention in view of the disclosure of Xu and the teaching of Krishnamurthy and Cooley. 
	
Regarding Claim 9 and 19: Xu in view of Krishnamurthy and Cooley makes obvious the above inventions. Additionally, Cooley teaches generating a revenue per click metric based on the conversion rate and the order size (Also, as described above, an event is an action considered to be of value to a business, such as conversions, orders or sales, leads, or applications or registrations, and events can be assigned a value, such as revenue, margin, or profit. Therefore, the events-per-view or events-per-click models 116 and the number of events models 152 can predict values such as a number of conversions, a number of conversions per click or per view, a number of orders or sales, a number of orders or sales per click or per view, a number of leads, a number of leads per click or per view, a number of applications or registrations, a number of applications or registrations per click or per view, etc. Similarly, the value-per-event models 122 and the total value models 150 can predict values such as a revenue, a revenue per click or per view, a revenue per conversion, a margin, a margin per click or per view, a margin per conversion, a profit, a profit per click or per view, a profit per conversion, etc. It should be understood that more than one event and/or value model may be used in a particular model sequence and multiple event. See at least [0043]). The motivation to combine Xu, Krishnamurthy, and Cooley is the same as explained under claim 8 above, and is incorporated herein.

Regarding Claim 10 and 20: Xu in view of Krishnamurthy and Cooley makes obvious the above inventions. Additionally, Cooley teaches generating a contributed profit per click metric based on the conversion rate and the contributed profit per order (Also, as described above, an event is an action considered to be of value to a business, such as conversions, orders or sales, leads, or applications or registrations, and events can be assigned a value, such as revenue, margin, or profit. Therefore, the events-per-view or events-per-click models 116 and the number of events models 152 can predict values such as a number of conversions, a number of conversions per click or per view, a number of orders or sales, a number of orders or sales per click or per view, a number of leads, a number of leads per click or per view, a number of applications or registrations, a number of applications or registrations per click or per view, etc. Similarly, the value-per-event models 122 and the total value models 150 can predict values such as a revenue, a revenue per click or per view, a revenue per conversion, a margin, a margin per click or per view, a margin per conversion, a profit, a profit per click or per view, a profit per conversion, etc. It should be understood that more than one event and/or value model may be used in a particular model sequence and multiple event. See at least [0043]). The motivation to combine Xu, Krishnamurthy, and Cooley is the same as explained under claim 8 above, and is incorporated herein.

Response to Arguments
Applicant’s Argument Regarding 112(b) Rejections of claims 1-20: Applicant amends claims 1, 2, 3, 11, 12, and 13.
Examiner’s Response: Applicant's amendments filed 14 July 2022 have been fully considered. The amendments resolve issues and raise issues. The rejections are withdrawn or updated as appropriate. 

Applicant’s Argument Regarding 103 Rejections of claims 1-20: Nowhere does Xu teach or suggest “inputting the representation vector for the meta features and the sequential data from the traffic features into a multi-layer neural network comprising a first rectified linear unit ReLU activation function, wherein an output of the first ReLU activation function is input into a second ReLU activation function of the multilayer neural network to output metrics at a scoring layer, and where the scoring layer generates one or more performance metrics for the item” as required by amended independent claim 1 or the similar limitations of independent claim 11. 
Examiner’s Response: Applicant's arguments filed 14 July 2022 have been fully considered but they are not persuasive. Examiner notes that Xu was not relied on for every aspect of this limitation. One cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references.  See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986). As explained above, Xu does teach inputting the feature embeddings and the sequential data from the traffic features into a multilayer neural network comprising a first rectified linear unit (ReLU) activation function, wherein an output of the first ReLU activation function is input into a second ReLU activation function of the multilayer neural network to output metrics at a scoring layer, and wherein the scoring layer generates one or more performance metrics for the item and Krishnamurthy teaches a representation vector for the meta features. As such, Applicant’s argument is unpersuasive. Additionally, examiner notes that Applicant’s argument only addresses the entire, lengthy, limitation and does not identify any specific deficiencies based on any specific disclosure of the references. As such, Examiner does not have any indication of what further explanation might be helpful to the applicant. 

Additional Considerations
The prior art made of record and not relied upon that is considered pertinent to applicant’s disclosure can be found in the PTO-892 Notice of References Cited. 
Sharma (Activation Functions in Neural Networks) notes “The ReLU is the most used activation function in the world right now.Since, it is used in almost all the convolutional neural networks or deep learning.” 




Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Bion A Shelden whose telephone number is (571)270-0515. The examiner can normally be reached M-F, 12pm-10pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hajime S Rojas can be reached on (571)270-5491. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Bion A Shelden/Examiner, Art Unit 3681                                                                                                                                                                                                        2022-08-07