Detailed Action
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim 1-20 are pending.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


	Claim 1, 3-4, 11, 13, 15-16 are rejected under 35 U.S.C. 102 as being anticipated by Tang (US 20190188561 A1).

Regarding claim 1, Tang teaches a method comprising: 
while processing a set of training data using one or more machine learning techniques: learning an embedding for each attribute value of a first plurality of attribute values of multiple content items ([Tang, 0018] “The online system 130 trains a deep neural network (DNN) using labels representing associations between users and events. In an embodiment, the online system trains two neural networks (e.g., a user neural network 142 and an event neural network 144)”, teaches first and second neural network, [Tang, 0050] “The training module 440 trains the DNN 430. In an embodiment, the training module 430 trains the DNN 430 by comparing the result of executing the DNN 430 for a sample input data with the expected label associated with the sample input data to determine a measure of error in the generated result … This process is repeated iteratively until an aggregate metric based on the error is determined to be below certain threshold value. The training module 440 repeats the process of training the DNN 430 through multiple iterations …”, sample input is needed to train the DNN. [Tang, 0051] “The neural network module 230 is executed during an online processing when the online system receives events and identifies content items associated with the events for distributing to users … The embeddings (i.e., the user embedding 460 and the event embedding 465) each represent the sample input data at a layer within the neural network”, the user embedding is the first attribute values, and the sample input is the user embedding. The user embedding is input to the neural network to train the neural network 
[Tang, 0031] “The action log 220 may be used by the online system 130 to track user actions on the online system 130, ... Users may interact with various objects on the online system 130, and information describing these interactions is stored in the action log 220. Examples of interactions with objects include: commenting on posts, commenting on a page associated with a third party event, sharing links, checking-in to physical locations via a client device 110, accessing content items, and any other suitable interactions. Additional examples of interactions with objects on the online system 130 that are included in the action log 220 include: commenting on a photo album, … expressing a preference for an object (“liking” the object), and engaging in a transaction. Additionally, the action log 220 may record a user's interactions with advertisements on the online system 130 …” teaches the action log contains user interaction information with the event and system, 
[Tang, 0037] “The neural network module 230 trains a DNN to extract embeddings from a user vector (e.g., a vector associated with a user 154) and an event vector (e.g., a vector associated with an event 152). The neural network module 230 is an embodiment of the neural network module 134 which is described above in conjunction with FIG. 1A. An example user vector for a user of the online system 130 includes information indicating that the user is 25 years old, is a member of a 3 concert groups in the online system 130, has expressed an interest in the “Beatles,” has attended 3 “John Lennon” concerts in past 6 months, and lives 25 miles from New York City. An example event vector associated with a “John Lennon” concert includes information indicating that the event is in “New York City” and a target demographic is men and women between the ages of 25 and 50. In various embodiments, the event vector may also include information about interactions by other users associated with the user ... The neural network module 230 is further described below in conjunction with the FIG. 4”, teaches what is the user and event data used in training process), 
learning an embedding for each attribute value of a second plurality of attribute values of multiple entities ([Tang, 0018] “The online system 130 trains a deep neural network (DNN) using labels representing associations between users and events. In an embodiment, the online system trains two neural networks (e.g., a user neural network 142 and an event neural network 144)”, teaches training the first and the second neural networks,
[Tang, 0050] “The training module 440 trains the DNN 430. In an embodiment, the training module 430 trains the DNN 430 by comparing the result of executing the DNN 430 for a sample input data with the expected label associated with the sample input data to determine a measure of error in the generated result ... The training module 440 repeats the process of training the DNN 430 through multiple iterations …”, teaches the sample input data is input to the DNN to train them,
[Tang, 0051] “The neural network module 230 is executed during an online processing when the online system receives events and identifies content items associated with the events for distributing to users … The embeddings (i.e., the user embedding 460 and the event embedding 465) each represent the sample input data at a layer within the neural network”, the event embedding is the second attribute values. The user embedding is input to the neural network to train the neural network), 
learning weights for a set of contextual features ([Tang, 0048] “The neural network 455 is configured to receive, as inputs, the output generated by both the neural network 142 and the neural network 144. The neural network 455 is configured to generate an output associated with a relationship between a user vector and an event vector. In various embodiments, the neural network 455 changes the weights of neural network 142 and neural network 144 based on various learning algorithms. Here, changing the weights of neural network 142 and neural networks 144 comprises adjusting the weights between individual neurons of the hidden layers to reduce a total measure of error between a predicted output and actual output”, 
[Tang, 0043] “… the input vector 310 is a vector comprising content items associated with a user of the online system 130 (e.g., items stored in the user profile store 205, action log 220, and edge store 225). For example, an input vector 310 comprises demographic information (e.g., age group of the user), groups that the user is associated with (e.g., a member of a page associated with the event), a geographic location (e.g., the user is within 10 miles of a geographic location associated with an event), teaches the contextual feature, which is a demographic information, ages, location are input to the neural network ); 
in response to receiving a content request: identifying a particular content item that is associated with one or more targeting criteria that are satisfied based on the content request ([Tang, 0016] “The online system 130 is a computer system that includes software and hardware for performing a group of coordinated functions or tasks. The online system 130 includes multiple components for providing a framework for upselling ticketing events to one or more users of the online system 130. Here, the online system 130 may additionally or alternatively be a social networking system. The online system 130 is configured to receive requests from one or more client devices 110 and third party systems 120 and execute computer programs associated with the received requests. As an example, the online system 130 stores content associated with one or more users and content associated with an event in order to provide the event to a user of the one or more users with a threshold probability of attending the event …”, [Tang, 0043] “… the input vector 310 is a vector comprising content items associated with a user of the online system 130 (e.g., items stored in the user profile store 205, action log 220, and edge store 225). For example, an input vector 310 comprises demographic information (e.g., age group of the user), groups that the user is associated with (e.g., a member of a page associated with the event), a geographic location (e.g., the user is within 10 miles of a geographic location associated with an event), a number and/or type of actions performed by the user on a content item associated with an event either on or off the online system 130, or any combination thereof. In other embodiments, the input vector 310 additionally, or alternatively, comprises one or more items associated with an event (e.g., items stored in the event store 210). For example, the input vector 310 may include information describing a time and date associated with the event, a number of other users of the online system 130 who have expressed an interest in the event, a number of messages associated with the event, or any combination thereof”); 
identifying a first set of embeddings for the particular content item ([Tang, 0043] “… the input vector 310 is a vector comprising content items associated with a user of the online system 130 (e.g., items stored in the user profile store 205, action log 220, and edge store 225). For example, an input vector 310 comprises demographic information (e.g., age group of the user), groups that the user is associated with (e.g., a member of a page associated with the event), a geographic location (e.g., the user is within 10 miles of a geographic location associated with an event), a number and/or type of actions performed by the user on a content item associated with an event either on or off the online system 130, or any combination thereof. In other embodiments, the input vector 310 additionally, or alternatively, comprises one or more items associated with an event (e.g., items stored in the event store 210). For example, the input vector 310 may include information describing a time and date associated with the event, a number of other users of the online system 130 who have expressed an interest in the event, a number of messages associated with the event, or any combination thereof”, [Tang, 0051] “... The online system provides user and event data to the neural network 142 and neural network 144 to generate a user embedding 460 and an event embedding 465, respectively …”); 
identifying a particular requesting entity that initiated the content request ([Tang, 0016] “The online system 130 is a computer system that includes software and hardware for performing a group of coordinated functions or tasks. The online system 130 includes multiple components for providing a framework for upselling ticketing events to one or more users of the online system 130. Here, the online system 130 may additionally or alternatively be a social networking system. The online system 130 is configured to receive requests from one or more client devices 110 and third party systems 120 and execute computer programs associated with the received requests. As an example, the online system 130 stores content associated with one or more users and content associated with an event in order to provide the event to a user of the one or more users with a threshold probability of attending the event …”, [Tang, 0017] “The online system 130 provides a framework to provide content items describing events to one or more user of client devices 110. A content item associated with an event is also referred to herein as an event content. In FIG. 1A, a user profile 132 is associated with a user of the online system 130 and an event 134 associated with the third-party system 120 …”, the user (requesting entity) initiates the request and the system describes events to one or more user of client devices. The system identifies it and describes event that is related to the user); 
identifying a second set of embeddings for the particular requesting entity ([Tang, 0017] “The online system 130 provides a framework to provide content items describing events to one or more user of client devices 110. A content item associated with an event is also referred to herein as an event content. In FIG. 1A, a user profile 132 is associated with a user of the online system 130 and an event 134 associated with the third-party system 120 …”, [Tang, 0036] “… A stored neural network is configured to receive, as an input, an input vector (e.g., a vector associated with an event 152 or a vector associated with a user 154) via an input layer. Here, a received input vector is associated with one of a user of the online system 130 or an event provided by a third-party system (e.g., third-party system 120). Each of the one or more neural networks stored in the neural network store 230 generates an output that is some function of the received input vector …”, [Tang, 0051] “... The online system provides user and event data to the neural network 142 and neural network 144 to generate a user embedding 460 and an event embedding 465, respectively …”); 
identifying a set of feature values for the set of contextual features ([Tang, 0043] “For example, an input vector 310 comprises demographic information (e.g., age group of the user), groups that the user is associated with (e.g., a member of a page associated with the event), a geographic location (e.g., the user is within 10 miles of a geographic location associated with an event)”, corresponds to the contextual features that are input to the neural network,
[Tang, 0044] “The neural network 310 generates as output comprising value, or a score. An output generated by the neural network 310 is, for example, a score indicating a likelihood of the input user attending the input event”, teaches the system identifies the contextual feature (a score indicating a likelihood of the user attending the event) [Tang, 0043] “… the input vector 310 is a vector comprising content items associated with a user of the online system 130 (e.g., items stored in the user profile store 205, action log 220, and edge store 225) ); 
selecting the particular content item based on the first set of embeddings, the second set of embeddings, the set of feature values, and the weights ([Tang, 0023] “FIG. 1B is a graphical illustration 102 of the function performed by the user-event mapping module 136, in accordance with an embodiment. As shown in FIG. 1B, square shaped data points represent users and circular data points represent events. In an embodiment, the online system uses a distance between a vector representing a user and a vector representing an event as a measure of likelihood of the user attending the event or the measure of likelihood that the user is interested in the event. Small distances between a user and an event in the latent space 156 indicate high likelihood of the user attending the event and large distance distances between a user and an event in the latent space 156 indicate less likelihood of the user attending the event. Accordingly, the online system determines the measure of likelihood of a user attending an event as a value inversely related to the distance between the data points corresponding to the event and the user in the latent space 156”, [Tang, 0051] “The online system provides user and event data to the neural network 142 and neural network 144 to generate a user embedding 460 and an event embedding 465, respectively. The embeddings (i.e., the user embedding 460 and the event embedding 465) each represent the sample input data at a layer within the neural network. An embedding is represented as a vector having one or more dimensions. A user embedding selection module 445 selects embeddings from a hidden layer of the neural network 142. An event embedding selection module 450 selects embeddings from a hidden layer of the neural network 144. In an embodiment, both the user embedding selection module 445 and the event embedding selection module 450 select embeddings from the last hidden layer of the neural networks 142 and 144, respectively. The user embedding selection module 445 and the event embedding selection module 450 both provide the selected embeddings to the feature mapping module 136. As further described above, the feature mapping module 136 is configured to determine a probability of attendance of the event for a user based on the selected embeddings associated with the user”, [Tang, 0043] “In one embodiment, the input vector 310 is a vector comprising content items associated with a user of the online system 130 (e.g., items stored in the user profile store 205, action log 220, and edge store 225). For example, an input vector 310 comprises demographic information (e.g., age group of the user), groups that the user is associated with (e.g., a member of a page associated with the event), a geographic location (e.g., the user is within 10 miles of a geographic location associated with an event), a number and/or type of actions performed by the user on a content item associated with an event either on or off the online system 130, or any combination thereof. In other embodiments, the input vector 310 additionally, or alternatively, comprises one or more items associated with an event (e.g., items stored in the event store 210 …”, [Tang, 0044] “The neural network 310 generates as output comprising value, or a score. An output generated by the neural network 310 is, for example, a score indicating a likelihood of the input user attending the input event”, teaches the neural network process the input from the edge (features) and give value or score. [Tang, 0048] “The neural network 455 is configured to receive, as inputs, the output generated by both the neural network 142 and the neural network 144. The neural network 455 is configured to generate an output associated with a relationship between a user vector and an event vector. In various embodiments, the neural network 455 changes the weights of neural network 142 and neural network 144 based on various learning algorithms. Here, changing the weights of neural network 142 and neural networks 144 comprises adjusting the weights between individual neurons of the hidden layers to reduce a total measure of error between a predicted output and actual output”, all the neural network operations are based on weights); 
wherein the method is performed by one or more computing devices ([Tang, 0016] “The online system 130 is a computer system that includes software and hardware for performing a group of coordinated functions or tasks”).

Regarding claim 15, Tang teaches one or more storage media storing instructions which, when executed by one or more processors ([Tang, 0061] “Embodiments may also relate to an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, and/or it may comprise a general-purpose computing device selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a non-transitory, tangible computer readable storage medium, or any type of media suitable for storing electronic instructions, which may be coupled to a computer system bus. Furthermore, any computing systems referred to in the specification may include a single processor or may be architectures employing multiple processor designs for increased computing capability”).
	Claim 15 is a storage media claim having similar limitation to the method claim 1. Therefore, it is rejected under the same rationale as the claim 1 above.


Regarding claim 3, Tang teaches wherein a first plurality of attributes that correspond to the first plurality of attribute values comprises one or more of a content provider identifier, a content delivery campaign identifier, or a content item identifier ([Tang, 0019] “… To address this technical challenge or cold-start problem in which the model cannot draw inferences yet because it does not have sufficient data, the system uses two models that it can train independently to produce vectors that can be updated at different rates. While the event-based vector is recomputed more regularly as events change frequently, the user vector can be recomputed only periodically as user preferences do not change as quickly”, [Tang, 0054] “FIG. 5 illustrates a process for upselling ticketing events to users, in accordance with an embodiment. The online system (e.g., online system 130) receives 510 content describing an event from a third-party system (e.g., third-party system 120). Content describing the event includes a venue associated with the event, a geographic location associated with the event, a time, and date associated with the event, ticket prices, and any combination thereof. The online system identifies 520 a plurality of users as potential users for receiving content associated with the event. In some embodiments the online system determines the plurality of users based on simple criteria, for example, users in close proximity of the location of the event”, the contents describing the events corresponds to the plurality of attributes).
	Claim 16 is a storage media claim having similar limitation to the method claim 3. Therefore, it is rejected under the same rationale as the claim 3 above.

Regarding claim 4, Tang teaches wherein a second plurality of attributes that correspond to the second plurality of attribute values comprises two or more of an employer identifier, a job title identifier, a skill identifier, or an industry identifier ([Tang, 0027] “Each user of the online system 130 is associated with a user profile ... Examples of information stored in a user profile include biographic, demographic, and other types of descriptive information, such as work experience, educational history, gender, interests, hobbies or preferences, location and the like ... A user profile in the user profile store 205 may also maintain references to one or more previous events attended by the user in the event store 210 and stored in the action log 220. The event store 210 and the action log 220 are both further described below”).

Regarding claim 11, Tang teaches using one or more machine learning techniques to train one or more layers of a neural network based on user interaction data regarding a plurality of content items and a plurality of entities ([Tang, 0018] “The online system 130 trains a deep neural network (DNN) using labels representing associations between users and events. In an embodiment, the online system trains two neural networks (e.g., a user neural network 142 and an event neural network 144)”, 
[Tang, 0041] “… The neural network 310 includes an input layer 320, one or more hidden layers 330a-n, and an output layer 340. Each layer of the neural network 310 (i.e., the input layer 320, the output layer 340, and the hidden layers 330a-n) comprises a set of nodes such that the set of nodes of the input layer 320 are input nodes of the neural network 310, the set of nodes of the output layer 340 are output nodes of the neural network 310, and the set of nodes of each of the hidden layers 330a-n are hidden nodes of the neural network 310 …”); 
after training the neural network: inserting an image from a first content item into the neural network to generate a first embedding for the first content item ([Tang, 0018] “Accordingly, the online system 130 extracts two vectors (i.e., one representing the user profile 142 and one representing the event 134) based on embeddings from the trained DNN”, discloses the system extracts the embedding from the user profile, which also means it receives the user profile (that includes image data) as an input. 
[Tang, 0031] “The action log 220 may be used by the online system 130 to track user actions on the online system 130, as well as actions on third party systems 120 that communicate information to the online system 130 … and information describing these interactions is stored in the action log 220. Additional examples of interactions with objects on the online system 130 that are included in the action log 220 include: commenting on a photo album, … expressing a preference for an object (“liking” the object), and engaging in a transaction”, teaches tracking using action log, that includes interaction with photo album, which is an interaction with an image. Information about interaction with photo album, which is stored in the action log, contains image data.
[Tang, 0036] “A stored neural network is configured to receive, as an input, an input vector (e.g., a vector associated with an event 152 or a vector associated with a user 154) via an input layer. Here, a received input vector is associated with one of a user of the online system 130 or an event provided by a third-party system (e.g., third-party system 120)”, [Tang, 0048] “The neural network 455 is configured to receive, as inputs, the output generated by both the neural network 142 and the neural network 144”); 
identifying a second content item with which a particular entity interacted ([Tang, 0017] “The online system 130 provides a framework to provide content items describing events to one or more user of client devices 110. A content item associated with an event is also referred to herein as an event content. In FIG. 1A, a user profile 132 is associated with a user of the online system 130 and an event 134 associated with the third-party system 120 …”, discloses the event is content item, which corresponds to the second content item.
 [Tang, 0047] “The outputs of the neural network 142 and 144 is some function of the relationship between a user associated with the user vector received neural network 142 and the event vector received by neural network 144”, discloses the system identifies the relationship between the user and the event, and the event corresponds to the particular entity that the user interacted. 
[Tang, 0031] “The action log 220 may be used by the online system 130 to track user actions on the online system 130, as well as actions on third party systems 120 that communicate information to the online system 130 … Additional examples of interactions with objects on the online system 130 that are included in the action log 220 include: commenting on a photo album, communicating with a user, establishing a connection with an object, joining an event, joining a group, creating an event, authorizing an application, using an application, expressing a preference for an object …, and engaging in a transaction. Additionally, the action log 220 may record a user's interactions with advertisements on the online system 130 as well as with other applications operating on the online system 130”, Information about interaction with photo album, which is stored in the action log, contains image data);
inserting an image of the second content item into the neural network to generate a second embedding for the particular entity ([Tang, 0031] “The action log 220 may be used by the online system 130 to track user actions on the online system 130, as well as actions on third party systems 120 that communicate information to the online system 130 … and information describing these interactions is stored in the action log 220. Additional examples of interactions with objects on the online system 130 that are included in the action log 220 include: commenting on a photo album, communicating with a user, establishing a connection with an object, joining an event, joining a group, creating an event, authorizing an application, using an application, expressing a preference for an object (“liking” the object), and engaging in a transaction”, teaches tracking using action log, that includes interaction with photo album, which is an interaction with an image.
[Tang, 0036] “A stored neural network is configured to receive, as an input, an input vector (e.g., a vector associated with an event 152 or a vector associated with a user 154) via an input layer. Here, a received input vector is associated with one of a user of the online system 130 or an event provided by a third-party system (e.g., third-party system 120)”, teaches receiving the input vector which includes the image data. Event corresponds to the particular entity); 
in response to receiving a content request that is associated with the particular entity: identifying the first content item that is associated with one or more targeting criteria that are satisfied based on the content request ([Tang, 0016] “The online system 130 is a computer system that includes software and hardware for performing a group of coordinated functions or tasks. The online system 130 includes multiple components for providing a framework for upselling ticketing events to one or more users of the online system 130. Here, the online system 130 may additionally or alternatively be a social networking system. The online system 130 is configured to receive requests from one or more client devices 110 and third party systems 120 and execute computer programs associated with the received requests. As an example, the online system 130 stores content associated with one or more users and content associated with an event in order to provide the event to a user of the one or more users with a threshold probability of attending the event …”, [Tang, 0043] “… the input vector 310 is a vector comprising content items associated with a user of the online system 130 (e.g., items stored in the user profile store 205, action log 220, and edge store 225). For example, an input vector 310 comprises demographic information (e.g., age group of the user), groups that the user is associated with (e.g., a member of a page associated with the event), a geographic location (e.g., the user is within 10 miles of a geographic location associated with an event), a number and/or type of actions performed by the user on a content item associated with an event either on or off the online system 130, or any combination thereof. In other embodiments, the input vector 310 additionally, or alternatively, comprises one or more items associated with an event (e.g., items stored in the event store 210). For example, the input vector 310 may include information describing a time and date associated with the event, a number of other users of the online system 130 who have expressed an interest in the event, a number of messages associated with the event, or any combination thereof”); 
based on the first embedding and the second embedding, generating a prediction of whether the particular entity will interact with the first content item ([Tang, 0047] “The outputs of the neural network 142 and 144 is some function of the relationship between a user associated with the user vector received neural network 142 and the event vector received by neural network 144”, teaches the output of the two networks is the function of the relationship between event and user, which is a interact); 
wherein the method is performed by one or more computing devices ([Tang, 0016] “The online system 130 is a computer system that includes software and hardware for performing a group of coordinated functions or tasks”).

Regarding claim 13, The method of Claim 11, further comprising: using the one or more machine learning techniques to learn weights for a plurality of contextual features while training the one or more layers of the neural network ([Tang, 0048] “The neural network 455 is configured to receive, as inputs, the output generated by both the neural network 142 and the neural network 144. The neural network 455 is configured to generate an output associated with a relationship between a user vector and an event vector. In various embodiments, the neural network 455 changes the weights of neural network 142 and neural network 144 based on various learning algorithms. Here, changing the weights of neural network 142 and neural networks 144 comprises adjusting the weights between individual neurons of the hidden layers to reduce a total measure of error between a predicted output and actual output”, [Tang, 0034] “An edge may include various features each representing characteristics of interactions between users, interactions between users and objects, or interactions between objects. For example, features included in an edge describe a rate of interaction between two users, how recently two users have interacted with each other, a rate, or an amount of information retrieved by one user about an object, or numbers and types of comments posted by a user about an object ... hence, an edge may be represented as one or more feature expressions”, teaches the edges stores the features, [Tang, 0043] “In one embodiment, the input vector 310 is a vector comprising content items associated with a user of the online system 130 (e.g., items stored in the user profile store 205, action log 220, and edge store 225). For example, an input vector 310 comprises demographic information (e.g., age group of the user), groups that the user is associated with (e.g., a member of a page associated with the event), a geographic location (e.g., the user is within 10 miles of a geographic location associated with an event), a number and/or type of actions performed by the user on a content item associated with an event either on or off the online system 130, or any combination thereof. In other embodiments, the input vector 310 additionally, or alternatively, comprises one or more items associated with an event (e.g., items stored in the event store 210). For example, the input vector 310 may include information describing a time and date associated with the event, a number of other users of the online system 130 who have expressed an interest in the event, a number of messages associated with the event, or any combination thereof”, [Tang, 0044] “The neural network 310 generates as output comprising value, or a score. An output generated by the neural network 310 is, for example, a score indicating a likelihood of the input user attending the input event”, teaches the neural network process the input from the edge (features) and give value or score); 
in response to receiving the content request: identifying a plurality of feature values for the plurality of contextual features ([Tang, 0016] “The online system 130 is a computer system that includes software and hardware for performing a group of coordinated functions or tasks. The online system 130 includes multiple components for providing a framework for upselling ticketing events to one or more users of the online system 130. Here, the online system 130 may additionally or alternatively be a social networking system. The online system 130 is configured to receive requests from one or more client devices 110 and third party systems 120 and execute computer programs associated with the received requests. As an example, the online system 130 stores content associated with one or more users and content associated with an event in order to provide the event to a user of the one or more users with a threshold probability of attending the event …”, [Tang, 0034] “An edge may include various features each representing characteristics of interactions between users, interactions between users and objects, or interactions between objects. For example, features included in an edge describe a rate of interaction between two users, how recently two users have interacted with each other, a rate, or an amount of information retrieved by one user about an object, or numbers and types of comments posted by a user about an object ... hence, an edge may be represented as one or more feature expressions”, teaches the edges stores the features, [Tang, 0043] “In one embodiment, the input vector 310 is a vector comprising content items associated with a user of the online system 130 (e.g., items stored in the user profile store 205, action log 220, and edge store 225). For example, an input vector 310 comprises demographic information (e.g., age group of the user), groups that the user is associated with (e.g., a member of a page associated with the event), a geographic location (e.g., the user is within 10 miles of a geographic location associated with an event), a number and/or type of actions performed by the user on a content item associated with an event either on or off the online system 130, or any combination thereof. In other embodiments, the input vector 310 additionally, or alternatively, comprises one or more items associated with an event (e.g., items stored in the event store 210). For example, the input vector 310 may include information describing a time and date associated with the event, a number of other users of the online system 130 who have expressed an interest in the event, a number of messages associated with the event, or any combination thereof”, [Tang, 0044] “The neural network 310 generates as output comprising value, or a score. An output generated by the neural network 310 is, for example, a score indicating a likelihood of the input user attending the input event”, teaches the neural network process the input from the edge (features) and give value or score); 
wherein generating the prediction is also based on the weights and the plurality of feature values ([Tang, 0048] “The neural network 455 is configured to receive, as inputs, the output generated by both the neural network 142 and the neural network 144. The neural network 455 is configured to generate an output associated with a relationship between a user vector and an event vector. In various embodiments, the neural network 455 changes the weights of neural network 142 and neural network 144 based on various learning algorithms. Here, changing the weights of neural network 142 and neural networks 144 comprises adjusting the weights between individual neurons of the hidden layers to reduce a total measure of error between a predicted output and actual output”, discloses making prediction about relationship based on weights of the neural network).


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

	Claim 2 and 14 are rejected under 35 U.S.C. 103 over Tang (US 20190188561 A1) in view of Wang (US 20190312897 A1).

Regarding claim 2, Tang teaches the method of claim 1.
Tang does not specifically teach wherein the set of contextual features includes two or more of: time of the content request, type of device that initiated the content request, or type of channel through which a content item was presented.
Wang teaches wherein the set of contextual features includes two or more of: time of the content request, type of device that initiated the content request, or type of channel through which a content item was presented ([Wang, 0052] “The content request scoring module 612 may extract a timestamp from the content request 604 to determine a time of the content request 604 as a request feature ... The characteristics of the requestor computing device may comprise a country, a zip code, an IP address, an operating system, a browser, a device type (e.g., smart phone, smart watch, desktop, etc.), an application name of an application that requested access to the website, and/or other information”).
It would have been obvious to one of ordinary skill in the art, prior to the effective filing date, to combine the method of Tang and Wang because time of request and device type is required to find out the request feature ([Wang, 0052]).

Regarding claim 14, Tang teaches the method of claim 13.
Tang does not specifically teach wherein the plurality of contextual features includes two or more of: time of the content request, type of device that initiated the content request, or type of channel through which a content item was presented.
Wang teaches wherein the plurality of contextual features includes two or more of: time of the content request, type of device that initiated the content request, or type of channel through which a content item was presented ([Wang, 0052] “The content request scoring module 612 may extract a timestamp from the content request 604 to determine a time of the content request 604 as a request feature ... The characteristics of the requestor computing device may comprise a country, a zip code, an IP address, an operating system, a browser, a device type (e.g., smart phone, smart watch, desktop, etc.), an application name of an application that requested access to the website, and/or other information”).

Claim 7-8, 18 are rejected under 35 U.S.C. 103 over Tang (US 20190188561 A1) in view of Nanni (Nanni & Lumini, 2012, “A classifier ensemble approach for the missing feature problem”).

Regarding claim 7, Tang teaches the method of claim 1 and determining a particular embedding and including the particular embedding in the first set of embeddings or the second set of embeddings ([Tang, 0018] “Accordingly, the online system 130 extracts two vectors (i.e., one representing the user profile 142 and one representing the event 134) based on embeddings from the trained DNN”, discloses the embedding is generated after training (trained neural network), [Tang, 0036] “A stored neural network is configured to receive, as an input, an input vector (e.g., a vector associated with an event 152 or a vector associated with a user 154) via an input layer. Here, a received input vector is associated with one of a user of the online system 130 or an event provided by a third-party system (e.g., third-party system 120)”, [Tang, 0048] “The neural network 455 is configured to receive, as inputs, the output generated by both the neural network 142 and the neural network 144”, [Tang, 0051] “An event embedding selection module 450 selects embeddings from a hidden layer of the neural network 144”, [Tang, 0027] “A user profile may also store other information provided by the user, for example, images or videos. In certain embodiments, images of users may be tagged with information identifying the online system users displayed in an image, with information identifying the images in which a user is tagged stored in the user profile of the user”, user profile information comprises image data).
However, Tang does not specifically teach further comprising: determining that an embedding for a particular attribute value is missing for the particular content item or the particular requesting entity; in response to determining that an embedding for the particular attribute value is missing for the particular content item or the particular requesting entity: determining a particular embedding, wherein determining the particular embedding comprises (a) generating a random embedding, wherein the particular embedding is the random embedding, or (b) generating the particular embedding based on embeddings of attribute values of the same attribute type as the particular attribute value.
Nanni teaches further comprising: determining that an embedding for a particular attribute value is missing for the particular content item or the particular requesting entity; in response to determining that an embedding for the particular attribute value is missing for the particular content item or the particular requesting entity: determining a particular embedding, wherein determining the particular embedding comprises (a) generating a random embedding, wherein the particular embedding is the random embedding, or (b) generating the particular embedding based on embeddings of attribute values of the same attribute type as the particular attribute value ([Nanni, page 39, 3. Proposed system] “2.2 Let Dt be a subset of all training patterns that have a similarity to the t-th cluster greater than TH (we fix TH = 0.25); 2.3 While Dt contains less than 5 patterns or if there exists a feature missing in all the patterns that belong to Dt , then assign to Dt a random subset of 25% of all the training patterns;”, teaches generating another value when some particular entity is missing). 
It would have been obvious to one of ordinary skill in the art, prior to the effective filing date, to combine the method of Tang and Nanni because imputation approach based on clustering and a random subspace classifier outperforms several other state-of-the-art approaches ([Nanni, Abstract]).

Claim 18 is a storage media claim having similar limitation to the method claim 7. Therefore, it is rejected under the same rationale as the claim 7 above.

Regarding claim 8, Tang teaches the method of Claim 7. 
Tang does not specifically teach further comprising: in response to determining that the embedding for the particular attribute value is missing for the particular requesting entity: identifying one or more profiles of users that are similar to the particular requesting entity; identifying, within the one or more profiles, one or more attribute values that are of the same attribute as the particular attribute value; based on the one or more attribute values, identifying the one or more other embeddings; including the particular embedding in the second set of embeddings.
Nanni teaches further comprising: in response to determining that the embedding for the particular attribute value is missing for the particular requesting entity: identifying one or more profiles of users that are similar to the particular requesting entity; identifying, within the one or more profiles, one or more attribute values that are of the same attribute as the particular attribute value; based on the one or more attribute values, identifying the one or more other embeddings; including the particular embedding in the second set of embeddings ([Nanni, page 38, right column, 2. Compared systems] “kNN [20], is a method, where, given an incomplete pattern x, K closest cases that are not missing values (i.e., features with missing values in x) in the features are imputed such that they minimize some distance measure. Once the K nearest neighbors have been found, a replacement value for the missing attribute value must be estimated. One obvious refinement is to weight the contribution of each neighbor according to its distance to x.3 We have tested different values for K (K ∈ {3, 5, 7, 9} and selected the best value (K = 3))”).
	It would have been obvious to one of ordinary skill in the art prior to the effective filing date, to combine the method of Tang and Nanni in order to replacement value for the missing attribute value by given an incomplete pattern x, K closest cases that are not missing values (i.e. features with missing values in x) in the features are imputed such that they minimize some distance measure ([Nanni, 2. Compared systems])

Claim 9 and 19 are rejected under 35 U.S.C. 103 over Tang (US 20190188561 A1) in view of Kraenzel (US 20190132359 A1).

Regarding claim 9, Tang teaches, further comprising: in response to receiving the content request: identifying a plurality of content items, each of which is associated with one or more targeting criteria that are satisfied ([Tang, 0016] “… The online system 130 is configured to receive requests from one or more client devices 110 and third party systems 120 and execute computer programs associated with the received requests. As an example, the online system 130 stores content associated with one or more users and content associated with an event in order to provide the event to a user of the one or more users with a threshold probability of attending the event …”).
for each content item in the plurality of content items: identifying a set of embeddings ([Tang, 0043] “… the input vector 310 is a vector comprising content items associated with a user of the online system 130 (e.g., items stored in the user profile store 205, action log 220, and edge store 225). For example, an input vector 310 comprises demographic information (e.g., age group of the user), groups that the user is associated with (e.g., a member of a page associated with the event), a geographic location (e.g., the user is within 10 miles of a geographic location associated with an event), a number and/or type of actions performed by the user on a content item associated with an event either on or off the online system 130, or any combination thereof. In other embodiments, the input vector 310 additionally, or alternatively, comprises one or more items associated with an event (e.g., items stored in the event store 210). For example, the input vector 310 may include information describing a time and date associated with the event, a number of other users of the online system 130 who have expressed an interest in the event, a number of messages associated with the event, or any combination thereof”, [Tang, 0051] “... The online system provides user and event data to the neural network 142 and neural network 144 to generate a user embedding 460 and an event embedding 465, respectively …”); 
inputting each embedding in the set of embeddings into a first neural network, whose weights were trained while processing the set of training data, to generate certain output ([Tang, 0016] “… The online system 130 is configured to receive requests from one or more client devices 110 and third party systems 120 and execute computer programs associated with the received requests. As an example, the online system 130 stores content associated with one or more users and content associated with an event in order to provide the event to a user of the one or more users with a threshold probability of attending the event …”, 
[Tang, 0036] “A stored neural network is configured to receive, as an input, an input vector (e.g., a vector associated with an event 152 or a vector associated with a user 154) via an input layer. Here, a received input vector is associated with one of a user of the online system 130 or an event provided by a third-party system (e.g., third-party system 120)”, [Tang, 0048] “The neural network 455 is configured to receive, as inputs, the output generated by both the neural network 142 and the neural network 144”, embedding is the input vector); wherein the plurality of content items does not include the particular content item ([Tang, 0037] “… In an embodiment, an attribute of a user or event is represented using a one hot vector. For example, the gender of a user may be represented using a bit vector in which each bi corresponds to a gender value. Similarly the age of a user may be represented by a vector in which the nth element stores 1 if the user has age value N and the remaining elements store 0 …”, Tang teaches the information of the user only has age value, others 0); 
However, Tang does not teach wherein selecting the particular content item comprises selecting the particular content item based on the certain output for each content item in the plurality of content items.
wherein selecting the particular content item comprises selecting the particular content item based on the certain output for each content item in the plurality of content items ([Kraenzel, 0019] “The honeypot configuration engine 142 includes a content selector engine 178 configured to select the honeypot content 134 based on the output 176 generated by the neural network 174 using features extracted from the first categories 168 and the one or more topics 164”).
	Further, it would have been obvious to one of ordinary skill in the art, prior to the effective filing date, to combine the method of Tang and the method of Kraenzel because the item recommendation methods according to the invention incorporate the above described algorithms so as to reduce computational complexity, and thus the runtime of the factorization of a rating matrix.

	Claim 19 is a storage media claim having similar limitation to the method claim 1. Therefore, it is rejected under the same rationale as the claim 1 above.

	Claim 10 and 20 are rejected under 35 U.S.C. 103 over Tang (US 20190188561 A1) in view of Young (US 9721203 B1).

Regarding claim 10, Tang teaches further comprising: while processing the set of training data:  learning first weights of a first neural network for content items ([Tang, 0018] “The online system 130 trains a deep neural network (DNN) using labels representing associations between users and events. In an embodiment, the online system trains two neural networks (e.g., a user neural network 142 and an event neural network 144). The online system extracts embeddings comprising a vector representation of the user profile 132 from the user neural network 142 and embedding comprising a vector representation of the event 134 from the neural network 144”, teaches training two networks, with user neural network, which corresponds to the first neural network and first weights.
 [Tang, 0051] “The embeddings (i.e., the user embedding 460 and the event embedding 465) each represent the sample input data at a layer within the neural network”, the user embedding is the first attribute values, 
[Tang, 0048] “The neural network 455 is configured to receive, as inputs, the output generated by both the neural network 142 and the neural network 144. The neural network 455 is configured to generate an output associated with a relationship between a user vector and an event vector. In various embodiments, the neural network 455 changes the weights of neural network 142 and neural network 144 based on various learning algorithms. Here, changing the weights of neural network 142 and neural networks 144 comprises adjusting the weights between individual neurons of the hidden layers to reduce a total measure of error between a predicted output and actual output”, teaches the training process includes learning weights. Adjusting weights of the neural network corresponds to learning weights); 
learning second weights for a second neural network for requesting entities ([Tang, 0016] “… The online system 130 is configured to receive requests from one or more client devices 110 and third party systems 120 and execute computer programs associated with the received requests. As an example, the online system 130 stores content associated with one or more users and content associated with an event in order to provide the event to a user of the one or more users with a threshold probability of attending the event …”, teaches requesting entities.
[Tang, 0018] “The online system 130 trains a deep neural network (DNN) using labels representing associations between users and events. In an embodiment, the online system trains two neural networks (e.g., a user neural network 142 and an event neural network 144). The online system extracts embeddings comprising a vector representation of the user profile 132 from the user neural network 142 and embedding comprising a vector representation of the event 134 from the neural network 144”, 
 [Tang, 0051] “The embeddings (i.e., the user embedding 460 and the event embedding 465) each represent the sample input data at a layer within the neural network”, the event embedding is the second attribute values, 
[Tang, 0048] “The neural network 455 is configured to receive, as inputs, the output generated by both the neural network 142 and the neural network 144. The neural network 455 is configured to generate an output associated with a relationship between a user vector and an event vector. In various embodiments, the neural network 455 changes the weights of neural network 142 and neural network 144 based on various learning algorithms. Here, changing the weights of neural network 142 and neural networks 144 comprises adjusting the weights between individual neurons of the hidden layers to reduce a total measure of error between a predicted output and actual output”, teaches the training process includes learning weights); 
in response to receiving the content request: inputting the first set of embeddings into the first neural network to generate a first vector ([Tang, 0016] “… The online system 130 is configured to receive requests from one or more client devices 110 and third party systems 120 and execute computer programs associated with the received requests. As an example, the online system 130 stores content associated with one or more users and content associated with an event in order to provide the event to a user of the one or more users with a threshold probability of attending the event …”, 
[Tang, 0036] “A stored neural network is configured to receive, as an input, an input vector (e.g., a vector associated with an event 152 or a vector associated with a user 154) via an input layer. Here, a received input vector is associated with one of a user of the online system 130 or an event provided by a third-party system (e.g., third-party system 120)”, [Tang, 0048] “The neural network 455 is configured to receive, as inputs, the output generated by both the neural network 142 and the neural network 144”, [Tang, 0051] “An event embedding selection module 450 selects embeddings from a hidden layer of the neural network 144”); 
inputting the second set of embeddings into the second neural network to generate second vector ([Tang, 0036] “A stored neural network is configured to receive, as an input, an input vector (e.g., a vector associated with an event 152 or a vector associated with a user 154) via an input layer. Here, a received input vector is associated with one of a user of the online system 130 or an event provided by a third-party system (e.g., third-party system 120)”, [Tang, 0048] “The neural network 455 is configured to receive, as inputs, the output generated by both the neural network 142 and the neural network 144”, [Tang, 0051] “A user embedding selection module 445 selects embeddings from a hidden layer of the neural network 142”); 
generating a prediction based on applying a set of one or more machine-learned weights to the output vector ([Tang, 0051] “The embeddings (i.e., the user embedding 460 and the event embedding 465) each represent the sample input data at a layer within the neural network”, the event embedding is the second attribute values, [Tang, 0048] “The neural network 455 is configured to receive, as inputs, the output generated by both the neural network 142 and the neural network 144. The neural network 455 is configured to generate an output associated with a relationship between a user vector and an event vector. In various embodiments, the neural network 455 changes the weights of neural network 142 and neural network 144 based on various learning algorithms. Here, changing the weights of neural network 142 and neural networks 144 comprises adjusting the weights between individual neurons of the hidden layers to reduce a total measure of error between a predicted output and actual output”, teaches the training process includes applying weights to the a user vector and an event vector).
Tang does not specifically teach performing an element wise product operation on the first vector and the second vector to generate an output vector. 
Young teaches performing an element wise product operation on the first vector and the second vector to generate an output vector ([Young, column 2, line 12-24] “(ii) a respective non-zero value at each element position of the masking tensor that corresponds to an element of the first tensor that would have been generated if the second convolutional neural network layer had the stride of the first convolutional neural network layer, and performing element-wise multiplication of a second masking tensor and the modified first tensor, wherein the second masking tensor comprises, at each element position that corresponds to an element of the first tensor that would be generated if the second convolutional neural network layer had the stride of the first convolutional neural network layer, an inverse of the respective non-zero value of the first masking tensor”). 
Further, it would have been obvious to one of ordinary skill in the art, prior to the effective filing date, to combine the method of Tang and the method of performing an element wise product operation on the first vector and second vector of Young because the item recommendation methods according to the invention incorporates two embeddings from two different neural networks, and element-wise product is a common way to combine two different outputs ([Young, column 2, line 12-24]).

Claim 20 is a storage media claim having similar limitation to the method claim 10. Therefore, it is rejected under the same rationale as the claim 10 above.

	Claim 5-6, 12, and 17 are rejected under 35 U.S.C. 103 over Tang (US 20190188561 A1) in view of Garlon (US 20180204113 A1).

Regarding claim 5, Tang teaches further comprising: for a particular attribute of the particular requesting entity, identifying a plurality of embeddings; wherein the first set of embeddings includes the single particular embedding and does not include more than one embedding from the plurality of embeddings ([Tang, 0018] “… The online system extracts embeddings comprising a vector representation of the user profile 132 from the user neural network 142 and embedding comprising a vector representation of the event 134 from the neural network 144 …”, [Tang, 0037] “… In an embodiment, an attribute of a user or event is represented using a one hot vector. For example, the gender of a user may be represented using a bit vector in which each bi corresponds to a gender value. Similarly the age of a user may be represented by a vector in which the nth element stores 1 if the user has age value N and the remaining elements store 0 …”).
Tang does not specifically teach combining the plurality of embeddings into a single particular embedding.
Garlon teaches combining the plurality of embeddings into a single particular embedding ([Garlon, 0005, line 8-14] “Initially, a pair of items, including a seed asset and a candidate asset, is received. Each word in the seed and candidate titles, each aspect, and the categories may be embedded into a k-dimensional vector space. The embedding may then be aggregated to construct an n-dimensional vector representing a seed asset and an n-dimensional vector representing a candidate asset”, the pair of items that are embedded into a k-dimensional vector space is the second and third embedding).
Further, it would have been obvious to one of ordinary skill in the art, prior to the effective filing date, to combine the method of Tang and the method of Garlon because the embeddings have to be combined (aggregated) to make prediction of both embeddings using one model.

Claim 17 is a storage media claim having similar limitation to the method claim 5. Therefore, it is rejected under the same rationale as the claim 5 above.

Regarding claim 6, Tang in view of Garlon teaches wherein: the particular attribute is one of an employer, a job title, or a skill; the plurality of embeddings are based on a plurality of employers, a plurality of job titles, or a plurality of skills ([Tang, 0027] “Each user of the online system 130 is associated with a user profile ... Examples of information stored in a user profile include biographic, demographic, and other types of descriptive information, such as work experience, educational history, gender, interests, hobbies or preferences, location and the like ... A user profile in the user profile store 205 may also maintain references to one or more previous events attended by the user in the event store 210 and stored in the action log 220. The event store 210 and the action log 220 are both further described below”).

Regarding claim 12, Tang teaches further comprising: after training the neural network: identifying a third content item with which the particular entity interacted ([Tang, 0043] “… the input vector 310 is a vector comprising content items associated with a user of the online system 130 (e.g., items stored in the user profile store 205, action log 220, and edge store 225). For example, an input vector 310 comprises demographic information (e.g., age group of the user), groups that the user is associated with (e.g., a member of a page associated with the event), a geographic location (e.g., the user is within 10 miles of a geographic location associated with an event), a number and/or type of actions performed by the user on a content item associated with an event either on or off the online system 130, or any combination thereof. In other embodiments, the input vector 310 additionally, or alternatively, comprises one or more items associated with an event (e.g., items stored in the event store 210). For example, the input vector 310 may include information describing a time and date associated with the event, a number of other users of the online system 130 who have expressed an interest in the event, a number of messages associated with the event, or any combination thereof”, [Tang, 0051] “... The online system provides user and event data to the neural network 142 and neural network 144 to generate a user embedding 460 and an event embedding 465, respectively …”, the ‘third content item’ is not clearly defined in the specification, therefore, interpreted as ‘first content item’ inputted after the first ‘first content item’);
inserting an image of the third content item into the neural network to generate a third embedding for the particular entity ([Tang, 0018] “Accordingly, the online system 130 extracts two vectors (i.e., one representing the user profile 142 and one representing the event 134) based on embeddings from the trained DNN”, discloses the embedding is generated after training (trained neural network), [Tang, 0036] “A stored neural network is configured to receive, as an input, an input vector (e.g., a vector associated with an event 152 or a vector associated with a user 154) via an input layer. Here, a received input vector is associated with one of a user of the online system 130 or an event provided by a third-party system (e.g., third-party system 120)”, [Tang, 0048] “The neural network 455 is configured to receive, as inputs, the output generated by both the neural network 142 and the neural network 144”, [Tang, 0051] “An event embedding selection module 450 selects embeddings from a hidden layer of the neural network 144”, [Tang, 0027] “A user profile may also store other information provided by the user, for example, images or videos. In certain embodiments, images of users may be tagged with information identifying the online system users displayed in an image, with information identifying the images in which a user is tagged stored in the user profile of the user”, user profile information comprises image data); 
wherein generating the prediction is based on the output embedding ([Tang, 0047] “The outputs of the neural network 142 and 144 is some function of the relationship between a user associated with the user vector received neural network 142 and the event vector received by neural network 144”, teaches the output of the two networks is the function of the relationship between event and user, which is a interact).
Tang does not specifically teach performing an aggregation operation that takes, as input, the second embedding and the third embedding, and produces an output embedding. 
Garlon teaches performing an aggregation operation that takes, as input, the second embedding and the third embedding, and produces an output embedding ([Garlon, 0005, line 8-14] “Initially, a pair of items, including a seed asset and a candidate asset, is received. Each word in the seed and candidate titles, each aspect, and the categories may be embedded into a k-dimensional vector space. The embedding may then be aggregated to construct an n-dimensional vector representing a seed asset and an n-dimensional vector representing a candidate asset”, the pair of items that are embedded into a k-dimensional vector space is the second and third embedding); 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
Regarding selecting a content item based on the output.
US 8676736 B2
Any inquiry concerning this communication or earlier communications from the examiner
should be directed to JUN KWON whose telephone number is (571)272-2072. The examiner can
normally be reached on 7:30 AM - 5:30 PM. If attempts to reach the examiner by telephone are
unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached on (571)270-3169. The fax
phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application
Information Retrieval (PAIR) system. Status information for published applications may be obtained
from either Private PAIR or Public PAIR. Status information for unpublished applications is available
through Private PAIR only. For more information about the PAIR system, see http://pair-
direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic
Business Center (EBC) at 866-217-9197 (toll-free).

/JUN KWON/
Patent Examiner, Art Unit 2127
/ABDULLAH AL KAWSAR/Supervisory Patent Examiner, Art Unit 2127