Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This is the initial office action that has been issued in response to patent application 16/142,441 filed on 09/26/2018. Claims 1-20, as originally filed, are currently pending and have been considered below. Claim 1, 8, and 15 are independent claims.

Specification
The specification is objected to as failing to provide proper antecedent basis for the claimed subject matter.  See 37 CFR 1.75(d)(1) and MPEP § 608.01(o).  Correction of the following is required: Claim 15-20 recites “machine-readable storage medium” but the Specification does not recite “machine-readable storage medium”.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-7 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because the claim could be considered signal per se.
Claim 1 recites “computer-readable medium." The broadest reasonable interpretation of a claim that recites "computer-readable medium," in view of the present specification, does not cover any form of non-transitory tangible media and transitory propagating signals per se in view of the ordinary and customary meaning of computer readable media, particularly when the specification is silent. See MPEP 2111.01. When the broadest reasonable interpretation of a claim covers a signal per se, the claim must be rejected under 35 U.S.C. § 101 as covering non-statutory subject matter. See In re Nuijten, 500 F.3d 1346, 1356-57 (Fed. Cir. 2007) (transitory embodiments are not directed to statutory subject matter) and Interim Examination Instructions for Evaluating Subject Matter Eligibility Under 35 U.S.C. §101, Aug. 24, 2009; p. 2. 1351 Off. Gaz. Pat. Off. 212 (2010). Under broadest reasonable interpretation, "computer-readable medium" recited in claim 1 encompasses a transitory, propagating signal, which is not a process, machine, manufacture, or composition of matter. Nuijten, 500 F.3d at 1357. Therefore, the claim "covers material not found in any of the four statutory categories [and thus] falls outside the plainly expressed scope of § 101." Id. at 1354. A recommended amendment is to recite “non-transitory computer-readable medium” (emphasis added).
Claims 2-7 are rejected based on the same rationale as discussed above in the rejected claim 1.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3, 6-8, 10, 13-15, 17, and 20 are rejected under 35 U.S.C. 103 as being obvious over Merhav et al. (US 20180144253 A1; hereinafter “Merhav-1”) in view of Merhav et al. (US 9904871 B2; hereinafter “Merhav-2”) 
The applied reference has a common Applicant with the instant application. Based upon the earlier effectively filed date of the reference, it constitutes prior art under 35 U.S.C. 102(a)(2).
This rejection under 35 U.S.C. 103 might be overcome by: (1) a showing under 37 CFR 1.130(a) that the subject matter disclosed in the reference was obtained directly or indirectly from the inventor or a joint inventor of this application and is thus not prior art in accordance with 35 U.S.C.102(b)(2)(A); (2) a showing under 37 CFR 1.130(b) of a prior public disclosure under 35 U.S.C. 102(b)(2)(B); or (3) a statement pursuant to 35 U.S.C. 102(b)(2)(C) establishing that, not later than the effective filing date of the claimed invention, the subject matter disclosed and the claimed invention were either owned by the same person or subject to an obligation of assignment to the same person or subject to a joint research agreement. See generally MPEP § 717.02.
Regarding Claim 1,
Merhav-1 teaches a system comprising (Merhav-1, FIG. 1 and Para. [0031], “FIG. 1 is a block diagram illustrating a client-server system 100, in accordance with an example embodiment. A networked system 102 provides server-side functionality via a network 104 (e.g., the Internet or a wide area network (WAN)) to one or more clients” teaches a system).
a computer-readable medium having instructions stored thereon, which, when executed by a processor, cause the system to perform operations comprising (Merhav-1, Para. [0090], “The term “machine-readable medium” shall also be taken to include any medium, or combination of multiple media, that is capable of storing instructions (e.g., instructions 1116) for execution by a machine (e.g., machine 1100), such that the instructions, when executed by one or more processors of the machine (e.g., processors 1110), cause the machine to perform any one or more of the methodologies described herein” teaches a machine-readable medium (corresponds to the computer-readable medium) that stores instructions that when executed by one or more processors, cause the machine to perform one or more methodologies described).
obtaining a first set of training data, the first set of training data comprising pairs of job titles and standardized job title identifications (Merhav-1, Para. [0059], “For example, a member profile containing both the particular title and the certain skill could be fed to the machine learning mode” teaches a member profile being used as training data. Para. [0065], “FIG. 8 is a flow diagram illustrating a method 800 for creating a deep representation of entities specifically for relating titles and skills. Here, titles and skills are embedded in a such a way that the dot product is maximized when the skill and title tend to co-occur, and minimized otherwise. A member title identification (entity) 802 is used for an embedding step 804, which outputs a title latent representation, specifically a vector 806 for the title” teaches a member title identification (corresponds to the job titles and standardized job title identifications)).
obtaining a second set of training data, the second set of training data comprising pairs of job titles and skills (Merhav-1, Para. [0059], “For example, a member profile containing both the particular title and the certain skill could be fed to the machine learning mode” teaches a member profile being used as training data. Para. [0065], “FIG. 8 is a flow diagram illustrating a method 800 for creating a deep representation of entities specifically for relating titles and skills. Here, titles and skills are embedded in a such a way that the dot product is maximized when the skill and title tend to co-occur, and minimized otherwise… Likewise, a skill identification (entity) 808 is used for an embedding step 810, which outputs a skill latent representation for the skill, specifically a vector 812 for the skill” teaches a skill identification (corresponds to the skills)).
… output a prediction score indicating a likelihood that an input candidate job title matches an input job title identification (Merhav-1, Para. [0050], “Para. [0050], “The deep representation of entities may be a single representation for many different types of entities. For example, whereas in the prior art each entity type would be mapped into a hierarchy of entities just of that entity type, the deep representation of entities in the present disclosure allows for multiple entity types to be mapped into the same data structure, thus permitting enriched analyses and predictions based on the relationships between entities of different entity types” teaches  a prediction based on the relationship of an entity (corresponds to the input candidate job title) to a hierarchy of entities (corresponds to the input job title identification). Para. [0051], “The machine learning component 330 may utilize machine learning processes to arrive at a prediction model 400 used to provide a confidence score for a particular prediction… In the confidence scoring component 404, one or more entities 414, as well as one or more outputs of objective function(s) 416 performed on embeddings 418 corresponding to the one or more entities 414, may be fed to the prediction model 400, which outputs a confidence score for each of one or more potential predictions, indicating a confidence level in the corresponding potential prediction” teaches a prediction model within the machine learning component, being fed one or more entities (corresponds to the plurality of candidate job title identifications), and that outputs a confidence score for a particular prediction (corresponds to  a prediction score indicating a likelihood).. FIG. 9 and Para. [0066], “FIG. 9 is a flow diagram illustrating a method 900 for creating a deep representation of entities specifically for skill-set-based title prediction. This model optimizes for an application known as title disambiguation. Specifically, some members may submit very broad titles (e.g., manager), and it can be useful to predict a more exact title identification given the set of skills the member has” teaches the matching of the member’s title (corresponds to the input candidate job title) with a title identification (corresponds to the input job title identification)).  
… feeding a first candidate job title and a plurality of candidate job title identifications into the prediction model, producing a prediction score for each pairing of the first candidate job title and a candidate job title identification (Merhav-1, Para. [0059], “For example, a member profile containing both the particular title and the certain skill could be fed to the machine learning model, the machine learning model having been trained based on the objective function output, and if the machine learning model accurately predicts that the member profile should contain the certain skill, then the optimization test passes. If not, it fails” teaches the machine learning model being fed the member profile containing the particular title (corresponds to the first candidate job title). Para. [0051], “In an example embodiment, one or more machine learning algorithms are used to aid in optimizing embedding used in the deep representation of entities. The machine learning component 330 may utilize machine learning processes to arrive at a prediction model 400 used to provide a confidence score for a particular prediction… In the confidence scoring component 404, one or more entities 414, as well as one or more outputs of objective function(s) 416 performed on embeddings 418 corresponding to the one or more entities 414, may be fed to the prediction model 400, which outputs a confidence score for each of one or more potential predictions, indicating a confidence level in the corresponding potential prediction” teaches a prediction model within the machine learning component, being fed one or more entities (corresponds to the plurality of candidate job title identifications), and that outputs a confidence score for a particular prediction (corresponds to the prediction score). FIG. 9 and Para. [0066], “FIG. 9 is a flow diagram illustrating a method 900 for creating a deep representation of entities specifically for skill-set-based title prediction. This model optimizes for an application known as title disambiguation. Specifically, some members may submit very broad titles (e.g., manager), and it can be useful to predict a more exact title identification given the set of skills the member has” teaches the matching of the member’s title (corresponds to the candidate job title) with a title identification (corresponds to the job title identification)).  
saving a mapping between the first candidate job title and a candidate job title identification from the plurality of candidate job title identifications having a highest prediction score (Merhav-1, Para. [0049], “It should be noted that an entity as described herein is a specific instance of standardized data in the social network. Typically these entities will include pieces of data supplied in a member profile that is capable of being standardized. Common entities in social networking profiles include titles, industries, locations, skills, likes, dislikes, schools attended, etc.” teaches entities including member profile data. Para. [0065], “A member title identification (entity) 802 is used for an embedding step 804” teaches member identification being an entity (corresponds to candidate job title identifications). Para. [0050], “The deep representation of entities may be a single representation for many different types of entities. For example, whereas in the prior art each entity type would be mapped into a hierarchy of entities just of that entity type, the deep representation of entities in the present disclosure allows for multiple entity types to be mapped into the same data structure, thus permitting enriched analyses and predictions based on the relationships between entities of different entity types” teaches an entity (corresponds to the candidate job title) being mapped to a hierarchy of entities (corresponds to the candidate job title identification from the plurality of candidate job title identifications). Para. [0051], “In the confidence scoring component 404, one or more entities 414, as well as one or more outputs of objective function(s) 416 performed on embeddings 418 corresponding to the one or more entities 414, may be fed to the prediction model 400, which outputs a confidence score for each of one or more potential predictions, indicating a confidence level in the corresponding potential prediction” teaches a confidence score (corresponds to the prediction score) that indicates the confidence level in the corresponds potential prediction (corresponds to the highest prediction score)). 
Merhav-1 does not appear to explicitly teach feeding the first set of training data into a deep convolutional neural network (DCNN) designed to train a prediction model to train a prediction model to output a prediction score; feeding the second set of training data into the DCNN in order to retrain the prediction model
However, Merhav-2, teaches feeding the first set of training data into a deep convolutional neural network (DCNN) designed to train a prediction model to output a prediction score (Merhav-2, Col. 14 Lines 34-43, “Social networks often have very abundant information that can be used to aid in the training of the DCNN 304, as not only image information is available but also various pieces of information about the subject of the images is also available, such as job title, experience level, skills, age, and so forth. This information can be quite useful in aiding of labelling training images with professionalism scores or categorizations, so that a human does not need to label each image from scratch” teaches a DCNN that is being fed and trained with various pieces of information such as job title (corresponds to the first set of training data). Col. 7 Lines 12-15, “After the training, the model may be applied to new input images to produce a useful prediction of the professionalism levels of the new input images” teaches the DCNN being trained to produce a useful prediction (corresponds to the training of a prediction model). Col. 3 Lines 1-3, “In an example embodiment, a Deep Convolutional Neural Network (DCNN) is used to generate professionalism scores for digital images” teaches the DCNN outputting a professionalism score (corresponds to the prediction score)).  
… feeding the second set of training data into the DCNN in order to retrain the prediction model (Merhav-2, Col. 7 Lines 12-15, “After the training, the model may be applied to new input images to produce a useful prediction of the professionalism levels of the new input images” teaches the DCNN being trained to produce a useful prediction (corresponds to the training of a prediction model). Col. 8 Lines 12-16, “It should be noted that the filters used in the convolutional layers 404A, 404B may be activated in a first iteration of the DCNN 400 and refined prior to each additional iteration, based on actions taken in other layers in the previous iteration” teaches iterations of training (corresponds to retraining) of the DCNN. Col. 14 Lines 34-43, “Social networks often have very abundant information that can be used to aid in the training of the DCNN 304, as not only image information is available but also various pieces of information about the subject of the images is also available, such as job title, experience level, skills, age, and so forth. This information can be quite useful in aiding of labelling training images with professionalism scores or categorizations, so that a human does not need to label each image from scratch” teaches a DCNN that is being fed and trained with various pieces of information such as job title and skills (corresponds to the second set of training data)).
Merhav-1 in view of Merhav-2 are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “neural network”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Merhav-1 with Merhav-2, with motivation of feeding the first set of training data into a deep convolutional neural network (DCNN) designed to train a prediction model to train a prediction model to output a prediction score; feeding the second set of training data into the DCNN in order to retrain the prediction model. “In a sense, the output of one DCNN is used as a label for input to train a different neural net. Using this technique, it is possible for the second neural net to express learning rules such as “usually the object of interest is around the middle of the image” or “the object of interest should never be with a very small width or height.” This improves performance over a process that scores rectangles without spatial context” (Merhav-2, Col. 12 Lines 35-42). The proposed teaching is beneficial in that it improves performance over a process that scores rectangles without spatial context.
Regarding Claim 3,
Merhav-1 in view of Merhav-2 teaches the system of claim 1, 
Merhav-1 further teaches wherein the first set of training data is obtained from a taxonomy of title identifications having a stored mapping between the title identifications and titles (Merhav-1, FIG. 3 and Para. [0047], “an entity retrieval component 300 retrieves entities from a database 305. This may include, for example, important existing taxonomies. The entities, once extracted, are passed to a deep representation formation component 310, which acts to form a deep representation of the entities, as will be described in more detail later. This may include utilizing a machine learning component 330. Once formed, the deep representation of entities 335 may be stored in database 340. In some example embodiments, database 305 and database 340 are a single database” teaches retrieving entities (corresponds to the first set of training data) from existing taxonomies of entities (corresponds to the title identification) that are stored in a database). 
Regarding Claim 6,
Merhav-1 in view of Merhav-2 teaches the system of claim 1, 
Merhav-1 further teaches wherein the first set of training data is obtained from a grouping of titles similar in characters to other titles (Merhav-1, Para. [0020]-[0021], “In an example embodiment, a machine learning algorithm is utilized to optimize embeddings. An embedding is a value assigned to an entity based on an objective. Specifically, all entities are initially assigned a random embeddding. Then, progressively, these embeddings are modified to maximize a stated objective. The objective will differ based on the problem being analyzed. Such an approach allows difficult standardization tasks to be analyzed, such as clustering similar entities, synonym identification and retrieval tasks, seniority relations, relevance relations, identifying analogies, outlier identification, and prediction tasks such as predicting job transitions applications, or string searches” teaches a machine learning algorithm that groups entities (corresponds to the first set of training data) with similarities (corresponds to the grouping of titles similar in characters to other titles)). 
Regarding Claim 7,
Merhav-1 in view of Merhav-2 teaches the system of claim 1, 
Merhav-1 further teaches wherein the second set of training data is obtained from member profiles of members of an online service (Merhav-1, Para. [0049], “It should be noted that an entity as described herein is a specific instance of standardized data in the social network. Typically these entities will include pieces of data supplied in a member profile that is capable of being standardized. Common entities in social networking profiles include titles, industries, locations, skills, likes, dislikes, schools attended, etc. Certain types of data are less likely to be capable of being standardized, such as names, publications, etc.” teaches obtaining the entities (corresponds to the second set of training data) from member’s profile in the social network (corresponds to the online service)).
Regarding Claim 8,
Merhav et al. teaches a computer-implemented method, comprising (Merhav et al., FIG. 5 and Para. [0055], “FIG. 5 is a flow diagram illustrating a method 500 for creating a deep embedded representation of social network entities in accordance with an example embodiment” teaches a method).
obtaining a first set of training data, the first set of training data comprising pairs of job titles and standardized job title identifications (Merhav-1, Para. [0059], “For example, a member profile containing both the particular title and the certain skill could be fed to the machine learning mode” teaches a member profile being used as training data. Para. [0065], “FIG. 8 is a flow diagram illustrating a method 800 for creating a deep representation of entities specifically for relating titles and skills. Here, titles and skills are embedded in a such a way that the dot product is maximized when the skill and title tend to co-occur, and minimized otherwise. A member title identification (entity) 802 is used for an embedding step 804, which outputs a title latent representation, specifically a vector 806 for the title” teaches a member title identification (corresponds to the job titles and standardized job title identifications)).
obtaining a second set of training data, the second set of training data comprising pairs of job titles and skills (Merhav-1, Para. [0059], “For example, a member profile containing both the particular title and the certain skill could be fed to the machine learning mode” teaches a member profile being used as training data. Para. [0065], “FIG. 8 is a flow diagram illustrating a method 800 for creating a deep representation of entities specifically for relating titles and skills. Here, titles and skills are embedded in a such a way that the dot product is maximized when the skill and title tend to co-occur, and minimized otherwise… Likewise, a skill identification (entity) 808 is used for an embedding step 810, which outputs a skill latent representation for the skill, specifically a vector 812 for the skill” teaches a skill identification (corresponds to the skills)). 
… output a prediction score indicating a likelihood that an input candidate job title matches an input job title identification (Merhav-1, Para. [0050], “Para. [0050], “The deep representation of entities may be a single representation for many different types of entities. For example, whereas in the prior art each entity type would be mapped into a hierarchy of entities just of that entity type, the deep representation of entities in the present disclosure allows for multiple entity types to be mapped into the same data structure, thus permitting enriched analyses and predictions based on the relationships between entities of different entity types” teaches  a prediction based on the relationship of an entity (corresponds to the input candidate job title) to a hierarchy of entities (corresponds to the input job title identification). Para. [0051], “The machine learning component 330 may utilize machine learning processes to arrive at a prediction model 400 used to provide a confidence score for a particular prediction… In the confidence scoring component 404, one or more entities 414, as well as one or more outputs of objective function(s) 416 performed on embeddings 418 corresponding to the one or more entities 414, may be fed to the prediction model 400, which outputs a confidence score for each of one or more potential predictions, indicating a confidence level in the corresponding potential prediction” teaches a prediction model within the machine learning component, being fed one or more entities (corresponds to the plurality of candidate job title identifications), and that outputs a confidence score for a particular prediction (corresponds to  a prediction score indicating a likelihood).. FIG. 9 and Para. [0066], “FIG. 9 is a flow diagram illustrating a method 900 for creating a deep representation of entities specifically for skill-set-based title prediction. This model optimizes for an application known as title disambiguation. Specifically, some members may submit very broad titles (e.g., manager), and it can be useful to predict a more exact title identification given the set of skills the member has” teaches the matching of the member’s title (corresponds to the input candidate job title) with a title identification (corresponds to the input job title identification)).
… feeding a first candidate job title and a plurality of candidate job title identifications into the prediction model, producing a prediction score for each pairing of the first candidate job title and a candidate job title identification (Merhav-1, Para. [0059], “For example, a member profile containing both the particular title and the certain skill could be fed to the machine learning model, the machine learning model having been trained based on the objective function output, and if the machine learning model accurately predicts that the member profile should contain the certain skill, then the optimization test passes. If not, it fails” teaches the machine learning model being fed the member profile containing the particular title (corresponds to the first candidate job title). Para. [0051], “In an example embodiment, one or more machine learning algorithms are used to aid in optimizing embedding used in the deep representation of entities. The machine learning component 330 may utilize machine learning processes to arrive at a prediction model 400 used to provide a confidence score for a particular prediction… In the confidence scoring component 404, one or more entities 414, as well as one or more outputs of objective function(s) 416 performed on embeddings 418 corresponding to the one or more entities 414, may be fed to the prediction model 400, which outputs a confidence score for each of one or more potential predictions, indicating a confidence level in the corresponding potential prediction” teaches a prediction model within the machine learning component, being fed one or more entities (corresponds to the plurality of candidate job title identifications), and that outputs a confidence score for a particular prediction (corresponds to the prediction score). FIG. 9 and Para. [0066], “FIG. 9 is a flow diagram illustrating a method 900 for creating a deep representation of entities specifically for skill-set-based title prediction. This model optimizes for an application known as title disambiguation. Specifically, some members may submit very broad titles (e.g., manager), and it can be useful to predict a more exact title identification given the set of skills the member has” teaches the matching of the member’s title (corresponds to the candidate job title) with a title identification (corresponds to the job title identification)).  
saving a mapping between the first candidate job title and a candidate job title identification from the plurality of candidate job title identifications having a highest prediction score (Merhav-1, Para. [0049], “It should be noted that an entity as described herein is a specific instance of standardized data in the social network. Typically these entities will include pieces of data supplied in a member profile that is capable of being standardized. Common entities in social networking profiles include titles, industries, locations, skills, likes, dislikes, schools attended, etc.” teaches entities including member profile data. Para. [0065], “A member title identification (entity) 802 is used for an embedding step 804” teaches member identification being an entity (corresponds to candidate job title identifications). Para. [0050], “The deep representation of entities may be a single representation for many different types of entities. For example, whereas in the prior art each entity type would be mapped into a hierarchy of entities just of that entity type, the deep representation of entities in the present disclosure allows for multiple entity types to be mapped into the same data structure, thus permitting enriched analyses and predictions based on the relationships between entities of different entity types” teaches an entity (corresponds to the candidate job title) being mapped to a hierarchy of entities (corresponds to the candidate job title identification from the plurality of candidate job title identifications). Para. [0051], “In the confidence scoring component 404, one or more entities 414, as well as one or more outputs of objective function(s) 416 performed on embeddings 418 corresponding to the one or more entities 414, may be fed to the prediction model 400, which outputs a confidence score for each of one or more potential predictions, indicating a confidence level in the corresponding potential prediction” teaches a confidence score (corresponds to the prediction score) that indicates the confidence level in the corresponds potential prediction (corresponds to the highest prediction score)). 
Merhav-1 does not appear to explicitly teach feeding the first set of training data into a deep convolutional neural network (DCNN) designed to train a prediction model to; feeding the second set of training data into the DCNN in order to retrain the prediction model
However, Merhav-2, teaches feeding the first set of training data into a deep convolutional neural network (DCNN) designed to train a prediction model to (Merhav-2, Col. 14 Lines 34-43, “Social networks often have very abundant information that can be used to aid in the training of the DCNN 304, as not only image information is available but also various pieces of information about the subject of the images is also available, such as job title, experience level, skills, age, and so forth. This information can be quite useful in aiding of labelling training images with professionalism scores or categorizations, so that a human does not need to label each image from scratch” teaches a DCNN that is being fed and trained with various pieces of information such as job title (corresponds to the first set of training data). Col. 7 Lines 12-15, “After the training, the model may be applied to new input images to produce a useful prediction of the professionalism levels of the new input images” teaches the DCNN being trained to produce a useful prediction (corresponds to the training of a prediction model). Col. 3 Lines 1-3, “In an example embodiment, a Deep Convolutional Neural Network (DCNN) is used to generate professionalism scores for digital images” teaches the DCNN outputting a professionalism score (corresponds to the prediction score)).  
… feeding the second set of training data into the DCNN in order to retrain the prediction model (Merhav-2, Col. 7 Lines 12-15, “After the training, the model may be applied to new input images to produce a useful prediction of the professionalism levels of the new input images” teaches the DCNN being trained to produce a useful prediction (corresponds to the training of a prediction model). Col. 8 Lines 12-16, “It should be noted that the filters used in the convolutional layers 404A, 404B may be activated in a first iteration of the DCNN 400 and refined prior to each additional iteration, based on actions taken in other layers in the previous iteration” teaches iterations of training (corresponds to retraining) of the DCNN. Col. 14 Lines 34-43, “Social networks often have very abundant information that can be used to aid in the training of the DCNN 304, as not only image information is available but also various pieces of information about the subject of the images is also available, such as job title, experience level, skills, age, and so forth. This information can be quite useful in aiding of labelling training images with professionalism scores or categorizations, so that a human does not need to label each image from scratch” teaches a DCNN that is being fed and trained with various pieces of information such as job title and skills (corresponds to the second set of training data)).
Merhav-1 in view of Merhav-2 are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “neural network”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Merhav-1 with Merhav-2, with motivation of feeding the first set of training data into a deep convolutional neural network (DCNN) designed to train a prediction model to; feeding the second set of training data into the DCNN in order to retrain the prediction model. “In a sense, the output of one DCNN is used as a label for input to train a different neural net. Using this technique, it is possible for the second neural net to express learning rules such as “usually the object of interest is around the middle of the image” or “the object of interest should never be with a very small width or height.” This improves performance over a process that scores rectangles without spatial context” (Merhav-2, Col. 12 Lines 35-42). The proposed teaching is beneficial in that it improves performance over a process that scores rectangles without spatial context.
Regarding Claim 10,
Merhav-1 in view of Merhav-2 teaches the method of claim 8, 
Merhav-1 further teaches wherein the first set of training data is obtained from a taxonomy of title identifications having a stored mapping between the title identifications and titles (Merhav-1, FIG. 3 and Para. [0047], “an entity retrieval component 300 retrieves entities from a database 305. This may include, for example, important existing taxonomies. The entities, once extracted, are passed to a deep representation formation component 310, which acts to form a deep representation of the entities, as will be described in more detail later. This may include utilizing a machine learning component 330. Once formed, the deep representation of entities 335 may be stored in database 340. In some example embodiments, database 305 and database 340 are a single database” teaches retrieving entities (corresponds to the first set of training data) from existing taxonomies of entities (corresponds to the title identification) that are stored in a database).
Regarding Claim 13,
Merhav-1 in view of Merhav-2 teaches the method of claim 8, 
Merhav-1 further teaches wherein the first set of training data is obtained from a grouping of titles similar in characters to other titles (Merhav-1, Para. [0020]-[0021], “In an example embodiment, a machine learning algorithm is utilized to optimize embeddings. An embedding is a value assigned to an entity based on an objective. Specifically, all entities are initially assigned a random embeddding. Then, progressively, these embeddings are modified to maximize a stated objective. The objective will differ based on the problem being analyzed. Such an approach allows difficult standardization tasks to be analyzed, such as clustering similar entities, synonym identification and retrieval tasks, seniority relations, relevance relations, identifying analogies, outlier identification, and prediction tasks such as predicting job transitions applications, or string searches” teaches a machine learning algorithm that groups entities (corresponds to the first set of training data) with similarities (corresponds to the grouping of titles similar in characters to other titles)). 
Regarding Claim 14,
Merhav-1 in view of Merhav-2 teaches the method of claim 8, 
Merhav-1 further teaches wherein the second set of training data is obtained from member profiles of members of an online service (Merhav-1, Para. [0049], “It should be noted that an entity as described herein is a specific instance of standardized data in the social network. Typically these entities will include pieces of data supplied in a member profile that is capable of being standardized. Common entities in social networking profiles include titles, industries, locations, skills, likes, dislikes, schools attended, etc. Certain types of data are less likely to be capable of being standardized, such as names, publications, etc.” teaches obtaining the entities (corresponds to the second set of training data) from member’s profile in the social network (corresponds to the online service)).
Regarding Claim 15,
Merhav et al. teaches a non-transitory machine-readable storage medium comprising instructions, which when implemented by one or more machines, cause the one or more machines to perform operations comprising (Merhav et al., Para. [0090], “The term “machine-readable medium” shall also be taken to include any medium, or combination of multiple media, that is capable of storing instructions (e.g., instructions 1116) for execution by a machine (e.g., machine 1100), such that the instructions, when executed by one or more processors of the machine (e.g., processors 1110), cause the machine to perform any one or more of the methodologies described herein” teaches a machine-readable medium that stores instructions that when executed by one or more processors, cause the machine to perform one or more methodologies described).
obtaining a first set of training data, the first set of training data comprising pairs of job titles and standardized job title identifications (Merhav-1, Para. [0059], “For example, a member profile containing both the particular title and the certain skill could be fed to the machine learning mode” teaches a member profile being used as training data. Para. [0065], “FIG. 8 is a flow diagram illustrating a method 800 for creating a deep representation of entities specifically for relating titles and skills. Here, titles and skills are embedded in a such a way that the dot product is maximized when the skill and title tend to co-occur, and minimized otherwise. A member title identification (entity) 802 is used for an embedding step 804, which outputs a title latent representation, specifically a vector 806 for the title” teaches a member title identification (corresponds to the job titles and standardized job title identifications)).
obtaining a second set of training data, the second set of training data comprising pairs of job titles and skills (Merhav-1, Para. [0059], “For example, a member profile containing both the particular title and the certain skill could be fed to the machine learning mode” teaches a member profile being used as training data. Para. [0065], “FIG. 8 is a flow diagram illustrating a method 800 for creating a deep representation of entities specifically for relating titles and skills. Here, titles and skills are embedded in a such a way that the dot product is maximized when the skill and title tend to co-occur, and minimized otherwise… Likewise, a skill identification (entity) 808 is used for an embedding step 810, which outputs a skill latent representation for the skill, specifically a vector 812 for the skill” teaches a skill identification (corresponds to the skills)). 
… output a prediction score indicating a likelihood that an input candidate job title matches an input job title identification (Merhav-1, Para. [0050], “Para. [0050], “The deep representation of entities may be a single representation for many different types of entities. For example, whereas in the prior art each entity type would be mapped into a hierarchy of entities just of that entity type, the deep representation of entities in the present disclosure allows for multiple entity types to be mapped into the same data structure, thus permitting enriched analyses and predictions based on the relationships between entities of different entity types” teaches  a prediction based on the relationship of an entity (corresponds to the input candidate job title) to a hierarchy of entities (corresponds to the input job title identification). Para. [0051], “The machine learning component 330 may utilize machine learning processes to arrive at a prediction model 400 used to provide a confidence score for a particular prediction… In the confidence scoring component 404, one or more entities 414, as well as one or more outputs of objective function(s) 416 performed on embeddings 418 corresponding to the one or more entities 414, may be fed to the prediction model 400, which outputs a confidence score for each of one or more potential predictions, indicating a confidence level in the corresponding potential prediction” teaches a prediction model within the machine learning component, being fed one or more entities (corresponds to the plurality of candidate job title identifications), and that outputs a confidence score for a particular prediction (corresponds to  a prediction score indicating a likelihood).. FIG. 9 and Para. [0066], “FIG. 9 is a flow diagram illustrating a method 900 for creating a deep representation of entities specifically for skill-set-based title prediction. This model optimizes for an application known as title disambiguation. Specifically, some members may submit very broad titles (e.g., manager), and it can be useful to predict a more exact title identification given the set of skills the member has” teaches the matching of the member’s title (corresponds to the input candidate job title) with a title identification (corresponds to the input job title identification)).  
… feeding a first candidate job title and a plurality of candidate job title identifications into the prediction model, producing a prediction score for each pairing of the first candidate job title and a candidate job title identification (Merhav-1, Para. [0059], “For example, a member profile containing both the particular title and the certain skill could be fed to the machine learning model, the machine learning model having been trained based on the objective function output, and if the machine learning model accurately predicts that the member profile should contain the certain skill, then the optimization test passes. If not, it fails” teaches the machine learning model being fed the member profile containing the particular title (corresponds to the first candidate job title). Para. [0051], “In an example embodiment, one or more machine learning algorithms are used to aid in optimizing embedding used in the deep representation of entities. The machine learning component 330 may utilize machine learning processes to arrive at a prediction model 400 used to provide a confidence score for a particular prediction… In the confidence scoring component 404, one or more entities 414, as well as one or more outputs of objective function(s) 416 performed on embeddings 418 corresponding to the one or more entities 414, may be fed to the prediction model 400, which outputs a confidence score for each of one or more potential predictions, indicating a confidence level in the corresponding potential prediction” teaches a prediction model within the machine learning component, being fed one or more entities (corresponds to the plurality of candidate job title identifications), and that outputs a confidence score for a particular prediction (corresponds to the prediction score). FIG. 9 and Para. [0066], “FIG. 9 is a flow diagram illustrating a method 900 for creating a deep representation of entities specifically for skill-set-based title prediction. This model optimizes for an application known as title disambiguation. Specifically, some members may submit very broad titles (e.g., manager), and it can be useful to predict a more exact title identification given the set of skills the member has” teaches the matching of the member’s title (corresponds to the candidate job title) with a title identification (corresponds to the job title identification)).
saving a mapping between the first candidate job title and a candidate job title identification from the plurality of candidate job title identifications having a highest prediction score (Merhav-1, Para. [0049], “It should be noted that an entity as described herein is a specific instance of standardized data in the social network. Typically these entities will include pieces of data supplied in a member profile that is capable of being standardized. Common entities in social networking profiles include titles, industries, locations, skills, likes, dislikes, schools attended, etc.” teaches entities including member profile data. Para. [0065], “A member title identification (entity) 802 is used for an embedding step 804” teaches member identification being an entity (corresponds to candidate job title identifications). Para. [0050], “The deep representation of entities may be a single representation for many different types of entities. For example, whereas in the prior art each entity type would be mapped into a hierarchy of entities just of that entity type, the deep representation of entities in the present disclosure allows for multiple entity types to be mapped into the same data structure, thus permitting enriched analyses and predictions based on the relationships between entities of different entity types” teaches an entity (corresponds to the candidate job title) being mapped to a hierarchy of entities (corresponds to the candidate job title identification from the plurality of candidate job title identifications). Para. [0051], “In the confidence scoring component 404, one or more entities 414, as well as one or more outputs of objective function(s) 416 performed on embeddings 418 corresponding to the one or more entities 414, may be fed to the prediction model 400, which outputs a confidence score for each of one or more potential predictions, indicating a confidence level in the corresponding potential prediction” teaches a confidence score (corresponds to the prediction score) that indicates the confidence level in the corresponds potential prediction (corresponds to the highest prediction score)). 
Merhav-1 does not appear to explicitly teach feeding the first set of training data into a deep convolutional neural network (DCNN) designed to train a prediction model to train a prediction model to output a prediction score; feeding the second set of training data into the DCNN in order to retrain the prediction model
However, Merhav-2, teaches feeding the first set of training data into a deep convolutional neural network (DCNN) designed to train a prediction model to (Merhav-2, Col. 14 Lines 34-43, “Social networks often have very abundant information that can be used to aid in the training of the DCNN 304, as not only image information is available but also various pieces of information about the subject of the images is also available, such as job title, experience level, skills, age, and so forth. This information can be quite useful in aiding of labelling training images with professionalism scores or categorizations, so that a human does not need to label each image from scratch” teaches a DCNN that is being fed and trained with various pieces of information such as job title (corresponds to the first set of training data). Col. 7 Lines 12-15, “After the training, the model may be applied to new input images to produce a useful prediction of the professionalism levels of the new input images” teaches the DCNN being trained to produce a useful prediction (corresponds to the training of a prediction model). Col. 3 Lines 1-3, “In an example embodiment, a Deep Convolutional Neural Network (DCNN) is used to generate professionalism scores for digital images” teaches the DCNN outputting a professionalism score (corresponds to the prediction score)).  
… feeding the second set of training data into the DCNN in order to retrain the prediction model (Merhav-2, Col. 7 Lines 12-15, “After the training, the model may be applied to new input images to produce a useful prediction of the professionalism levels of the new input images” teaches the DCNN being trained to produce a useful prediction (corresponds to the training of a prediction model). Col. 8 Lines 12-16, “It should be noted that the filters used in the convolutional layers 404A, 404B may be activated in a first iteration of the DCNN 400 and refined prior to each additional iteration, based on actions taken in other layers in the previous iteration” teaches iterations of training (corresponds to retraining) of the DCNN. Col. 14 Lines 34-43, “Social networks often have very abundant information that can be used to aid in the training of the DCNN 304, as not only image information is available but also various pieces of information about the subject of the images is also available, such as job title, experience level, skills, age, and so forth. This information can be quite useful in aiding of labelling training images with professionalism scores or categorizations, so that a human does not need to label each image from scratch” teaches a DCNN that is being fed and trained with various pieces of information such as job title and skills (corresponds to the second set of training data)).
Merhav-1 in view of Merhav-2 are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “neural network”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Merhav-1 with Merhav-2, with motivation of feeding the first set of training data into a deep convolutional neural network (DCNN) designed to train a prediction model to train a prediction model to output a prediction score; feeding the second set of training data into the DCNN in order to retrain the prediction model. “In a sense, the output of one DCNN is used as a label for input to train a different neural net. Using this technique, it is possible for the second neural net to express learning rules such as “usually the object of interest is around the middle of the image” or “the object of interest should never be with a very small width or height.” This improves performance over a process that scores rectangles without spatial context” (Merhav-2, Col. 12 Lines 35-42). The proposed teaching is beneficial in that it improves performance over a process that scores rectangles without spatial context.
Regarding Claim 17,
Merhav-1 in view of Merhav-2 teaches the non-transitory machine-readable storage medium of claim 15, 
Merhav-1 further teaches wherein the first set of training data is obtained from a taxonomy of title identifications having a stored mapping between the title identifications and titles (Merhav-1, FIG. 3 and Para. [0047], “an entity retrieval component 300 retrieves entities from a database 305. This may include, for example, important existing taxonomies. The entities, once extracted, are passed to a deep representation formation component 310, which acts to form a deep representation of the entities, as will be described in more detail later. This may include utilizing a machine learning component 330. Once formed, the deep representation of entities 335 may be stored in database 340. In some example embodiments, database 305 and database 340 are a single database” teaches retrieving entities (corresponds to the first set of training data) from existing taxonomies of entities (corresponds to the title identification) that are stored in a database).
Regarding Claim 20,
Merhav-1 in view of Merhav-2 teaches the non-transitory machine-readable storage medium of claim 15, 
Merhav-1 further teaches wherein the first set of training data is obtained from a grouping of titles similar in characters to other titles (Merhav-1, Para. [0020]-[0021], “In an example embodiment, a machine learning algorithm is utilized to optimize embeddings. An embedding is a value assigned to an entity based on an objective. Specifically, all entities are initially assigned a random embeddding. Then, progressively, these embeddings are modified to maximize a stated objective. The objective will differ based on the problem being analyzed. Such an approach allows difficult standardization tasks to be analyzed, such as clustering similar entities, synonym identification and retrieval tasks, seniority relations, relevance relations, identifying analogies, outlier identification, and prediction tasks such as predicting job transitions applications, or string searches” teaches a machine learning algorithm that groups entities (corresponds to the first set of training data) with similarities (corresponds to the grouping of titles similar in characters to other titles)). 
Claims 4, 11, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Merhav-1 in view of Merhav-2 in further view of Ritter et al. (“Data-Driven Response Generation in Social Media”)
The applied reference has a common Applicant with the instant application. Based upon the earlier effectively filed date of the reference, it constitutes prior art under 35 U.S.C. 102(a)(2).
This rejection under 35 U.S.C. 103 might be overcome by: (1) a showing under 37 CFR 1.130(a) that the subject matter disclosed in the reference was obtained directly or indirectly from the inventor or a joint inventor of this application and is thus not prior art in accordance with 35 U.S.C.102(b)(2)(A); (2) a showing under 37 CFR 1.130(b) of a prior public disclosure under 35 U.S.C. 102(b)(2)(B); or (3) a statement pursuant to 35 U.S.C. 102(b)(2)(C) establishing that, not later than the effective filing date of the claimed invention, the subject matter disclosed and the claimed invention were either owned by the same person or subject to an obligation of assignment to the same person or subject to a joint research agreement. See generally MPEP § 717.02.
Regarding Claim 4,
Merhav-1 in view of Merhav-2 teaches the system of claim 1, 
Merhav-1 further teaches wherein the first set of training data is obtained from member profiles of members of an online service (Merhav-1, Para. [0049], “It should be noted that an entity as described herein is a specific instance of standardized data in the social network. Typically these entities will include pieces of data supplied in a member profile that is capable of being standardized. Common entities in social networking profiles include titles, industries, locations, skills, likes, dislikes, schools attended, etc. Certain types of data are less likely to be capable of being standardized, such as names, publications, etc.” teaches obtaining the entities (corresponds to the first set of training data) from member’s profile in the social network (corresponds to the online service)).
Merhav-1 in view of Merhav-2 does not appear to teach the member profiles each being written in at least two languages.  
However, Ritter et al., teaches the member profiles each being written in at least two languages (Ritter et al., Section 3 Pg. 3, “Twitter conversations don’t occur in real-time as in IRC; rather as in email, users typically take turns responding to each other. Twitter’s 140 character limit, however, keeps conversations chat-like” teaches the users on Twitter (corresponds to member profiles that user utilize) status post and conversation response. Abstract Pg. 1, “We present a data-driven approach to generating responses to Twitter status posts, based on phrase-based Statistical Machine Translation” and Section 1 Pg. 2, “We apply SMT to this problem, treating Twitter as our parallel corpus, with status posts as our source language and their responses as our target language” teaches generating responses to Twitter status posts, based on phrase-based Statistical Machine Translation (corresponds to the at least two languages)).
Merhav-1 in view of Merhav-2 in view of Ritter et al. are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “machine learning”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Merhav-1 and Merhav-2 with Ritter et al., with motivation of the member profiles each being written in at least two languages. “We show that SMT outperforms IR on this task, and its output is preferred over actual human responses in 15% of cases. As far as we are aware, this is the first work to investigate the use of phrase-based SMT to directly translate a linguistic stimulus into an appropriate response” (Ritter et al., Abstract). The proposed teaching is beneficial in that it outperforms IR and outputs are preferred over actual human response.
Regarding Claim 11,
Merhav-1 in view of Merhav-2 teaches the method of claim 8, 
Merhav-1 further teaches wherein the first set of training data is obtained from member profiles of members of an online service (Merhav-1, Para. [0049], “It should be noted that an entity as described herein is a specific instance of standardized data in the social network. Typically these entities will include pieces of data supplied in a member profile that is capable of being standardized. Common entities in social networking profiles include titles, industries, locations, skills, likes, dislikes, schools attended, etc. Certain types of data are less likely to be capable of being standardized, such as names, publications, etc.” teaches obtaining the entities (corresponds to the first set of training data) from member’s profile in the social network (corresponds to the online service)).
Merhav-1 in view of Merhav-2 does not appear to teach the member profiles each being written in at least two languages.  
However, Ritter et al., teaches the member profiles each being written in at least two languages (Ritter et al., Section 3 Pg. 3, “Twitter conversations don’t occur in real-time as in IRC; rather as in email, users typically take turns responding to each other. Twitter’s 140 character limit, however, keeps conversations chat-like” teaches the users on Twitter (corresponds to member profiles that user utilize) status post and conversation response. Abstract Pg. 1, “We present a data-driven approach to generating responses to Twitter status posts, based on phrase-based Statistical Machine Translation” and Section 1 Pg. 2, “We apply SMT to this problem, treating Twitter as our parallel corpus, with status posts as our source language and their responses as our target language” teaches generating responses to Twitter status posts, based on phrase-based Statistical Machine Translation (corresponds to the at least two languages)).
Merhav-1 in view of Merhav-2 in view of Ritter et al. are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “machine learning”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Merhav-1 and Merhav-2 with Ritter et al., with motivation of the member profiles each being written in at least two languages. “We show that SMT outperforms IR on this task, and its output is preferred over actual human responses in 15% of cases. As far as we are aware, this is the first work to investigate the use of phrase-based SMT to directly translate a linguistic stimulus into an appropriate response” (Ritter et al., Abstract). The proposed teaching is beneficial in that it outperforms IR and outputs are preferred over actual human response.
Regarding Claim 18,
Merhav-1 in view of Merhav-2 teaches the non-transitory machine-readable storage medium of claim 15, 
Merhav-1 further teaches wherein the first set of training data is obtained from member profiles of members of an online service (Merhav-1, Para. [0049], “It should be noted that an entity as described herein is a specific instance of standardized data in the social network. Typically these entities will include pieces of data supplied in a member profile that is capable of being standardized. Common entities in social networking profiles include titles, industries, locations, skills, likes, dislikes, schools attended, etc. Certain types of data are less likely to be capable of being standardized, such as names, publications, etc.” teaches obtaining the entities (corresponds to the first set of training data) from member’s profile in the social network (corresponds to the online service)).
Merhav-1 in view of Merhav-2 does not appear to teach the member profiles each being written in at least two languages.  
However, Ritter et al., teaches the member profiles each being written in at least two languages (Ritter et al., Section 3 Pg. 3, “Twitter conversations don’t occur in real-time as in IRC; rather as in email, users typically take turns responding to each other. Twitter’s 140 character limit, however, keeps conversations chat-like” teaches the users on Twitter (corresponds to member profiles that user utilize) status post and conversation response. Abstract Pg. 1, “We present a data-driven approach to generating responses to Twitter status posts, based on phrase-based Statistical Machine Translation” and Section 1 Pg. 2, “We apply SMT to this problem, treating Twitter as our parallel corpus, with status posts as our source language and their responses as our target language” teaches generating responses to Twitter status posts, based on phrase-based Statistical Machine Translation (corresponds to the at least two languages)).
Merhav-1 in view of Merhav-2 in view of Ritter et al. are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “machine learning”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Merhav-1 and Merhav-2 with Ritter et al., with motivation of feeding the second set of training data into the DCNN in order to retrain the prediction model. “We show that SMT outperforms IR on this task, and its output is preferred over actual human responses in 15% of cases. As far as we are aware, this is the first work to investigate the use of phrase-based SMT to directly translate a linguistic stimulus into an appropriate response” (Ritter et al., Abstract). The proposed teaching is beneficial in that it outperforms IR and outputs are preferred over actual human response.
Claims 5, 12, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Merhav-1 in view of Merhav-2 in further view of Jersin et al. (US 10832219 B2)
The applied reference has a common Applicant with the instant application. Based upon the earlier effectively filed date of the reference, it constitutes prior art under 35 U.S.C. 102(a)(2).
This rejection under 35 U.S.C. 103 might be overcome by: (1) a showing under 37 CFR 1.130(a) that the subject matter disclosed in the reference was obtained directly or indirectly from the inventor or a joint inventor of this application and is thus not prior art in accordance with 35 U.S.C.102(b)(2)(A); (2) a showing under 37 CFR 1.130(b) of a prior public disclosure under 35 U.S.C. 102(b)(2)(B); or (3) a statement pursuant to 35 U.S.C. 102(b)(2)(C) establishing that, not later than the effective filing date of the claimed invention, the subject matter disclosed and the claimed invention were either owned by the same person or subject to an obligation of assignment to the same person or subject to a joint research agreement. See generally MPEP § 717.02.
Regarding Claim 5,
Merhav-1 in view of Merhav-2 teaches the system of claim 1, 
Merhav-1 in view of Merhav-2 does not appear to explicitly teach wherein the first set of training data is obtained from machine-translated titles
However, Jersin et al., teaches wherein the first set of training data is obtained from machine-translated titles (Jersin et al., Col. 1 Lines 44-51, “A key challenge in a search for candidates (e.g., talent search) is to translate the criteria of a hiring position into a search query that leads to desired candidates. To fulfill this goal, the searcher typically needs to understand which skills are typically required for the position (e.g., job title), what are the alternatives, which companies are likely to have such candidates, which schools the candidates are most likely to graduate from, etc.” teaches translating the criterion of a hiring position (corresponds to the machine-translated titles) to determine desired candidates (corresponds to the first set of training data).  
Merhav-1 in view of Merhav-2 in view of Jersin et al. are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “neural network”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Merhav-1 and Merhav-2 with Jersin et al., with motivation wherein the first set of training data is obtained from machine-translated titles. “That is, embodiments improve the efficiency of suggesting titles to a user by filtering out titles that are semantically identical or duplicative” (Jersin et al., Col. 5 Lines 7-9). The proposed teaching is beneficial in that it improves the efficiency of suggesting titles to a user.
Regarding Claim 12,
Merhav-1 in view of Merhav-2 teaches the method of claim 8, 
Merhav-1 in view of Merhav-2 does not appear to explicitly teach wherein the first set of training data is obtained from machine-translated titles
However, Jersin et al., teaches wherein the first set of training data is obtained from machine-translated titles (Jersin et al., Col. 1 Lines 44-51, “A key challenge in a search for candidates (e.g., talent search) is to translate the criteria of a hiring position into a search query that leads to desired candidates. To fulfill this goal, the searcher typically needs to understand which skills are typically required for the position (e.g., job title), what are the alternatives, which companies are likely to have such candidates, which schools the candidates are most likely to graduate from, etc.” teaches translating the criterion of a hiring position (corresponds to the machine-translated titles) to determine desired candidates (corresponds to the first set of training data).  
Merhav-1 in view of Merhav-2 in view of Jersin et al. are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “neural network”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Merhav-1 and Merhav-2 with Jersin et al., with motivation wherein the first set of training data is obtained from machine-translated titles. “That is, embodiments improve the efficiency of suggesting titles to a user by filtering out titles that are semantically identical or duplicative” (Jersin et al., Col. 5 Lines 7-9). The proposed teaching is beneficial in that it improves the efficiency of suggesting titles to a user.
Regarding Claim 19,
Merhav-1 in view of Merhav-2 teaches the non-transitory machine-readable storage medium of claim 15, 
Merhav-1 in view of Merhav-2 does not appear to explicitly teach wherein the first set of training data is obtained from machine-translated titles
However, Jersin et al., teaches wherein the first set of training data is obtained from machine-translated titles (Jersin et al., Col. 1 Lines 44-51, “A key challenge in a search for candidates (e.g., talent search) is to translate the criteria of a hiring position into a search query that leads to desired candidates. To fulfill this goal, the searcher typically needs to understand which skills are typically required for the position (e.g., job title), what are the alternatives, which companies are likely to have such candidates, which schools the candidates are most likely to graduate from, etc.” teaches translating the criterion of a hiring position (corresponds to the machine-translated titles) to determine desired candidates (corresponds to the first set of training data).  
Merhav-1 in view of Merhav-2 in view of Jersin et al. are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “neural network”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Merhav-1 and Merhav-2 with Jersin et al., with motivation wherein the first set of training data is obtained from machine-translated titles. “That is, embodiments improve the efficiency of suggesting titles to a user by filtering out titles that are semantically identical or duplicative” (Jersin et al., Col. 5 Lines 7-9). The proposed teaching is beneficial in that it improves the efficiency of suggesting titles to a user.

Allowable Subject Matter
Claims 9 and 16 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Henry T Nguyen whose telephone number is (571)272-8860. The examiner can normally be reached Monday-Friday 8:00am-5:30pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on (571) 272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/HENRY TRONG NGUYEN/
Examiner, Art Unit 2125                                                                                                                                                                                             
/KAMRAN AFSHAR/Supervisory Patent Examiner, Art Unit 2125