Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
Notice to Applicant
Claims 1- 20 have been examined in this application. This communication is the first action on the merits. No Information Disclosure Statement (IDS)  has been filed to date. 
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1- 20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Claims 1-20 are directed to providing a universal occupational taxonomy. 
Claim 1  and Claim 19 recite a method for providing a universal occupational taxonomy and Claim 10 recites a system for providing a universal occupational taxonomy, which include establishing one or more levels of granularity of a plurality of jobs; assembling data for each job of the plurality of jobs; training one or more vector representations, wherein a vector representation is trained for each job of the plurality of jobs; reevaluating the one or more levels of granularity based on the training; clustering the plurality of jobs into one or more clusters based on the one or more vector representations; naming the one or more resulting clusters a representative title; classifying the one or more jobs to the one or more clusters; and outputting an occupational taxonomy (Claim 1 and Claim 10). Establishing one or more levels of granularity of a plurality of jobs; assembling data for each job of the plurality of jobs; training one or more vector representations, wherein a vector representation is trained for each job of the plurality of jobs; reevaluating the one or more levels of granularity based on the training; clustering the plurality of jobs into one or more clusters based on the one or more vector representations; classifying the one or more jobs to the one or more clusters; and outputting an occupational taxonomy at least one of to a display and as a data set (Claim 19).  As drafted, this is, under its broadest reasonable interpretation, within the Abstract idea grouping of “Methods of Organizing Human Activity” – managing personal behavior.  The recitation of  “processor”, “memory”, “system”, provide nothing in the claim elements to preclude the step from being “Methods of Organizing Human Activity”- managing personal behavior.  Accordingly, the claim recites an abstract idea.  
This judicial exception is not integrated into a practical application. The claims primarily recite the additional element of using computer components to perform each step. The “processor”, “memory”, “system” is recited at a high-level of generality, such that it amounts no more than mere instructions to apply the exception using a computer component. See MPEP 2106.05(f).    Accordingly, the additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claims also fail to recite any improvements to another technology or technical field, improvements to the functioning of the computer itself, use of a particular machine, effecting a transformation or reduction of a particular article to a different state or thing, and/or an additional element applies or uses the judicial  exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort designed to monopolize the exception.  See 84 Fed. Reg. 55.  In particular, there is a lack of improvement to a computer or technical field in taxonomy. 
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements when considered both individually and as an ordered combination do not amount to significantly more than the abstract idea. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of ““processor”, “memory”, “system” is insufficient to amount to significantly more. (See MPEP 2106.05(f) – Mere Instructions to Apply an Exception – “Thus, for example, claims that amount to nothing more than an instruction to apply the abstract idea using a generic computer do not render an abstract idea eligible.” Alice Corp., 134 S. Ct. at 235). Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. 
The claim fails to recite any improvements to another technology or technical field, improvements to the functioning of the computer itself, use of a particular machine, effecting a transformation or reduction of a particular article to a different state or thing, adding unconventional steps that confine the claim to a particular useful application, and/or meaningful limitations beyond generally linking the use of an abstract idea to a particular environment.  See 84 Fed. Reg. 55. Viewed individually or as a whole, these additional claim element(s) do not provide meaningful limitation(s) to transform the abstract idea into a patent eligible application of the abstract idea such that the claim(s) amounts to significantly more than the abstract idea itself.   With regards to receiving data and step 2B, it is M2106.05(d)- Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information). 
Examiner concludes that the additional elements in combination fail to amount to significantly more than the abstract idea based on findings that each element merely performs the same function(s) in combination as each element performs separately. The claim is not patent eligible. Thus, taken alone, the additional elements do not amount to significantly more than the above-identified judicial exception (the abstract idea). Looking at the limitations as an ordered combination adds nothing that is not already present when looking at the elements taken individually.
Dependent Claims 2-9, 11-18 and 20 recite establishing a most granular level and setting a first threshold for counts for company-title pairs; if the first threshold is not met, establishing a higher level and setting a second threshold for industry-title pairs; and if the second threshold is not met, establishing one or more additional higher levels and corresponding thresholds until a threshold is met; collecting text of descriptions of the plurality of jobs from a plurality of profiles; and concatenating all the collected profile data into one text document for each job; training the one or more vector representations is based on the text document for each job; implementing a neural network based on at least one of a distributed bag of words (DBOW) implementation, a skip-gram implementation, a Distributed Memory (DM) model, a Bidirectional Encoder Representations from Transformers (BERT) model and a Universal Language Model Fine-tuning (ULMFiT) model; reevaluating the one or more levels of granularity based on the training further comprises: combining a first given vector representation and a second given vector representation into a single vector representation when a statistical insignificance is identified between the first given vector representation and the second given vector representation; naming the one or more resulting clusters, the representative title comprises: selecting, for each cluster, a title based on a distance from centroid (X), a size (Y), and a binary variable which indicates whether this title exists without an industry (Z); clustering the plurality of jobs into one or more clusters based on the one or more vector representations comprises: implementing one or more agglomerative models; and creating a hierarchical taxonomy, wherein one or more granular categories are each mapped to one broader category in a many-to-one mapping;  outputting an occupational taxonomy comprises: outputting the occupational taxonomy at least one of to a display and as a data set; naming the one or more resulting clusters a representative title; and further narrowing the abstract idea. These recited limitations in the dependent claims do not amount to significantly more than the above-identified judicial exceptions in Claims 1, 10 and 29.  Regarding Claims, 2-9, 11-12, 14-18 and 20  and the additional elements of “processor” it is M2106.05(d)- Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information). Regarding claim 9 and claim 18  and additional element of “display” it is M2106.05(h)- field of use.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20  are rejected under 35 U.S.C. 103 as being unpatentable over Hagen, US Publication No. 20180293485A1, [hereinafter Merhav], in view of Wang et al, "DeepCarotene -Job Title Classification with Multi-stream Convolutional Neural Network," 2019 IEEE International Conference on Big Data (Big Data), 2019, pp. 1953-1961, [hereinafter Wang].
Regarding Claim 1,  
Merhav teaches
A method for providing a universal occupational taxonomy comprising: establishing, by a processor, one or more levels of granularity of a plurality of jobs; (Merhav Par. 21-“ As described briefly above, an issue arises in that titles may be ambiguous. While one solution could be to predict the member's job title based on his or her skill set, the taxonomy of job titles has far too many titles that are indistinguishable from the perspective of a model that is based on member skills. For example, “programmer” versus “software engineer” may have sonic nuance that makes them not exactly the same, but seeking to predict at this granularity would produce arbitrary results. Additionally, the title taxonomy may have numerous synonymous titles or titles that have seniority tokens that are not usually obtainable from user skill sets. Examples include “software engineer,” “software developer,” and “programmer” as well as those including modifiers like junior and senior, but any model trained to distinguish between these based on skill sets would produce arbitrary results.”; Par. 70-71) 
assembling, by the processor, data for each job of the plurality of jobs (Merhav Par. 32-“ As shown in FIG. 2, the data layer may include several databases, such as a profile database 218 for storing profile data, including both member profile data and profile data for various organizations (e.g., companies, schools, etc.). Consistent with some embodiments, when a person initially registers to become a member of the social networking service, the person will be prompted to provide some personal information, such as his or her name, age (e.g., birth date), gender, interests, contact information, home town, address, spouse's and/or family members' names, educational background (e.g., schools, majors, matriculation and/or graduation dates, etc.), employment history, skills, professional organizations, and so on. This information is stored, for example, in the profile database 218. Similarly, when a representative of an organization initially registers the organization with the social networking service, the representative may be prompted to provide certain information about the organization. This information may be stored, for example, in the profile database 218, or another database (not shown)...”);
 training, by the processor, one or more vector representations, wherein a vector representation is trained for each job of the plurality of jobs (Merhav Par. 43-“ In an example embodiment, one or more machine learning algorithms are used to aid in optimizing embedding used in the deep representation of entities. FIG. 4 is a block diagram illustrating a machine learning component 330 in more detail, in accordance with an example embodiment. The machine learning component 330 may utilize machine learning processes to arrive at a prediction model 400 used to provide a confidence score for a particular prediction. The exact prediction may vary based on the objective being selected. The machine learning component 330 may comprise a training component 402 and a confidence scoring component 404. The training component 402 feeds training data 406 comprising, for example, member profile data and member activity data into a feature extractor 408 that extracts one or more features 410 of the information. The training data 406 may also include output of an objective function 411 performed on embeddings 412 corresponding to the training data 406 (from, for example, the deep representation of entities, as will be described in more detail below). A machine learning algorithm 413 produces the prediction model 400 using the extracted features 410 and the output of the objective function 411. In some example embodiments, this involves the machine learning algorithm 413 learning weights to apply in the prediction model 400. In the confidence scoring component 404, one or more entities 414, as well as one or more outputs of objective function(s) 416 performed on embeddings 418 corresponding to the one or more entities 414, may be fed to the prediction model 400, which outputs a confidence score for each of one or more potential predictions, indicating a confidence level in the corresponding potential prediction.”; Par. 45; Par. 47);
reevaluating, by the processor, the one or more levels of granularity based on the training  (Merhav Par. 44-“ It should be noted that the prediction model 400 may be periodically updated via additional training and/or user feedback 420. The user feedback 420 may be either feedback from members performing searches or from administrators. The user feedback 420 may include an indication about how successful the prediction model 400 is in providing accurate confidence scores.”); 
clustering, by the processor, the plurality of jobs into one or more clusters based on the one or more vector representations (Merhav Par. 62-63-“In an example embodiment, the above machine-learning algorithm using skill-set-based title prediction can be modified by first cutting down the list of potential outputs. This may be accomplished by clustering the title entities based on occupations. Then Knn clustering may be performed on the remaining embedded title space in the title-skill latent space, with distances measured by the cosine distance measure. Since the titles have been embedded into latent space, using their co-occurrences with skills, each title becomes a point in the latent space. These points can then be clustered and each cluster referred to as an occupational cluster. As each cluster is a set of several points (occupations), each cluster can be represented by one of these points, which is actually the point in the occupational cluster that has the most members. For example, the “software” cluster may have occupations such as “software engineer” and “software architect.” If “software engineer” is the occupation with the most members in this occupational cluster, then it is considered to be the cluster representative.”); 
naming, by the processor, the one or more resulting clusters a representative title  (Merhav Par. 62-63-“ As each cluster is a set of several points (occupations), each cluster can be represented by one of these points, which is actually the point in the occupational cluster that has the most members. For example, the “software” cluster may have occupations such as “software engineer” and “software architect.” If “software engineer” is the occupation with the most members in this occupational cluster, then it is considered to be the cluster representative.”); 
and outputting, by the processor, an occupational taxonomy (Merhav Par. 64-65-“ At operation 1006, it is determined if the role is ambiguous. Ambiguity may be identified by determining whether the role can be mapped to two or more different title entities in a taxonomy of the social networking service. In some example embodiments, this social networking service taxonomy may be a unified taxonomy, as described above, where latent representation embedding is used to form a taxonomy including all skills and titles (and potentially other entities) into a single latent space. In other example embodiments, this social networking service taxonomy is a title taxonomy. See Fig. 10; Par. 66-“ If it is determined at operation 1006 that the role is not ambiguous, then at operation 1008 the role is output as the occupation of the member. If, however, it is determined that the role is ambiguous, then at operation 1010 one or more skills for the member are extracted from the member profile. At operation 1012, a closest occupation is determined by matching the skills in the latent space to find a cluster representative having the closest set of skills to the skills for the member. The occupation cluster having the cluster representative with the closest set of skills is determined to have the closest occupation.”).
Merhav teaches clustering and classifying techniques and the feature is expounded upon by Wang:

classifying, by the processor, the one or more jobs to the one or more clusters (Wang Sec IV.B- “The training data are pairs of job posting document and the correspondence normalized job title. Based on this data set, it applies clustering technique to create a taxonomy discovery component. This taxonomy defines the fine-grained categories for each SOC major. Then it learns a two stage hierarchical k-Nearest-Neighbors model to classify the input text to the most appropriate job titles, respectively in root and leaf levels of the taxonomy. Given a query job posting document, the coarse-grained classifier assigns the title to one of the 23 SOC majors. Then the fine-grained classifier is supposed to find the similar training data samples (neighbors) and classify the query according to the normalized titles of those retrieved neighbors.; Sec IV.A- “The domain specific problem of job title classification usually suffers from limited size of training data [21] or using simple clustering method to estimate the labels [2].”); 
Merhav and Wang are directed to job classification/clustering analysis. It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have improve upon data analysis of Merhav, as taught by Wang, by utilizing further classification analysis with a reasonable expectation of success of arriving at the claimed invention. One of ordinary skill in the art would have been motivated to make the modification to the teachings of Merhav with the motivation of improving job title classification (Wang Abstract).
Regarding Claim 2 and Claim 11 , Merhav in view of Wang teach The method as in claim 1, wherein establishing the one or more levels of granularity comprises:... and The system as in claim 10, wherein, when establishing the one or more levels of granularity, the processor is further configured to:...
Merhav teaches clustering and classifying techniques and the feature is expounded upon by Wang:
establishing, by the processor, a most granular level and setting a first threshold for counts for company-title pairs Wang Sec.IV.a-“In our previous in-house job title classifier, we use a hierarchical system called Carotene [34]. The training data are pairs of job posting document and the correspondence normalized job title. Based on this data set, it applies clustering technique to create a taxonomy discovery component. This taxonomy defines the fine-grained categories for each SOC major. Then it learns a two stage hierarchical k-Nearest-Neighbors model to classify the input text to the most appropriate job titles, respectively in root and leaf levels of the taxonomy. Given a query job posting document, the coarse-grained classifier assigns the title to one of the 23 SOC majors. Then the fine-grained classifier is supposed to find the similar training data samples (neighbors) and classify the query according to the normalized titles of those retrieved neighbors.”); 
if the first threshold is not met, establishing, by the processor, a higher level and setting a second threshold for industry-title pairs (Wang  SecII & Fig. 2-“ Our model has three components -charCNN for job title (in red ) [level 1], wordCNN for job title (in blue) [level 2]and wordCNN for job descriptions (in green). The output of the last convolution layer in charCNN is used as the character level feature while wordCNN takes the concatenation of all max-pooling output as word level features. The two wordCNN for job title and job descriptions are independent. All these three networks are learned simultaneously during training process”);
 and if the second threshold is not met, establishing, by the processor, one or more additional higher levels and corresponding thresholds until a threshold is met. (Wang Sec. III-“ We used the labels generated by the previous state-of-the-art Carotene system as our training data. Each sample is labeled with the predicted carotene code and the confidence score. However, the classification results of Carotene are only 50.1% accurate on average. To deal with this noisy labeled data set, we evaluate two alternative noise-aware loss functions, applying confidence score on softmax score before computing loss and on loss directly.”);
Merhav and Wang are directed to job classification/clustering analysis. It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have improve upon data analysis of Merhav, as taught by Wang, by utilizing further classification analysis with a reasonable expectation of success of arriving at the claimed invention. One of ordinary skill in the art would have been motivated to make the modification to the teachings of Merhav with the motivation of improving job title classification (Wang Abstract).

Regarding Claim 3 and Claim 12 , Merhav in view of Wang teach The method as in claim 1, wherein assembling the data for each job comprises:... and The system as in claim 10, wherein, when assembling the data for each job, the processor is further configured to:...
collecting, by the processor, text of descriptions of the plurality of jobs from a plurality of profiles (Merhav Par. 32-“ As shown in FIG. 2, the data layer may include several databases, such as a profile database 218 for storing profile data, including both member profile data and profile data for various organizations (e.g., companies, schools, etc.). Consistent with some embodiments, when a person initially registers to become a member of the social networking service, the person will be prompted to provide some personal information, such as his or her name, age (e.g., birth date), gender, interests, contact information, home town, address, spouse's and/or family members' names, educational background (e.g., schools, majors, matriculation and/or graduation dates, etc.), employment history, skills, professional organizations, and so on. This information is stored, for example, in the profile database 218. Similarly, when a representative of an organization initially registers the organization with the social networking service, the representative may be prompted to provide certain information about the organization. This information may be stored, for example, in the profile database 218, or another database (not shown). In some embodiments, the profile data may be processed (e.g., in the background or offline) to generate various derived profile data. For example, if a member has provided information about various job titles that the member has held with the same organization or different organizations, and for how long, this information can be used to infer or derive a member profile attribute indicating the member's overall seniority level, or seniority level within a particular organization. In some embodiments, importing or otherwise accessing data from one or more externally hosted data sources may enrich profile data for both members and organizations. For instance, with organizations in particular, financial data may be imported from one or more external data sources and made part of an organization's profile. This importation of organization data and enrichment of the data will be described in more detail later in this document.”); 
Merhav teaches clustering and classifying techniques and the feature is expounded upon by Wang:
and concatenating, by the processor, all the collected profile data into one text document for each job. Wang Sec.III-“ We jointly train all of the above three CNN models to extract features on characters, short terms, and paragraphs. We use “charCNN” for job title matching on character level and produce character features Vc. Meanwhile, “wordCNN” is learned for job titles and job descriptions to produce word features Vw and Vs respectively. All the three streams are learned simultaneously in our end-to-endjob title classification system. The final feature to represent our input job title and description is obtained by concatenating features from all three streams Vf=concat(Vc, Vs, Vw).”); 
Merhav and Wang are directed to job classification/clustering analysis. It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have improve upon data analysis of Merhav, as taught by Wang, by utilizing further classification analysis with a reasonable expectation of success of arriving at the claimed invention. One of ordinary skill in the art would have been motivated to make the modification to the teachings of Merhav with the motivation of improving job title classification (Wang Abstract).
Regarding Claim 4 and Claim 13, Merhav in view of Wang teach The method as in claim 3,... and The system as in claim 12,...
Merhav teaches clustering and classifying techniques and the feature is expounded upon by Wang:
wherein training, by the processor, the one or more vector representations is based on the text document for each job. (Wang Sec.Iv.  B-“ In our previous in-house job title classifier, we use a hierarchical system called Carotene [34]. The training data are pairs of job posting document and the correspondence normalized job title. Based on this data set, it applies clustering technique to create a taxonomy discovery component. This taxonomy defines the fine-grained categories for each SOC major. Then it learns a two stage hierarchical k-Nearest-Neighbors model to classify the input text to the most appropriate job titles, respectively in root and leaf levels of the taxonomy. Given a query job posting document, the coarse-grained classifier assigns the title to one of the 23 SOC majors. Then the fine-grained classifier is supposed to find the similar training data samples (neighbors) and classify the query according to the normalized titles of those retrieved neighbors.”)
Merhav and Wang are directed to job classification/clustering analysis. It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have improve upon data analysis of Merhav, as taught by Wang, by utilizing further classification analysis with a reasonable expectation of success of arriving at the claimed invention. One of ordinary skill in the art would have been motivated to make the modification to the teachings of Merhav with the motivation of improving job title classification (Wang Abstract).
Regarding Claim 5 and Claim 14, Merhav in view of Wang teach The method as in claim 1 wherein training, by the processor, the one or more vector representations for each job comprises:,... and The system as in claim 10, wherein, when training the one or more vector representations for each job, the processor is further configured to:,...
Merhav teaches clustering and classifying techniques and the feature is expounded upon by Wang:
implementing, by the processor, a neural network based on at least one of a distributed bag of words (DBOW) implementation, a skip-gram implementation, a Distributed Memory (DM) model, a Bidirectional Encoder Representations from Transformers (BERT) model and a Universal Language Model Fine-tuning (ULMFiT) model. (Wang Sec.Iv.  B-“ To represent the input terms in vectors, the baseline model conducts experiments by fine-tuning 3 different models, Word2Vec [19], Doc2Vec [16] and Bag-of-Words [22], [28]. With the same testing data set, in Figure 3, we compare DeepCarotene with all 3 baseline models. The one with Word2Vec representation has the best performance and will be compared with DeepCarotene in details in this work.”[word2vwc = skipgram])
Merhav and Wang are directed to job classification/clustering analysis. It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have improve upon data analysis of Merhav, as taught by Wang, by utilizing further classification analysis with a reasonable expectation of success of arriving at the claimed invention. One of ordinary skill in the art would have been motivated to make the modification to the teachings of Merhav with the motivation of improving job title classification (Wang Abstract).

Regarding Claim 6 and Claim 15, Merhav in view of Wang teach The method as in claim 1 wherein reevaluating, by the processor, the one or more levels of granularity based on the training further comprises:... and The system as in claim 10, wherein, when reevaluating the one or more levels of granularity based on the training, the processor is further configured to:...

combining a first given vector representation and a second given vector representation into a single vector representation when a statistical insignificance is identified between the first given vector representation and the second given vector representation. (Merhav Par. 49-50-“ At operation 506, one or more objective functions are applied to each of at least one combination of two or more vectors. The objective function is selected based on the prediction that is attempting to be solved. For example, if the selected prediction that is attempting to be solved is whether a member of the social network having a particular title also has a certain skill, then the objective function may be a dot product function between vectors for title entities and vectors for skills entities. The result of the application of the objective function is an objective function output. At operation 508, an optimization test is applied to each of the outputs of the objective function. The purpose of the optimization test is to determine whether the embeddings have been optimized. This determination may be based on whether a machine learning model would, if fed at least one of the vectors in the combination, be accurate in the selected prediction. For example, if, as above, the prediction that is attempting to be solved is whether a member of the social network having a particular title also has a certain skill, then the result of operation 506 (the dot product of the title entity and skill entity) should be at a maximum (e.g., 1.0) for members with the title also having the skill. Thus, a machine learning model is run using one of the entities represented by the vectors in the combination to determine if it would accurately predict the presence of the other vector(s).”)
Regarding Claim 7 and Claim 16, Merhav in view of Wang teach The method as in claim 1, wherein naming, by the processor, the one or more resulting clusters, the representative title comprises:... and The system as in claim 10, wherein, when naming the one or more resulting clusters, the representative title, the processor is further configured to:...
selecting, for each cluster, a title based on a distance from centroid (X), a size (Y), and a binary variable which indicates whether this title exists without an industry (Z). (Merhav Abstract-“ In an example embodiment, a deep representation data structure is formed having first vectors representing titles from a social network data structure and second vectors representing skills from a social network data structure represented as coordinates in the deep representation data structure. One or more objective functions are applied to at least one combination of first vectors and second vectors in the deep representation data structure, causing an objective function output for each of the at least one combination. One or more coordinates for the vectors in the combination are clustered into occupation clusters, wherein each coordinate in each of the occupation clusters shares an occupation. A cluster representative is identified for each of the occupation clusters. Then, the cluster representative can be used to infer an occupation for a first member profile having a role that is ambiguous.”; Par. 20-“ In an example embodiment, a system is provided whereby latent vector representations for entities in a taxonomy are learned through machine learning techniques. Ultimately, every skill, title, or other standardized entity may be mapped to a vector representation, where “distance” is a well-defined quantity (e.g., Euclidean) and “relation” well defined as well (e.g., the subtraction of the two vectors).”; Par. 57-62)
Regarding Claim 8 and Claim 17, Merhav in view of Wang teach The method as in claim 1, wherein clustering, by the processor, the plurality of jobs into one or more clusters based on the one or more vector representations comprises:... and The system as in claim 10, wherein, when clustering the plurality of jobs into one or more clusters based on the one or more vector representations, the processor is further configured to:...
implementing one or more agglomerative models; and creating a hierarchical taxonomy, wherein one or more granular categories are each mapped to one broader category in a many-to-one mapping. (Merhav Par. 21-“ As described briefly above, an issue arises in that titles may be ambiguous. While one solution could be to predict the member's job title based on his or her skill set, the taxonomy of job titles has far too many titles that are indistinguishable from the perspective of a model that is based on member skills. For example, “programmer” versus “software engineer” may have sonic nuance that makes them not exactly the same, but seeking to predict at this granularity would produce arbitrary results. Additionally, the title taxonomy may have numerous synonymous titles or titles that have seniority tokens that are not usually obtainable from user skill sets. Examples include “software engineer,” “software developer,” and “programmer” as well as those including modifiers like junior and senior, but any model trained to distinguish between these based on skill sets would produce arbitrary results.”; Par. 65-67-“ At operation 1006, it is determined if the role is ambiguous. Ambiguity may be identified by determining whether the role can be mapped to two or more different title entities in a taxonomy of the social networking service. In some example embodiments, this social networking service taxonomy may be a unified taxonomy, as described above, where latent representation embedding is used to form a taxonomy including all skills and titles (and potentially other entities) into a single latent space. In other example embodiments, this social networking service taxonomy is a title taxonomy....At operation 1014, a specialty for the cluster representative with the closest set of skills is identified. This may be performed by referencing a static mapping with a combination of roles and specialties on one side of the mapping and occupations on the other side of the mapping. In other words, the static mapping contains a list of role-specialty combinations and corresponding occupations.)
Regarding Claim 9 and Claim 18, Merhav in view of Wang teach The method as in claim 1, wherein outputting, by the processor, an occupational taxonomy comprises:... and The system as in claim 10, wherein, when outputting the occupational taxonomy, the processor is further configured to:...
outputting the occupational taxonomy at least one of to a display and as a data set. (Merhav Par. 91 “ The output components 1252 may include visual components (e.g., a display such as a plasma display panel (PUP), a light emitting diode (LED) display, a liquid crystal display (LCD), a projector, or a cathode ray tube (CRT)), acoustic components (e.g., speakers), haptic components (e.g., a vibratory motor, resistance mechanisms), other signal generators, and so forth.”; Par.59-61“ A member title identification (entity) 902 is used for an embedding step 904, which outputs a title latent representation, specifically a vector 906, for the title. Likewise, multiple skill identifications (entities) 908A-908C are used for an embedding step 920, which outputs a skill latent representation of each skill, specifically vectors 912. Vectors 906 and 912 are then fed to an objective function 914, herein being a dot product operation plus a bias (e.g., addition or subtraction of a constant). Notably, however, vectors 912 may first be passed through a max pooling step 913.”)

Regarding Claim 10,  
Merhav teaches
A system for providing a universal occupational taxonomy comprising: a memory; a processor; and one or more code sets stored in the memory and executing in the processor which, when executed, configured the processor to: establish one or more levels of granularity of a plurality of jobs; (Merhav Par. 21-“ As described briefly above, an issue arises in that titles may be ambiguous. While one solution could be to predict the member's job title based on his or her skill set, the taxonomy of job titles has far too many titles that are indistinguishable from the perspective of a model that is based on member skills. For example, “programmer” versus “software engineer” may have sonic nuance that makes them not exactly the same, but seeking to predict at this granularity would produce arbitrary results. Additionally, the title taxonomy may have numerous synonymous titles or titles that have seniority tokens that are not usually obtainable from user skill sets. Examples include “software engineer,” “software developer,” and “programmer” as well as those including modifiers like junior and senior, but any model trained to distinguish between these based on skill sets would produce arbitrary results.”; Par. 70-71)
assemble data for each job of the plurality of jobs (Merhav Par. 32-“ As shown in FIG. 2, the data layer may include several databases, such as a profile database 218 for storing profile data, including both member profile data and profile data for various organizations (e.g., companies, schools, etc.). Consistent with some embodiments, when a person initially registers to become a member of the social networking service, the person will be prompted to provide some personal information, such as his or her name, age (e.g., birth date), gender, interests, contact information, home town, address, spouse's and/or family members' names, educational background (e.g., schools, majors, matriculation and/or graduation dates, etc.), employment history, skills, professional organizations, and so on. This information is stored, for example, in the profile database 218. Similarly, when a representative of an organization initially registers the organization with the social networking service, the representative may be prompted to provide certain information about the organization. This information may be stored, for example, in the profile database 218, or another database (not shown)...”);
 train one or more vector representations, wherein a vector representation is trained for each job of the plurality of jobs (Merhav Par. 43-“ In an example embodiment, one or more machine learning algorithms are used to aid in optimizing embedding used in the deep representation of entities. FIG. 4 is a block diagram illustrating a machine learning component 330 in more detail, in accordance with an example embodiment. The machine learning component 330 may utilize machine learning processes to arrive at a prediction model 400 used to provide a confidence score for a particular prediction. The exact prediction may vary based on the objective being selected. The machine learning component 330 may comprise a training component 402 and a confidence scoring component 404. The training component 402 feeds training data 406 comprising, for example, member profile data and member activity data into a feature extractor 408 that extracts one or more features 410 of the information. The training data 406 may also include output of an objective function 411 performed on embeddings 412 corresponding to the training data 406 (from, for example, the deep representation of entities, as will be described in more detail below). A machine learning algorithm 413 produces the prediction model 400 using the extracted features 410 and the output of the objective function 411. In some example embodiments, this involves the machine learning algorithm 413 learning weights to apply in the prediction model 400. In the confidence scoring component 404, one or more entities 414, as well as one or more outputs of objective function(s) 416 performed on embeddings 418 corresponding to the one or more entities 414, may be fed to the prediction model 400, which outputs a confidence score for each of one or more potential predictions, indicating a confidence level in the corresponding potential prediction.”; Par. 45; Par. 47);
reevaluate the one or more levels of granularity based on the training  (Merhav Par. 44-“ It should be noted that the prediction model 400 may be periodically updated via additional training and/or user feedback 420. The user feedback 420 may be either feedback from members performing searches or from administrators. The user feedback 420 may include an indication about how successful the prediction model 400 is in providing accurate confidence scores.”); 
cluster the plurality of jobs into one or more clusters based on the one or more vector representations (Merhav Par. 62-63-“In an example embodiment, the above machine-learning algorithm using skill-set-based title prediction can be modified by first cutting down the list of potential outputs. This may be accomplished by clustering the title entities based on occupations. Then Knn clustering may be performed on the remaining embedded title space in the title-skill latent space, with distances measured by the cosine distance measure. Since the titles have been embedded into latent space, using their co-occurrences with skills, each title becomes a point in the latent space. These points can then be clustered and each cluster referred to as an occupational cluster. As each cluster is a set of several points (occupations), each cluster can be represented by one of these points, which is actually the point in the occupational cluster that has the most members. For example, the “software” cluster may have occupations such as “software engineer” and “software architect.” If “software engineer” is the occupation with the most members in this occupational cluster, then it is considered to be the cluster representative.”); 
name the one or more resulting clusters a representative title  (Merhav Par. 62-63-“ As each cluster is a set of several points (occupations), each cluster can be represented by one of these points, which is actually the point in the occupational cluster that has the most members. For example, the “software” cluster may have occupations such as “software engineer” and “software architect.” If “software engineer” is the occupation with the most members in this occupational cluster, then it is considered to be the cluster representative.”); 
and output an occupational taxonomy (Merhav Par. 64-65-“ At operation 1006, it is determined if the role is ambiguous. Ambiguity may be identified by determining whether the role can be mapped to two or more different title entities in a taxonomy of the social networking service. In some example embodiments, this social networking service taxonomy may be a unified taxonomy, as described above, where latent representation embedding is used to form a taxonomy including all skills and titles (and potentially other entities) into a single latent space. In other example embodiments, this social networking service taxonomy is a title taxonomy. See Fig. 10; Par. 66-“ If it is determined at operation 1006 that the role is not ambiguous, then at operation 1008 the role is output as the occupation of the member. If, however, it is determined that the role is ambiguous, then at operation 1010 one or more skills for the member are extracted from the member profile. At operation 1012, a closest occupation is determined by matching the skills in the latent space to find a cluster representative having the closest set of skills to the skills for the member. The occupation cluster having the cluster representative with the closest set of skills is determined to have the closest occupation.”).
Merhav teaches clustering and classifying techniques and the feature is expounded upon by Wang:

classify the one or more jobs to the one or more clusters (Wang Sec IV.B- “The training data are pairs of job posting document and the correspondence normalized job title. Based on this data set, it applies clustering technique to create a taxonomy discovery component. This taxonomy defines the fine-grained categories for each SOC major. Then it learns a two stage hierarchical k-Nearest-Neighbors model to classify the input text to the most appropriate job titles, respectively in root and leaf levels of the taxonomy. Given a query job posting document, the coarse-grained classifier assigns the title to one of the 23 SOC majors. Then the fine-grained classifier is supposed to find the similar training data samples (neighbors) and classify the query according to the normalized titles of those retrieved neighbors.; Sec IV.A- “The domain specific problem of job title classification usually suffers from limited size of training data [21] or using simple clustering method to estimate the labels [2].”); 
Merhav and Wang are directed to job classification/clustering analysis. It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have improve upon data analysis of Merhav, as taught by Wang, by utilizing further classification analysis with a reasonable expectation of success of arriving at the claimed invention. One of ordinary skill in the art would have been motivated to make the modification to the teachings of Merhav with the motivation of improving job title classification (Wang Abstract).
Regarding Claim 19,  
Merhav teaches
A method for providing a universal occupational taxonomy comprising: establishing, by a processor, one or more levels of granularity of a plurality of jobs; (Merhav Par. 21-“ As described briefly above, an issue arises in that titles may be ambiguous. While one solution could be to predict the member's job title based on his or her skill set, the taxonomy of job titles has far too many titles that are indistinguishable from the perspective of a model that is based on member skills. For example, “programmer” versus “software engineer” may have sonic nuance that makes them not exactly the same, but seeking to predict at this granularity would produce arbitrary results. Additionally, the title taxonomy may have numerous synonymous titles or titles that have seniority tokens that are not usually obtainable from user skill sets. Examples include “software engineer,” “software developer,” and “programmer” as well as those including modifiers like junior and senior, but any model trained to distinguish between these based on skill sets would produce arbitrary results.”; Par. 70-71)
assembling, by the processor, data for each job of the plurality of jobs (Merhav Par. 32-“ As shown in FIG. 2, the data layer may include several databases, such as a profile database 218 for storing profile data, including both member profile data and profile data for various organizations (e.g., companies, schools, etc.). Consistent with some embodiments, when a person initially registers to become a member of the social networking service, the person will be prompted to provide some personal information, such as his or her name, age (e.g., birth date), gender, interests, contact information, home town, address, spouse's and/or family members' names, educational background (e.g., schools, majors, matriculation and/or graduation dates, etc.), employment history, skills, professional organizations, and so on. This information is stored, for example, in the profile database 218. Similarly, when a representative of an organization initially registers the organization with the social networking service, the representative may be prompted to provide certain information about the organization. This information may be stored, for example, in the profile database 218, or another database (not shown)...”);
 training, by the processor, one or more vector representations, wherein a vector representation is trained for each job of the plurality of jobs (Merhav Par. 43-“ In an example embodiment, one or more machine learning algorithms are used to aid in optimizing embedding used in the deep representation of entities. FIG. 4 is a block diagram illustrating a machine learning component 330 in more detail, in accordance with an example embodiment. The machine learning component 330 may utilize machine learning processes to arrive at a prediction model 400 used to provide a confidence score for a particular prediction. The exact prediction may vary based on the objective being selected. The machine learning component 330 may comprise a training component 402 and a confidence scoring component 404. The training component 402 feeds training data 406 comprising, for example, member profile data and member activity data into a feature extractor 408 that extracts one or more features 410 of the information. The training data 406 may also include output of an objective function 411 performed on embeddings 412 corresponding to the training data 406 (from, for example, the deep representation of entities, as will be described in more detail below). A machine learning algorithm 413 produces the prediction model 400 using the extracted features 410 and the output of the objective function 411. In some example embodiments, this involves the machine learning algorithm 413 learning weights to apply in the prediction model 400. In the confidence scoring component 404, one or more entities 414, as well as one or more outputs of objective function(s) 416 performed on embeddings 418 corresponding to the one or more entities 414, may be fed to the prediction model 400, which outputs a confidence score for each of one or more potential predictions, indicating a confidence level in the corresponding potential prediction.”; Par. 45; Par. 47);
reevaluating, by the processor, the one or more levels of granularity based on the training  (Merhav Par. 44-“ It should be noted that the prediction model 400 may be periodically updated via additional training and/or user feedback 420. The user feedback 420 may be either feedback from members performing searches or from administrators. The user feedback 420 may include an indication about how successful the prediction model 400 is in providing accurate confidence scores.”); 
clustering, by the processor, the plurality of jobs into one or more clusters based on the one or more vector representations (Merhav Par. 62-63-“In an example embodiment, the above machine-learning algorithm using skill-set-based title prediction can be modified by first cutting down the list of potential outputs. This may be accomplished by clustering the title entities based on occupations. Then Knn clustering may be performed on the remaining embedded title space in the title-skill latent space, with distances measured by the cosine distance measure. Since the titles have been embedded into latent space, using their co-occurrences with skills, each title becomes a point in the latent space. These points can then be clustered and each cluster referred to as an occupational cluster. As each cluster is a set of several points (occupations), each cluster can be represented by one of these points, which is actually the point in the occupational cluster that has the most members. For example, the “software” cluster may have occupations such as “software engineer” and “software architect.” If “software engineer” is the occupation with the most members in this occupational cluster, then it is considered to be the cluster representative.”); 
naming, by the processor, the one or more resulting clusters a representative title  (Merhav Par. 62-63-“ As each cluster is a set of several points (occupations), each cluster can be represented by one of these points, which is actually the point in the occupational cluster that has the most members. For example, the “software” cluster may have occupations such as “software engineer” and “software architect.” If “software engineer” is the occupation with the most members in this occupational cluster, then it is considered to be the cluster representative.”); 
and outputting, by the processor, an occupational taxonomy at least one of to a display and as a data set. (Merhav Par. 64-65-“ At operation 1006, it is determined if the role is ambiguous. Ambiguity may be identified by determining whether the role can be mapped to two or more different title entities in a taxonomy of the social networking service. In some example embodiments, this social networking service taxonomy may be a unified taxonomy, as described above, where latent representation embedding is used to form a taxonomy including all skills and titles (and potentially other entities) into a single latent space. In other example embodiments, this social networking service taxonomy is a title taxonomy. See Fig. 10; Par. 66-“ If it is determined at operation 1006 that the role is not ambiguous, then at operation 1008 the role is output as the occupation of the member. If, however, it is determined that the role is ambiguous, then at operation 1010 one or more skills for the member are extracted from the member profile. At operation 1012, a closest occupation is determined by matching the skills in the latent space to find a cluster representative having the closest set of skills to the skills for the member. The occupation cluster having the cluster representative with the closest set of skills is determined to have the closest occupation.”; Par. 91-“ The output components 1252 may include visual components (e.g., a display such as a plasma display panel (PUP), a light emitting diode (LED) display, a liquid crystal display (LCD), a projector, or a cathode ray tube (CRT)), acoustic components (e.g., speakers), haptic components (e.g., a vibratory motor, resistance mechanisms), other signal generators, and so forth.”; Par. 43-“ The training data 406 may also include output of an objective function 411 performed on embeddings 412 corresponding to the training data 406 (from, for example, the deep representation of entities, as will be described in more detail below)”;  Par. 57; Par. 66).
Merhav teaches clustering and classifying techniques and the feature is expounded upon by Wang:

classifying, by the processor, the one or more jobs to the one or more clusters (Wang Sec IV.B- “The training data are pairs of job posting document and the correspondence normalized job title. Based on this data set, it applies clustering technique to create a taxonomy discovery component. This taxonomy defines the fine-grained categories for each SOC major. Then it learns a two stage hierarchical k-Nearest-Neighbors model to classify the input text to the most appropriate job titles, respectively in root and leaf levels of the taxonomy. Given a query job posting document, the coarse-grained classifier assigns the title to one of the 23 SOC majors. Then the fine-grained classifier is supposed to find the similar training data samples (neighbors) and classify the query according to the normalized titles of those retrieved neighbors.; Sec IV.A- “The domain specific problem of job title classification usually suffers from limited size of training data [21] or using simple clustering method to estimate the labels [2].”); 
Merhav and Wang are directed to job classification/clustering analysis. It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have improve upon data analysis of Merhav, as taught by Wang, by utilizing further classification analysis with a reasonable expectation of success of arriving at the claimed invention. One of ordinary skill in the art would have been motivated to make the modification to the teachings of Merhav with the motivation of improving job title classification (Wang Abstract).
Regarding Claim 20,  
The method as in claim 19, further comprising naming, by the processor, the one or more resulting clusters a representative title. (Merhav Par.65-66-“ At operation 1006, it is determined if the role is ambiguous. Ambiguity may be identified by determining whether the role can be mapped to two or more different title entities in a taxonomy of the social networking service. In some example embodiments, this social networking service taxonomy may be a unified taxonomy, as described above, where latent representation embedding is used to form a taxonomy including all skills and titles (and potentially other entities) into a single latent space. In other example embodiments, this social networking service taxonomy is a title taxonomy. If it is determined at operation 1006 that the role is not ambiguous, then at operation 1008 the role is output as the occupation of the member.”)
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: US Publication No. 20180144253A1 to Merhav et al.- Abstract-“ In an example embodiment, for each of a plurality of different titles in a social network structure, the title is mapped into a first vector having n coordinates, while kills are mapped into a second vector having n coordinates. The first and second vectors are stored in a deep representation data structure. One or more objective functions are applied to at least one combination of two or more of the vectors in the deep representation data structure. Then, an optimization test on each of the at least one combination is performed using a corresponding objective function output for each of the at least one combination of two or more of the vectors, and, for any combination that did not pass the optimization test, one or more coordinates for the vectors in the combination are altered so that the vectors in the combination become closer together within an n-dimensional space.”

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Chesiree Walton, whose telephone number is (571) 272-5219.  The examiner can normally be reached from Monday to Friday between 8 AM and 5 PM.  If any attempt to reach the examiner by telephone is unsuccessful, the examiner’s supervisor, Patricia Munson, can be reached at (571) 270-5396.  The fax telephone numbers for this group are either (571) 273-8300 or (703) 872-9326 (for official communications including After Final communications labeled “Box AF”).
	Another resource that is available to applicants is the Patent Application Information Retrieval (PAIR). Information regarding the status of an application can be obtained from the (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAX. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, please feel free to contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
	Applicants are invited to contact the Office to schedule an in-person interview to discuss and resolve the issues set forth in this Office Action.  Although an interview is not required, the Office believes that an interview can be of use to resolve any issues related to a patent application in an efficient and prompt manner.
Sincerely,
/CHESIREE A WALTON/ Examiner, Art Unit 3624