Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments filed with respect to amended claims 1-20 on 8/9/2021, directed to the applied prior art, have been fully considered but they are not persuasive.

(A)	In re pages 9-10, applicant states, the claims are directed to and involves first training a machine learning model to predict or identify entities in portions of text extracted from a first set of job postings, as described by applicant.

“Accordingly, claim 1 involves first training a machine learning model to predict or identify entities in portions of text extracted from a first set of job postings, and then, processing with the trained model portions of text that have been extracted from a second set of job postings to identify or tag entities, by type, within the portions of the text. 

By way of example, the text that is input to the model may specify a job title (e.g., “Looking for a software engineer with ten years of experience 1


Upon processing the text, the model outputs a tag (e.g., the named entity type) for the relevant portion of text — e.g., “software engineer.” 

Consistent with Applicants’ claimed invention, the machine learning model is capable of identifying and tagging a wide variety of named entity types associated with job postings.

In contrast, the Bhadouria reference describes a classifier that is trained to predict an industry for a job posting, based on the entire text of the job posting. 

As such, the Bhadouria reference is not identifying an entity type associated with specific text within the job description, but is instead, identifying or determining the named entity itself, which is an industry. 

For example, based on an analysis of all text in a job posting, the classifier of Bhadouria may output an industry (e.g., “legal” or “consulting services”), for a job posting. 

This is substantially different from what Applicants have described and claimed. At least for these reasons, Applicants respectfully request reconsideration, withdrawal of the rejection, and allowance of the claims.


	In response, respectfully the examiner agrees that, the system predicts an industry, but, it is seen as Plural Industries or plural Types.
	The examiner also agrees that there is disclosure (applicant), wherein the Types do include as states above includes software engineer.” 
Consistent with Applicants’ claimed invention, the machine learning model is capable of identifying and tagging a wide variety of named entity types associated with job postings.

In response to applicant's argument that the references fail to show certain features of applicant’s invention, it is noted that the features upon which applicant relies (i.e., software engineer) are not recited in the rejected claim(s).  Although the claims are interpreted in light of the specification, limitations from the specification are not read In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).

	Rather in accord to applicant specification the scope of entity types includes 
SEE US 20200311112, Entities can be any of (0014)
A machine learning model is trained to predict a named entity (e.g., job title, company, industry, function, location, seniority, skill, etc.), associated with a keyword in a job description and/or other text in a job posting based on a label for the entity that is generated from structured data for the job posting and/or human input.

Therefore, entities include Industry, being industries or Types, with associated, job titles, which is understood as INDUSTRIES or Types, in applicant’s specification as well as in the applied prior art.

Bhadouria appears discloses at least Entity Types, being Labels added based on found text portions of Job postings (Fig. 4), as understood performing, feature extractions to machine learning to the classification model, applied with User Feedback and threshold determiner 420.
SEE Entities Types, include Industries 0056-
SEE Industries (RAW) to, similar and/or derive industry groups

feature index IDF, determining based on Similar.

An example: normalized company name/identification.


Note Term Frequency (TF and IDF), directed to industries as well as industry Groups (or Types).

[0056] FIG. 7 is a flow diagram illustrating a method 700 of classifying industries for candidate job postings using a raw industry classification model 400, in accordance with an example embodiment. At operation 702, the raw industry classification model 400 is fetched. At operation 704, the feature index IDF and similar industry group files are loaded. At operation 706, one or more job postings corresponding to a single company (e.g., containing the same normalized company name/identification) are obtained. At operation 708, TF-IDF vector calculation is performed for each of the terms in one or more job postings corresponding to the single company. At operation 710, the final TF-IDF vector is used to perform logistic regression classification, which uses the learned coefficients to output the top k k=3) predictions with their prediction scores. At operation 712, the job-posting post-processing component 312 may compute two other derived industry groups from the raw predictions. The first is the top k+ similar industries. This is computed by combining a list of the top k industries from the previous operation as well as every industry similar to each of the top k industries (as identified in an industry similarity table). The second is top k dissimilar industry, which comprises picking the top k industries so that no 2 industries in the set are similar. This helps give breadth to the predicted industries to be assigned to each job posting, with the recognition that is better to have industries that perhaps are not actually relevant to the job posting listed in the job posting than to not have an industry that is relevant to the job posting, as in the latter case a search by a member on the industry would yield no results, whereas in the former the member could always filter out results he or she views as unrelated.

Types (0016), of including, Software Engineers and/or IT engineer, identifies the industry as HEALTH CARE, as the types of industries, also appears to be as argued, being adapted to or, is capable of identifying and tagging a wide variety of named entity types associated with job postings.

SEE Training a model with selected features (0051)
Note, based on the above Bhadouria, also Predicts the Industries (or Types), as claimed, 0037, 0040, after training (the raw industry classification model). 
[0037] It should be noted that the raw industry classification model 400 may be periodically updated via additional training and/or user feedback. The user feedback may be either feedback from members performing searches, or from companies corresponding to the job postings. The feedback may include an indication about how successful the raw industry classification model 400 is in predicting an industry for job postings based upon information in the job postings. 

[0040] There are various technical problems encountered in attempting to automatically assign an industry to a job posting. Among these technical problems are that the job description in a job posting may provide a very noisy signal for industry classification of a company, the training data may be heavily skewed if there are more jobs in few industries, a company may post lots of jobs in other domains or industries than their primary industry, and poor accuracy for predicting a single perfect industry for a company. 

	Therefore, the arguments are not deemed persuasive, since appears does as claimed, by identifying an entity types (Types of Industries) associated with specific text within the job description and performing predicting of the entities types being industries, including Software, IT Engineers, as well as health care, types, in view of JOB posting can comprise noisy information, leading to false positives (or Industry Tagging, Job Posts).
SEE (0016)
[0016] In an example embodiment, an automated solution for identifying industries relating to a company from a job posting using a machine learning algorithm is provided. The solution comprises a two phase process. In the first phase, offline training of a machine learning model is conducted by training a multi-class logistic regression model with labeled training data to generate a model with feature vectors and model coefficients. In the second phase, online classification of companies identified in previously unseen job postings is performed by using the trained model to associate one or more industries with the identified company. This process is able to handle a wide variety of job posting input to provide reliable results even when job posting information is noisy and could otherwise lead to false positives. For example, while many different types of companies may post job postings for software engineers, that does not make those companies software companies. An example would be a health care company looking to hire an IT engineer. The machine learning algorithm is able to correctly identify the industry as "health care" as opposed to "software" despite the fact that the job itself, and thus many of the keywords contained in the job posting, pertain(s) to software.


	The examiner welcomes applicant to request an interview to discuss potential distinguishable subject matter in an effort to enhance compact prosecution, as well as record clarity.




Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claims 1-18 and 20 are rejected under 35 U.S.C. 102 as being anticipated by Bhadouria et al. (US 2017/0300862).
	Regarding claim 1, Bhadouria is deemed to disclose as claimed directed to a method supported by structures, the method, comprising:
obtaining labels 
(Fig. 4, Feature Extraction 408 to Machine Learning 412-
for entities found in portions of text in a first set of job postings 

SEE Fig. 4, Input of Sample 406 Job Postings, 

each label indicating an entity type for a portion of text 
applying, by the one or more computer systems, the machine learning model (412), to portions of text from a second set of job postings (414, w/feature extractor 416), to generate predictions of additional entity types (see 0037, 0040) in the additional portions of text in the second set of job postings
[0040] There are various technical problems encountered in attempting to automatically assign an industry to a job posting. Among these technical problems are that the job description in a job posting may provide a very noisy signal for industry classification of a company, the training data may be heavily skewed if there are more jobs in few industries, a company may post lots of jobs in other domains or industries than their primary industry, and poor accuracy for predicting a single perfect industry for a company.

Also see detail at 0056-, w/scores to derive Industries 
Groups

o	creating, based on the predictions, an index (see 0056, 0057-, “score exceeding a threshold”), comprising mappings of the additional entities (see Predicted Industries), to subsets of the second set of job postings in which the additional entities are found.

ies (are Types), being, added to the job postings for the company.
[0057] At operation 714, one or more of the predicted industries are selected to be added to the job postings for the company. These are selected from among the industries identified in operation 710 and/or operation 712. At operation 716, it is determined if at least one of the selected predicted industries has a prediction score exceeding an accuracy threshold. If so, then at operation 718, the one or more predicted industries are added to the job postings for the company. If not, then the process ends. 

[0058] In an example embodiment, the prediction scores may be calculated by a validation response handler. This validation response handler computes the following statistics: a. Top performing metric per industry: raw top 3, top 3+similar or top k dissimilar b. Prediction accuracy per employer group c. Average prediction metric accuracy d. Total number of industries tagged e. Total number of industries tagged for companies with low company standardizer score. f. Total jobs tagged by industry g. Total jobs skipped due to low accuracy h. Company and job Industry overrides by Analyst/Quality Assurance (QA) 


	Regarding claim 2, Bhadouria as applied is deemed to further meet as claimed, comprising:
O	obtaining one or more entities extracted from parameters of a job search (see Fig. 4, User feedback or search, behavior, 0003, 0006, 0028, 0032, 0037 and “Indexing”); matching the one or more entities to one or more of the mappings in the index; and adding one or more job postings 
SEE 0006 & 0032

	Regarding claim 3 of claim 2, Bhadouria as applied is deemed to further meet as claimed, wherein obtaining the one or more entities extracted from the parameters of the job search comprises: matching a parameter of the job search to a standardized value for an entity
SEE 0034, in post processing, including company name and location, standardizations

Regarding claim 4 of claim 1, Bhadouria as applied is deemed to further meet as claimed, wherein obtaining the labels for the entities found in the portions of text in the first set of job postings comprises: 
O	obtaining the labels from fields (an area), in structured data for the first set of job postings
SEE Job Postings include at least Fields (0002) in structured data (see postings and fields, 0044, 0059 and Fig. 4)
SEE 0059, Text Portion (or a Field) in structured data (Form or Format)

“The one or more features may include, for example, a filtered list of terms in a textual portion of the sample computerized job posting.”

inputting the portions of text and the labels as training data for the machine learning model comprises: 
O	locating the entities in the portions of text within the first set of job postings and
training the machine learning model to predict entity types corresponding with the labels based on the portions of text containing the corresponding entities in the first set of job postings
SEE 0057-, 0056, 0040 and 0037

	Regarding claim 6, Bhadouria as applied is deemed to further meet as claimed, wherein applying the machine learning model to the second set of jobs job postings to generate the predictions of the additional entities in the additional portions of text in the second set of job postings comprises:
O	dividing a job posting into one or more portions of text;
O	extracting one or more text windows comprising a keyword from a portion in the one or more portions of text; and applying the machine learning model to the one or more text windows to generate a prediction of an entity type for the keyword.
SEE 0059

SEE 0059 and 0055

Regarding claim 8 of claim 6, Bhadouria as applied is deemed to further meet as claimed wherein applying the machine learning model to the one or more text windows to generate the prediction of the entity type for the keyword comprises: segmenting words in the one or more text windows into sub-words; and inputting the sub-words into the machine learning model to generate the prediction of the entity type for the keyword
SEE 0059, w/parser to filtered list of terms

Regarding claim 9 of claim 6, Bhadouria as applied is deemed to further meet as claimed, wherein creating the index comprising the mappings of the additional entities to the subsets of the second set of job postings in which the additional entities are found comprises:
mapping a representation of the entity and the keyword to one or more job postings in the second set of job postings for which the prediction of the entity type for the keyword was generated
SEE 0055, 0032

Regarding claim 10 of claim 6, Bhadouria as applied is deemed to further meet as claimed wherein each portion in the one or more portions comprises at least one of: a sentence; a paragraph and a line
SEE 0049, abstract, Textual Data or a Line (or a Word), (in portion), w/features (w/Chi square estimator)



Regarding claim 11 of claim 1, Bhadouria as applied is deemed to further meet as claimed wherein the entity types and the additional entity types comprise at least one of:
O	a company (abstract, 0002-);
O	a title (0016);
O	a type of employment (by Industry);
O	an industry (Abstract);
O	a function (see 0016, Health Care);
O	a seniority;

O	a location (0003); and
O	also, an irrelevant entity type (or Types)

Claims 12-20 are deemed analyzed and discussed with respect to the claims 1-11, above, also includes Scores includes is based on the fixed size windows of text extracted from the job postings, as claimed.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AlA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 19 is rejected under 35 U.S.C. 103 as being unpatentable over Bhadouria in view of Martin (US 2019/0325863). 

SEE 0065, FastText, w/words to n-grams
	
Therefore, since, the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to
modify Bhadouria with the teachings of Martin, to apply, a fastText model, characterized machine learning model, having advantages of breaking words into several n-grams instead of feeding individual words into the
neural networks as taught by Martin 0065.
 
Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 




Contact Information
Any inquiry concerning this communication or earlier communications should be directed to the examiner of record Vincent F. Boccio whose telephone number is (571) 272-7373. 

The examiner can normally be reached on between Monday-Thursday between (8:30 AM to 5:00 PM).

The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Boris Gorney can be reached on (571) 270-5626. 

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR.

Status information for unpublished applications is available through Private PAIR only.

For more information about the PAIR system:

"http://portal.uspto.gov/external/portal/pair" 

Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC)
866-217-9197 (toll-free) 

If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/VINCENT F BOCCIO/Primary Examiner, Art Unit 2158