DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
Claims 1 and 9 were amended. Claims 1, 3-9, and 11-16 are pending.
The rejection under 35 USC 103 is maintained. See response to arguments.

Response to Arguments
Applicant's arguments filed 12/14/2021 have been fully considered but they are not persuasive.

Applicant argues, see pages 7-8, that Vivas fails to teach “each of a plurality n of the predefined features is represented by an axis in an n-dimensional space” because the Office Action cites Vivas, saying “Any subset of the features may be represented using axes in a vector space either by taking the numerical value or by using one-hot encoding for the categorical features”. Applicant argues that this is insufficient to show inherency.
Examiner respectfully disagrees that the rejection is improper for this reason. It is important to consider what the limitation “wherein each of a plurality n of the predefined features is represented by an axis in an n-dimensional space” requires. It does not require that the computer store an n-dimensional vector space. It does not require that the computer store a mapping such as a lookup table that takes a feature to an axis of a vector space. Neither is this a positively recited step of representing each of the features as an axis in an n-dimensional space. A vector space is an abstract mathematical construct. The limitation requires that it be possible to consider the features as axes in some n-dimensional space. Based on the specification (see, e.g., [0025]), the broadest reasonable interpretation of the limitation includes a vector whose components represent feature values. For example, (31.4 centigrade, 188 centimeters, 76.5 kilograms, 13-wide) is a vector whose components represent temperature, height, weight and shoe size. This is completely analogous to the interpretation taken in the Office Action that a 
 
Applicant further argues, see especially pages 8-9, that Lu fails to teach “axes of the n-dimensional vector” or any vector structure at all. Examiner respectfully disagrees that the rejection is improper for that reason. This argument considers Lu in isolation. In response to applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references.  See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986). Vivas and Tsyganskiy clearly teach vectors storing feature data. Lu is relied upon to teach determining missing data using other data and location information (which is part of the data taught by Vivas, see [0020]). 

Applicant further argues, see page 10, that Lu is non-analogous art. In response to applicant's argument that Lu is nonanalogous art, it has been held that a prior art reference must either be in the field of applicant’s endeavor or, if not, then be reasonably pertinent to the particular problem with which the applicant was concerned, in order to be relied upon as a basis for rejection of the claimed invention.  See In re Oetiker, 977 F.2d 1443, 24 USPQ2d 1443 (Fed. Cir. 1992).  In this case, Applicant is concerned with determining missing data including “mapping a value associated with a first location feature represented by one or more existing axes of the n-dimensional vector in the n-dimensional space onto a new axis corresponding to a missing second location feature using an orthogonality relationship”. That is, the claim includes a recitation of determining missing data based on location data. This is precisely what is Lu is 

The rejection under 35 USC 103 is maintained.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 7-9, and 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over “Vivas” (US 2016/0092840 A1), in view of “Tsyganskiy” (US 2004/0107243 A1), further in view of “Lu” (An adaptive inverse-distance weighting spatial interpolation technique), and further in view of “Traupman” (US 2014/0358826 A1).

ing claim 1, Vivas teaches
A method of machine learning for generating a predictive model of a response characteristic  ([0020] describes creating a model that predicts the performance of an premium job listing. An outline of the method is shown in figure 3.)
based on historical data elements, the method comprising:  ([0023], second sentence describes using data associated with job listings that were "upsold" and determining the actual number of applicants. That is, the data used includes historical job listing data for which past performance is known.)
using a processor:  ([0042] indicates that any of the modules may be performed using a processor.)
receiving non-uniform or non-structured historical data elements and historical values for the response characteristic related to uses of the historical data elements in web pages;    ([0015] describes ingesting job posting data from third-party job sites. That is, the data is ingested from a variety of sources, so this data is understood to be non-uniform. A person skilled in the art would also recognize that data obtained by “crawl[ing]” the web will have no predetermined structure (i.e. is unstructured). This data is in addition to data obtained by the site itself as described in [0014]. This step is shown in figure 3, step 310. [0023], second sentence describes using the "actual number of applicants", which is a response characteristic. [0004] indicates that the job postings which are potentially being upsold are hosted on a website. [0027] indicates that premium includes various options such as sharing and editing, which are uses of the data elements (job postings) on the website.)
…values of a plurality of predefined features representing properties of the historical data elements, … and wherein each of a plurality n of the predefined features is represented by an axis in an n-dimensional space;  ([0018] describes using job listing data to predict the success of an upsold job listing. The job listing data includes, e.g., job features. [0055] indicates that "job features" could include country, region, job title, job functions. [0057] describes some other features associated with the job posting via the company information. [0059] describes data associated with members and their activity. Any subset of the features may be represented using axes in a vector space either by taking the numeric value of a feature or by using one-hot encoding for the categorical features.)
training a predictive model…; and predicting a value of the response characteristic for a new data element using the trained predictive model.   ([0020] describes creating (i.e., training) a model that predicts the performance of a premium job listing. Since the social network system creates a model, it is understood to correspond to a model generator. Figure 3, step 320 shows predicting a value based on accessed data. This step is further described in [0063]. The "value" is described in [0064].)
	Vivas does not appear to explicitly teach
extracting from the historical data elements, a plurality of key-value pairs defining… wherein each of the plurality of key-value pairs is represented by a two-dimensional vector comprising a first element that is a key of the feature and a second element that is the value of the feature, 
	…mapping the extracted plurality of key-value pairs for each historical data element onto an n-dimensional vector in the n-dimensional space so as to map each of the plurality of key- value pairs from the two-dimensional vector into the higher n-dimensional vector, wherein each n-dimensional vector represents a plurality of feature values for a single historical data element, and a plurality of n-dimensional vectors represents the feature values for a plurality of historical data elements with uniform and structured data, said mapping including mapping a value associated with a first location feature represented by one or more existing axes of the n-dimensional vector the n-dimensional space onto a new axis corresponding to a missing second location feature using an orthogonality relationship between the new axis and the one or more existing axes to generate a value for the missing second location feature in the new axis associated with the response characteristic, wherein the orthogonality relationship is inversely related to a distance measure between the new axis and the one or more existing axes of the n-dimensional vector;
	…using the plurality of n-dimensional vectors, wherein the trained predictive model is a support vector machine (SVM) or neural network
	However, Tsyganskiy—directed to analogous art—teaches
	extracting from the historical data elements, a plurality of key-value pairs defining… wherein each of the plurality of key-value pairs is represented by a two-dimensional vector comprising a first element that is a key of the feature and a second element that is the value of the feature, (Abstract describes a data structure for analyzing user session data, including extracting field names and field values, resulting in name-value pairs. This is further described at [0033-0035]. In particular, [0034] provides an example of a name-value pair represented as a two-dimensional element including a first component (u_name) being a key and a second component (John Smith) being a value.)
	…mapping the extracted plurality of key-value pairs for each historical data element onto an n-dimensional vector in the n-dimensional space so as to map each of the plurality of key- value pairs from the two-dimensional vector into the higher n-dimensional vector, wherein each n-dimensional vector represents a plurality of feature values for a single historical data element, and a plurality of n-dimensional vectors represents the feature values for a plurality of historical data elements with uniform and structured data, ([0035-0040] indicates that the data for a session (i.e., historical data element) may be consolidated in a vector including information related to those sessions. That is, each vector (necessarily having a dimension) represents a plurality of feature values for a single historical data element and multiple vectors represent multiple sessions. The representation as a vector is a representation with uniform and structured data.) 
	It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify Vivas to use the data extraction and data structure techniques taught by Tsyganskiy described above because this allows for analysis of multiple aspects of the data, the same data may be used for different analysis, there is no need for multiple structures of the collected data, and the data can be easily and quickly changed into a structure appropriate for a particular analysis as described by Tsyganskiy at [0011].
	The combination of Vivas and Tsyganskiy does not appear to explicitly teach
	 said mapping including mapping a value associated with a first location feature represented by one or more existing axes of the n-dimensional vector the n-dimensional space onto a new axis corresponding to a missing second location feature using an orthogonality relationship between the new axis and the one or more existing axes to generate a value for the missing second location feature in the new axis associated with the response characteristic, wherein the orthogonality relationship is inversely related to a distance measure between the new axis and the one or more existing axes of the n-dimensional vector;
	…using the plurality of n-dimensional vectors, wherein the trained predictive model is a support vector machine (SVM) or neural network
	However, Lu—directed to analogous art—teaches
	said mapping including mapping a value associated with a first location feature represented by one or more existing axes of the n-dimensional vector the n-dimensional space onto a new axis corresponding to a missing second location feature using an orthogonality relationship between the new axis and the one or more existing axes to generate a value for the missing second location feature in the new axis associated with the response characteristic, wherein the orthogonality relationship is inversely related to a distance measure between the new axis and the one or more existing axes of the n-dimensional vector; (Abstract describes using inverse-distance weighting (IDW) to perform spatial interpolation. Introduction, first paragraph indicates that IDW may be used to predict an unknown attribute value at a certain location. An overview of spatial interpolation, including inverse distance weighting, is provided in section 2. In particular, Equations (1)-(3) provide the basic formula for performing inverse distance weighting. The value y^(S_0) corresponds to the missing feature/response characteristic where S_0 is the location (i.e., second location) which is missing the feature. The “axis” is the axis corresponding to the feature value at the location S_0. This is determined using feature values at other locations y(S_i) (i.e., one or more existing axes). The weights lambda_i correspond to the orthogonality relationships. Equation (2) indicates that the weights lambda_i are inversely correlated with the distance. Here d_0i indicates the distance from location 0 to location i and alpha is a parameter which may be larger than 1 (see right hand column on page 1045). In the combination with Vivas/Tsyganskiy, it is understood that Vivas is modified to apply this technique to the vectors (including location information) described above.)
	It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify the combination of Vivas and Tsyganskiy to use inverse-distance weighting (IDW) to account for missing values as taught by Lu because IDW is relatively fast and easy to compute and straightforward to interpret as described by Lu in the abstract.
	The combination of Vivas, Tsyganskiy and Lu does not appear to explicitly teach
…using the plurality of n-dimensional vectors, wherein the trained predictive model is a support vector machine (SVM) or neural network
However, Traupman—directed to analogous art—teaches 
…using the plurality of n-dimensional vectors, wherein the trained predictive model is a support vector machine (SVM) or neural network (Abstract describes techniques for predicting a response to content by assembling raw data into feature vectors and performing modeling based on the feature vectors and a prediction model. This may also be seen in Figure 4. [0024, 0033] indicates that this may be applied in the context of employment seekers. [0047] indicates that the prediction modeling may be performed using a support vector machine or a neural network.)
It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify the combination of Vivas, Tsyganskiy and Lu to use a support vector machine or neural network as taught by Traupman in addition to the prediction techniques taught by Vivas described above because these techniques are well understood by those of ordinary skill in the art as described at [0047] of Traupman and the use of additional techniques may improve the accuracy of the system and make it more robust.

Regarding claim 9, this claim recites a system for performing the method of claim 1. Claim 9 is rejected with the same rationale as claim 1 in view of Vivas teaching an embodiment as a system in Figure 1 and at [0033-0036].

Regarding claims 7 and 15, the rejection of claim 1 and 9 is incorporated herein. Furthermore, Vivas teaches
wherein the historical data elements comprise historical job postings  ([0023], second sentence describes using data associated with job listings that were "upsold" and determining the actual number of applicants. That is, the data used includes historical job listing data for which past performance is known.)
and the new data element comprises a new job posting.  ([0019], last sentence describes making the predictions for the success of a job posting in real-time.)

Regarding claims 8 and 16, the rejection of claim 1 and 9 is incorporated herein. Furthermore, Vivas teaches
wherein the response characteristic is selected from the group consisting of 
a number of clicks;  ([0018], last sentence describes the module determining "the number of applicants, views, and impressions". [0072] indicates that a "view" could be defined as an event where a user clicks on a link.)
a number of times that a web page is shared, saved or viewed;  ([0018], last sentence describes the module determining "the number of applicants, views, and impressions")
and a number of times that a user clicks on a specific button, icon or image on a web page. ([0018], last sentence describes the module determining "the number of applicants, views, and impressions". [0072] indicates that a "view" could be defined as an event where a user clicks on a link. Links include not only text, but also graphical links.)

Claims 3-4 and 11-12 are rejected under 35 U.S.C. 103 as being unpatentable the combination of Vivas, Tsyganskiy, Lu and Traupman in view of Devries et al. (US 2017/0249445 A1), hereinafter “Devries”.

Regarding claims 3 and 11, the rejection of claim 1 and 9 is incorporated herein. 
The combination of Vivas, Tsyganskiy, Lu and Traupman does not appear to explicitly teach, but Devries teaches
partitioning the plurality of vectors into a training set and a validating set,  (Figure 15, step 236 shows dividing the dataset into a training set and a validation set. This is further described in [0104])
using the training set to generate the predictive model  (Figure 15, step 240 shows fitting the machine learning model to the training set. This is further described in [0104])
and the validating set to validate the predictive model  (Figure 15, step 242 shows using the validation dataset to calculate a validation error. This is further described in [0104])
by computing an error based on the difference between the historical value of the response characteristic for each of the historical data elements represented by the plurality of n-dimensional vectors in the validating set  and a predicted value of the response characteristic for the historical data element generated by the predictive model by training the predictive model using each of the plurality of n-dimensional vectors in the validating set.  ([0104], about halfway down describes computing the error as, for example, mean absolute error, which is just the absolute value of the difference. [0104], indicates that the error is the error between the value predicted by the trained model and the measured value. Figure 15, step 242 shows using the validation dataset to calculate a validation error. This is further described in [0104]. [0104], about halfway down indicates that the error is the error between the value predicted by the trained model and the measured value. [0104], about halfway down indicates that the "regression model is applied to the validation dataset". That is, the features are put into the model to generate the predictions.)
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention to have modified the combination described with respect to claim 1 to include model validation and updating as taught by Devries. In [0013], Devries tells us that the retraining is to allow the model to “be adapted to a new user”. That is, the validation and retraining step allow the model to adapt to a new situation. This is a specific instance of what in the machine learning community is often called “concept drift”, which describes the situation in which the process that is being modeled changes. An artisan in possession of the teaching of Vivas, Tsyganskiy, Lu and Traupman would face the same problem in which the data on which a current model was trained becomes stale, out-of-date, or not representative of a change in the process being modeled. The use of validation and retraining on more up-to-date data is a straightforward way to address this “concept drift” problem.  

Regarding claims 4 and 12, the rejection of claim 3 and 11 is incorporated herein.  
The combination of Vivas, Tsyganskiy, Lu and Traupman does not appear to explicitly teach, Devries teaches
receiving a new plurality of historical  data elements that are represented by a new plurality of n-dimensional vectors…and retraining the predictive model using the new plurality of vectors.   ([0113], second and third sentences describe adapting the model to the user when the validation error is found unsatisfactory. Examiner regards this as teaching the threshold to be whatever threshold the modeler holds as being "unsatisfactory". Choosing a number for the cutoff becomes necessary as soon as the method/system is implemented. [0113] goes on to describe incorporating the data from the new user into the training set. The mapping of data into n-dimensional vectors as a data pre-processing step was described above regarding the rejection of claim 1. It is understood that this same process would be applied to any new data. [0113] then describes retraining the model using the new/updated training set.)
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention to have performed this combination for the reasons given above with respect to claim 3.

Claims 5 and 13 are rejected under 35 U.S.C. 103 as being unpatentable the combination of Vivas, Tsyganskiy, Lu, and Traupman in view of Fradkin et al. (Fradkin, Dmitriy; and Muchnik, Ilya; "Support Vector Machines for Classification" in Abello, J.; and Carmode, G. (Eds); Discrete Methods in Epidemiology, DIMACS Series in Discrete Mathematics and Theoretical Computer Science, volume 70, pp. 13–20, 2006).

Regarding claims 5 and 13, the rejection of claim 1 and 9 is incorporated herein. Furthermore, Vivas teaches
to predict the value of the response characteristic for the new data element.  (Figure 3, step 320 shows predicting a value based on accessed data. This step is further described in [0063]. The "value" is described in [0064].)
The combination of Vivas and Tsyganskiy does not appear to explicitly teach, but Traupman teaches
wherein the predictive model is a support vector model (SVM) and (Abstract describes techniques for predicting a response to content by assembling raw data into feature vectors and performing modeling based on the feature vectors and a prediction model. This may also be seen in 
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention to have performed this combination for the reasons given above with respect to claim 1. 
While Traupman teaches using an SVM, Traupman does not explicitly teach that the SVM operates based on a set of coefficients, however Fradkin teaches
wherein predicting values comprises using a set of coefficients of the SVM  (Page 1, Introduction generally describes using support vector machines as models for performing prediction. Page 8, equations 3.3 and 3.4 show the optimization problem that is solved by a support vector machine trainer to find the coefficients alpha used by the classifier.)
It would have been obvious to a person having ordinary skill in the art before the time of the effective filing date of the claimed invention to have modified the combination of Vivas, Tsyganskiy, Lu and Traupman to use the implementation details of support vector machines as taught by Fradkin. As described on page 1 of Fradkin, “Support Vector Machines (SVM) recently became one of the most population classification methods” and “Support Vector Machines…produces classifiers with theoretical guarantees of good performance”. That is, SVMs are popular and perform well. Since Traupman teaches using an SVM, a person of ordinary skill in the art would consider how an SVM is implemented and would consider a reference such as Fradkin for implementation details.

	Claims 6 and 14 are rejected under 35 U.S.C. 103 as being unpatentable the combination of Vivas, Tsyganskiy, Lu and Traupman in view of “Friedman” (Elements of Statistical Learning – Chapter 11).

Regarding claims 6 and 14, the rejection of claim 1 and 9 is incorporated herein. Furthermore, Vivas teaches
to predict the value of the response characteristic for the new data element.  (Figure 3, step 320 shows predicting a value based on accessed data. This step is further described in [0063]. The "value" is described in [0064].)
The combination of Vivas, Tsyganskiy, Lu, and Traupman does not appear to explicitly teach, but Friedman teaches
wherein the predictive model is a neural network model and wherein predicting values comprises using a set of weights of the neural network model (Page 389, section 11.1 indicates that neural networks are powerful methods with widespread application in many fields. Section 11.3 describes the operation of neural networks. Section 11.4 describes training a neural network, including determining the weight parameters.)
It would have been obvious before the effective filing date of the claimed invention to one of ordinary skill in the art to which the invention pertains to modify the teaching of Vivas, Tsyganskiy, Lu and Traupman to use a neural network as taught by Friedman because Traupman suggests using neural networks and a person of ordinary skill in the art would be motivated to consider background references describing the functioning of neural networks such as Friedman. 

Conclusion
 THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li Zhen can be reached on (571) 272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/M.A.V./Examiner, Art Unit 2121                                                                                                                                                                                                        




/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121