DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office Action has been issued in response to Applicant’s Communication of application S/N 17/154,378 filed on August 17, 2021. Claims 1-17 are currently pending with the application. 

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

	Claim 14-17 recites on pages 6 line 28 and page 7 lines 3, 5, and 7 “the user level”. There is insufficient antecedent basis for these limitations in claim 1.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-17 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
With respect to claim 1, the claim recites a computer-implemented method for modeling heterogeneous feature sets by a computerized information system, the method comprising: generating a categorical embedding vector for each of a plurality of personalization features corresponding to a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records and each categorical embedding vector comprises a variable number of variably sized elements; calculating a query-record based attention score for each of the plurality of personalization features, the query-record based attention score indicating a weight of the corresponding personalization feature; converting each query-record based attention score to a corresponding probability distribution; scaling each categorical embedding vector based on the corresponding probability distribution; creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions; combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector; and generating a ranking function based on the fixed-dimensional query-record latent space feature vector, wherein the ranking function reorders the list of input records for the search created by the user.
The limitations directed towards generating a categorical embedding vector for each of a plurality of personalization features corresponding to a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records and each categorical embedding vector comprises a variable number of variably sized elements, converting each query-record based attention score to a corresponding probability distribution, scaling each categorical embedding vector based on the corresponding probability distribution, generating a ranking function based on the fixed-dimensional query-record latent space feature vector, wherein the ranking function reorders the list of input records for the search created by the user, is a process that, under its broadest reasonably interpretation, covers performance of these limitation in the mind but for the recitation of generic computer components. The limitations directed towards calculating a query-record based attention score for each of the plurality of personalization features, the query-record based attention score indicating a weight of the corresponding personalization feature is a process that, under its broadest reasonably interpretation, covers a mathematical operation or an act of calculating using mathematical methods. If a claim limitation, under its broadest reasonable interpretation recites a mathematical calculation, then the claim falls within the “Mathematical Concepts” grouping of abstract ideas. That is, other than reciting a computer-implemented method for modeling heterogeneous feature sets by a computerized information system, creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector, nothing in the claim precludes these steps from practically being performed in the mind and/or by a human with pen and paper and a mathematical operation or an act of calculating using mathematical methods.
For example, but for the limitations stating “a computer-implemented method for modeling heterogeneous feature sets by a computerized information system, creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector”, the mention of generating a categorical embedding vector for each of a plurality of personalization features corresponding to a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records and each categorical embedding vector comprises a variable number of variably sized elements, calculating a query-record based attention score for each of the plurality of personalization features, the query-record based attention score indicating a weight of the corresponding personalization feature, converting each query-record based attention score to a corresponding probability distribution, scaling each categorical embedding vector based on the corresponding probability distribution, generating a ranking function based on the fixed-dimensional query-record latent space feature vector, wherein the ranking function reorders the list of input records for the search created by the user, in the context of this claim, encompasses a user mentally generating vectors using mental calculations, conversions, probability measurements (scaling), and ranking data used in a search. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
The judicial exception is not integrated into a practical application by additional elements. In particular, the claim recites “a computer-implemented method for modeling heterogeneous feature sets by a computerized information system, creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector”. A computer-implemented method and a computerized information system is recited at a high level of generality (i.e., as a generic computer performing a generic computer function of comparing and merging) such that it amounts to no more than mere instructions to apply the exception. Creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector is interpreted by the examiner to be mere data gathering which the courts have found to be insignificant extra-solution activity. These elements do not integrate the abstract idea into a practical application because it does not impose a meaningful limit on the judicial exception and it merely confines the claim to a particular technological environment or field of use in conjunction with the abstract idea.
This claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements, a computer-implemented method and a computerized information system recited at a high level of generality to apply the exception using generic components. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. The additional elements, creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector are interpreted to be well understood, routine, and conventional activity (Storing and retrieving information in memory, Versata (see MPEP 2106.05(d))). To further elaborate, the additional elements, “a computer-implemented method for modeling heterogeneous feature sets by a computerized information system, creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector”, does not impose a meaningful limit on the judicial exception and it merely confines the claim to a particular technological environment or field of use. Claim 1 is not patent eligible.

With respect to claim 5, the claim recites a non-transitory machine-readable storage medium that provides instructions that, if executed by a processor, are configurable to cause said processor to perform operations comprising: generating a categorical embedding vector for each of a plurality of personalization features corresponding to a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records and each categorical embedding vector comprises a variable number of variably sized elements; calculating a query-record based attention score for each of the plurality of personalization features, the query-record based attention score indicating a weight of the corresponding personalization feature; converting each query-record based attention score to a corresponding probability distribution; scaling each categorical embedding vector based on the corresponding probability distribution; creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions; combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector; and generating a ranking function based on the fixed-dimensional query-record latent space feature vector, wherein the ranking function reorders the list of input records for the search created by the user.
The limitations directed towards generating a categorical embedding vector for each of a plurality of personalization features corresponding to a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records and each categorical embedding vector comprises a variable number of variably sized elements, converting each query-record based attention score to a corresponding probability distribution, scaling each categorical embedding vector based on the corresponding probability distribution, generating a ranking function based on the fixed-dimensional query-record latent space feature vector, wherein the ranking function reorders the list of input records for the search created by the user, is a process that, under its broadest reasonably interpretation, covers performance of these limitation in the mind but for the recitation of generic computer components. The limitations directed towards calculating a query-record based attention score for each of the plurality of personalization features, the query-record based attention score indicating a weight of the corresponding personalization feature is a process that, under its broadest reasonably interpretation, covers a mathematical operation or an act of calculating using mathematical methods. If a claim limitation, under its broadest reasonable interpretation recites a mathematical calculation, then the claim falls within the “Mathematical Concepts” grouping of abstract ideas. That is, other than reciting a non-transitory machine-readable storage medium that provides instructions that, if executed by a processor, are configurable to cause said processor to perform operations, creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector, nothing in the claim precludes these steps from practically being performed in the mind and/or by a human with pen and paper and a mathematical operation or an act of calculating using mathematical methods.
For example, but for the limitations stating “a non-transitory machine-readable storage medium that provides instructions that, if executed by a processor, are configurable to cause said processor to perform operations, creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector”, the mention of generating a categorical embedding vector for each of a plurality of personalization features corresponding to a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records and each categorical embedding vector comprises a variable number of variably sized elements, calculating a query-record based attention score for each of the plurality of personalization features, the query-record based attention score indicating a weight of the corresponding personalization feature, converting each query-record based attention score to a corresponding probability distribution, scaling each categorical embedding vector based on the corresponding probability distribution, generating a ranking function based on the fixed-dimensional query-record latent space feature vector, wherein the ranking function reorders the list of input records for the search created by the user, in the context of this claim, encompasses a user mentally generating vectors using mental calculations, conversions, probability measurements (scaling), and ranking data used in a search. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components and a mathematical calculation, then it falls within the “Mental Processes” and “Mathematical Concepts” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
The judicial exception is not integrated into a practical application by additional elements. In particular, the claim recites “a non-transitory machine-readable storage medium that provides instructions that, if executed by a processor, are configurable to cause said processor to perform operations, creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector”. a non-transitory machine-readable storage medium that provides instructions that, if executed by a processor, are configurable to cause said processor to perform operations is recited at a high level of generality (i.e., as a generic computer performing a generic computer function of comparing and merging) such that it amounts to no more than mere instructions to apply the exception. Creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector is interpreted by the examiner to be mere data gathering which the courts have found to be insignificant extra-solution activity. These elements do not integrate the abstract idea into a practical application because it does not impose a meaningful limit on the judicial exception and it merely confines the claim to a particular technological environment or field of use in conjunction with the abstract idea.
This claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements, a non-transitory machine-readable storage medium that provides instructions that, if executed by a processor, are configurable to cause said processor to perform operations recited at a high level of generality to apply the exception using generic components. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. The additional elements, creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector are interpreted to be well understood, routine, and conventional activity (Storing and retrieving information in memory, Versata (see MPEP 2106.05(d))). To further elaborate, the additional elements, “a non-transitory machine-readable storage medium that provides instructions that, if executed by a processor, are configurable to cause said processor to perform operations, creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector”, does not impose a meaningful limit on the judicial exception and it merely confines the claim to a particular technological environment or field of use. Claim 5 is not patent eligible.

With respect to claim 9, the claim recites an apparatus comprising: a processor; a non-transitory machine-readable storage medium that provides instructions that, if executed by the processor, are configurable to cause the apparatus to perform operations comprising: generating a categorical embedding vector for each of a plurality of personalization features corresponding to a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records and each categorical embedding vector comprises a variable number of variably sized elements; calculating a query-record based attention score for each of the plurality of personalization features, the query-record based attention score indicating a weight of the corresponding personalization feature; converting each query-record based attention score to a corresponding probability distribution; scaling each categorical embedding vector based on the corresponding probability distribution; creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions; combining the fixed-dimensional personalization feature vector and a fixed- dimensional non-personalization feature vector to produce a fixed-dimensional query- record latent space feature vector; and generating a ranking function based on the fixed-dimensional query-record latent space feature vector, wherein the ranking function reorders the list of input records for the search created by the user.
The limitations directed towards generating a categorical embedding vector for each of a plurality of personalization features corresponding to a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records and each categorical embedding vector comprises a variable number of variably sized elements, converting each query-record based attention score to a corresponding probability distribution, scaling each categorical embedding vector based on the corresponding probability distribution, generating a ranking function based on the fixed-dimensional query-record latent space feature vector, wherein the ranking function reorders the list of input records for the search created by the user, is a process that, under its broadest reasonably interpretation, covers performance of these limitation in the mind but for the recitation of generic computer components. The limitations directed towards calculating a query-record based attention score for each of the plurality of personalization features, the query-record based attention score indicating a weight of the corresponding personalization feature is a process that, under its broadest reasonably interpretation, covers a mathematical operation or an act of calculating using mathematical methods. If a claim limitation, under its broadest reasonable interpretation recites a mathematical calculation, then the claim falls within the “Mathematical Concepts” grouping of abstract ideas. That is, other than reciting an apparatus comprising: a processor; a non-transitory machine-readable storage medium that provides instructions that, if executed by the processor, are configurable to cause the apparatus to perform operations, creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector, nothing in the claim precludes these steps from practically being performed in the mind and/or by a human with pen and paper and a mathematical operation or an act of calculating using mathematical methods.
For example, but for the limitations stating “an apparatus comprising: a processor; a non-transitory machine-readable storage medium that provides instructions that, if executed by the processor, are configurable to cause the apparatus to perform operations, creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector”, the mention of generating a categorical embedding vector for each of a plurality of personalization features corresponding to a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records and each categorical embedding vector comprises a variable number of variably sized elements, calculating a query-record based attention score for each of the plurality of personalization features, the query-record based attention score indicating a weight of the corresponding personalization feature, converting each query-record based attention score to a corresponding probability distribution, scaling each categorical embedding vector based on the corresponding probability distribution, generating a ranking function based on the fixed-dimensional query-record latent space feature vector, wherein the ranking function reorders the list of input records for the search created by the user, in the context of this claim, encompasses a user mentally generating vectors using mental calculations, conversions, probability measurements (scaling), and ranking data used for a search. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components and a mathematical calculation, then it falls within the “Mental Processes” and “Mathematical Concepts” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
The judicial exception is not integrated into a practical application by additional elements. In particular, the claim recites “an apparatus comprising: a processor; a non-transitory machine-readable storage medium that provides instructions that, if executed by the processor, are configurable to cause the apparatus to perform operations, creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector”. An apparatus comprising: a processor; a non-transitory machine-readable storage medium that provides instructions that, if executed by the processor, are configurable to cause the apparatus to perform operations is recited at a high level of generality (i.e., as a generic computer performing a generic computer function of comparing and merging) such that it amounts to no more than mere instructions to apply the exception. Creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector is interpreted by the examiner to be mere data gathering which the courts have found to be insignificant extra-solution activity. These elements do not integrate the abstract idea into a practical application because it does not impose a meaningful limit on the judicial exception and it merely confines the claim to a particular technological environment or field of use in conjunction with the abstract idea.
This claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements, an apparatus comprising: a processor; a non-transitory machine-readable storage medium that provides instructions that, if executed by the processor, are configurable to cause the apparatus to perform operations recited at a high level of generality to apply the exception using generic components. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. The additional elements, creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector are interpreted to be well understood, routine, and conventional activity (Storing and retrieving information in memory, Versata (see MPEP 2106.05(d))). To further elaborate, the additional elements, “an apparatus comprising: a processor; a non-transitory machine-readable storage medium that provides instructions that, if executed by the processor, are configurable to cause the apparatus to perform operations, creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector”, does not impose a meaningful limit on the judicial exception and it merely confines the claim to a particular technological environment or field of use. Claim 9 is not patent eligible.

With respect to claim 13, the claim recites a computer-implemented method for modeling heterogeneous feature sets by a computerized information system, the method comprising: generating a similarity factor for each of a plurality of personalization features corresponding to a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records; calculating a personality feature weight for each of the plurality of personalization features; converting each personality feature weight to a corresponding probability distribution; scaling each similarity factor based on the corresponding probability distribution; generating a most recently used affinity value for each of the plurality of personalization features; and generating a ranking function based on the most recently used affinity values, wherein the ranking function reorders the list of input records for the search created by the user.
The limitations directed towards generating a similarity factor for each of a plurality of personalization features corresponding to a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records, converting each personality feature weight to a corresponding probability distribution, scaling each similarity factor based on the corresponding probability distribution, generating a most recently used affinity value for each of the plurality of personalization features, and generating a ranking function based on the most recently used affinity values, wherein the ranking function reorders the list of input records for the search created by the user, is a process that, under its broadest reasonably interpretation, covers performance of these limitation in the mind but for the recitation of generic computer components. The limitations directed towards calculating a personality feature weight for each of the plurality of personalization features is a process that, under its broadest reasonably interpretation, covers a mathematical operation or an act of calculating using mathematical methods. If a claim limitation, under its broadest reasonable interpretation recites a mathematical calculation, then the claim falls within the “Mathematical Concepts” grouping of abstract ideas. That is, other than reciting a computer-implemented method for modeling heterogeneous feature sets by a computerized information system, nothing in the claim precludes these steps from practically being performed in the mind and/or by a human with pen and paper and a mathematical operation or an act of calculating using mathematical methods.
For example, but for the limitations stating “a computer-implemented method for modeling heterogeneous feature sets by a computerized information system”, the mention of generating a similarity factor for each of a plurality of personalization features corresponding to a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records, converting each personality feature weight to a corresponding probability distribution, scaling each similarity factor based on the corresponding probability distribution, generating a most recently used affinity value for each of the plurality of personalization features, and generating a ranking function based on the most recently used affinity values, wherein the ranking function reorders the list of input records for the search created by the user, in the context of this claim, encompasses a user mentally generating ranking function based on affinity values using mental calculations, conversions, probability measurements (scaling), and ranking data used for a search. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components and a mathematical calculation, then it falls within the “Mental Processes” and “Mathematical Concepts” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
The judicial exception is not integrated into a practical application by additional elements. In particular, the claim recites “a computer-implemented method for modeling heterogeneous feature sets by a computerized information system”. A computer-implemented method and a computerized information system is recited at a high level of generality (i.e., as a generic computer performing a generic computer function of comparing and merging) such that it amounts to no more than mere instructions to apply the exception. These elements do not integrate the abstract idea into a practical application because it does not impose a meaningful limit on the judicial exception and it merely confines the claim to a particular technological environment or field of use in conjunction with the abstract idea.
This claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements, a computer-implemented method and a computerized information system are recited at a high level of generality to apply the exception using generic components. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. To further elaborate, the additional elements, “a computer-implemented method and a computerized information system”, does not impose a meaningful limit on the judicial exception and it merely confines the claim to a particular technological environment or field of use. Claim 13 is not patent eligible.

With respects to claims 2, 6, and 10, the limitations are directed towards the computer-implemented method of claim 1, wherein calculating a query-record based attention score for each of the plurality of personalization features comprises: multiplying the fixed-dimensional non-personalization feature vector by a weight matrix to produce a weighted fixed-dimensional non-personalization feature vector; and for each of the categorical embedding vectors: calculating a dot product of the weighted fixed-dimensional non-personalization feature vector and the categorical embedding vector to produce the corresponding query- record based attention score. These elements are directed to mathematical operation or an act of calculating using mathematical methods. Claims 2, 6, and 10, do not recite additional limitations which tie the abstract idea into a practical application and do not amount to significantly more than the identified judicial exception.

With respects to claims 3, 7 and 11, the limitations are directed towards the computer-implemented method of claim 1, wherein the ranking function is selected from the group consisting of: pointwise; pairwise; groupwise; and set-wise. These elements further elaborate the abstract idea and the human mind and/or with pen and paper can select from the group consisting of pointwise; pairwise; groupwise; and set-wise. Therefore, claims 3, 7, and 11, do not recite additional limitations which tie the abstract idea into a practical application and do not amount to significantly more than the identified judicial exception.

With respects to claims 4, 8, and 12, the limitations are directed towards the computer-implemented method of claim 1, wherein converting each query-record based attention score to a corresponding probability distribution comprises using a softmax activation function. These elements further elaborate the abstract idea and they merely confine the claims to a particular technological environment or field of use. Therefore, claims 4, 8, and 12, do not recite additional limitations which tie the abstract idea into a practical application and do not amount to significantly more than the identified judicial exception.

With respects to claims 14-17, the limitations are directed towards ranking function reorders the list of input records in a predetermined order of relevance at the user level. The elements further elaborate the abstract idea and the human mind and/or with pen and paper can reorder the list of input records in a predetermined order of relevance at the user level . Therefore, claims 14-17, do not recite additional limitations which tie the abstract idea into a practical application and do not amount to significantly more than the identified judicial exception.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 5, and 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kang et al. (U.S. Publication No.: US 20200408566 A1) hereinafter Kang, Meek et al. (U.S. Publication No.: US 20090006285 A1) hereinafter Meek, in view of Zhang et al. (U.S. Publication No.: US 20150186938 A1) hereinafter Zhang, in view of Polak et al. (U.S. Publication No.: US 20180032845 A1) hereinafter Polak, and further in view of Berkman et al. (U.S. Publication No.: US 20110066613 A1) hereinafter Berkman.
As to claim 1:
Kang discloses:
A computer-implemented method for modeling heterogeneous feature sets by a computerized information system [Paragraph 0046 teaches system creates a generic anomaly detection and classification machine learning model based on a general training dataset, deploys the model in a cloud server, and creates a copy of the model for each individual building/equipment/device of a user. The system detects and classifies anomalies from real-time sensor data based off of the model. In an embodiment, the system continuously updates the model based on a user's feedback about the detection and classification.], the method comprising:
generating a categorical embedding vector for each of a plurality of personalization features corresponding to a most recently used affinity [Paragraph 0084 teaches user-defined labels (or categories) may also be incorporated. In step 908, an engineer reviews default anomalies and determines labels. The engineer may make a selection or provide other types of user input to identify one or more user-defined labels for anomalies the engineer wishes to address in building management. These labels are stored in a database 910. Next, in step 912 label vectors are determined from the labels stored in database 910.... a pre-trainer classifier may be created that takes into account features learned through automated processing of feature vectors and through labels learned through automated processing of label vectors corresponding to user-defined labels. Paragraph 0091 teaches classifier 1120 is obtained in step 1130 by selecting the most recently used classifier from either pre-trained classifier database 314 or a building/equipment/device specific classifier database 316.  Note: User-defined label (personalization features) vectors (categorical embedding vector) that are determined from labels stored in a database and are used to create most recently used classifiers (a most recently used affinity) are interpreted to read on the claimed generating a categorical embedding vector for each of a plurality of personalization features corresponding to a most recently used affinity.], wherein each categorical embedding vector comprises a variable number of variably sized elements [Paragraph 0084 teaches user-defined labels (or categories) may also be incorporated. In step 908, an engineer reviews default anomalies and determines labels. The engineer may make a selection or provide other types of user input to identify one or more user-defined labels for anomalies the engineer wishes to address in building management. These labels are stored in a database 910. Next, in step 912 label vectors are determined from the labels stored in database 910.... a pre-trainer classifier may be created that takes into account features learned through automated processing of feature vectors and through labels learned through automated processing of label vectors corresponding to user-defined labels. Paragraph 0091 teaches classifier 1120 is obtained in step 1130 by selecting the most recently used classifier from either pre-trained classifier database 314 or a building/equipment/device specific classifier database 316. Figure 14: 1410 and Paragraph 0090 teaches array of feature vectors are determined from the extracted features. FIG. 14 shows an example of an array 1410 of feature vectors. The feature vectors in the array are made up of processed data representing the relative values of features which are time-related, weather-related, and/or energy-related. Note: The labeled feature vectors that have appear to have a variable number of features, as indicated by the mention of currently 17 features, and a variable number (size) of elements associated with each time-related, weather-related, or energy-related feature is interpreted to read on the claimed each categorical embedding vector comprises a variable number of variably sized elements.];
personalization feature [Paragraph 0084 teaches user-defined labels (or categories) may also be incorporated. In step 908, an engineer reviews default anomalies and determines labels. The engineer may make a selection or provide other types of user input to identify one or more user-defined labels for anomalies the engineer wishes to address in building management. These labels are stored in a database 910. Next, in step 912 label vectors are determined from the labels stored in database 910.... a pre-trainer classifier may be created that takes into account features learned through automated processing of feature vectors and through labels learned through automated processing of label vectors corresponding to user-defined labels. Note: The user-defined labels are interpreted to read on the claimed personalization feature.]
personalization feature vector [Paragraph 0084 teaches user-defined labels (or categories) may also be incorporated. In step 908, an engineer reviews default anomalies and determines labels. The engineer may make a selection or provide other types of user input to identify one or more user-defined labels for anomalies the engineer wishes to address in building management. These labels are stored in a database 910. Next, in step 912 label vectors are determined from the labels stored in database 910.... a pre-trainer classifier may be created that takes into account features learned through automated processing of feature vectors and through labels learned through automated processing of label vectors corresponding to user-defined labels. Note: Labels that are user-defined associated with vectors as label vectors are interpreted to read on the claimed personalization feature vector.]
non-personalization feature vector [Paragraph 0084 teaches user-defined labels (or categories) may also be incorporated. In step 908, an engineer reviews default anomalies and determines labels. The engineer may make a selection or provide other types of user input to identify one or more user-defined labels for anomalies the engineer wishes to address in building management. These labels are stored in a database 910. Next, in step 912 label vectors are determined from the labels stored in database 910.... a pre-trainer classifier may be created that takes into account features learned through automated processing of feature vectors and through labels learned through automated processing of label vectors corresponding to user-defined labels. Note: Making a selection from labels instead of providing other types of user input to identify one or more user-defined labels is interpreted to be making a selection of labels that are not user-defined and are associated with vectors. Labels that are not user-defined that are associated with vectors as label vectors are interpreted to read on the claimed  non-personalization feature vector.] 

Kang discloses some of the limitations as set forth in claim 1 but does not appear to expressly disclose a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records, calculating a query-record based attention score for each of the plurality of features, the query-record based attention score indicating a weight of the corresponding feature, converting each query-record based attention score to a corresponding probability distribution, scaling each categorical embedding vector based on the corresponding probability distribution, creating a fixed-dimensional feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, combining the fixed-dimensional feature vector and a fixed-dimensional feature vector to produce a fixed-dimensional query-record latent space feature vector, and generating a ranking function based on the fixed-dimensional query-record latent space feature vector, wherein the ranking function reorders the list of input records for the search created by the user.
Meek discloses:
a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records [Paragraph 0003 teaches one mechanism for categorizing media content, such as pictures or video clips, is the use of metadata tags. Paragraph 0052 teaches system 500 can also utilize a list of user preferences stored within a storage component 506 to customize a generation, weighting, presentation, and/or association of tags with an input item 504. Paragraph 0053 teaches preferences can indicate whether to use a MRU tag list or user-specified tag list to generate tags, and what priority to give such criteria (e.g., as discussed at FIG. 2, supra). Paragraph 0054 teaches preferences can instruct tagging assistant 502 to search recipient(s) of the item to see if any stored tags (e.g., contained within list management component 208) are associated with the recipient(s). Paragraph 0075 teaches at 1106, classifiers and user preferences can be employed to relevance rank the tags. At 1108, the tags can be presented to a device user in order of relevance rank. Note: The classifiers and user preferences for searching recipients (user selects to create a search) ranking customized tags (plurality of personalization features) (custom fields of input records) based relevance, where the tags are associated with an MRU (most recently used) list (affinity) is interpreted to read on the claimed generating a similarity factor for each of a plurality of personalization features corresponding to a most recently used affinity.]
wherein the ranking function reorders the list of input records for the search created by the user [Paragraph 0006 teaches tags can be relevance ranked as a function of relatedness to determined criteria, including item content, tags most recently used by a sender and/or recipient, user-defined preferences, one or more user profiles, or the like. Tags can be sorted as a function of relevance ranking, and automatically attributed to one or more items of communication, or presented to a recipient for selection. Paragraph 0054 teaches preferences can instruct tagging assistant 502 to search recipient(s) of the item to see if any stored tags (e.g., contained within list management component 208) are associated with the recipient(s). Paragraph 0061 component 606 can employ a probabilistic-based or statistical-based approach in connection with choosing between potential tags associated with an input item, optionally auto-assigning tags to input items 604, or offering proposed tags as a function of relevance rank. Note: Ranking/sorting (reordering) tags associated with an input item (list of input records) that are user-defined to search recipients (for the search) reads on the claimed wherein the ranking function reorders the list of input records for the search created by the user.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, by incorporating classifiers and user preferences for searching recipients and ranking customized tags based on relevance, wherein the tags are associated with an MRU, as taught by Meek (see Paragraph 0003, 0052, 0053, 0054, and 0075) because both applications are directed to data processing; incorporating classifiers and user preferences for searching recipients and ranking customized tags based on relevance, wherein the tags are associated with an MRU can greatly improve the speed and accuracy with which tags can be generated and associated with items of communication (see Meek Paragraph 0006).

Kang and Meek discloses some of the limitations as set forth in claim 1 but does not appear to expressly disclose calculating a query-record based attention score for each of the plurality of features, the query-record based attention score indicating a weight of the corresponding feature, converting each query-record based attention score to a corresponding probability distribution, scaling each categorical embedding vector based on the corresponding probability distribution, creating a fixed-dimensional feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, combining the fixed-dimensional feature vector and a fixed-dimensional feature vector to produce a fixed-dimensional query-record latent space feature vector, and generating a ranking function based on the fixed-dimensional query-record latent space feature vector.
Zhang discloses:
calculating a query-record based attention score for each of the plurality of features, the query-record based attention score indicating a weight of the corresponding feature [Paragraph 0043 teaches machine learning scorer component 224 may be embodied as a machine learning model configured for determining the weights for each term, word, or feature in the reference/query and target/advertisement document, and in an embodiment for determining the probability that a target document (such as an ad) is relevant to the reference (query). Paragraph 0044 teaches the machine learning scorer is based on a logistic regression model such as an L1-regularized logistic regression model. Equation(1) and Equation (2). Logistic regression models the probability distribution of the class label R given a feature vector (q,a) as in Equation (2), where w is the parameters of the logistic regression model; y.sub.i represents the label of the document, for query document i; and P is the probability calculated based on current w. In particular, in an embodiment, w represents the weights, and is a dot-product which can be used in the WAND operator. Note: The cited determining the weights for each term, word, or feature in the reference/query and target/advertisement document is interpreted to read on the claimed calculating a query-record based attention score for each of the plurality of personalization features, the query-record based attention score indicating a weight of the corresponding feature because the cited weight corresponds to features associated with a query (query-based attention), therefore the cited weight is interpreted  to also include the claimed calculated query-record based attention score.]; 
converting each query-record based attention score to a corresponding probability distribution [Paragraph 0043 teaches machine learning scorer component 224 may be embodied as a machine learning model configured for determining the weights for each term, word, or feature in the reference/query and target/advertisement document, and in an embodiment for determining the probability that a target document (such as an ad) is relevant to the reference (query). Paragraph 0044 teaches the machine learning scorer is based on a logistic regression model such as an L1-regularized logistic regression model. Equation(1) and Equation (2). Logistic regression models the probability distribution of the class label R given a feature vector (q,a) as in Equation (2), where w is the parameters of the logistic regression model; y.sub.i represents the label of the document, for query document i; and P is the probability calculated based on current w. In particular, in an embodiment, w represents the weights, and is a dot-product which can be used in the WAND operator. Note: The examiner interprets logistic regression modeling the probability distribution of the class label R given a feature vector (q,a) as in part of Equation (2), where the weight (query-record based attention score) is the parameters of the logistic regression model and P is the probability (probability distribution). The examiner interprets the weight to be converted to a probability distribution because the probability associated with the logistic regression model modeling the probability distribution is based on the weight (query-record based attention score).; 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang and Meek, by incorporating determining weights for each term, word, or feature in the reference/query and target/advertisement document and logistic regression modeling the probability distribution, as taught by Zhang (see Paragraph 0043 and 0044), because the three applications are directed to displaying vector processing; incorporating determining weights for each term, word, or feature in the reference/query and target/advertisement document and logistic regression modeling the probability distribution provides improved accuracy of the model (see Zhang Paragraph 0045).

Kang, Meek, and Zhang discloses some of the limitations as set forth in claim 1 but does not appear to expressly disclose scaling each categorical embedding vector based on the corresponding probability distribution, creating a fixed-dimensional feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, combining the fixed-dimensional feature vector and a fixed-dimensional feature vector to produce a fixed-dimensional query-record latent space feature vector, and generating a ranking function based on the fixed-dimensional query-record latent space feature vector.
Polak discloses:
scaling each categorical embedding vector based on the corresponding probability distribution [Paragraph 0019 teaches the class probability includes one or more of a plurality of concept probability distribution values each indicating a classification probability score. Paragraph 0094 teaches the calculated class probability may include a class probability score record that may be a vector of class probability distributions of real valued scores. The vector of class probability distributions may be interpreted as probability of a respective concept to be present (exist) in the presented content, e.g. the modalities data 310. Paragraph 0099 teaches each of the score vectors holds the class probability distributions for the respective concept of the respective modality identified in the scene 302. Some of the modalities may produce several score vectors for a single concept. For example, a score vector may be calculated and assigned to a respective visual concept detected in each of several frames extracted from the scene 302. As another example, a first score vector may be calculated and assigned to one or more text concepts extracted from the scene 302 through OCR tools while a second score vector may be calculated and assigned to one or more text objects extracted from the scene 302 using speech-to-text tools. Note: A calculated class probability score that is a vector (categorical embedding vector) including a probability distributions of scores (corresponding probability distribution) is interpreted to read on the claimed scaling each categorical embedding vector based on the corresponding probability distribution because the vector representing the class probability depends on the probability distributions (corresponding probability distribution) and therefore would be scaled or adjusted based on the probability distributions. For example, in the context of the cited prior art, a first calculated score vector (class probability score record that may be a vector) and a second calculated score vector (class probability score record that may be a vector) indicate a plurality of calculated vectors that are based on probability distributions of each respective concept to be present (exist) in the presented content, e.g. the modalities data, therefore, for the probability distribution of each concept there is an adjusted or different class probability score record that may be a vector (categorical embedding vector). In other words, the cited second calculated score vector is a different or changed vector from the cited first calculated score vector.]  
creating a fixed-dimensional feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions [Paragraph 0094 teaches the calculated class probability may include a class probability score record that may be a vector of class probability distributions of real valued scores. The vector of class probability distributions may be interpreted as probability of a respective concept to be present (exist) in the presented content, e.g. the modalities data 310. Paragraph 0100 teaches the aggregation aims to aggregate the one or more score vectors of a certain dimensionality to one in-modality class probability vector of the same dimensionality. Note: The one in-modality class probability vector of the same dimensionality (a fixed-dimensional personalization feature vector) that is created by aggregating the one or more score vectors of a certain dimensionality (scaled categorical embedding vectors based on the probability distributions) is interpreted to read on the claimed creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions because, in the context of the cited prior art, one or more score vectors of a certain dimensionality reasonably includes scaled vectors and one in-modality class probability vector of the same dimensionality is interpreted to be created based on certain dimensionality vectors.]; 
combining the fixed-dimensional feature vector and a fixed-dimensional vector to produce a fixed-dimensional query-record latent space feature vector [Paragraph 0044 teaches the staged classification approach employs several methods and techniques for accurately categorizing semantically the video stream content for media search, media monitoring and/or media targeting applications. Paragraph 0094 teaches the calculated class probability may include a class probability score record that may be a vector of class probability distributions of real valued scores. The vector of class probability distributions may be interpreted as probability of a respective concept to be present (exist) in the presented content, e.g. the modalities data 310. Paragraph 0100 teaches the aggregation aims to aggregate the one or more score vectors of a certain dimensionality to one in-modality class probability vector of the same dimensionality. Note: Aggregating the one or more score vectors of a certain dimensionality to one in-modality class probability vector of the same dimensionality is interpreted to also read on the claimed combining the fixed-dimensional feature vector and a fixed-dimensional vector to produce a fixed-dimensional query-record latent space feature vector because aggregating (combining) vectors (fixed-dimensional feature vector and a fixed-dimensional vector) results in (produces) one in-modality class probability vector of the same dimensionality (a fixed-dimensional query-record latent space feature vector). To further elaborate, in-modality or state is interpreted to be the claimed latent space and the cited one or same dimensionality is the claimed fixed dimensional feature vector, the cited classification associated with the vectors are interpreted to be the claimed features, the media search (query) associated with a classification approach to accurately categorize content is interpreted to include the claimed query-record. Furthermore, the claimed fixed-dimensional query-record latent space feature vector and fixed-dimensional feature vector are both fixed-dimensional feature vector based on combine or aggregating vectors. Therefore the cited one in-modality class probability vector of the same dimensionality produced by combining and aggregating vectors associated with a search (query record) reads on the claimed fixed-dimensional query-record latent space feature vector.]; and 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, Meek, and Zhang, by incorporating aggregating the one or more vectors of a certain dimensionality to one in-modality class probability vector of the same dimensionality, as taught by Polak (see Paragraph 0044, 0094, and 0100), because four applications are directed to vector processing; incorporating aggregating the one or more vectors of a certain dimensionality to one in-modality class probability vector of the same dimensionality significantly improves the precision of categorization (see Polak Paragraph 0050).

Kang, Meek, Zhang, and Polak discloses most of the limitations as set forth in claim 1 but does not appear to expressly disclose generating a ranking function based on the fixed-dimensional query-record latent space feature vector.
Berkman discloses:
generating a ranking function based on the fixed-dimensional query-record latent space feature vector [Paragraph 0116 teaches for calculating the ranking function, the ranking function calculator 120 represents each content object as a vector in a, possibly high-dimensional, space of content-characterizing features selected by the ranking function calculator 120, to characterize content… the content-characterizing features are denoted by F, and the space of content-characterizing features is thus of dimensionality |F|. Paragraph 0122 teaches the ranking function calculator 120 calculates the ranking function 350 based on the user's interactions 360 with the content objects 35, as tracked by the interaction tracker 110. Note: Calculating the ranking function based on a vector representation of content objects associated with high dimensional space of content-characterizing features (latent space feature) is interpreted to read on the claimed generating a ranking function based on the fixed-dimensional query-record latent space feature vector because the vector is associated with a dimension set (fixed) to |F| and based on the cited vector the ranking function is calculated (generating a ranking function).]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, Meek, Zhang, and Polak, by incorporating calculating the ranking function based on a vector representation of content objects associated with high dimensional space of content-characterizing features (latent space feature), as taught by Berkman (see Paragraph 0116 and 0122), because five applications are directed to vector processing; incorporating calculating the ranking function based on a vector representation of content objects associated with high dimensional space of content-characterizing features (latent space feature) reduces the burden put on the user (see Berkman Paragraph 0011).

As to claim 5:
Kang discloses:
A non-transitory machine-readable storage medium that provides instructions that, if executed by a processor, are configurable to cause said processor to perform operations comprising: 
generating a categorical embedding vector for each of a plurality of personalization features corresponding to a most recently used affinity [Paragraph 0084 teaches user-defined labels (or categories) may also be incorporated. In step 908, an engineer reviews default anomalies and determines labels. The engineer may make a selection or provide other types of user input to identify one or more user-defined labels for anomalies the engineer wishes to address in building management. These labels are stored in a database 910. Next, in step 912 label vectors are determined from the labels stored in database 910.... a pre-trainer classifier may be created that takes into account features learned through automated processing of feature vectors and through labels learned through automated processing of label vectors corresponding to user-defined labels. Paragraph 0091 teaches classifier 1120 is obtained in step 1130 by selecting the most recently used classifier from either pre-trained classifier database 314 or a building/equipment/device specific classifier database 316.  Note: User-defined label (personalization features) vectors (categorical embedding vector) that are determined from labels stored in a database and are used to create most recently used classifiers (a most recently used affinity) are interpreted to read on the claimed generating a categorical embedding vector for each of a plurality of personalization features corresponding to a most recently used affinity.], wherein each categorical embedding vector comprises a variable number of variably sized elements [Paragraph 0084 teaches user-defined labels (or categories) may also be incorporated. In step 908, an engineer reviews default anomalies and determines labels. The engineer may make a selection or provide other types of user input to identify one or more user-defined labels for anomalies the engineer wishes to address in building management. These labels are stored in a database 910. Next, in step 912 label vectors are determined from the labels stored in database 910.... a pre-trainer classifier may be created that takes into account features learned through automated processing of feature vectors and through labels learned through automated processing of label vectors corresponding to user-defined labels. Paragraph 0091 teaches classifier 1120 is obtained in step 1130 by selecting the most recently used classifier from either pre-trained classifier database 314 or a building/equipment/device specific classifier database 316. Figure 14: 1410 and Paragraph 0090 teaches array of feature vectors are determined from the extracted features. FIG. 14 shows an example of an array 1410 of feature vectors. The feature vectors in the array are made up of processed data representing the relative values of features which are time-related, weather-related, and/or energy-related. Note: The labeled feature vectors that have appear to have a variable number of features, as indicated by the mention of currently 17 features, and a variable number (size) of elements associated with each time-related, weather-related, or energy-related feature is interpreted to read on the claimed each categorical embedding vector comprises a variable number of variably sized elements.];
personalization feature [Paragraph 0084 teaches user-defined labels (or categories) may also be incorporated. In step 908, an engineer reviews default anomalies and determines labels. The engineer may make a selection or provide other types of user input to identify one or more user-defined labels for anomalies the engineer wishes to address in building management. These labels are stored in a database 910. Next, in step 912 label vectors are determined from the labels stored in database 910.... a pre-trainer classifier may be created that takes into account features learned through automated processing of feature vectors and through labels learned through automated processing of label vectors corresponding to user-defined labels. Note: The user-defined labels are interpreted to read on the claimed personalization feature.]
personalization feature vector [Paragraph 0084 teaches user-defined labels (or categories) may also be incorporated. In step 908, an engineer reviews default anomalies and determines labels. The engineer may make a selection or provide other types of user input to identify one or more user-defined labels for anomalies the engineer wishes to address in building management. These labels are stored in a database 910. Next, in step 912 label vectors are determined from the labels stored in database 910.... a pre-trainer classifier may be created that takes into account features learned through automated processing of feature vectors and through labels learned through automated processing of label vectors corresponding to user-defined labels. Note: Labels that are user-defined associated with vectors as label vectors are interpreted to read on the claimed personalization feature vector.]
non-personalization feature vector [Paragraph 0084 teaches user-defined labels (or categories) may also be incorporated. In step 908, an engineer reviews default anomalies and determines labels. The engineer may make a selection or provide other types of user input to identify one or more user-defined labels for anomalies the engineer wishes to address in building management. These labels are stored in a database 910. Next, in step 912 label vectors are determined from the labels stored in database 910.... a pre-trainer classifier may be created that takes into account features learned through automated processing of feature vectors and through labels learned through automated processing of label vectors corresponding to user-defined labels. Note: Making a selection from labels instead of providing other types of user input to identify one or more user-defined labels is interpreted to be making a selection of labels that are not user-defined and are associated with vectors. Labels that are not user-defined that are associated with vectors as label vectors are interpreted to read on the claimed  non-personalization feature vector.] 

Kang discloses some of the limitations as set forth in claim 5 but does not appear to expressly disclose a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records, calculating a query-record based attention score for each of the plurality of features, the query-record based attention score indicating a weight of the corresponding feature, converting each query-record based attention score to a corresponding probability distribution, scaling each categorical embedding vector based on the corresponding probability distribution, creating a fixed-dimensional feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, combining the fixed-dimensional feature vector and a fixed-dimensional feature vector to produce a fixed-dimensional query-record latent space feature vector, and generating a ranking function based on the fixed-dimensional query-record latent space feature vector, wherein the ranking function reorders the list of input records for the search created by the user.
Meek discloses:
a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records [Paragraph 0003 teaches one mechanism for categorizing media content, such as pictures or video clips, is the use of metadata tags. Paragraph 0052 teaches system 500 can also utilize a list of user preferences stored within a storage component 506 to customize a generation, weighting, presentation, and/or association of tags with an input item 504. Paragraph 0053 teaches preferences can indicate whether to use a MRU tag list or user-specified tag list to generate tags, and what priority to give such criteria (e.g., as discussed at FIG. 2, supra). Paragraph 0054 teaches preferences can instruct tagging assistant 502 to search recipient(s) of the item to see if any stored tags (e.g., contained within list management component 208) are associated with the recipient(s). Paragraph 0075 teaches at 1106, classifiers and user preferences can be employed to relevance rank the tags. At 1108, the tags can be presented to a device user in order of relevance rank. Note: The classifiers and user preferences for searching recipients (user selects to create a search) ranking customized tags (plurality of personalization features) (custom fields of input records) based relevance, where the tags are associated with an MRU (most recently used) list (affinity) is interpreted to read on the claimed generating a similarity factor for each of a plurality of personalization features corresponding to a most recently used affinity.]
wherein the ranking function reorders the list of input records for the search created by the user [Paragraph 0006 teaches tags can be relevance ranked as a function of relatedness to determined criteria, including item content, tags most recently used by a sender and/or recipient, user-defined preferences, one or more user profiles, or the like. Tags can be sorted as a function of relevance ranking, and automatically attributed to one or more items of communication, or presented to a recipient for selection. Paragraph 0054 teaches preferences can instruct tagging assistant 502 to search recipient(s) of the item to see if any stored tags (e.g., contained within list management component 208) are associated with the recipient(s). Paragraph 0061 component 606 can employ a probabilistic-based or statistical-based approach in connection with choosing between potential tags associated with an input item, optionally auto-assigning tags to input items 604, or offering proposed tags as a function of relevance rank. Note: Ranking/sorting (reordering) tags associated with an input item (list of input records) that are user-defined to search recipients (for the search) reads on the claimed wherein the ranking function reorders the list of input records for the search created by the user.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, by incorporating classifiers and user preferences for searching recipients and ranking customized tags based on relevance, wherein the tags are associated with an MRU, as taught by Meek (see Paragraph 0003, 0052, 0053, 0054, and 0075) because both applications are directed to data processing; incorporating classifiers and user preferences for searching recipients and ranking customized tags based on relevance, wherein the tags are associated with an MRU can greatly improve the speed and accuracy with which tags can be generated and associated with items of communication (see Meek Paragraph 0006).

Kang and Meek discloses some of the limitations as set forth in claim 5 but does not appear to expressly disclose calculating a query-record based attention score for each of the plurality of features, the query-record based attention score indicating a weight of the corresponding feature, converting each query-record based attention score to a corresponding probability distribution, scaling each categorical embedding vector based on the corresponding probability distribution, creating a fixed-dimensional feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, combining the fixed-dimensional feature vector and a fixed-dimensional feature vector to produce a fixed-dimensional query-record latent space feature vector, and generating a ranking function based on the fixed-dimensional query-record latent space feature vector.
Zhang discloses:
calculating a query-record based attention score for each of the plurality of features, the query-record based attention score indicating a weight of the corresponding feature [Paragraph 0043 teaches machine learning scorer component 224 may be embodied as a machine learning model configured for determining the weights for each term, word, or feature in the reference/query and target/advertisement document, and in an embodiment for determining the probability that a target document (such as an ad) is relevant to the reference (query). Paragraph 0044 teaches the machine learning scorer is based on a logistic regression model such as an L1-regularized logistic regression model. Equation(1) and Equation (2). Logistic regression models the probability distribution of the class label R given a feature vector (q,a) as in Equation (2), where w is the parameters of the logistic regression model; y.sub.i represents the label of the document, for query document i; and P is the probability calculated based on current w. In particular, in an embodiment, w represents the weights, and is a dot-product which can be used in the WAND operator. Note: The cited determining the weights for each term, word, or feature in the reference/query and target/advertisement document is interpreted to read on the claimed calculating a query-record based attention score for each of the plurality of personalization features, the query-record based attention score indicating a weight of the corresponding feature because the cited weight corresponds to features associated with a query (query-based attention), therefore the cited weight is interpreted  to also include the claimed calculated query-record based attention score.]; 
converting each query-record based attention score to a corresponding probability distribution [Paragraph 0043 teaches machine learning scorer component 224 may be embodied as a machine learning model configured for determining the weights for each term, word, or feature in the reference/query and target/advertisement document, and in an embodiment for determining the probability that a target document (such as an ad) is relevant to the reference (query). Paragraph 0044 teaches the machine learning scorer is based on a logistic regression model such as an L1-regularized logistic regression model. Equation(1) and Equation (2). Logistic regression models the probability distribution of the class label R given a feature vector (q,a) as in Equation (2), where w is the parameters of the logistic regression model; y.sub.i represents the label of the document, for query document i; and P is the probability calculated based on current w. In particular, in an embodiment, w represents the weights, and is a dot-product which can be used in the WAND operator. Note: The examiner interprets logistic regression modeling the probability distribution of the class label R given a feature vector (q,a) as in part of Equation (2), where the weight (query-record based attention score) is the parameters of the logistic regression model and P is the probability (probability distribution). The examiner interprets the weight to be converted to a probability distribution because the probability associated with the logistic regression model modeling the probability distribution is based on the weight (query-record based attention score).; 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang and Meek, by incorporating determining weights for each term, word, or feature in the reference/query and target/advertisement document and logistic regression modeling the probability distribution, as taught by Zhang (see Paragraph 0043 and 0044), because the three applications are directed to displaying vector processing; incorporating determining weights for each term, word, or feature in the reference/query and target/advertisement document and logistic regression modeling the probability distribution provides improved accuracy of the model (see Zhang Paragraph 0045).

Kang, Meek, and Zhang discloses some of the limitations as set forth in claim 5 but does not appear to expressly disclose scaling each categorical embedding vector based on the corresponding probability distribution, creating a fixed-dimensional feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, combining the fixed-dimensional feature vector and a fixed-dimensional feature vector to produce a fixed-dimensional query-record latent space feature vector, and generating a ranking function based on the fixed-dimensional query-record latent space feature vector.
Polak discloses:
scaling each categorical embedding vector based on the corresponding probability distribution [Paragraph 0019 teaches the class probability includes one or more of a plurality of concept probability distribution values each indicating a classification probability score. Paragraph 0094 teaches the calculated class probability may include a class probability score record that may be a vector of class probability distributions of real valued scores. The vector of class probability distributions may be interpreted as probability of a respective concept to be present (exist) in the presented content, e.g. the modalities data 310. Paragraph 0099 teaches each of the score vectors holds the class probability distributions for the respective concept of the respective modality identified in the scene 302. Some of the modalities may produce several score vectors for a single concept. For example, a score vector may be calculated and assigned to a respective visual concept detected in each of several frames extracted from the scene 302. As another example, a first score vector may be calculated and assigned to one or more text concepts extracted from the scene 302 through OCR tools while a second score vector may be calculated and assigned to one or more text objects extracted from the scene 302 using speech-to-text tools. Note: A calculated class probability score that is a vector (categorical embedding vector) including a probability distributions of scores (corresponding probability distribution) is interpreted to read on the claimed scaling each categorical embedding vector based on the corresponding probability distribution because the vector representing the class probability depends on the probability distributions (corresponding probability distribution) and therefore would be scaled or adjusted based on the probability distributions. For example, in the context of the cited prior art, a first calculated score vector (class probability score record that may be a vector) and a second calculated score vector (class probability score record that may be a vector) indicate a plurality of calculated vectors that are based on probability distributions of each respective concept to be present (exist) in the presented content, e.g. the modalities data, therefore, for the probability distribution of each concept there is an adjusted or different class probability score record that may be a vector (categorical embedding vector). In other words, the cited second calculated score vector is a different or changed vector from the cited first calculated score vector.]  
creating a fixed-dimensional feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions [Paragraph 0094 teaches the calculated class probability may include a class probability score record that may be a vector of class probability distributions of real valued scores. The vector of class probability distributions may be interpreted as probability of a respective concept to be present (exist) in the presented content, e.g. the modalities data 310. Paragraph 0100 teaches the aggregation aims to aggregate the one or more score vectors of a certain dimensionality to one in-modality class probability vector of the same dimensionality. Note: The one in-modality class probability vector of the same dimensionality (a fixed-dimensional personalization feature vector) that is created by aggregating the one or more score vectors of a certain dimensionality (scaled categorical embedding vectors based on the probability distributions) is interpreted to read on the claimed creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions because, in the context of the cited prior art, one or more score vectors of a certain dimensionality reasonably includes scaled vectors and one in-modality class probability vector of the same dimensionality is interpreted to be created based on certain dimensionality vectors.]; 
combining the fixed-dimensional feature vector and a fixed-dimensional vector to produce a fixed-dimensional query-record latent space feature vector [Paragraph 0044 teaches the staged classification approach employs several methods and techniques for accurately categorizing semantically the video stream content for media search, media monitoring and/or media targeting applications. Paragraph 0094 teaches the calculated class probability may include a class probability score record that may be a vector of class probability distributions of real valued scores. The vector of class probability distributions may be interpreted as probability of a respective concept to be present (exist) in the presented content, e.g. the modalities data 310. Paragraph 0100 teaches the aggregation aims to aggregate the one or more score vectors of a certain dimensionality to one in-modality class probability vector of the same dimensionality. Note: Aggregating the one or more score vectors of a certain dimensionality to one in-modality class probability vector of the same dimensionality is interpreted to also read on the claimed combining the fixed-dimensional feature vector and a fixed-dimensional vector to produce a fixed-dimensional query-record latent space feature vector because aggregating (combining) vectors (fixed-dimensional feature vector and a fixed-dimensional vector) results in (produces) one in-modality class probability vector of the same dimensionality (a fixed-dimensional query-record latent space feature vector). To further elaborate, in-modality or state is interpreted to be the claimed latent space and the cited one or same dimensionality is the claimed fixed dimensional feature vector, the cited classification associated with the vectors are interpreted to be the claimed features, the media search (query) associated with a classification approach to accurately categorize content is interpreted to include the claimed query-record. Furthermore, the claimed fixed-dimensional query-record latent space feature vector and fixed-dimensional feature vector are both fixed-dimensional feature vector based on combine or aggregating vectors. Therefore the cited one in-modality class probability vector of the same dimensionality produced by combining and aggregating vectors associated with a search (query record) reads on the claimed fixed-dimensional query-record latent space feature vector.]; and 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, Meek, and Zhang, by incorporating aggregating the one or more vectors of a certain dimensionality to one in-modality class probability vector of the same dimensionality, as taught by Polak (see Paragraph 0044, 0094, and 0100), because four applications are directed to vector processing; incorporating aggregating the one or more vectors of a certain dimensionality to one in-modality class probability vector of the same dimensionality significantly improves the precision of categorization (see Polak Paragraph 0050).

Kang, Meek, Zhang, and Polak discloses most of the limitations as set forth in claim 5 but does not appear to expressly disclose generating a ranking function based on the fixed-dimensional query-record latent space feature vector.
Berkman discloses:
generating a ranking function based on the fixed-dimensional query-record latent space feature vector [Paragraph 0116 teaches for calculating the ranking function, the ranking function calculator 120 represents each content object as a vector in a, possibly high-dimensional, space of content-characterizing features selected by the ranking function calculator 120, to characterize content… the content-characterizing features are denoted by F, and the space of content-characterizing features is thus of dimensionality |F|. Paragraph 0122 teaches the ranking function calculator 120 calculates the ranking function 350 based on the user's interactions 360 with the content objects 35, as tracked by the interaction tracker 110. Note: Calculating the ranking function based on a vector representation of content objects associated with high dimensional space of content-characterizing features (latent space feature) is interpreted to read on the claimed generating a ranking function based on the fixed-dimensional query-record latent space feature vector because the vector is associated with a dimension set (fixed) to |F| and based on the cited vector the ranking function is calculated (generating a ranking function).]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, Meek, Zhang, and Polak, by incorporating calculating the ranking function based on a vector representation of content objects associated with high dimensional space of content-characterizing features (latent space feature), as taught by Berkman (see Paragraph 0116 and 0122), because five applications are directed to vector processing; incorporating calculating the ranking function based on a vector representation of content objects associated with high dimensional space of content-characterizing features (latent space feature) reduces the burden put on the user (see Berkman Paragraph 0011).

As to claim 9:
Kang discloses:
An apparatus comprising: a processor; a non-transitory machine-readable storage medium that provides instructions that, if executed by the processor, are configurable to cause the apparatus to perform operations comprising: 
generating a categorical embedding vector for each of a plurality of personalization features corresponding to a most recently used affinity [Paragraph 0084 teaches user-defined labels (or categories) may also be incorporated. In step 908, an engineer reviews default anomalies and determines labels. The engineer may make a selection or provide other types of user input to identify one or more user-defined labels for anomalies the engineer wishes to address in building management. These labels are stored in a database 910. Next, in step 912 label vectors are determined from the labels stored in database 910.... a pre-trainer classifier may be created that takes into account features learned through automated processing of feature vectors and through labels learned through automated processing of label vectors corresponding to user-defined labels. Paragraph 0091 teaches classifier 1120 is obtained in step 1130 by selecting the most recently used classifier from either pre-trained classifier database 314 or a building/equipment/device specific classifier database 316.  Note: User-defined label (personalization features) vectors (categorical embedding vector) that are determined from labels stored in a database and are used to create most recently used classifiers (a most recently used affinity) are interpreted to read on the claimed generating a categorical embedding vector for each of a plurality of personalization features corresponding to a most recently used affinity.], wherein each categorical embedding vector comprises a variable number of variably sized elements [Paragraph 0084 teaches user-defined labels (or categories) may also be incorporated. In step 908, an engineer reviews default anomalies and determines labels. The engineer may make a selection or provide other types of user input to identify one or more user-defined labels for anomalies the engineer wishes to address in building management. These labels are stored in a database 910. Next, in step 912 label vectors are determined from the labels stored in database 910.... a pre-trainer classifier may be created that takes into account features learned through automated processing of feature vectors and through labels learned through automated processing of label vectors corresponding to user-defined labels. Paragraph 0091 teaches classifier 1120 is obtained in step 1130 by selecting the most recently used classifier from either pre-trained classifier database 314 or a building/equipment/device specific classifier database 316. Figure 14: 1410 and Paragraph 0090 teaches array of feature vectors are determined from the extracted features. FIG. 14 shows an example of an array 1410 of feature vectors. The feature vectors in the array are made up of processed data representing the relative values of features which are time-related, weather-related, and/or energy-related. Note: The labeled feature vectors that have appear to have a variable number of features, as indicated by the mention of currently 17 features, and a variable number (size) of elements associated with each time-related, weather-related, or energy-related feature is interpreted to read on the claimed each categorical embedding vector comprises a variable number of variably sized elements.];
personalization feature [Paragraph 0084 teaches user-defined labels (or categories) may also be incorporated. In step 908, an engineer reviews default anomalies and determines labels. The engineer may make a selection or provide other types of user input to identify one or more user-defined labels for anomalies the engineer wishes to address in building management. These labels are stored in a database 910. Next, in step 912 label vectors are determined from the labels stored in database 910.... a pre-trainer classifier may be created that takes into account features learned through automated processing of feature vectors and through labels learned through automated processing of label vectors corresponding to user-defined labels. Note: The user-defined labels are interpreted to read on the claimed personalization feature.]
personalization feature vector [Paragraph 0084 teaches user-defined labels (or categories) may also be incorporated. In step 908, an engineer reviews default anomalies and determines labels. The engineer may make a selection or provide other types of user input to identify one or more user-defined labels for anomalies the engineer wishes to address in building management. These labels are stored in a database 910. Next, in step 912 label vectors are determined from the labels stored in database 910.... a pre-trainer classifier may be created that takes into account features learned through automated processing of feature vectors and through labels learned through automated processing of label vectors corresponding to user-defined labels. Note: Labels that are user-defined associated with vectors as label vectors are interpreted to read on the claimed personalization feature vector.]
non-personalization feature vector [Paragraph 0084 teaches user-defined labels (or categories) may also be incorporated. In step 908, an engineer reviews default anomalies and determines labels. The engineer may make a selection or provide other types of user input to identify one or more user-defined labels for anomalies the engineer wishes to address in building management. These labels are stored in a database 910. Next, in step 912 label vectors are determined from the labels stored in database 910.... a pre-trainer classifier may be created that takes into account features learned through automated processing of feature vectors and through labels learned through automated processing of label vectors corresponding to user-defined labels. Note: Making a selection from labels instead of providing other types of user input to identify one or more user-defined labels is interpreted to be making a selection of labels that are not user-defined and are associated with vectors. Labels that are not user-defined that are associated with vectors as label vectors are interpreted to read on the claimed  non-personalization feature vector.] 

Kang discloses some of the limitations as set forth in claim 9 but does not appear to expressly disclose a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records, calculating a query-record based attention score for each of the plurality of features, the query-record based attention score indicating a weight of the corresponding feature, converting each query-record based attention score to a corresponding probability distribution, scaling each categorical embedding vector based on the corresponding probability distribution, creating a fixed-dimensional feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, combining the fixed-dimensional feature vector and a fixed-dimensional feature vector to produce a fixed-dimensional query-record latent space feature vector, and generating a ranking function based on the fixed-dimensional query-record latent space feature vector, wherein the ranking function reorders the list of input records for the search created by the user.
Meek discloses:
a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records [Paragraph 0003 teaches one mechanism for categorizing media content, such as pictures or video clips, is the use of metadata tags. Paragraph 0052 teaches system 500 can also utilize a list of user preferences stored within a storage component 506 to customize a generation, weighting, presentation, and/or association of tags with an input item 504. Paragraph 0053 teaches preferences can indicate whether to use a MRU tag list or user-specified tag list to generate tags, and what priority to give such criteria (e.g., as discussed at FIG. 2, supra). Paragraph 0054 teaches preferences can instruct tagging assistant 502 to search recipient(s) of the item to see if any stored tags (e.g., contained within list management component 208) are associated with the recipient(s). Paragraph 0075 teaches at 1106, classifiers and user preferences can be employed to relevance rank the tags. At 1108, the tags can be presented to a device user in order of relevance rank. Note: The classifiers and user preferences for searching recipients (user selects to create a search) ranking customized tags (plurality of personalization features) (custom fields of input records) based relevance, where the tags are associated with an MRU (most recently used) list (affinity) is interpreted to read on the claimed generating a similarity factor for each of a plurality of personalization features corresponding to a most recently used affinity.]
wherein the ranking function reorders the list of input records for the search created by the user [Paragraph 0006 teaches tags can be relevance ranked as a function of relatedness to determined criteria, including item content, tags most recently used by a sender and/or recipient, user-defined preferences, one or more user profiles, or the like. Tags can be sorted as a function of relevance ranking, and automatically attributed to one or more items of communication, or presented to a recipient for selection. Paragraph 0054 teaches preferences can instruct tagging assistant 502 to search recipient(s) of the item to see if any stored tags (e.g., contained within list management component 208) are associated with the recipient(s). Paragraph 0061 component 606 can employ a probabilistic-based or statistical-based approach in connection with choosing between potential tags associated with an input item, optionally auto-assigning tags to input items 604, or offering proposed tags as a function of relevance rank. Note: Ranking/sorting (reordering) tags associated with an input item (list of input records) that are user-defined to search recipients (for the search) reads on the claimed wherein the ranking function reorders the list of input records for the search created by the user.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, by incorporating classifiers and user preferences for searching recipients and ranking customized tags based on relevance, wherein the tags are associated with an MRU, as taught by Meek (see Paragraph 0003, 0052, 0053, 0054, and 0075) because both applications are directed to data processing; incorporating classifiers and user preferences for searching recipients and ranking customized tags based on relevance, wherein the tags are associated with an MRU can greatly improve the speed and accuracy with which tags can be generated and associated with items of communication (see Meek Paragraph 0006).

Kang and Meek discloses some of the limitations as set forth in claim 9 but does not appear to expressly disclose calculating a query-record based attention score for each of the plurality of features, the query-record based attention score indicating a weight of the corresponding feature, converting each query-record based attention score to a corresponding probability distribution, scaling each categorical embedding vector based on the corresponding probability distribution, creating a fixed-dimensional feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, combining the fixed-dimensional feature vector and a fixed-dimensional feature vector to produce a fixed-dimensional query-record latent space feature vector, and generating a ranking function based on the fixed-dimensional query-record latent space feature vector.
Zhang discloses:
calculating a query-record based attention score for each of the plurality of features, the query-record based attention score indicating a weight of the corresponding feature [Paragraph 0043 teaches machine learning scorer component 224 may be embodied as a machine learning model configured for determining the weights for each term, word, or feature in the reference/query and target/advertisement document, and in an embodiment for determining the probability that a target document (such as an ad) is relevant to the reference (query). Paragraph 0044 teaches the machine learning scorer is based on a logistic regression model such as an L1-regularized logistic regression model. Equation(1) and Equation (2). Logistic regression models the probability distribution of the class label R given a feature vector (q,a) as in Equation (2), where w is the parameters of the logistic regression model; y.sub.i represents the label of the document, for query document i; and P is the probability calculated based on current w. In particular, in an embodiment, w represents the weights, and is a dot-product which can be used in the WAND operator. Note: The cited determining the weights for each term, word, or feature in the reference/query and target/advertisement document is interpreted to read on the claimed calculating a query-record based attention score for each of the plurality of personalization features, the query-record based attention score indicating a weight of the corresponding feature because the cited weight corresponds to features associated with a query (query-based attention), therefore the cited weight is interpreted  to also include the claimed calculated query-record based attention score.]; 
converting each query-record based attention score to a corresponding probability distribution [Paragraph 0043 teaches machine learning scorer component 224 may be embodied as a machine learning model configured for determining the weights for each term, word, or feature in the reference/query and target/advertisement document, and in an embodiment for determining the probability that a target document (such as an ad) is relevant to the reference (query). Paragraph 0044 teaches the machine learning scorer is based on a logistic regression model such as an L1-regularized logistic regression model. Equation(1) and Equation (2). Logistic regression models the probability distribution of the class label R given a feature vector (q,a) as in Equation (2), where w is the parameters of the logistic regression model; y.sub.i represents the label of the document, for query document i; and P is the probability calculated based on current w. In particular, in an embodiment, w represents the weights, and is a dot-product which can be used in the WAND operator. Note: The examiner interprets logistic regression modeling the probability distribution of the class label R given a feature vector (q,a) as in part of Equation (2), where the weight (query-record based attention score) is the parameters of the logistic regression model and P is the probability (probability distribution). The examiner interprets the weight to be converted to a probability distribution because the probability associated with the logistic regression model modeling the probability distribution is based on the weight (query-record based attention score).; 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang and Meek, by incorporating determining weights for each term, word, or feature in the reference/query and target/advertisement document and logistic regression modeling the probability distribution, as taught by Zhang (see Paragraph 0043 and 0044), because the three applications are directed to displaying vector processing; incorporating determining weights for each term, word, or feature in the reference/query and target/advertisement document and logistic regression modeling the probability distribution provides improved accuracy of the model (see Zhang Paragraph 0045).

Kang, Meek, and Zhang discloses some of the limitations as set forth in claim 9 but does not appear to expressly disclose scaling each categorical embedding vector based on the corresponding probability distribution, creating a fixed-dimensional feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, combining the fixed-dimensional feature vector and a fixed-dimensional feature vector to produce a fixed-dimensional query-record latent space feature vector, and generating a ranking function based on the fixed-dimensional query-record latent space feature vector.
Polak discloses:
scaling each categorical embedding vector based on the corresponding probability distribution [Paragraph 0019 teaches the class probability includes one or more of a plurality of concept probability distribution values each indicating a classification probability score. Paragraph 0094 teaches the calculated class probability may include a class probability score record that may be a vector of class probability distributions of real valued scores. The vector of class probability distributions may be interpreted as probability of a respective concept to be present (exist) in the presented content, e.g. the modalities data 310. Paragraph 0099 teaches each of the score vectors holds the class probability distributions for the respective concept of the respective modality identified in the scene 302. Some of the modalities may produce several score vectors for a single concept. For example, a score vector may be calculated and assigned to a respective visual concept detected in each of several frames extracted from the scene 302. As another example, a first score vector may be calculated and assigned to one or more text concepts extracted from the scene 302 through OCR tools while a second score vector may be calculated and assigned to one or more text objects extracted from the scene 302 using speech-to-text tools. Note: A calculated class probability score that is a vector (categorical embedding vector) including a probability distributions of scores (corresponding probability distribution) is interpreted to read on the claimed scaling each categorical embedding vector based on the corresponding probability distribution because the vector representing the class probability depends on the probability distributions (corresponding probability distribution) and therefore would be scaled or adjusted based on the probability distributions. For example, in the context of the cited prior art, a first calculated score vector (class probability score record that may be a vector) and a second calculated score vector (class probability score record that may be a vector) indicate a plurality of calculated vectors that are based on probability distributions of each respective concept to be present (exist) in the presented content, e.g. the modalities data, therefore, for the probability distribution of each concept there is an adjusted or different class probability score record that may be a vector (categorical embedding vector). In other words, the cited second calculated score vector is a different or changed vector from the cited first calculated score vector.]  
creating a fixed-dimensional feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions [Paragraph 0094 teaches the calculated class probability may include a class probability score record that may be a vector of class probability distributions of real valued scores. The vector of class probability distributions may be interpreted as probability of a respective concept to be present (exist) in the presented content, e.g. the modalities data 310. Paragraph 0100 teaches the aggregation aims to aggregate the one or more score vectors of a certain dimensionality to one in-modality class probability vector of the same dimensionality. Note: The one in-modality class probability vector of the same dimensionality (a fixed-dimensional personalization feature vector) that is created by aggregating the one or more score vectors of a certain dimensionality (scaled categorical embedding vectors based on the probability distributions) is interpreted to read on the claimed creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions because, in the context of the cited prior art, one or more score vectors of a certain dimensionality reasonably includes scaled vectors and one in-modality class probability vector of the same dimensionality is interpreted to be created based on certain dimensionality vectors.]; 
combining the fixed-dimensional feature vector and a fixed-dimensional vector to produce a fixed-dimensional query-record latent space feature vector [Paragraph 0044 teaches the staged classification approach employs several methods and techniques for accurately categorizing semantically the video stream content for media search, media monitoring and/or media targeting applications. Paragraph 0094 teaches the calculated class probability may include a class probability score record that may be a vector of class probability distributions of real valued scores. The vector of class probability distributions may be interpreted as probability of a respective concept to be present (exist) in the presented content, e.g. the modalities data 310. Paragraph 0100 teaches the aggregation aims to aggregate the one or more score vectors of a certain dimensionality to one in-modality class probability vector of the same dimensionality. Note: Aggregating the one or more score vectors of a certain dimensionality to one in-modality class probability vector of the same dimensionality is interpreted to also read on the claimed combining the fixed-dimensional feature vector and a fixed-dimensional vector to produce a fixed-dimensional query-record latent space feature vector because aggregating (combining) vectors (fixed-dimensional feature vector and a fixed-dimensional vector) results in (produces) one in-modality class probability vector of the same dimensionality (a fixed-dimensional query-record latent space feature vector). To further elaborate, in-modality or state is interpreted to be the claimed latent space and the cited one or same dimensionality is the claimed fixed dimensional feature vector, the cited classification associated with the vectors are interpreted to be the claimed features, the media search (query) associated with a classification approach to accurately categorize content is interpreted to include the claimed query-record. Furthermore, the claimed fixed-dimensional query-record latent space feature vector and fixed-dimensional feature vector are both fixed-dimensional feature vector based on combine or aggregating vectors. Therefore the cited one in-modality class probability vector of the same dimensionality produced by combining and aggregating vectors associated with a search (query record) reads on the claimed fixed-dimensional query-record latent space feature vector.]; and 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, Meek, and Zhang, by incorporating aggregating the one or more vectors of a certain dimensionality to one in-modality class probability vector of the same dimensionality, as taught by Polak (see Paragraph 0044, 0094, and 0100), because four applications are directed to vector processing; incorporating aggregating the one or more vectors of a certain dimensionality to one in-modality class probability vector of the same dimensionality significantly improves the precision of categorization (see Polak Paragraph 0050).

Kang, Meek, Zhang, and Polak discloses most of the limitations as set forth in claim 9 but does not appear to expressly disclose generating a ranking function based on the fixed-dimensional query-record latent space feature vector.
Berkman discloses:
generating a ranking function based on the fixed-dimensional query-record latent space feature vector [Paragraph 0116 teaches for calculating the ranking function, the ranking function calculator 120 represents each content object as a vector in a, possibly high-dimensional, space of content-characterizing features selected by the ranking function calculator 120, to characterize content… the content-characterizing features are denoted by F, and the space of content-characterizing features is thus of dimensionality |F|. Paragraph 0122 teaches the ranking function calculator 120 calculates the ranking function 350 based on the user's interactions 360 with the content objects 35, as tracked by the interaction tracker 110. Note: Calculating the ranking function based on a vector representation of content objects associated with high dimensional space of content-characterizing features (latent space feature) is interpreted to read on the claimed generating a ranking function based on the fixed-dimensional query-record latent space feature vector because the vector is associated with a dimension set (fixed) to |F| and based on the cited vector the ranking function is calculated (generating a ranking function).]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, Meek, Zhang, and Polak, by incorporating calculating the ranking function based on a vector representation of content objects associated with high dimensional space of content-characterizing features (latent space feature), as taught by Berkman (see Paragraph 0116 and 0122), because five applications are directed to vector processing; incorporating calculating the ranking function based on a vector representation of content objects associated with high dimensional space of content-characterizing features (latent space feature) reduces the burden put on the user (see Berkman Paragraph 0011).

Claim(s) 2, 6, and 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kang et al. (U.S. Publication No.: US 20200408566 A1) hereinafter Kang, Meek et al. (U.S. Publication No.: US 20090006285 A1) hereinafter Meek, in view of Zhang et al. (U.S. Publication No.: US 20150186938 A1) hereinafter Zhang, in view of Polak et al. (U.S. Publication No.: US 20180032845 A1) hereinafter Polak, in view of Berkman et al. (U.S. Publication No.: US 20110066613 A1) hereinafter Berkman, in view of Ispir et al. (U.S. Publication No.: 20220039735 A1) hereinafter Ispir, and further in view of Al Hasan et al. (U.S. Publication No.: US 20200175015 A1) hereinafter Al Hasan.
As to claim 2:
Kang discloses:
non-personalization feature vector [Paragraph 0084 teaches user-defined labels (or categories) may also be incorporated. In step 908, an engineer reviews default anomalies and determines labels. The engineer may make a selection or provide other types of user input to identify one or more user-defined labels for anomalies the engineer wishes to address in building management. These labels are stored in a database 910. Next, in step 912 label vectors are determined from the labels stored in database 910.... a pre-trainer classifier may be created that takes into account features learned through automated processing of feature vectors and through labels learned through automated processing of label vectors corresponding to user-defined labels. Note: Making a selection from labels instead of providing other types of user input to identify one or more user-defined labels is interpreted to be making a selection of labels that are not user-defined and are associated with vectors. Labels that are not user-defined that are associated with vectors as label vectors are interpreted to read on the claimed  non-personalization feature vector.]

Kang, Meek, Zhang, Polak, and Berkman discloses all of the limitations as set forth in claim 1 and some of the limitations of claim 2 but does not appear to expressly disclose the computer-implemented method of claim 1, wherein calculating a query-record based attention score for each of the plurality of personalization features comprises: multiplying the fixed-dimensional vector by a weight matrix to produce a weighted fixed-dimensional vector, weighted fixed-dimensional vector, and for each of the categorical embedding vectors: calculating a dot product of the vector and the categorical embedding vector to produce the corresponding query- record based attention score.
Ispir discloses:
The computer-implemented method of claim 1, wherein calculating a query-record based attention score for each of the plurality of personalization features comprises: multiplying the fixed-dimensional vector by a weight matrix to produce a weighted fixed-dimensional vector [Paragraph 0027 teaches the embedding process 104 converts the raw EEG trial data 102 into a vector of fixed length. Paragraph 0033 teaches attention vectors can then be concatenated and multiplied by an additional weight matrix to yield a single attention vector for each received embedding. Note: Multiplying concatenated attention vectors that are of fixed length (fixed-dimensional vectors) by a weight matrix to yield a single attention vector that is of fixed length (weighted fixed-dimensional vector) is interpreted to read on the claimed multiplying the fixed-dimensional vector by a weight matrix to produce a weighted fixed-dimensional vector.]; and
weighted fixed-dimensional vector [Paragraph 0027 teaches the embedding process 104 converts the raw EEG trial data 102 into a vector of fixed length. Paragraph 0033 teaches attention vectors can then be concatenated and multiplied by an additional weight matrix to yield a single attention vector for each received embedding. Note: Multiplying concatenated attention vectors that are of fixed length (fixed-dimensional vectors) by a weight matrix to yield a single attention vector that is of fixed length (weighted fixed-dimensional vector) is interpreted to read on the claimed multiplying the fixed-dimensional vector by a weight matrix to produce a weighted fixed-dimensional vector.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, Meek, Zhang, Polak, and Berkman, by incorporating multiplying concatenated attention vectors (vectors) by a weight matrix to yield a single attention vector (weighted vector), as taught by Ispir (see Paragraph 0027 and 0033), because the six applications are directed to vector processing; incorporating multiplying concatenated attention vectors (vectors) by a weight matrix to yield a single attention vector (weighted vector) improves the efficiency of training the model and may improve the generalizability of the model (see Ispir Paragraph 0030).

Kang, Meek, Zhang, Polak, Berkman, and Ispir discloses all of the limitations as set forth in claim 1 and most of claim 2 but does not appear to expressly disclose for each of the categorical embedding vectors: calculating a dot product of the vector and the categorical embedding vector to produce the corresponding query- record based attention score.
Al Hasan discloses:
 for each of the categorical embedding vectors: calculating a dot product of the vector and the categorical embedding vector to produce the corresponding query- record based attention score [Paragraph 0080 teaches referring to FIG. 6, in one embodiment, is a high-level representation of a CRF classifier 600 to generate a word sequence using document words labeled sequence identifiers, to generate a response to the query… the input to the CRF classifier 600 is the concatenated features M from the attend layer 540 of the hierarchical self-attention mechanism 500. Paragraph 0082 teaches linear layer generates a scalar score for each of the possible B, I, O labels, for each word embedding vector. For example, given a label, there is an associated a trainable vector, and the method uses the vector to dot-product with the word embedding from 600. Note: For each word embedding label vector (categorical embedding vectors), the linear layer generating a score (query- record based attention score) based on query input (query-record) associated with a self-attention mechanism (attention) and further based on vectors including word embedding label vector is interpreted to read on the claimed for each of the categorical embedding vectors: calculating a dot product of the vector and the categorical embedding vector to produce the corresponding query- record based attention score.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, Meek, Zhang, Polak, Berkman, and Ispir, by incorporating for each word embedding label vector, the linear layer generating a score based on query input associated with a self-attention mechanism and further based on vectors including word embedding label vector, as taught by Al Hasan (see Paragraph 0080 and 0082), because seven applications are directed to vector processing; incorporating for each word embedding label vector, the linear layer generating a score based on query input associated with a self-attention mechanism and further based on vectors including word embedding label vector provides significant technological improvement over existing computerized systems used to automatically generate responses or answers to a query (see Al Hasan Paragraph 0103).

As to claim 6:
Kang discloses:
non-personalization feature vector [Paragraph 0084 teaches user-defined labels (or categories) may also be incorporated. In step 908, an engineer reviews default anomalies and determines labels. The engineer may make a selection or provide other types of user input to identify one or more user-defined labels for anomalies the engineer wishes to address in building management. These labels are stored in a database 910. Next, in step 912 label vectors are determined from the labels stored in database 910.... a pre-trainer classifier may be created that takes into account features learned through automated processing of feature vectors and through labels learned through automated processing of label vectors corresponding to user-defined labels. Note: Making a selection from labels instead of providing other types of user input to identify one or more user-defined labels is interpreted to be making a selection of labels that are not user-defined and are associated with vectors. Labels that are not user-defined that are associated with vectors as label vectors are interpreted to read on the claimed  non-personalization feature vector.]

Ispir discloses:
The non-transitory machine-readable storage medium of claim 5, wherein calculating a query-record based attention score for each of the plurality of personalization features comprises: multiplying the fixed-dimensional vector by a weight matrix to produce a weighted fixed-dimensional vector [Paragraph 0027 teaches the embedding process 104 converts the raw EEG trial data 102 into a vector of fixed length. Paragraph 0033 teaches attention vectors can then be concatenated and multiplied by an additional weight matrix to yield a single attention vector for each received embedding. Note: Multiplying concatenated attention vectors that are of fixed length (fixed-dimensional vectors) by a weight matrix to yield a single attention vector that is of fixed length (weighted fixed-dimensional vector) is interpreted to read on the claimed multiplying the fixed-dimensional vector by a weight matrix to produce a weighted fixed-dimensional vector.]; and
weighted fixed-dimensional vector [Paragraph 0027 teaches the embedding process 104 converts the raw EEG trial data 102 into a vector of fixed length. Paragraph 0033 teaches attention vectors can then be concatenated and multiplied by an additional weight matrix to yield a single attention vector for each received embedding. Note: Multiplying concatenated attention vectors that are of fixed length (fixed-dimensional vectors) by a weight matrix to yield a single attention vector that is of fixed length (weighted fixed-dimensional vector) is interpreted to read on the claimed multiplying the fixed-dimensional vector by a weight matrix to produce a weighted fixed-dimensional vector.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, Zhang, Polak, and Berkman, by incorporating multiplying concatenated attention vectors (vectors) by a weight matrix to yield a single attention vector (weighted vector), as taught by Ispir (see Paragraph 0027 and 0033), because the five applications are directed to vector processing; incorporating multiplying concatenated attention vectors (vectors) by a weight matrix to yield a single attention vector (weighted vector) improves the efficiency of training the model and may improve the generalizability of the model (see Ispir Paragraph 0030).

Kang, Meek, Zhang, Polak, Berkman, and Ispir discloses all of the limitations as set forth in claim 5 and most of claim 6 but does not appear to expressly disclose for each of the categorical embedding vectors: calculating a dot product of the vector and the categorical embedding vector to produce the corresponding query- record based attention score.
Al Hasan discloses:
 for each of the categorical embedding vectors: calculating a dot product of the vector and the categorical embedding vector to produce the corresponding query- record based attention score [Paragraph 0080 teaches referring to FIG. 6, in one embodiment, is a high-level representation of a CRF classifier 600 to generate a word sequence using document words labeled sequence identifiers, to generate a response to the query… the input to the CRF classifier 600 is the concatenated features M from the attend layer 540 of the hierarchical self-attention mechanism 500. Paragraph 0082 teaches linear layer generates a scalar score for each of the possible B, I, O labels, for each word embedding vector. For example, given a label, there is an associated a trainable vector, and the method uses the vector to dot-product with the word embedding from 600. Note: For each word embedding label vector (categorical embedding vectors), the linear layer generating a score (query- record based attention score) based on query input (query-record) associated with a self-attention mechanism (attention) and further based on vectors including word embedding label vector is interpreted to read on the claimed for each of the categorical embedding vectors: calculating a dot product of the vector and the categorical embedding vector to produce the corresponding query- record based attention score.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, Meek, Zhang, Polak, Berkman, and Ispir, by incorporating for each word embedding label vector, the linear layer generating a score based on query input associated with a self-attention mechanism and further based on vectors including word embedding label vector, as taught by Al Hasan (see Paragraph 0080 and 0082), because seven applications are directed to vector processing; incorporating for each word embedding label vector, the linear layer generating a score based on query input associated with a self-attention mechanism and further based on vectors including word embedding label vector provides significant technological improvement over existing computerized systems used to automatically generate responses or answers to a query (see Al Hasan Paragraph 0103).

As to claim 10:
Kang discloses:
non-personalization feature vector [Paragraph 0084 teaches user-defined labels (or categories) may also be incorporated. In step 908, an engineer reviews default anomalies and determines labels. The engineer may make a selection or provide other types of user input to identify one or more user-defined labels for anomalies the engineer wishes to address in building management. These labels are stored in a database 910. Next, in step 912 label vectors are determined from the labels stored in database 910.... a pre-trainer classifier may be created that takes into account features learned through automated processing of feature vectors and through labels learned through automated processing of label vectors corresponding to user-defined labels. Note: Making a selection from labels instead of providing other types of user input to identify one or more user-defined labels is interpreted to be making a selection of labels that are not user-defined and are associated with vectors. Labels that are not user-defined that are associated with vectors as label vectors are interpreted to read on the claimed  non-personalization feature vector.]

Ispir discloses:
The apparatus of claim 9, wherein calculating a query-record based attention score for each of the plurality of personalization features comprises: multiplying the fixed-dimensional vector by a weight matrix to produce a weighted fixed-dimensional vector [Paragraph 0027 teaches the embedding process 104 converts the raw EEG trial data 102 into a vector of fixed length. Paragraph 0033 teaches attention vectors can then be concatenated and multiplied by an additional weight matrix to yield a single attention vector for each received embedding. Note: Multiplying concatenated attention vectors that are of fixed length (fixed-dimensional vectors) by a weight matrix to yield a single attention vector that is of fixed length (weighted fixed-dimensional vector) is interpreted to read on the claimed multiplying the fixed-dimensional vector by a weight matrix to produce a weighted fixed-dimensional vector.]; and
weighted fixed-dimensional vector [Paragraph 0027 teaches the embedding process 104 converts the raw EEG trial data 102 into a vector of fixed length. Paragraph 0033 teaches attention vectors can then be concatenated and multiplied by an additional weight matrix to yield a single attention vector for each received embedding. Note: Multiplying concatenated attention vectors that are of fixed length (fixed-dimensional vectors) by a weight matrix to yield a single attention vector that is of fixed length (weighted fixed-dimensional vector) is interpreted to read on the claimed multiplying the fixed-dimensional vector by a weight matrix to produce a weighted fixed-dimensional vector.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, Meek, Zhang, Polak, and Berkman, by incorporating multiplying concatenated attention vectors (vectors) by a weight matrix to yield a single attention vector (weighted vector), as taught by Ispir (see Paragraph 0027 and 0033), because the six applications are directed to vector processing; incorporating multiplying concatenated attention vectors (vectors) by a weight matrix to yield a single attention vector (weighted vector) improves the efficiency of training the model and may improve the generalizability of the model (see Ispir Paragraph 0030).

Kang, Meek, Zhang, Polak, Berkman, and Ispir discloses all of the limitations as set forth in claim 9 and most of claim 10 but does not appear to expressly disclose for each of the categorical embedding vectors: calculating a dot product of the vector and the categorical embedding vector to produce the corresponding query- record based attention score.
Al Hasan discloses:
 for each of the categorical embedding vectors: calculating a dot product of the vector and the categorical embedding vector to produce the corresponding query- record based attention score [Paragraph 0080 teaches referring to FIG. 6, in one embodiment, is a high-level representation of a CRF classifier 600 to generate a word sequence using document words labeled sequence identifiers, to generate a response to the query… the input to the CRF classifier 600 is the concatenated features M from the attend layer 540 of the hierarchical self-attention mechanism 500. Paragraph 0082 teaches linear layer generates a scalar score for each of the possible B, I, O labels, for each word embedding vector. For example, given a label, there is an associated a trainable vector, and the method uses the vector to dot-product with the word embedding from 600. Note: For each word embedding label vector (categorical embedding vectors), the linear layer generating a score (query- record based attention score) based on query input (query-record) associated with a self-attention mechanism (attention) and further based on vectors including word embedding label vector is interpreted to read on the claimed for each of the categorical embedding vectors: calculating a dot product of the vector and the categorical embedding vector to produce the corresponding query- record based attention score.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, Meek, Zhang, Polak, Berkman, and Ispir, by incorporating for each word embedding label vector, the linear layer generating a score based on query input associated with a self-attention mechanism and further based on vectors including word embedding label vector, as taught by Al Hasan (see Paragraph 0080 and 0082), because seven applications are directed to vector processing; incorporating for each word embedding label vector, the linear layer generating a score based on query input associated with a self-attention mechanism and further based on vectors including word embedding label vector provides significant technological improvement over existing computerized systems used to automatically generate responses or answers to a query (see Al Hasan Paragraph 0103).

Claim(s) 3, 7, and 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kang et al. (U.S. Publication No.: US 20200408566 A1) hereinafter Kang, in view of Meek et al. (U.S. Publication No.: US 20090006285 A1) hereinafter Meek, in view of Zhang et al. (U.S. Publication No.: US 20150186938 A1) hereinafter Zhang, in view of Polak et al. (U.S. Publication No.: US 20180032845 A1) hereinafter Polak, in view of Berkman et al. (U.S. Publication No.: US 20110066613 A1) hereinafter Berkman, and further in view of Ustimenko et al. (U.S. Publication No.: US 20210319359 A1) hereinafter Ustimenko.
As to claim 3:
Kang, Meek, Zhang, Polak, and Berkman discloses all of the limitations as set forth in claim 1 but does not appear to expressly disclose the computer-implemented method of claim 1, wherein the ranking function is selected from the group consisting of: pointwise; pairwise; groupwise; and set-wise.
Ustimenko discloses:
The computer-implemented method of claim 1, wherein the ranking function is selected from the group consisting of: pointwise; pairwise; groupwise; and set-wise [Paragraph 0102 teaches different approaches may be used to optimize the ranking quality of algorithms that are used for ranking documents based on estimated relevance and which typically fall within one of the following: point-wise approach, pair-wise approach, and list-wise approach. Note: Selecting ranking algorithms approaches such as point-wise approach, pair-wise approach, and list-wise approach (groupwise and set-wise) reads on the claimed ranking function is selected from the group consisting of pointwise, pairwise, groupwise, and set-wise.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, Meek, Zhang, Polak, and Berkman, by incorporating selecting ranking algorithms approaches such as point-wise approach, pair-wise approach, and list-wise approach (groupwise and set-wise), as taught by Ustimenko (see Paragraph 0102), because the six applications are directed to vector processing; incorporating selecting ranking algorithms approaches such as point-wise approach, pair-wise approach, and list-wise approach (groupwise and set-wise) provides a desired level of accuracy (see Ustimenko Paragraph 0137).

As to claim 7:
Kang, Meek, Zhang, Polak, and Berkman discloses all of the limitations as set forth in claim 1 but does not appear to expressly disclose the non-transitory machine-readable storage medium of claim 5, wherein the ranking function is selected from the group consisting of pointwise; pairwise; groupwise; and set-wise.
Ustimenko discloses:
The non-transitory machine-readable storage medium of claim 5, wherein the ranking function is selected from the group consisting of pointwise; pairwise; groupwise; and set-wise [Paragraph 0102 teaches different approaches may be used to optimize the ranking quality of algorithms that are used for ranking documents based on estimated relevance and which typically fall within one of the following: point-wise approach, pair-wise approach, and list-wise approach. Note: Selecting ranking algorithms approaches such as point-wise approach, pair-wise approach, and list-wise approach (groupwise and set-wise) reads on the claimed ranking function is selected from the group consisting of pointwise, pairwise, groupwise, and set-wise.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, Meek, Zhang, Polak, and Berkman, by incorporating selecting ranking algorithms approaches such as point-wise approach, pair-wise approach, and list-wise approach (groupwise and set-wise), as taught by Ustimenko (see Paragraph 0102), because the six applications are directed to vector processing; incorporating selecting ranking algorithms approaches such as point-wise approach, pair-wise approach, and list-wise approach (groupwise and set-wise) provides a desired level of accuracy (see Ustimenko Paragraph 0137).

As to claim 11:
Kang, Meek, Zhang, Polak, and Berkman discloses all of the limitations as set forth in claim 1 but does not appear to expressly disclose the apparatus of claim 9, wherein the ranking function is selected from the group consisting of pointwise; pairwise; groupwise; and set-wise.
Ustimenko discloses:
The apparatus of claim 9, wherein the ranking function is selected from the group consisting of pointwise; pairwise; groupwise; and set-wise [Paragraph 0102 teaches different approaches may be used to optimize the ranking quality of algorithms that are used for ranking documents based on estimated relevance and which typically fall within one of the following: point-wise approach, pair-wise approach, and list-wise approach. Note: Selecting ranking algorithms approaches such as point-wise approach, pair-wise approach, and list-wise approach (groupwise and set-wise) reads on the claimed ranking function is selected from the group consisting of pointwise, pairwise, groupwise, and set-wise.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, Meek, Zhang, Polak, and Berkman, by incorporating selecting ranking algorithms approaches such as point-wise approach, pair-wise approach, and list-wise approach (groupwise and set-wise), as taught by Ustimenko (see Paragraph 0102), because the six applications are directed to vector processing; incorporating selecting ranking algorithms approaches such as point-wise approach, pair-wise approach, and list-wise approach (groupwise and set-wise) provides a desired level of accuracy (see Ustimenko Paragraph 0137).

Claim(s) 4, 8, and 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kang et al. (U.S. Publication No.: US 20200408566 A1) hereinafter Kang, in view of Meek et al. (U.S. Publication No.: US 20090006285 A1) hereinafter Meek, in view of Zhang et al. (U.S. Publication No.: US 20150186938 A1) hereinafter Zhang, in view of Polak et al. (U.S. Publication No.: US 20180032845 A1) hereinafter Polak, in view of Berkman et al. (U.S. Publication No.: US 20110066613 A1) hereinafter Berkman, and further in view of Creed et al. (WO 2019186198 A1) hereinafter Creed.
As to claim 4:
Kang, Meek, Zhang, Polak, and Berkman discloses all of the limitations as set forth in claim 1.
Zhang also discloses:
query-record based attention score [Paragraph 0043 teaches machine learning scorer component 224 may be embodied as a machine learning model configured for determining the weights for each term, word, or feature in the reference/query and target/advertisement document, and in an embodiment for determining the probability that a target document (such as an ad) is relevant to the reference (query). Paragraph 0044 teaches the machine learning scorer is based on a logistic regression model such as an L1-regularized logistic regression model. Equation(1) and Equation (2). Logistic regression models the probability distribution of the class label R given a feature vector (q,a) as in Equation (2), where w is the parameters of the logistic regression model; y.sub.i represents the label of the document, for query document i; and P is the probability calculated based on current w. In particular, in an embodiment, w represents the weights, and is a dot-product which can be used in the WAND operator. Note: The cited determining the weights for each term, word, or feature in the reference/query and target/advertisement document is interpreted to read on the claimed calculating a query-record based attention score for each of the plurality of personalization features, the query-record based attention score indicating a weight of the corresponding feature because the cited weight corresponds to features associated with a query (query-based attention), therefore the cited weight is interpreted  to also include the claimed calculated query-record based attention score.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, by incorporating determining weights for each term, word, or feature in the reference/query and target/advertisement document and logistic regression modeling the probability distribution, as taught by Zhang (see Paragraph 0043 and 0044), because both applications are directed to displaying vector processing; incorporating determining weights for each term, word, or feature in the reference/query and target/advertisement document and logistic regression modeling the probability distribution provides improved accuracy of the model (see Zhang Paragraph 0045).

Kang, Meek, Zhang, Polak, and Berkman discloses all of the limitations as set forth in claim 1 but does not appear to expressly disclose the computer-implemented method of claim 1, wherein converting each score to a corresponding probability distribution comprises using a softmax activation function
Creed discloses:
The computer-implemented method of claim 1, wherein converting each score to a corresponding probability distribution comprises using a softmax activation function [Paragraph 0031 teaches preferably, calculating the attention function further comprises calculating an attention function based on one or more from the group of: a SOFTMAX attention  function, wherein each attention weight, a.sub.i n, is calculated based on a.sub.i n = .sub.Sj n. Paragraph 00126 teaches where L(.Math.) is the attention function that maps a score vector  114 to a probability distribution A.sup.n= {a; > 0, å a.sub.t = 1]. Note: Mapping scores to probability distribution using a preferable softmax function is interpreted to read on the claimed converting each score to a corresponding probability distribution comprises using a softmax activation function because the claimed converting is interpreted to be mapping from a score to probability distribution. The examiner further interprets the claimed softmax activation function to be a softmax function.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, Meek, Zhang, Polak, and Berkman, by incorporating mapping scores to probability distribution using a preferable softmax function, as taught by Creed (see Paragraph 0031 and 00126), because six applications are directed to vector processing; incorporating mapping scores to probability distribution using a preferable softmax function improves the input dataset to a ML model and/or classifier (see Creed Paragraph 0088).

As to claim 8:
Kang, Meek, Zhang, Polak, and Berkman discloses all of the limitations as set forth in claim 5.
Zhang also discloses:
query-record based attention score [Paragraph 0043 teaches machine learning scorer component 224 may be embodied as a machine learning model configured for determining the weights for each term, word, or feature in the reference/query and target/advertisement document, and in an embodiment for determining the probability that a target document (such as an ad) is relevant to the reference (query). Paragraph 0044 teaches the machine learning scorer is based on a logistic regression model such as an L1-regularized logistic regression model. Equation(1) and Equation (2). Logistic regression models the probability distribution of the class label R given a feature vector (q,a) as in Equation (2), where w is the parameters of the logistic regression model; y.sub.i represents the label of the document, for query document i; and P is the probability calculated based on current w. In particular, in an embodiment, w represents the weights, and is a dot-product which can be used in the WAND operator. Note: The cited determining the weights for each term, word, or feature in the reference/query and target/advertisement document is interpreted to read on the claimed calculating a query-record based attention score for each of the plurality of personalization features, the query-record based attention score indicating a weight of the corresponding feature because the cited weight corresponds to features associated with a query (query-based attention), therefore the cited weight is interpreted  to also include the claimed calculated query-record based attention score.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang and Meek, by incorporating determining weights for each term, word, or feature in the reference/query and target/advertisement document and logistic regression modeling the probability distribution, as taught by Zhang (see Paragraph 0043 and 0044), because three applications are directed to displaying vector processing; incorporating determining weights for each term, word, or feature in the reference/query and target/advertisement document and logistic regression modeling the probability distribution provides improved accuracy of the model (see Zhang Paragraph 0045).

Kang, Meek, Zhang, Polak, and Berkman discloses all of the limitations as set forth in claim 5 but does not appear to expressly disclose the non-transitory machine-readable storage medium of claim 5, wherein converting each score to a corresponding probability distribution comprises using a softmax activation function.
Creed discloses:
The non-transitory machine-readable storage medium of claim 5, wherein converting each score to a corresponding probability distribution comprises using a softmax activation function [Paragraph 0031 teaches preferably, calculating the attention function further comprises calculating an attention function based on one or more from the group of: a SOFTMAX attention  function, wherein each attention weight, a.sub.i n, is calculated based on a.sub.i n = .sub.Sj n. Paragraph 00126 teaches where L(.Math.) is the attention function that maps a score vector  114 to a probability distribution A.sup.n= {a; > 0, å a.sub.t = 1]. Note: Mapping scores to probability distribution using a preferable softmax function is interpreted to read on the claimed converting each score to a corresponding probability distribution comprises using a softmax activation function because the claimed converting is interpreted to be mapping from a score to probability distribution. The examiner further interprets the claimed softmax activation function to be a softmax function.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, Meek, Zhang, Polak, and Berkman, by incorporating mapping scores to probability distribution using a preferable softmax function, as taught by Creed (see Paragraph 0031 and 00126), because five applications are directed to vector processing; incorporating mapping scores to probability distribution using a preferable softmax function improves the input dataset to a ML model and/or classifier (see Creed Paragraph 0088).

	As to claim 12:
Kang, Meek, Zhang, Polak, and Berkman discloses all of the limitations as set forth in claim 9.
Zhang also discloses:
query-record based attention score [Paragraph 0043 teaches machine learning scorer component 224 may be embodied as a machine learning model configured for determining the weights for each term, word, or feature in the reference/query and target/advertisement document, and in an embodiment for determining the probability that a target document (such as an ad) is relevant to the reference (query). Paragraph 0044 teaches the machine learning scorer is based on a logistic regression model such as an L1-regularized logistic regression model. Equation(1) and Equation (2). Logistic regression models the probability distribution of the class label R given a feature vector (q,a) as in Equation (2), where w is the parameters of the logistic regression model; y.sub.i represents the label of the document, for query document i; and P is the probability calculated based on current w. In particular, in an embodiment, w represents the weights, and is a dot-product which can be used in the WAND operator. Note: The cited determining the weights for each term, word, or feature in the reference/query and target/advertisement document is interpreted to read on the claimed calculating a query-record based attention score for each of the plurality of personalization features, the query-record based attention score indicating a weight of the corresponding feature because the cited weight corresponds to features associated with a query (query-based attention), therefore the cited weight is interpreted  to also include the claimed calculated query-record based attention score.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang and Meek, by incorporating determining weights for each term, word, or feature in the reference/query and target/advertisement document and logistic regression modeling the probability distribution, as taught by Zhang (see Paragraph 0043 and 0044), because three applications are directed to displaying vector processing; incorporating determining weights for each term, word, or feature in the reference/query and target/advertisement document and logistic regression modeling the probability distribution provides improved accuracy of the model (see Zhang Paragraph 0045).

Kang, Meek, Zhang, Polak, and Berkman discloses all of the limitations as set forth in claim 9 but does not appear to expressly disclose the apparatus of claim 9, wherein converting each score to a corresponding probability distribution comprises using a softmax activation function.
Creed discloses: 
The apparatus of claim 9, wherein converting each score to a corresponding probability distribution comprises using a softmax activation function [Paragraph 0031 teaches preferably, calculating the attention function further comprises calculating an attention function based on one or more from the group of: a SOFTMAX attention  function, wherein each attention weight, a.sub.i n, is calculated based on a.sub.i n = .sub.Sj n. Paragraph 00126 teaches where L(.Math.) is the attention function that maps a score vector  114 to a probability distribution A.sup.n= {a; > 0, å a.sub.t = 1]. Note: Mapping scores to probability distribution using a preferable softmax function is interpreted to read on the claimed converting each score to a corresponding probability distribution comprises using a softmax activation function because the claimed converting is interpreted to be mapping from a score to probability distribution. The examiner further interprets the claimed softmax activation function to be a softmax function.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Kang, Meek, Zhang, Polak, and Berkman, by incorporating mapping scores to probability distribution using a preferable softmax function, as taught by Creed (see Paragraph 0031 and 00126), because five applications are directed to vector processing; incorporating mapping scores to probability distribution using a preferable softmax function improves the input dataset to a ML model and/or classifier (see Creed Paragraph 0088).

Claim(s) 13 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Meek et al. (U.S. Publication No.: US 20090006285 A1) hereinafter Meek, in view of Roy (U.S. Publication No.: US 20080147575 A1) hereinafter Roy, and further in view of Alam et al. (U.S. Publication No.: US 20190139403 A1) hereinafter Alam.
As to claim 13:
Meek discloses:
A computer-implemented method for modeling heterogeneous feature sets by a computerized information system, the method comprising: 
generating a similarity factor for each of a plurality of personalization features corresponding to a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records [Paragraph 0003 teaches one mechanism for categorizing media content, such as pictures or video clips, is the use of metadata tags. Paragraph 0052 teaches system 500 can also utilize a list of user preferences stored within a storage component 506 to customize a generation, weighting, presentation, and/or association of tags with an input item 504. Paragraph 0053 teaches preferences can indicate whether to use a MRU tag list or user-specified tag list to generate tags, and what priority to give such criteria (e.g., as discussed at FIG. 2, supra). Paragraph 0054 teaches preferences can instruct tagging assistant 502 to search recipient(s) of the item to see if any stored tags (e.g., contained within list management component 208) are associated with the recipient(s). Paragraph 0075 teaches at 1106, classifiers and user preferences can be employed to relevance rank the tags. At 1108, the tags can be presented to a device user in order of relevance rank. Note: The classifiers and user preferences for searching recipients (user selects to create a search) ranking customized tags (plurality of personalization features) (custom fields of input records) based relevance (similarity factor), where the tags are associated with an MRU (most recently used) list (affinity) is interpreted to read on the claimed generating a similarity factor for each of a plurality of personalization features corresponding to a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records.]; 
calculating a personality feature weight for each of the plurality of personalization features [Paragraph 0040 teaches filtering component 206 can utilize tags contained within a most recently used tag list (e.g., either a simple, generic list provided by a recipient, or a sender-specific list) to generate and weight one or more tags for an input item of communication 204 Note: The examiner interprets the claimed personality feature weight to be a personalization feature weight. Therefore, generating and weighting one or more tags (personalization features) is interpreted to read on the claimed calculating a personality feature weight for each of the plurality of personalization features.]; 
generating a most recently used affinity value for each of the plurality of personalization features [Paragraph 0006 teaches tags can be relevance ranked as a function of relatedness to determined criteria, including item content, tags most recently used by a sender and/or recipient. Paragraph 0052 teaches system 500 can also utilize a list of user preferences stored within a storage component 506 to customize a generation, weighting, presentation, and/or association of tags with an input item 504. Paragraph 0053 teaches preferences can indicate whether to use a MRU tag list or user-specified tag list to generate tags, and what priority to give such criteria (e.g., as discussed at FIG. 2, supra). Note: Generated most recently used tags that are can be customized are interpreted to read on the claimed generating a most recently used affinity value for each of the plurality of personalization features because personalization features are interpreted to be custom tags that indicate a category and tags that are most recently used are interpreted affinity values.]
generating a ranking function based on the most recently used affinity values [Paragraph 0006 teaches tags can be relevance ranked as a function of relatedness to determined criteria, including item content, tags most recently used by a sender and/or recipient, user-defined preferences, one or more user profiles, or the like. Tags can be sorted as a function of relevance ranking, and automatically attributed to one or more items of communication, or presented to a recipient for selection. Paragraph 0061 component 606 can employ a probabilistic-based or statistical-based approach in connection with choosing between potential tags associated with an input item, optionally auto-assigning tags to input items 604, or offering proposed tags as a function of relevance rank. Note: Ranking as a function of relatedness based on tags (most recently used affinity values) is interpreted to read on the claimed Ranking as a function of relatedness based on tags (most recently used affinity values).]
wherein the ranking function reorders the list of input records for the search created by the user [Paragraph 0006 teaches tags can be relevance ranked as a function of relatedness to determined criteria, including item content, tags most recently used by a sender and/or recipient, user-defined preferences, one or more user profiles, or the like. Tags can be sorted as a function of relevance ranking, and automatically attributed to one or more items of communication, or presented to a recipient for selection. Paragraph 0054 teaches preferences can instruct tagging assistant 502 to search recipient(s) of the item to see if any stored tags (e.g., contained within list management component 208) are associated with the recipient(s). Paragraph 0061 component 606 can employ a probabilistic-based or statistical-based approach in connection with choosing between potential tags associated with an input item, optionally auto-assigning tags to input items 604, or offering proposed tags as a function of relevance rank. Note: Ranking/sorting (reordering) tags associated with an input item (list of input records) that are user-defined to search recipients (for the search) reads on the claimed wherein the ranking function reorders the list of input records for the search created by the user.]
personality feature weight [Paragraph 0040 teaches filtering component 206 can utilize tags contained within a most recently used tag list (e.g., either a simple, generic list provided by a recipient, or a sender-specific list) to generate and weight one or more tags for an input item of communication 204 Note: The examiner interprets the claimed personality feature weight to be a personalization feature weight. Therefore, generating and weighting one or more tags (personalization features) is interpreted to read on the claimed calculating a personality feature weight for each of the plurality of personalization features.]

Meek discloses most of the limitations as set forth in claim 13 but does not appear to expressly disclose converting each weight to a corresponding probability distribution and scaling each similarity factor based on the corresponding probability distribution.
Roy discloses: 
converting each weight to a corresponding probability distribution [Paragraph 0040 teaches  probability distribution may be described by .mu..sub.i and .sigma..sub.i for i=1 . . . m for weights w.sub.i for one or more features f.sub.i in a first labeled content item x.sub.1. Note: The cited probability distribution for weights w.sub.i is interpreted to read on the claimed converting each weight to a corresponding probability distribution because the probability distribution can describe the weights therefore the weight have been converted.];
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Meek by, incorporating probability distribution for weights w.sub.i, as taught by Roy (see Paragraph 0040), because both applications are directed to data processing; incorporating probability distribution for weights w.sub.i provides a fast and accurate method/algorithm for classifying the unlabeled content items (see Roy Paragraph 0049).
 
Meek and Roy discloses most of the limitations as set forth in claim 13 but does not appear to expressly disclose scaling each similarity factor based on the corresponding probability distribution.
Alam discloses:
scaling each similarity factor based on the corresponding probability distribution [Paragraph 0035 teaches the weights may be computed empirically by adjusting weighted kernel similarity on the Mahalanobis distance (e.g., quantifying how many standard deviations away a point P is from the mean of a distribution D) or the Bhattacharyya distance (e.g., measuring the similarity of two probability distributions) between individual perceptual distributions comparing with a shallow artificial neural network to conduct the voting. Note: Adjusting weighted kernel similarity based on similarity of associated probability distribution is interpreted to read on the claimed scaling each similarity factor based on the corresponding probability distribution.]; and 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teaching of the cited references and modify the invention as taught by Meek and Roy by, incorporating adjusting weighted kernel similarity based on similarity of associated probability distribution, as taught by Alam (see Paragraph 0035), because the three applications are directed to data processing; incorporating adjusting weighted kernel similarity based on similarity of associated probability distribution reduces latency and/or improves safety (see Alam Paragraph 0061).

As to claim 17:
Meek discloses:
The computer-implemented method of claim 13, wherein the ranking function reorders the list of input records in a predetermined order of relevance at the user level [Paragraph 0006 teaches tags can be relevance ranked as a function of relatedness to determined criteria, including item content, tags most recently used by a sender and/or recipient, user-defined preferences, one or more user profiles, or the like. Tags can be sorted as a function of relevance ranking, and automatically attributed to one or more items of communication, or presented to a recipient for selection. Paragraph 0054 teaches preferences can instruct tagging assistant 502 to search recipient(s) of the item to see if any stored tags (e.g., contained within list management component 208) are associated with the recipient(s). Paragraph 0061 component 606 can employ a probabilistic-based or statistical-based approach in connection with choosing between potential tags associated with an input item, optionally auto-assigning tags to input items 604, or offering proposed tags as a function of relevance rank. Note: Ranking/sorting (reordering) tags associated with an input item (list of input records) that are user-defined (user-level) to search recipients (for the search) based on relatedness to determined criteria (predetermined order of relevance)reads on the claimed the ranking function reorders the list of input records in a predetermined order of relevance at the user level.]

Response to Arguments
Applicant presents the following arguments in August 17, 2022 remarks pages 8-11:
“Applicant respectfully disagrees that the claims are directed to an abstract idea at least because they include a combination of elements that define a practical application that amounts to significantly more than the identified judicial exception (abstract idea)... Applicant respectfully submits even if the claims are considered to recite a judicial exception, which Applicant does not concede, the recited features are integrated into a practical application of that exception… claims do impose a meaningful limit on the judicial exception and thus are directed to statutory subject matter under the Office's guidance to application of § 101. ”

Examiner respectfully presents the following response to Applicant’s remarks:
Applicant’s arguments regarding claims 1, 5, 9, and 13 have been fully considered but they are not persuasive. Regarding independent claim 1, but for the limitations stating “a computer-implemented method for modeling heterogeneous feature sets by a computerized information system, creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector”, the mention of generating a categorical embedding vector for each of a plurality of personalization features corresponding to a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records and each categorical embedding vector comprises a variable number of variably sized elements, calculating a query-record based attention score for each of the plurality of personalization features, the query-record based attention score indicating a weight of the corresponding personalization feature, converting each query-record based attention score to a corresponding probability distribution, scaling each categorical embedding vector based on the corresponding probability distribution, generating a ranking function based on the fixed-dimensional query-record latent space feature vector, wherein the ranking function reorders the list of input records for the search created by the user, in the context of this claim, encompasses a user mentally generating vectors using mental calculations, conversions, probability measurements (scaling), and ranking data used in a search. Therefore, the examiner maintains that if a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Claims 1, 5, 9, do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As for claim 1, with respect to integration of the abstract idea into a practical application, the additional elements, a computer-implemented method and a computerized information system recited at a high level of generality to apply the exception using generic components. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. The additional elements, creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector are interpreted to be well understood, routine, and conventional activity (Storing and retrieving information in memory, Versata (see MPEP 2106.05(d))). To further elaborate, the examiner maintains the additional elements, “a computer-implemented method for modeling heterogeneous feature sets by a computerized information system, creating a fixed-dimensional personalization feature vector by aggregating the scaled categorical embedding vectors based on the probability distributions, and combining the fixed-dimensional personalization feature vector and a fixed-dimensional non-personalization feature vector to produce a fixed-dimensional query-record latent space feature vector”, does not impose a meaningful limit on the judicial exception and it merely confines the claim to a particular technological environment or field of use. Therefore, in view of the applicant’s arguments, the examiner maintains claim 1 is not patent eligible.
As for claim 13, but for the limitations stating “a computer-implemented method for modeling heterogeneous feature sets by a computerized information system”, the mention of generating a similarity factor for each of a plurality of personalization features corresponding to a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records, converting each personality feature weight to a corresponding probability distribution, scaling each similarity factor based on the corresponding probability distribution, generating a most recently used affinity value for each of the plurality of personalization features, and generating a ranking function based on the most recently used affinity values, wherein the ranking function reorders the list of input records for the search created by the user, in the context of this claim, encompasses a user mentally generating ranking function based on affinity values using mental calculations, conversions, probability measurements (scaling), and ranking data used for a search. Therefore, the examiner maintains, if a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components and a mathematical calculation, then it falls within the “Mental Processes” and “Mathematical Concepts” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Claim 13 does not include additional elements that are sufficient to amount to significantly more than the judicial exception. For example, with respect to integration of the abstract idea into a practical application, the additional elements, a computer-implemented method and a computerized information system are recited at a high level of generality to apply the exception using generic components. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. To further elaborate, the examiner maintains the additional elements, “a computer-implemented method and a computerized information system”, does not impose a meaningful limit on the judicial exception and it merely confines the claim to a particular technological environment or field of use. Therefore, in view of the applicant’s arguments, the examiner maintains claim 13 is not patent eligible.

Applicant presents the following arguments in the 8/17/2022 remarks page 13:
“Kang, Zhang, Polak, and Berkman alone or in combination, do not teach or suggest or disclose the cited claim elements of independent claims 1, 5, and 9… Meek does not teach or suggest or disclose anything akin to the additional claim element added by this response, "a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records ... the ranking function reorders the list of input records for the search created by the user" as recited in independent claim 13”

Applicant’s arguments have been fully considered but they are not persuasive. The Examiner respectfully disagrees with the applicant’s arguments regarding claim 1, 5, 9, and 13’s recitation "a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records ... the ranking function reorders the list of input records for the search created by the user"  (see Meek Paragraph 0003, 0006, 0052, 0053, 0054, 0061, and 0075). As for “a most recently used affinity for a list of input records from which a user selects to create a search, wherein the personalization features correspond to custom fields of the input records”, one mechanism for categorizing media content, such as pictures or video clips, is the use of metadata tags (see Paragraph 0003). System 500 can also utilize a list of user preferences stored within a storage component 506 to customize a generation, weighting, presentation, and/or association of tags with an input item 504 (see Paragraph 0052). Preferences can indicate whether to use a MRU tag list or user-specified tag list to generate tags, and what priority to give such criteria (e.g., as discussed at FIG. 2, supra) (see Paragraph 0053). Preferences can instruct tagging assistant 502 to search recipient(s) of the item to see if any stored tags (e.g., contained within list management component 208) are associated with the recipient(s) (see Paragraph 0054). At 1106, classifiers and user preferences can be employed to relevance rank the tags. At 1108, the tags can be presented to a device user in order of relevance rank (see Paragraph 0075). The examiner maintains the classifiers and user preferences for searching recipients (user selects to create a search) ranking customized tags (plurality of personalization features) (custom fields of input records) based relevance (similarity factor), where the tags are associated with an MRU (most recently used) list (affinity) is interpreted to read on the claimed generating a similarity factor for each of a plurality of personalization features corresponding to a most recently used affinity.
As for “the ranking function reorders the list of input records for the search created by the user”, tags can be relevance ranked as a function of relatedness to determined criteria, including item content, tags most recently used by a sender and/or recipient, user-defined preferences, one or more user profiles, or the like. Tags can be sorted as a function of relevance ranking, and automatically attributed to one or more items of communication, or presented to a recipient for selection (see Paragraph 0006). Preferences can instruct tagging assistant 502 to search recipient(s) of the item to see if any stored tags (e.g., contained within list management component 208) are associated with the recipient(s) (see Paragraph 0006). Component 606 can employ a probabilistic-based or statistical-based approach in connection with choosing between potential tags associated with an input item, optionally auto-assigning tags to input items 604, or offering proposed tags as a function of relevance rank (see Paragraph 0061). Ranking/sorting (reordering) tags associated with an input item (list of input records) that are user-defined to search recipients (for the search) reads on the claimed wherein the ranking function reorders the list of input records for the search created by the user. Further clarification through amendments to the claim language may aid in differentiating from the current prior art citations.

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EARL ELIAS whose telephone number is (571)272-9762. The examiner can normally be reached Monday - Friday (IFP).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached on 571-272-4046. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/EARL ELIAS/Examiner, Art Unit 2169                                                                                                                                                                                                        
/USMAAN SAEED/Supervisory Patent Examiner, Art Unit 2169