DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
       The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of Claims
Claims 1, 7, 9 – 11, 17 and 19 – 20 have been amended and are hereby entered.
Claims 1-20 are pending and have been examined. 
This action is made FINAL.
Response to Arguments
Applicant's arguments filed June 27, 2022 have been fully considered but they are not persuasive. 
Regarding to the applicant's arguments of the patent eligibility under 35 U.S.C. § 101, on pages 10-13, in which independent claims set 1 and 11 and its pending claims recite additional elements that integrates the judicial exception into a practical application (Step 2A-Prong 2) under the improvement in the functioning of a computer, or an improvement to other technology or technical field and these pending claims reciting an inventive concept by adding specific limitations that are not well-understood, routine, or conventional activity in the field (Step 2B). Firstly, the applicant’s arguments for Step 2A-Prong 2 analysis on page 10: The applicant expresses that “The specific, concrete approach in these additional elements provides a combination of steps that improves the technical field of quality assessment of attribute values and uses the ideas in meaningful way that is not generally linking to a technological environment” by presenting the amended claims 1 and 11 limitation steps to justify that these recite an “improvement to other technology and recite use of the ideas in a meaningful way beyond generally linking to a particular technological environment” by reciting “many significant additional elements above and beyond what the Office Action identifies as the abstract idea” and this is not persuasive. 
The limitation steps are merely reciting that the use of a computer to build “models” that can determine “relevancy” and “interpret” the tittles of item catalogs by “retrieving” their “target attribute values” that are based on their “target attribute names” and to “assess” their “relevancy” and “precision” through “scores”. Resulting in a “weight” that will be used to select a “winning attribute value” for the target item’s name which fails to have significant or unconventional limitation steps to be considered an improvement to computer technology and its functions.  
In addition, the specifications section provided and cited by the applicant (¶38 and ¶116-118) in pages 12 and 13 do not “provide sufficient details such that one of ordinary skill in the art would recognize the claimed invention as providing an improvement” for Step 2A prong 2. Even though, “the specification need not explicitly set forth the improvement, but it must describe the invention such that the improvement would be apparent to one of ordinary skill in the art”. Conversely, the specification might “explicitly sets forth an improvement, but in a conclusory manner (i.e., a bare assertion of an improvement without the detail necessary to be apparent to a person of ordinary skill in the art) and does this is not enough to determine the claim improves technology” (See MPEP 2106.05(a)). Thus, for this alleged invention and taking into consideration the rest of the additional elements, there’s no additional elements individually or in combination (one or more processors; one or more non-transitory computer-readable media storing computing instructions; (claim 1) a relevancy model; title interpreter model (claims 1 and 11) term frequency-inverse document frequency (TF-IDF) and a natural language processing matching and a conflict solver (claims 5, 8, 15 and 18)) that could amount significantly more than the judicial exception itself for it to be considered an improvement. 
Neither there is an “inventive concept” in the limitations as it does not have additional specific limitations other than is well-understood, routine, conventional activity in the field or adding unconventional steps that confine the claim to a particular useful application (at Step 2B), for it to be considered an improvement. Therefore, these pending claims merely amount to nothing more than an instruction to apply the abstract idea using a generic computer and/or the implementing the use of ordinary capacity for software programs or other tasks (e.g., to receive, store, or transmit data) (refer to MPEP 2106.05f (2)) and does not render an abstract idea eligible. Thus, the examiner respectfully disagrees, and maintains 35 USC § 101 rejection for these pending claims. 
Regarding to the applicant's arguments of rejection under 35 USC §102 and §103 for the independent claims set 1 and 11 and their dependent claims 2-10 and 12-20 on pages 13 – 20: The applicant’s arguments regarding these amended limitation steps in the pending claims are not persuasive and are considered moot, due to the new grounds of rejection. Please, refer to the Claim Rejections - 35 USC § 103 section for further details. 

Claim Rejections - 35 USC § 101
       35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1 - 20  are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Firstly, it should be stated that claim 1 is being used as the most representative of the independent claims set 1 and 11. Step 1: the claimed invention falls under statutory categories of a machine and a process. However, Step 2A Prong 1: the abstract idea is defined by the elements of: 
building …based on items in an item catalog; 
building a title interpreter model based on titles of the items in the item catalog; 
retrieving target attribute values that are associated with a target attribute name of a target item in the item catalog, …; 
generating a respective relevancy score for each one of the target attribute values…; 
generating a respective precision score for the each one of the target attribute values…; 
determining a respective weight for the each one of the target attribute values …; and 
selecting a winning attribute value for the target attribute name of the target item from among the target attribute values, based on the respective weights for the target attribute values.

These limitations, describe a system and a method for identifying, filtering and categorizing items or merchandise information to accurately browse relevant items inside an online shop platform and make better purchasing decisions. Thus, these limitations are directed to the abstract idea of a certain method of organizing human activity in the form of engaging in commercial or legal interactions by organizing and cataloguing merchandise by accuracy and relevancy based on review confidence values and scores to manage a business and avoid return costs, angry customers, and/or decreased brand loyalty, resulting in revenue losses. As disclosed in the specification, this invention can provide a technology-based solution to automatically detect data accuracy issues by flagging them to further review them. 

Step 2A Prong 2: The judicial exception is not integrated into a practical application, because the claims as a whole, while looking for its additional element(s) of one or more processors; one or more non-transitory computer-readable media storing computing instructions; a relevancy model; title interpreter model individually and in combination, merely is used as a tool to perform the abstract idea (refer to MPEP 2106.05f) and are not functionally related to the claimed process other than a recitation to intended use or field of use (MPEP 2106.05(h)) after the fact to an abstract idea that does not integrate a judicial exception into a practical application or provide significantly more. Therefore, this is indicative of the fact that the claim has not integrated the abstract idea into a practical application and therefore, the claim is found to be directed to the abstract idea identified by the examiner.

As for dependent claims 5/15 and 8/18, recite the additional element(s) of a term frequency-inverse document frequency (TF-IDF) and a natural language processing matching and a conflict solver, respectively, which are merely used as a tool to perform the abstract idea. Thus, they amounts no more than mere instructions to apply the exception using a generic computer component (MPEP 2106.05(f)) and/or links to computer implementing the use of ordinary capacity for economic or other tasks (e.g., to receive, store, or transmit data) or simply adding a general-purpose computer or computer components (refer to MPEP 2106.05f (2)) and these additional element(s) does not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. These claim(s) are directed to an abstract idea.

Step 2B: For claims 1, 2 and 20, these claims recite the additional elements: one or more processors; one or more non-transitory computer-readable media storing computing instructions; (claim 1) a relevancy model; title interpreter model (claims 1 and 11) term frequency-inverse document frequency (TF-IDF) and a natural language processing matching and a conflict solver (claims 5, 8, 15 and 18) and these are not sufficient to amount significantly more than the judicial exception. Meaning, that there are no additional element(s) claimed in the dependent claims that could be significantly more than the judicial exception, but rather, further recites the abstract idea. As indicated in Step 2A Prong 2, the additional element(s) in the claims are merely, using a generic computer device and/or computing technologies to perform an abstract idea that does not constitute a practical application and only amounts to a mere instruction to practice the invention. Thus, this not render the claims as being eligible (refer to MPEP 2106.05(f) and 2106.05(h). The rationale set forth for the 2nd prong of the eligibility test above is also applicable and re-evaluated in step 2B, thus being sufficient for its rejection basis as not patent eligible and by being consistent with the recently issued 2019 PEG.

For dependent claims 2 - 20, these cover or fall under the same abstract idea of a method of organizing human activity. They describe additional limitations steps of:
Claims 2 – 8 and 12 – 18: further describes the abstract idea of the filtering and categorizing target items method and its scores and values of a first and a second dictionary used to filter items in a catalog based on attributes values (item’s name, type and quantity), confidence, relevancy and semantic centroid scores (with a weighted average of glove word embeddings) and taxonomy data to extract them using natural processing language based on their similarity and respective scores. Thus, being directed to the abstract idea group of “engaging in commercial or legal interactions” by organizing and cataloguing merchandise by accuracy and relevancy.
Claims 9 – 10 and 19 – 20: further describes the abstract idea of the filtering and categorizing target items method and the attribute values in where a third dictionary is built based on the first dictionary based on a nesting level for each attribute value, tokenizing the titles to determine and sort the item.

Therefore, the additional elements previously mentioned above, is/are nothing more than descriptive language about the elements that define the abstract idea, and these claims remain rejected under 101 as well. 

Claim Rejections - 35 USC § 103
       In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
        The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

     Claims 1-5, 8-10, 11-15 and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Pyati  (U.S. Pub No. 20190311301 A1) in view of Cassidy (U.S. Patent No. 7107226 B1) in further view of Rubenczyk (U.S. Pub No. 20030217052 A1). 
Regarding claims 1 and 11: 
A system comprising: (claim 1)
Pyati teaches:
one or more processors; and (“Computing system 800 may include processors 804, memory/storage 806, and I/O components 818, which may be configured to communicate with each other such as via bus 802.” ¶0171; Fig 8 (800 and 804)) 
one or more non-transitory computer-readable media storing computing instructions configured to run on the one or more processors and perform: (“computing system 800 can read instructions 810 from a computer-readable medium (e.g., a computer-readable storage medium) and perform any one or more of the methodologies discussed herein.” ¶0169; Fig 8 (800, 804 and 810)) Examiner note: Also, refer to ¶0173 which clarifies that “The term “computer-readable medium” may include a single medium or multiple media (e.g., a centralized or distributed database, or associated caches and servers) able to store instructions 810.”
building a relevancy model based on items in an item catalog; (“User account feature extractor 110 can also extract features from the values of attributes of a user account that may be a relevant to a target objective for an item listing. For example, the target objective may relate to satisfying a sales criterion (e.g., maximizing a selling price or the selling probability of an item). User account feature extractor 110 can query user accounts database 132 to retrieve features of a user account (e.g., a user account associated with a user creating a particular item listing, a user account of multiple users sharing demographic traits or behavior with the listing user, etc.).” ¶0036; Fig 1 (104 and 110); Fig 4A (410)) Examiner note: Under the Broadest Reasonable Interpretation (BRI), the ability to create a model based on catalog items has been interpreted as the role that the “User account feature extractor 110” to later build an “item listing” when it “retrieve features of a user account”. For more information regarding relevancy measurements refer to ¶0020 and ¶0040.
building a title interpreter model based on titles of the items in the item catalog; (“text feature extractor 106 may identify features of an item listing based on its metadata, such as the number of characters or the number words in the listing's title, description, or other text… text feature extractor 106 may be capable of identifying the tense (e.g., past, present, future, etc.), perspective (e.g., first, second, or third person), conciseness (e.g., multiple vs. a plurality of), tone, and other writing styles or properties of text apart from its meaning (e.g., semantic qualities), importance (e.g., lexical qualities), or structure (e.g., syntactic qualities).” ¶0031; Fig 1 (104 and 106); Fig 3C (314, 316 and 318); Fig 4A (410)) Examiner note: Under BRI, the building step of a title interpreter model based on catalog items has been interpreted as the “text feature extractor 106” that can identify “an item listing based on… number of characters or the number words in the listing's title”. Also, refer to ¶0032 in where the use of a “Item specific feature extractor 108” is also used to identify and process “the values of item specifics (e.g., manufacturer, model, color, etc.) from one form to another form more suitable for analyzing the item listings for a particular context” which are being interpreted as attribute values or “the brand, the color, the size, and/or other suitable attribute names”, as stated in ¶0057 from the applicant’s specification. Finally, refer to ¶0135 for more details of the system building a machine learning model while receiving “additional attribute values for the attributes of the new item listing (e.g., title, description, photos/videos, item specifics, etc.).” and to ¶0103 to learn about the title builder and recommendations. 
determining a respective weight for the each one of the target attribute values based on the respective relevancy score for the each one of the target attribute values and the respective precision score for the each one of the target attribute values; and (“Weights can be user-specified or automatically obtained, such as via silhouette scores. A silhouette score is a measure of how similar an object is to its own cluster or class compared to other clusters or other classes, which can range from −1 to 1, where a high value indicates that an item listing is well matched to its own cluster or class and badly matched to neighboring clusters or classes. If most item listings have a high silhouette score, then the clustering or classification maybe accurate. If many item listing have a low or negative silhouette score, then the clustering or classification may have too many or too few clusters or classes. The silhouette score can be calculated with any similarity or distance metric, such as the Euclidean distance or the Manhattan distance.” ¶0045; Fig 1 (118); Equation 2) Examiner note: Under BRI, the respective weight based on target attribute values’ relevancy and precision scores has been interpreted as the “Weights” that are “user-specified or automatically obtained, such as via silhouette scores”. Also, refer to ¶0041 to learn more about early and late fusion as implemented in the “feature vector builder 118”.
selecting a winning attribute value for the target attribute name of the target item from among the target attribute values, based on the respective weights for the target attribute values. (“Machine learning modeler 120 can determine the parameters and functions (e.g., a machine learning model) that may be optimal for achieving a target objective. This can include identifying a suitable machine learning algorithm. In some embodiments, a user may select the machine learning algorithm(s) as part of generating the new item listing, or the selected machine learning algorithm(s) may have previously been chosen as a user preference. In addition or alternatively, the system can select the machine learning algorithm(s) based on historical analyses (including a/b testing) indicating that a particular machine learning algorithm provides an especially accurate prediction for the outcome of an item listing or that the algorithm is especially successful in achieving the target objective” ¶0048; Fig 1 (120 and 122)) Examiner note: Under BRI, the winning attribute for the target attribute name of the target item has been interpreted as the ability that the  “Machine learning modeler 120 can determine” with “the parameters and functions…that may be optimal for achieving a target objective” while selecting a suitable algorithm based on “historical analyses” for an accurate prediction. Also, refer to ¶0067 to learn how the use of “bin classifications” can help with the selection and probability of classifying attributes of item listings based on parameters or targets and ¶0074 for new item listing under the “Machine learning model evaluator 122”.

Pyati does not explicitly teach the following amended limitations. However, Rubenczyk is prior art directed to “a search engine and, more particularly, but not exclusively to a search engine for use in conjunction with databases including networked databases and information stores” which concentrates on data items that can be classified as attributes or values in a hierarchy. (see ¶0002, abstract and ¶0406). Thus, teaches: 
generating a respective relevancy score for each one of the target attribute values using the relevancy model to assess a relevancy of the each one of the target attribute values for the target item to the target attribute name of the target item; A matchmaker 28 then has the task of searching the data store (possibly making use of various indices), which may include one or more separate databases, to find the items that match components of the formal request. A ranker 30 provides a numerical value to describe the overall level of match between the query and each data item, i.e. it assesses the relevance of data-items to the query. This relevance rank is affected by the quality of match of components of the formal request, the confidence in variant readings of the query, and the confidence measures of data classification (if available) attached to the items by the Indexer.” ¶0420; Fig 2 (30 and 28) and Figs 3 – 4) Examiner note: Under BRI and in light of the applicant specifications in ¶0100, relevancy score for a target attribute value has been interpreted as the “relevance rank” in which a assesses “data-items” and its “match components” that are considered the attribute names and values inputted in a query and requested by the user as stated in ¶0419 and ¶0598. Also, refer to ¶0421 and ¶0428 to learn more information about the “ranker” and to ¶00445-46 for the “matchmaker”. 
generating a respective precision score for the each one of the target attribute values based on the title interpreter model to assess an accuracy of the each one of the target attribute values for the target item compared against a title of the target item; (“The interpreter analyzes, interprets and enhances the request and reformulates it as a formal request. A formal request is a request that conforms to a model description of the database items. A formal request is able to provide measures of confidence for possible variant readings of that request. In order to make up the formal request and also in order to provide for variants, the interpreter 22 makes use of a general knowledge base 24, which includes dictionaries and thesauri on one hand, and domain-specific semantic data 26 garnered from items in the data store. The domain specific data may be enhanced using machine learning unit 18, from the behaviors of previous users who have submitted similar queries, as noted above. In addition, the interpreter parses the request as a series of nouns and adjectives, and attempts to determine which terms in the query refer to which known classes (in the classification scheme), taking into account that some class-values are considered as attributes for other class-values.” ¶0419; “The numerical value can then be thresholded to decide whether to add the data item to a result space or not. Also the retrieved data items within the results space can be ordered in decreasing relevancy according to the scores computed by the ranker. Thus, in the above example, item “plain red cotton shirt with long sleeves” would be added to the results space with a high degree of confidence, as would “plain red nylon shirt with long sleeves”. An item “patterned cotton shirt with long sleeves” might be added to the results with a lower degree of confidence and an item “plain tee-shirt with collar” with an even lower degree of confidence.” ¶0421; Fig 2 (30 and 28) and Figs 3 – 4) Examiner note: Under BRI and in light of the applicant specifications in ¶0057, the precision score for a target attribute value has been interpreted as the “numerical value” or “relevance score” “thresholded to decide whether to add the data item to a result space or not” to determine the order, from highest to lowest degree of confidence. Also, refer to ¶0520-523 and ¶0533 for details of the formula which includes a precision variable to determine “average past success of M [multiple candidates]” for to classify “products (data items)”.

It would have been obvious to one of ordinary skill in the art before the earliest effective filing date of the claimed invention to have provided Pyati by with the ability of generating relevancy and precision scores using a relevancy model and an interpreter model, respectively to assess relevancy and accuracy in the attribute values and names and titles of the target items, as taught by Rubenczyk. Because it would be “obvious to try” and determine a comparison score that can measure the relevancy or likelihood of two similar terms while also providing confidence with minimum error that those scored terms are correct and satisfy the user when searching for them to complete a product purchase in an online shop. Also, Rubenczyk acknowledges that “Often, items, that is potential objects of a search, that are represented in a database or data store or Information Storehouse (IS) component of an IR [Information Retrieval] system, are in the form of free-text documents, The documents can be very short (just one line, as in the name of a product in an e-vendor site), of medium length (a few lines, as in a news item) or quite long (a few pages, as in financial reports, scientific articles, or encyclopedic entries). Still, it should be strongly emphasized that the textual medium, though definitively the most common one today, is by no means the only applicable medium for database items. The IS can consist of items that are pictures, videos, sound excerpts, electronically transcribed music sheets, or any other resource that contains information. The query may then consist of describing parts or features of the required pictures (colors, shapes, etc.) or sounds, a short musical or rhythmic pattern, and the like.” (Rubenczyk; ¶0009).

Finally, neither Pyati or Cassidy explicitly teach the retrieval of target attribute values associated to a target item and their attribute name, that specifically comes from or are received by multiple suppliers for the specific target item. However, Cassidy which is an analogous prior art related to “a searchable database comprising a multiplicity of tables including an attributes table and a values table for a multiplicity of target search items constructed and arranged so that selection of values for one or more target search item attributes yields an attribute-value construct specifying a particular one of said target search items and precluding an indeterminate search result” (See Col. 3, lines 19-29 and abstract). Thus, teaches:
retrieving target attribute values that are associated with a target attribute name of a target item in the item catalog, the target attribute values having been received from multiple sources comprising multiple different suppliers of the target item; (“the shopping cart employed in the instant invention, is of an aggregatable/disaggregatable character, meaning that the order for products is assembled by a user across the full spectrum of the database, and thus includes a multiplicity of vendors, manufacturers, products, etc. This important aspect of the applicants invention permits the system proprietor to assemble at a single Web site or other cyberspace location, an extensive collection of products from a variety of manufacturers and suppliers. This broad spectrum capability of the applicants' invention therefore permits a user to assemble an order which may involve very different and numerous products deriving from numerous independent sources, based on attribute-value chains which enable comparison shopping and selection according to the user's unique needs and requirements (by the user's selection of appropriate values for each of the selected attributes).” Col. 8, lines 56 – 67 and Col. 9 lines 1 – 6; Fig 1 (108); Fig 4A (412)) Examiner note: Under BRI and in light of the applicant specifications in ¶0098-99, the retrieval of target attributes received from multiple sources such as suppliers of the target item has been interpreted as the ability that a user can input by including the item description of the “attribute-value chains” for that specific product. Also, the user can be a “vendor” or a “potential supplier” (see Col. 10, 65 – 67 and Col. 11, 1 – 6). Finally, refer to Col. 16, 65 – 67 and Col. 17, 1 – 54 for details about the data entry module and to Col. 21, 4 – 67 and Col. 17, 1 – 54 specific details of the database information of “manufacturers” and “attributes”.

It would have been obvious to one of ordinary skill in the art before the earliest effective filing date of the claimed invention to have provided Pyati modified Rubenczyk by with the ability of retrieving target item’s attribute names by receiving attribute values from multiple sources of different suppliers of the target item, as taught by Cassidy. Because it would be obvious to try and aggregate information of as many sources, but specifically from suppliers to receive the most accurate description of the items that were manufactured by their creator. Also, Cassidy acknowledges that “Further, there is a need for information for the making of a purchase decision based on comparison of the purveyed products, as to their features, such as overall price, unit price, volume discounts, quality, and/or source. Furthermore, there is a need for enabling consumers to view such information in a readily assimilated format, such as a grid or matrix format that may be proprietary to the shopping site.” (Cassidy; Col. 1, 40 – 47).

Regarding claims 2 and 12: 
The combination of Pyati, Rubenczyk and Cassidy, as shown in the rejection above, discloses the limitations of claims 1 and 11.
Pyati further teaches:
building a first dictionary comprising a respective confidence score for each respective filtered attribute value that is associated with each respective attribute name that is associated with each product type of the items in the item catalog; and (“Weights can be user-specified or automatically obtained, such as via silhouette scores. A silhouette score is a measure of how similar an object is to its own cluster or class compared to other clusters or other classes, which can range from −1 to 1, where a high value indicates that an item listing is well matched to its own cluster or class and badly matched to neighboring clusters or classes. If most item listings have a high silhouette score, then the clustering or classification maybe accurate. If many item listing have a low or negative silhouette score, then the clustering or classification may have too many or too few clusters or classes. The silhouette score can be calculated with any similarity or distance metric, such as the Euclidean distance or the Manhattan distance. Percentiling or binning maps the value of each position of a similarity vector to percentiles or bins to account for different similarity distributions. That is, similarity vectors are sorted and bins of the sorted vectors are created according to a particular probability distribution (e.g., normal or Gaussian, Poisson, Weibull, etc.). For example, the probability P, for a probability density function f, that an item listing X belongs to a cluster or class associated with a percentile/bin over interval a and b” ¶0045; Fig 1 (118; and 120); Equation 2) Examiner note: Under BRI and in light to the applicant specifications, specifically ¶0044-45, the construction of the first dictionary has been interpreted as the generation of “Percentiling or binning maps” that are based on the “clusters or classes” that represents the “intersections, unions, or complements of feature vectors” or “item listings” (refer to ¶0046) and the confidence score has been interpreted as the “silhouette scores”. Also, refer to ¶0041 to learn more about early and late fusion as implemented in the “feature vector builder 118” and ¶0061 to learn more about clusters.
building a second dictionary comprising a respective semantic centroid score for the each respective attribute name that is associated with the each product type of the items in the item catalog. (“In other embodiments, machine learning modeler 120 may utilize unsupervised learning methods for determining how item listings relate to one another and the attributes of the item listings that form clusters, such as by binning clusters and determining the size of each bin/cluster using a distribution curve (e.g., bell curve, exponential curve, Poisson curve, or other curve discussed in the present disclosure)… Examples of unsupervised learning techniques include clustering, principle component analysis (PCA), and other methods discussed throughout the present disclosure…” ¶0061; “Clustering methods can include k-means clustering, hierarchical clustering, density-based clustering,…. In k-means clustering, a number of n data points are partitioned into k clusters such that each point belongs to a cluster with the nearest mean. The algorithm proceeds by alternating steps, assignment and update. During assignment, each point is assigned to a cluster whose mean yields the least within-cluster sum of squares (WCSS) (e.g., the nearest mean). During update, the new means is calculated to be the centroids of the points in the new clusters. Convergence is achieved when the assignments no longer change.” ¶0062; Fig 1 (120)) Examiner note: Under BRI and in light to the applicant’s specifications, specifically ¶0044-45, the construction of the second dictionary based on a semantic centroid score has been interpreted as the implementation of “k-means clustering” in which each “point belongs to a cluster with the nearest mean” and when updating “new means is calculated to be the centroids of the points in the new clusters” with no more assignments or change, which can be interpreted as the “semantic centroid”. 

Regarding claims 3 and 13: 
The combination of Pyati, Rubenczyk and Cassidy, as shown in the rejection above, discloses the limitations of claims 2 and 12.
Pyati further teaches:
filtering out attribute values such that, for each excluded attribute value associated with an attribute name and a product type, a quantity of items associated with the excluded attribute value is fewer than a predetermined threshold; and (“In some embodiments, optimizer can “pre-load” or “pre-cache” previous item listings that may be relevant to a new item listing. This can involve loading previous item listings into memory prior to when the system will process the listings. For example, application layer 104 can include an interface that allows users to initiate construction of a new item listing by selecting one or more categories for the new listing (e.g., electronics) and/or identifying a specific type of item (e.g., make and model) associated with the new listing. Upon selection of the category or categories and/or specific item type, optimizer 116 may retrieve a predetermined number of item listings (e.g., all item listings, a number of items specified by the system, or a number of items specified by the user, etc.) that have ended from sale or expiration by a predetermined date (e.g., past 3 months up to the current date and time, a date specified by the system, or a date specified by the user, etc.) from item listings database 136. In some situations, there may be a minimum or maximum number of item listings, and the system and/or the user can broaden (or narrow) the category or date range to satisfy the minimum (or maximum) threshold. Optimizer 116 may fetch the item listings from item listings database 136, load or cache the item listings in in-memory resources 138 (e.g., cache, memory, memcache, etc.), and maintain the item listings in in-memory resources 138 at least until the new item listing is completed.” ¶0040; Fig 1 (116)) Examiner note: Under BRI, the filtering out step has been interpreted as the reduction or “narrowness” of items in an item listing, based on attribute values that are lesser than a predetermined threshold in which “system and/or the user can broaden (or narrow) the category or date range to satisfy the minimum (or maximum) threshold”. 
generating the respective confidence score for the each respective filtered attribute value that is associated with the each respective attribute name that is associated with the each product type of the items in the item catalog. (“Delta engine 124 can identify differences between a new item listing based on the actual attribute values the user has entered for it and one or more altered item listings with one or more attribute values substitutes. In some embodiments, delta engine 124 can provide a recommendation to alter the attribute value for any change that gets the new item listing closer to the target objective or for a change that represents an improvement of the new item listing by a predetermined amount selected by the user and/or the system (e.g., at least $2.00 more for a selling price, at least a 5% increase in selling probability, etc.). In addition or alternatively, the suggestion can take into account a confidence level associated with the evaluation of the altered item listing exceeding a predetermined amount, the confidence level associated with the evaluation of the altered item listing exceeding the confidence level associated with the evaluation of the item listing as actually entered by the user by a predetermined amount, or other confidence level.” ¶0040; Fig 1 (124)) Examiner note: Under BRI, the confidence score generation has been interpreted as the “confidence level” that is considered for suggesting an “alteration” or an “improvement” to an “item listing”. Also, refer to ¶0061 in where “machine learning modeler 120 may utilize unsupervised learning methods” such as “binning clusters” to also filter item listings.

Regarding claims 4 and 14: 
The combination of Pyati, Rubenczyk and Cassidy, as shown in the rejection above, discloses the limitations of claims 3 and 13.
Pyati further teaches:
adding taxonomy data to the first dictionary; and (“Path Distance Similarity measures the semantic similarity of a pair of words based on the shortest path that connects them in the is-a-kind-of (e.g., hypernym/hyponym) taxonomy. Variations of Path Distance Similarity normalize the shortest path value using the depths of the pair of words in the taxonomy (e.g., Wu & Palmer semantic similarity) or the maximum depth of the taxonomy (e.g., Leacock and Chodorow).” ¶0022; Fig 1 (106)) Examiner note: Under BRI, the ability to add taxonomy data in the first dictionary has been interpreted as the “semantic feature” that is included in the “text feature extractor 106” for analysis as “Path Distance Similarity”. 
assigning a high confidence score for each attribute value added by the taxonomy data (“Delta engine 124 can also compute text-based similarity metrics, such as LCS, Path Distance Similarity, Lexical Chains, Overlapping Glosses, Vector Pairs, HAL, LSA, LDA, ESA, PMI-IR, Normalized Google Distance, DISCO, variations of one or more of these text-based similarity measures, or other text-based similarity measures discussed in the present disclosure.” ¶0081; Fig 1 (124)) Examiner note: Under BRI, the high confidence score assignment has been interpreted as the capability that the “Delta Engine 124” suggest when “altering” or “improving item listings”, as stated in ¶0040 by using a “confidence level” and in combination with using the taxonomy of “Path Distance Similarity”. 

Regarding claims 5 and 15: 
The combination of Pyati, Rubenczyk and Cassidy, as shown in the rejection above, discloses the limitations of claims 3 and 13.
Pyati, further teaches:
using term frequency-inverse document frequency (TF-IDF) to determine the respective confidence score for the each respective filtered attribute value. (“Lexical features generally identify the relationships between words and phrases of text. An example of a lexical feature is the term frequency-inverse document frequency (tf-idf) of a word or phrase. The tf-idf score measures the relevance of the word or phrase in a collection or corpus based on how often the word or phrase appears in a segment of text and how often the word or phrase appears over the entirety of the collection or corpus of text. Other examples of lexical features include the part of speech of a word or phrase, the probability that certain words and phrases repeat in the same segment (e.g., there may be a low probability that “don't” appears twice in the same sentence), or the pairwise or sequential probability of words and phrases (e.g., the probability a pair of words or a sequence of words occurring one after another in a sentence, paragraph, or other unit of text).” ¶0020; Fig 1 (106)) Examiner note: Under BRI, the TF-IDF to determine the respective confidence score for the each respective filtered attribute value has been interpreted as the “tf-idf score” which measures how often the word or phrase appears in a segment of text, meaning that the confidence is higher, as the probability of showing consequentially is also higher.

Regarding claims 8 and 18: 
The combination of Pyati, Rubenczyk and Cassidy, as shown in the rejection above, discloses the limitations of claims 2 and 12.
Pyati further teaches:
wherein building the title interpreter model further comprises, for a respective title of the titles associated with a respective item in the item catalog: extracting title attribute values from the respective title using a natural language processing matching and a conflict solver, each of the title attribute values being associated with a respective title attribute name; (“Text feature extractor 106 can segment text and identify the features and corresponding feature values for each unit of text… After parsing and separating text into discrete units for analysis, text feature extractor 106 can identify features of each unit. A feature is generally a quality of an object that can define the object in part and may be used to compare the similarities or differences between objects. Some examples of text features include lexical features, semantic features, syntactic features, and metadata-based features.” ¶0019; Fig 1 (106); Fig 3C (314 and 318)) Examiner note: Under BRI and in light of the applicant’s specifications in ¶0095, the extracting step based on NLP matching and a conflict solver has been interpreted as the “parsing and separating text” of the “Text feature extractor 106”, which can also involve data such as the title (refer to ¶0013 and ¶0031) and identify features of each unit or “text features”, including “similarity or relatedness” through “semantic features” which are being interpreted as similar to the ability that the conflict solver executes to determine word’s matching and sorting based on importance, confidence and token length. 
determining a respective score for the respective title attribute name associated with the each of the title attribute values (“Title section 314 also includes title recommendation 318, a user interface element that provides a recommendation on how the user may improve the title for the item listing (e.g., by adding the term “Internet-enabled”) and a quantitative measure (e.g., 5%) associated with following through on the recommendation (e.g., increasing the probability of selling the item). In some embodiments, title recommendation 318 can display information received from a machine learning system for evaluating an item listing with respect to a target objective (e.g., maximizing selling price, maximizing selling probability, maximizing number of views of the item listing, etc.) from information input for the item listing, and information substituting user-inputted values. In some embodiments, title recommendation 338 can be dynamically updated as the user enters additional information for the new item listing.” ¶0103; Fig 1 (106); Fig 3C (314 and 318)) Examiner note: Under BRI and in light of the applicant’s specifications in ¶0095, the determining step of a title attribute name’s score based on the title attribute value has been interpreted as the combination of the “semantic features” which can measure “similarities between words and phrases”, as stated in ¶0021, and  the “title recommendation 318” which can provide recommendations, on “improving” a title and provide a “quantitative measure”, based on the evaluation and association of a “machine learning system” with respect to a “target objective”.

Regarding claims 9 and 19: 
The combination of Pyati, Rubenczyk and Cassidy, as shown in the rejection above, discloses the limitations of claims 8 and 18.
Pyati further teaches:
building a third dictionary based on the first dictionary by exchanging a nesting level of the each respective filtered attribute value with a nesting level of the each respective attribute name; (“Hierarchical clustering methods sort data into a hierarchical structure (e.g., tree, weighted graph, etc.) based on a similarity measure. Hierarchical clustering can be categorized as divisive or agglomerate. Divisive hierarchical clustering involves splitting or decomposing “central” nodes of the hierarchical structure where the measure of “centrality” can be based on “degree” centrality, (e.g., a node having the most number of edges incident on the node or the most number of edges to and/or from the node), “betweenness” centrality (e.g., a node operating the most number of times as a bridge along the shortest path between two nodes), “closeness” centrality (e.g., a node having the minimum average length of the shortest path between the node and all other nodes of the graph), among others (e.g., Eigenvector centrality, percolation centrality, cross-clique centrality, Freeman centrality, etc.)” ¶0063; Fig 1 (120)) Examiner note: Under BRI and in light to the applicant’s specifications in ¶0061 and ¶0113, the third dictionary construction based on the exchange of nesting levels of attribute values with respective attribute name nesting levels has been interpreted as the method of “hierarchical clustering” to sort and categorize data, which is used as an unsupervised learning method to determine “how item listings relate to one another and the attributes of the item listings that form clusters”, as stated in ¶0061. Therefore, the nesting level or “hierarchy” of the attribute values for each respective attribute name and the method of “hierarchical clustering” can provide the structure of a “graph” or map of the “nodes or words” and “edges or relationships” (refer to ¶0021) that represents the attribute name for a particular attribute value and product type to the associated confidence score (refer to ¶0061; applicant’s specifications).
tokenizing the respective title into n-grams; (“In other embodiments, text feature extractor 106 may additionally or alternatively calculate other text-based features, such as character-based features or term-based features. Character-based features determine the similarity of a pair of strings or the extent to which they share similar character sequences. Examples of character-based features include Longest Common Substring (LCS), Damerau-Levenshtein, Jaro, Needleman-Wunsch, Smith-Waterman, and N-gram, among others…N-grams measure similarity using the n-grams (e.g., a subsequence of n items of a sequence of text) from each character or word in the two strings. Distance is computed by dividing the number of similar n-grams by the maximal number of n-grams.” ¶0029; Fig 1 (106)) Examiner note: Under BRI and in light to the applicant’s specifications in ¶0062-63, the ability to tokenize or “segment text into smaller units of words or phrases” (refer to ¶0019), such as the title into n-grams has been interpreted as the calculation of the “character-based feature” of “N-grams” method to determine the “similarity of a pair of strings”, including the possible title. Also, refer to ¶0030-31 for more information regarding to the “text feature extractor 106” and the identification of “features of an item listing based on its metadata” such as the title.
determining matches, for the n-grams, in the third dictionary for a product type associated with the respective item; and (“Item specific feature extractor 108 can comprise a number of adapters for translating, transforming, or otherwise processing the values of item specifics (e.g., manufacturer, model, color, etc.) from one form to another form more suitable for analyzing the item listings for a particular context.” ¶0032; Fig 1 (108)) Examiner note: Under BRI and in light to the applicant’s specifications in ¶0064, the match determination for each n-gram has been interpreted as the “adapters for translating, transforming, or otherwise processing the values of item specifics (e.g., manufacturer, model, color, etc.)” to have suitable “item listings” in a “particular context”. Also, refer to ¶0033-34 for more examples of such n-grams being matched accordingly to a product type or “specific feature”.
sorting the matches based at least in part on an importance measure of attribute values in the matches and determining which of the n-grams to use as the attribute values for the respective title attribute name (“User account feature extractor 110 can include processes for acquiring user preferences for putting together item listings, such as preferences for a target objective of dynamically generated machine learning models for optimizing creation of the listings. Target objectives can include selling an item for a specific price, price range, or maximum price; selling the item with a specific degree of certainty, a range of possibilities, or a maximum probability; selling the item by a specific date, range of dates, or the earliest date; selling the item while incurring a specific fee amount, a range of fee amounts, or a minimum fee amount; achieving a specific number of impressions, range of numbers of impressions, or maximum number of impressions for the item listing, or a combination of these objectives or other target objectives.” ¶0035; Fig 1 (110 and 112)) Examiner note: Under BRI, the sorting step of matching an attribute value’s important measure and determining the n-grams to use as the attribute values to their respective attribute names or categories has been interpreted as the ability the “User account feature extractor 110” can process by “putting together item listings,…[and]…optimizing creation of the listings” based on “target objectives” which are being interpreted as the “attribute names” or categories. Also, refer to ¶0036-38 for more examples, including an “Image feature extractor 112” for image data. 

Regarding claims 10 and 20: 
The combination of Pyati, Rubenczyk and Cassidy, as shown in the rejection above, discloses the limitations of claims 8 and 18.
Pyati further teaches:
wherein generating the respective precision score for the each one of the target attribute values further comprises, for a target attribute value of the target attribute values: determining if a match exists between (a) the target attribute value and (b) a title attribute value of the title attribute values that is associated with the respective title attribute name that matches the target attribute name; (“Feature vector builder 118 can determine a representation for the extracted features of each item listing. In a process sometimes referred to as fusion, feature vector builder 118 can combine the features of an item listing into a data object (e.g., feature vector) that can operate as a sample or data point of machine learning modeler 120. Feature vector builder 118 can implement early fusion or late fusion. In early fusion, feature values may be concatenated to form a vector, array, list, matrix, or similar data structure…Features may be derived from various domains (e.g., quantitative, semantical, lexical, syntactic, image, user, etc.), and a feature vector in an early fusion system can combine disparate feature types or domains. Early fusion may be effective for a feature set of similar features or features within the same domain (e.g., semantic features of a title of an item listing and a description of the item listing).” ¶0041; Fig 1 (118)) Examiner note: Under BRI, the generation of the precision score derived from the determination of an existent match between target attribute value and the title attribute value of other associated title attribute values that match the target attribute value has been interpreted as the “early fusion” which is the process that the “feature vector builder 118” executes. The “features” or the “a quality of an object” (refer to ¶0019) has been interpreted as the target attribute value while the “domain” would correspond to the title of the attribute values that are most likely to match the target attribute name (refer to ¶0021-28 to learn more about semantic features). 
when the match exists: using the respective score of the respective title attribute name as the respective precision score for the target attribute value; and (“In still other embodiments, late fusion can involve various set operations. For example, late fusion may define clusters or classes as the intersections, unions, or complements of feature vectors. That is, if a feature vector determined by a first machine learning system includes the values {1, 2, 3, 4} and a second feature vector determined by a second machine learning system includes the values {3, 4, 5}, then the intersection operation may result in clusters or classes {1}, {2}, {3, 4}, and {5}. On the other hand, applying the union operation to the first and second feature vectors may yield a single cluster or class {1, 2, 3, 4, 5}.” ¶0046; Fig 1 (118)) Examiner note: Under BRI, the use of the respective score of the respective title attribute name as the respective precision score when a match exist has been interpreted as the “late fusion” that the “feature vector builder 118” executes, while using the calculated “similarity S” from the “similarity vectors” from the first machine learning system that executed the “early fusion” (refer to ¶0042). 
when the match does not exist: using a complement of the respective score of the respective title attribute name as the respective precision score for the respective target attribute value. (“feature vector builder 118 can also handle missing data or process data to handle outlier values. Feature vector builder 118 may delete missing or outlier values or replace missing or outlier values with mean, median, or mode values in some situations. In other situations, feature vector builder 118 may predict missing or outlier values using machine learning techniques, such as MMSE, MAP estimation, MLE, PCA, or other techniques discussed in the present disclosure.” ¶0043; Fig 1 (118)) Examiner note: Under BRI, the use of a complement of the respective score of the respective title attribute name as the respective precision score has been interpreted as the ability that the “feature vector builder 118” can implement to replace the missing or outlier value with “mean, median or mode values” which in turn would be derived from previous “similarity S” values within “similarity vectors” that are wrapped up in “feature vectors”. 

Claims 6-7 and 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over Pyati  (U.S. Pub No. 20190311301 A1)  in view of Cassidy (U.S. Patent No. 7107226 B1) in further view of Rubenczyk (U.S. Pub No. 20030217052 A1) and Krishnamurthy (U.S. Pub No. 20180068371 A1).
Regarding claims 6 and 16: 
The combination of Pyati, Rubenczyk and Cassidy, as shown in the rejection above, discloses the limitations of claims 2 and 12.
Pyati further teaches:
determining the respective semantic centroid score for the each respective attribute name based on a weighted average of glove word embeddings for the respective filtered attributes values that are associated with the each respective attribute name. (“Lexical Chains measure semantic relatedness by identifying lexical chains associating two concepts, and classifying relatedness of a pair of words, such as “extra-strong,” “strong,” and “medium-strong.” Overlapping glosses measure semantic relatedness using the “glosses” (e.g., brief definition or concept of a synset) of two synsets, and quantifies relatedness as the sum of the squares of the overlap lengths. Vector pairs measure semantic relatedness using co-occurrence matrices for words in the glosses from a particular corpus and represent each gloss as a vector of the average of the co-occurrence matrices.” ¶0023; Fig 1 (118 and 124)) Examiner note: Under BRI and in light of the applicant’s specifications in ¶0055, the semantic centroid has been interpreted as the measurement of “semantic relatedness” by “identifying lexical chains associating two concepts, and classifying relatedness of a pair of words” in a vector that uses “co-occurrence matrices for words… and represent each gloss as a vector of the average of the co-occurrence matrices”. Also refer to ¶0081.

Pyati does not explicitly teaches the use of glove word embeddings or “global vectors for word representation”. Therefore, the examiner would like to address that the second prior art that will be introduced as Krishnamurthy, is an analogous art that is based in historical data for items of products or services (also referred to simply as “items” herein), which encompasses product recommendations through a computing device to a user’s shopping service system by utilizing word embedding models to determine vector space representations of items and use them to determine item similarities for recommendations. Thus, Krishnamurthy teaches: “Any suitable word embedding model 212, such as the GloVe (Global Vectors for word representation) model or the Word2Vec models, may be used to calculate the item vectors 304 for the items in the sessions from the word embedment 302. The GloVe and Word2Vec models are described in detail below. Depending on the word embedding model 212 used, multiple permutations of sentences may be used as the word embedment 302 and input into the word embedding model 212.” ¶0033; “GloVe is a global log-bilinear regression model that is configured to use a word-word co-occurrence matrix along with local context window methods to generate word embeddings in a low dimensional space. As discussed above, embedding of words is used by the computing device to produce item vector representations of the associated items. The item vectors are usable by an item recommendation system of a computing device to create an item similarity matrix through comparison of the item vectors, arithmetic that is based on the item vectors, and so on. Although GloVe is described in terms of using a word-word co-occurrence matrix where items are considered as words and items in sessions as sentences, an item-item co-occurrence matrix can also be used to generate item embeddings in the low dimensional space directly. For example, the GloVe model is able to make use of a conventional item-item co-occurrence matrix to produce more accurate item-item similarities through vector representation without necessarily converting items and sessions to words and sentences” ¶0034, Fig 2 (212); Fig 3 (304)

It would have been obvious to one of ordinary skill in the art before the earliest effective filing date of the claimed invention to have provided Pyati with the ability of using and combining the “glove word embeddings” to later determine the weighted average and determine a semantic centroid score for the each respective attribute name, as taught by Krishnamurthy because “The GloVe model has several advantages that make it suitable for the task of creating item vector representations. One such advantage is that the GloVe model efficiently leverages statistical information by training only on the nonzero elements in an item-item co-occurrence matrix, rather than on the entire sparse matrix or on individual context windows in a large corpus. Further, the representation deduction process in GloVe treats the distance between words to determine their relative similarity. Intuitively, this makes sense since items that are interacted with consecutively in a session are likely to be more similar than items that are separated by a larger number of items within a session.” (Krishnamurthy; ¶0034).

Regarding claims 7 and 17: 
The combination of Pyati, Rubenczyk and Cassidy, as shown in the rejection above, discloses the limitations of claims 2 and 12.
Pyati further teaches:
determining if the target attribute value is included in the first dictionary; (“In still other embodiments, late fusion can involve various set operations. For example, late fusion may define clusters or classes as the intersections, unions, or complements of feature vectors. That is, if a feature vector determined by a first machine learning system includes the values {1, 2, 3, 4} and a second feature vector determined by a second machine learning system includes the values {3, 4, 5}, then the intersection operation may result in clusters or classes {1}, {2}, {3, 4}, and {5}. On the other hand, applying the union operation to the first and second feature vectors may yield a single cluster or class {1, 2, 3, 4, 5}.” ¶0046 “Machine learning modeler 120 can determine the parameters and functions (e.g., a machine learning model) that may be optimal for achieving a target objective. This can include identifying a suitable machine learning algorithm. In some embodiments, a user may select the machine learning algorithm(s) as part of generating the new item listing, or the selected machine learning algorithm(s) may have previously been chosen as a user preference.” ¶0048; Fig 1 (118; and 120); Equation 2) Examiner note: Under BRI and in light to the applicant specifications, specifically ¶0100, the first dictionary has been interpreted as the generation of “Percentiling or binning maps” that are based on the “clusters or classes” (refer to ¶0061) that represents the “intersections, unions, or complements of feature vectors” or “item listings” (refer to ¶0045). As for the determination step of the target attribute, this has been interpreted as the combination of the abilities between “feature vector builder 118” using “late fusion” (refer to ¶0041) and the determination of “optimal parameters and functions” done by the “Machine learning modeler 120” to obtain a “target objective” to include in a “new item listing” stated above. Finally, keep in mind, that the “feature vectors” are “data objects” which are combined with “features” or attribute values of an “item listing” or item catalogue (refer to ¶0041).   
when the target attribute value is included in the first dictionary: using the respective confidence score for the respective filtered attribute value in the first dictionary that matches the target attribute value as the respective relevancy score for the target attribute value; and (“Weights can be user-specified or automatically obtained, such as via silhouette scores. A silhouette score is a measure of how similar an object is to its own cluster or class compared to other clusters or other classes, which can range from −1 to 1, where a high value indicates that an item listing is well matched to its own cluster or class and badly matched to neighboring clusters or classes. If most item listings have a high silhouette score, then the clustering or classification maybe accurate. If many item listing have a low or negative silhouette score, then the clustering or classification may have too many or too few clusters or classes. The silhouette score can be calculated with any similarity or distance metric, such as the Euclidean distance or the Manhattan distance. Percentiling or binning maps the value of each position of a similarity vector to percentiles or bins to account for different similarity distributions. That is, similarity vectors are sorted and bins of the sorted vectors are created according to a particular probability distribution (e.g., normal or Gaussian, Poisson, Weibull, etc.). For example, the probability P, for a probability density function f, that an item listing X belongs to a cluster or class associated with a percentile/bin over interval a and b” ¶0045; Fig 1 (118; and 120); Equation 2) Examiner note: Under BRI and in light to the applicant specifications, specifically ¶0044-45, the construction of the first dictionary has been interpreted as the generation of “Percentiling or binning maps” that are based on the “clusters or classes” that represents the “intersections, unions, or complements of feature vectors” or “item listings” (refer to ¶0046) and the confidence score has been interpreted as the “silhouette scores”. Also, refer to ¶0041 to learn more about early and late fusion as implemented in the “feature vector builder 118” and ¶0061 to learn more about clusters.
when the target attribute value is not included in the first dictionary: determining the respective relevance score for the target attribute value based on a cosine similarity measure between (a) the respective semantic centroid score for an attribute name associated with the target attribute value in the second dictionary (“In some embodiments, feature vector builder 118 can also handle missing data or process data to handle outlier values. Feature vector builder 118 may delete missing or outlier values or replace missing or outlier values with mean, median, or mode values in some situations. In other situations, feature vector builder 118 may predict missing or outlier values using machine learning techniques, such as MMSE, MAP estimation, MLE, PCA, or other techniques discussed in the present disclosure” ¶0047; Fig 1 (118; and 120)) Examiner note: Under BRI, the determination step of relevance score for each target attribute based on cosine similarity measure between semantic centroid for an attribute name to its related target attribute value has been interpreted as the combination of the abilities that the “machine learning modeler 120” and its techniques such as “k-means clustering” (refer to ¶0062), and the “Text feature extractor 106” when using its “Corpus-based semantic features” (refer to ¶0024) such as “Explicit Semantic Analysis (ESA)” (refer to ¶0027), and the “semantic features”, In where the use of “Lexical Chains measure semantic relatedness” which has been interpreted as the semantic centroid (refer to ¶0023). 

Pyati does not explicitly teaches the following limitation. However, Krishnamurthy teaches:
and (b) a glove word embedding of the target attribute value. “Any suitable word embedding model 212, such as the GloVe (Global Vectors for word representation) model or the Word2Vec models, may be used to calculate the item vectors 304 for the items in the sessions from the word embedment 302. The GloVe and Word2Vec models are described in detail below. Depending on the word embedding model 212 used, multiple permutations of sentences may be used as the word embedment 302 and input into the word embedding model 212.” ¶0033; “GloVe is a global log-bilinear regression model that is configured to use a word-word co-occurrence matrix along with local context window methods to generate word embeddings in a low dimensional space. As discussed above, embedding of words is used by the computing device to produce item vector representations of the associated items. The item vectors are usable by an item recommendation system of a computing device to create an item similarity matrix through comparison of the item vectors, arithmetic that is based on the item vectors, and so on. Although GloVe is described in terms of using a word-word co-occurrence matrix where items are considered as words and items in sessions as sentences, an item-item co-occurrence matrix can also be used to generate item embeddings in the low dimensional space directly. For example, the GloVe model is able to make use of a conventional item-item co-occurrence matrix to produce more accurate item-item similarities through vector representation without necessarily converting items and sessions to words and sentences” ¶0034, Fig 2 (212); Fig 3 (304) Examiner note: Under BRI and in light to the applicant specifications in ¶0101-102, the use of a glove word embedding of a the target attribute value to be part of a cosine similarity measure has been interpreted as the use of a “GloVe model” in the “word embedding model 212” to “produce more accurate item-item similarities through vector representation”.

It would have been obvious to one of ordinary skill in the art before the earliest effective filing date of the claimed invention to have provided Pyati with the ability of using and combining the “glove word embeddings” to determine a relevance score for the target attribute value based on a cosine similarity measurements, as taught by Krishnamurthy because “The GloVe model has several advantages that make it suitable for the task of creating item vector representations. One such advantage is that the GloVe model efficiently leverages statistical information by training only on the nonzero elements in an item-item co-occurrence matrix, rather than on the entire sparse matrix or on individual context windows in a large corpus. Further, the representation deduction process in GloVe treats the distance between words to determine their relative similarity. Intuitively, this makes sense since items that are interacted with consecutively in a session are likely to be more similar than items that are separated by a larger number of items within a session.” (Krishnamurthy; ¶0034).








Conclusion
  The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Ahmed (U.S. Pub No. 20180075511 A1) is pertinent because it is “A system for extracting attributes can analyze text from data sources, extract n-grams from the text as candidate attribute and service/product pairs, prompt a human operator to rate the suitability of the candidate attribute and service/product pairs, and, based on the ratings, add the candidate attribute and service/product pairs to an attribute dictionary.”
Dubey (U.S. Pub No. 20170161613 A1) is pertinent because it “relate[s] to systems and/or methods for improving the integrity and consistency of data imported from Big Data and/or other data sources. More particularly, certain example embodiments described herein relate to techniques for managing “bad” or “imperfect” data being imported into a database system by automatically classifying and enriching data records, e.g., using self-learning models that help fit such data to given taxonomies and/or the like, in order to provide meaningful outputs”
Jadhav (U.S. Pub No. 20180174220 A1) is pertinent because it is related to “a method [that] includes receiving a plurality of candidate offers that are likely associated with a product being offered for sale. Each candidate offer is associated with a common set of attributes, wherein at least one of the attributes in the common set uniquely identifies the product being offered for sale.”
Pobbathi (U.S. Pub No. 20140172652 A1) is pertinent because it is “A system and method is described for large-scale, automated classification of products. The system and method receives information about products, wherein such information includes one or more text metadata fields associated with each product, receives a set of categories, and automatically selects one or more categories from the set of categories to which each product belongs based upon at least one of the one or more text metadata fields associated with each product.”

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action.
 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Ivonnemary Rivera Gonzalez whose telephone number is (571)272-6158. The examiner can normally be reached Mon - Fri 9:00AM - 5:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Nathan Uber can be reached on (571) 270-3923. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/IVONNEMARY RIVERA GONZALEZ/Examiner, Art Unit 3687                                                                                                                                                                                                        
/NATHAN C UBER/Supervisory Patent Examiner, Art Unit 3687