DETAILED ACTION
Remarks
The instant application having Application No. 17/304,170 filed on June 15, 2021.  After a thorough search and examination of the present application and in light of prior art made of record, claims 1-30 are allowed. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continuation Statement
This patent application is a continuation of U.S. Application No. 16/288,059, filed February 27, 2019, now U.S. Patent #11,042,594.

Terminal Disclaimer
The terminal disclaimer filed on August 31, 2022 disclaiming the terminal portion of any patent granted on this application which would extend beyond the expiration date of 11,042,594 has been reviewed and is accepted.  The terminal disclaimer has been recorded.

EXAMINER’S AMENDMENT
An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
          Authorization for this examiner’s amendment was given in a telephone interview with Attorney, Mr. David N. Weiss (Reg. No. 41,371) on August 30, 2022.

Please amend the claims as follows:

(Original) A computer system, the computer system comprising:
one or more processing devices;
a network interface;
non-transitory memory that stores instructions that when executed by the one or more processing devices are configured to cause the computer system to perform operations comprising:
crawling, using the network interface, a website to identify a plurality of product pages, the product pages comprising base product pages comprising data about a product;
applying an unsupervised content extraction model to the product pages to identify a first set of data patterns for extracting product attributes and to distinguish product attributes from non-product information;
identifying a plurality of interface elements, comprising menus and/or buttons, on the product pages;
applying an automated process to systematically activate the plurality of interface elements on the product pages, comprising menus and/or buttons, to generate respective product page variations;
generating differences between the product page variations, generated by systematically activating the plurality of interface elements on the product pages, comprising menus and/or buttons, and the base product pages and analyzing the generated differences to identify a plurality of raw product attribute values for each of the plurality of product attributes;
normalizing the plurality of product attributes and normalizing the plurality of raw product attribute values, identified by analyzing the generated differences between the product page variations and the base product page, to a master list of attributes by identifying corresponding standardized attribute values, to create consistency across multiple websites;
storing in a searchable database one or more product identifiers, product attributes, and product attribute values, identified by analyzing the generated differences between the product page variations and the base product page, and
enabling a user to search for and review product variations using the searchable database.
 (Currently amended) The computer [[-]] system of claim 1, the operations further comprising: 
crawling the website 
clustering the extracted URLs based on similarity to generate a plurality of URL clusters; 
generating a plurality of URL templates based on the URL clusters; 
tagging a first set of URL templates as product page URLs; and
prioritizing the product page URLs while crawling the website to identify the plurality of product pages.
 (Currently amended) The computer [[-]] system of claim 1, the operations further comprising: 
crawling the website to identify the plurality of product pages by using a reinforcement learning algorithm, where the reinforcement learning algorithm is rewarded for crawling a product page and penalized for visiting a non-product page.
 (Currently amended) The computer system 
 (Currently amended) The computer system 
(Currently amended)  The computer [[-]] system of claim 1, the operations further comprising: 
extracting a plurality of HTML elements from a web page and extracting features from each of the HTML elements; 
inputting the extracted features to the unsupervised content extraction model, wherein the unsupervised content extraction model is configued to select one of the plurality of HTML elements as a product attribute.
(Currently amended)  The computer [[-]] system of claim 1, wherein the first set of data patterns comprises regular expressions.
(Currently amended)  The computer [[-]] system of claim 1, the operations further comprising: 
providing a master list of product attributes and attribute values and normalizing the product attributes and attributes values to the master list.
 (Currently amended) The computer [[-]] system of claim 1, wherein the searchable database comprises a graph, the graph comprising nodes representing products and edges representing common attributes between connected nodes.
 (Currently amended) The computer [[-]] system of claim 1, the operations comprising:
training an embedding generation model to generate tensors for each product in the searchable database, where [[the]] a distance between tensors represents [[the]] a  similarity of the products in at least one dimension.
(Currently amended) A computer-implemented method for extracting content from a web page comprising: 
crawling a website to identify a plurality of product pages, the product pages comprising base product pages comprising data about a product; 
applying an unsupervised content extraction model to the product pages to identify a first set of data patterns for extracting product attributes and to distinguish product attributes from non-product information; 
filtering the first set of data patterns for extracting product attributes; 
identifying a plurality of interface elements, comprising menus and/or buttons, on the product pages; 
applying an automated process to systematically activate the plurality of interface elements on the product pages, comprising menus and/or buttons, to generate respective product page variations; 
generating differences between the product page variations, generated by systematically activating the plurality of interface elements on the product pages, comprising menus and/or buttons, and the base product pages and analyzing, using the first set of data patterns, the generated differences to identify a plurality of product attribute values for each of the plurality of product attributes; 
normalizing the plurality of product attributes and normalizing a plurality of raw product attribute values, identified by analyzing the generated differences between the product page variations and the base product page, to a master list of attributes by identifying corresponding standardized attribute values, to create consistency across multiple websites; 
storing in a searchable database one or more product identifiers, product attributes, and product attribute values, identified by analyzing the generated differences between the product page variations and the base product page, and
enabling a user to search for and review product variations using the searchable database.
 (Currently amended) The computer-implemented method of claim 11, further comprising: 
crawling the website 
clustering the extracted URLs based on similarity to generate a plurality of URL clusters; generating a plurality of URL templates based on the URL clusters; 
tagging a first set of URL templates as product page URLs; 
prioritizing the product page URLs while crawling the website to identify the plurality of product pages.
 (Original) The computer-implemented method of claim 11, further comprising: 
crawling the website to identify the plurality of product pages by using a reinforcement learning algorithm, where the reinforcement learning algorithm is rewarded for crawling a product page and penalized for visiting a non-product page.
 (Original) The computer-implemented method of claim 11, wherein the unsupervised content extraction model comprises a machine learning model.
 (Original) The computer-implemented method of claim 11, wherein the unsupervised content extraction model is trained to accept as input a plurality of HTML elements and select one of the HTML elements as a product attribute.
 (Original) The computer-implemented method of claim 11, further comprising: 
extracting a plurality of HTML elements from a web page and extracting features from each of the HTML elements; 
inputting the extracted features to the unsupervised content extraction model, wherein the unsupervised content extraction model is configured to select one of the plurality of HTML elements as a product attribute.
 (Original) The computer-implemented method of claim 11, wherein the first set of data patterns for extracting product attributes comprises regular expressions.
 (Original) The computer-implemented method of claim 11, further comprising: 
providing a master list of product attributes and attribute values and normalizing the product attributes and attributes values to the master list.
 (Original) The computer-implemented method of claim 11, wherein the searchable database comprises a graph, the graph comprising nodes representing products and edges representing common attributes between connected nodes.
 (Currently amended) The computer-implemented method of claim 11, further comprising:
training an embedding generation model to generate tensors for each product in the searchable database, where [[the]] a distance between tensors represents [[the]] a similarity of the products in at least one dimension.
 (Currently amended) A non-transitory computer-readable medium comprising instructions that when executed by a computer device 
crawling a website to identify a plurality of product pages, the product pages comprising base product pages comprising data about a product; 
applying an unsupervised content extraction model to the product pages to generate a first set of patterns for extracting product attributes; 
filtering the first set of patterns for extracting product attributes; 
identifying a plurality of interface elements, comprising menus and/or buttons, on the product pages; 
applying an automated process to systematically activate the plurality of interface elements on the product pages, comprising menus and/or buttons, to generate product page variations; 
generating differences between the product page variations, generated by systematically activating the plurality of interface elements on the product pages, comprising menus and/or buttons, and the base product pages and analyzing the generated differences to identify a plurality of product attribute values for each of the plurality of product attributes; 
normalizing the plurality of product attributes and normalizing a plurality of raw product attribute values, identified by analyzing the generated differences between the product page variations and the base product page, to a master list of attributes by identifying corresponding standardized attribute values, to create consistency across multiple websites; 
storing in a searchable database one or more product identifiers, product attributes, and attribute values, identified by analyzing the generated differences between the product page variations and the base product page; and
enabling a user to search for and review product variations using the searchable database.
 (Currently amended) The non-transitory computer-readable medium of claim 21, the operations further comprising: 
crawling the website 
clustering the extracted URLs based on similarity to generate a plurality of URL clusters; 
generating a plurality of URL templates based on the URL clusters; 
tagging a first set of URL templates as product page URLs; 
prioritizing the product page URLs while crawling the website to identify the plurality of product pages.
 The non-transitory computer-readable medium of claim 21, the operations further comprising: 
crawling the website to identify the plurality of product pages by using a reinforcement learning algorithm, where the reinforcement learning algorithm is rewarded for crawling a product page and penalized for visiting a non-product page.
 The non-transitory computer-readable medium of claim 21, wherein the unsupervised content extraction model is a machine learning model.
 The non-transitory computer-readable medium of claim 21, wherein the unsupervised content extraction model is trained to accept as input a plurality of HTML elements and select one of the HTML elements as a product attribute.
 The non-transitory computer-readable medium of claim 21, the operations further comprising: 
extracting a plurality of HTML elements from a web page and extracting features from each of the HTML elements; 
inputting the extracted features to the unsupervised content extraction model, wherein the unsupervised content extraction model is configured to select one of the plurality of HTML elements as a product attribute.
 The non-transitory computer-readable medium of claim 21, wherein the first set of patterns for extracting product attributes comprises regular expressions.
 The non-transitory computer-readable medium of claim 21, the operations further comprising: 
providing a master list of product attributes and attribute values and normalizing the product attributes and attributes values to the master list.
 The non-transitory computer-readable medium of claim 21, wherein the searchable database comprises a graph, the graph comprising nodes representing products and edges representing common attributes between connected nodes.
 (Currently amended) The non-transitory computer-readable medium of claim 21, the operations further comprising: 
training an embedding generation model to generate tensors for each product in the searchable database, where [[the]] distance between tensors represents [[the]] a similarity of the products in at least one dimension.


Examiner’s Statement of Reasons for Allowance
Claims 1-30 are allowed over the prior art made of record.
The following is an Examiner’s Statement of Reasons for the indication of allowable subject matter:  Claims 1-30 are allowable over the prior art of record because the Examiner found neither prior art cited in its entirety, nor based on the prior art, found any motivation to combine any of the said prior arts.  
The prior art of records teaches in the same field of invention.  
Prior art of record Bentley (US Patent Publication No. 2014/0114758) discloses systems and methods for generating customized electronic advertisements are disclosed. A request for an advertisement is received. Viewer data is received and analyzed to determine current viewer features, characteristics, attributes, and/or interest(s). Product data can be extracted from publicly accessible electronic data generated by an ad source organization. The product data can be compared to the current viewer interest(s) to determine which product of the plurality of products most closely aligns with the current interests of the viewer to select a product to be advertised. A customized advertisement can be generated specifically for the viewer using at least a portion of the extracted product data for the product to be advertised. 
Prior art of record Sachdev et al. (US Patent Publication No. 2018/0013720 A1) discloses a method includes receiving a plurality of uniform resource identifiers (URI's) associated with a particular domain. Each of the URI's identifies a content page comprising one or more signature elements. The method further includes, for each URI in the plurality of URI's, successively testing the URI to identify a core of the URI and any unnecessary elements of the URI. The core of the URI is sufficient to retrieve a version of the content page including all of its signature elements. The method additionally includes, for each URI in the plurality of URI's, updating a set of rules based on the identified core and the identified unnecessary elements. The set of rules establishes a normalized version of the URI.
Prior art of record Bandaru et al. (US Patent Publication No. 2009/0119268 A1) discloses method and system for crawling multiple websites containing one or more web pages having information relevant to a particular domain of interest, such as details about local restaurants, extracting content from such websites, such as hours, location and phone number as well as reviews, review dates and other business specific information, and associating the extracted content with a specific business entity. 
Prior art of record Lamba et al. (US Patent Publication No. 20140149105 A1) discloses systems and methods are disclosed herein for extracting products referenced in a document. A document is analyzed to identify a product type that is referenced in the document. Attributes are extracted from the document. A set of candidate products are identified corresponding to the extracted attributes. A score is calculated for the candidate products and the products are further selected or filtered based on the score, whitelist rules, and blacklist rules in order to identify one or more inferred products referenced by the document. The whitelist and blacklist rules may take as inputs a domain, a user identifier, and keywords included in the document. A set of sufficient attributes may be identified for each product type. Selection of a candidate product may be based at least in part on the document including all of the attributes in the set of sufficient attributes.
In contrast to Applicant’s claim 1, the cited references alone or in combination fail to suggest or to teach “crawling a website to identify a plurality of product pages, the product pages comprising base product pages comprising data about a product; applying an unsupervised content extraction model to the product pages to identify a first set of data patterns for extracting product attributes and to distinguish product attributes from non-product information; filtering the first set of data patterns for extracting product attributes; identifying a plurality of interface elements, comprising menus and/or buttons, on the product pages; applying an automated process to systematically activate the plurality of interface elements on the product pages, comprising menus and/or buttons, to generate respective product page variations; generating differences between the product page variations, generated by systematically activating the plurality of interface elements on the product pages, comprising menus and/or buttons, and the base product pages and analyzing, using the first set of data patterns, the generated differences to identify a plurality of product attribute values for each of the plurality of product attributes; normalizing the plurality of product attributes and normalizing a plurality of raw product attribute values, identified by analyzing the generated differences between the product page variations and the base product page, to a master list of attributes by identifying corresponding standardized attribute values, to create consistency across multiple websites; storing in a searchable database one or more product identifiers, product attributes, and product attribute values, identified by analyzing the generated differences between the product page variations and the base product page, and enabling a user to search for and review product variations using the searchable database.”

Independent claims 11 and 21 are similar to that of the independent claim 1; therefore, is allowable for the same reason as claim 1.  The dependent claims, being further limiting to the independent claims, definite and enabled by the specification are also allowed.

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”


Conclusion
The prior art made of record, listed on form PTO-892, and not relied upon, if any, is considered pertinent to applicant’s disclosure.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to HASANUL MOBIN whose telephone number is (571)270-1289.  The examiner can normally be reached on 9:30AM to 6:00PM EST M-F.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Fred Ehichioya can be reached on 571-272-4034.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.  Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/HASANUL MOBIN/
Primary Examiner, Art Unit 2168