DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Drawings
The drawings filed 10/15/2019 were accepted.


Claim Objections
Claims 2, 6, 8, 12, 14 and 18 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.


Claim Interpretation

The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an 
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitations are: the first and second modules in claims 3, 9, and 15. 
Claims 3, 9, and 15: “a first module to determine term frequency”
Claims 3, 9, and 15: “a second module in parallel with the first module to identify language”
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.





Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 3-5, 7, 9-11, 13, and 15-17 are rejected under 35 U.S.C. 103 as being unpatentable over Chang (US 20150106078 A1; filed 10/15/2013) in view of Look (US 20170024436 A1; filed 6/3/2016) and Ahern et al (US 20180336279 A1; filed 4/29/2016).

With regards to claim 1, Chang discloses a method for operating a self-orchestrated system for extraction, analysis, and presentation of entity data, the method comprising: extracting a web page to a content store comprising object-based storage including web page content (Chang, abstract: “A contextual analysis engine systematically extracts, analyzes and organizes digital content stored in an electronic file such as a webpage. Content can be extracted using a text extraction module”), web page metadata and a globally unique identifier (Chang, paragraph 25: “The term content also includes information that is not specifically intended for display, and therefore also encompasses items such as software, executable instructions, scripts, hyperlinks, addresses, pointers, metadata, and formatting ; extracting the web page metadata from the object-based storage (Chang, paragraph 16: “FIG. 9 illustrates a modular extraction rule that can be used to extract metadata from a webpage in accordance with an embodiment of the present invention.”); inputting the web page metadata to a queue (Chang, paragraph 71: “orchestration manager 112 a can be configured to receive incoming content analysis requests, place such requests into a load balancing queue for dispatching and processing as appropriate, and pass such requests to text extraction module 120 and/or text analytics module 140.”); pulling web page content from the content store (Chang, abstract: “A contextual analysis engine systematically extracts, analyzes and organizes digital content stored in an electronic file such as a webpage.”)… parsing the web page content… and the web page metadata from the queue to generate extracted content (Chang, paragraph 92: “Feature extraction system 122 can be configured to extract plain text and metadata from incoming content such as webpages, word processing documents, PDF files and other types of content containing formatted and/or unformatted text.”) and positions of extracted content; passing the web page metadata, the extracted content, and the positions of extracted content to an advanced analysis function decider (AAF Decider) for analysis to generate relevance between the terms (Chang, paragraph 87: “The subsequent foundCandidate node 533 b includes one or more sub-nodes 533 b′ which provide information regarding particular features identified in the corpus of extracted text… The specific data provided within sub-nodes 533 b′ depends on the particular text analytics tool identified in the @id descriptor, but in general, may include data characterizing… feature location within the analyzed content based on original word position (provided within, for example, a collapsible “offset” label 544)… and feature relevancy score (see, for example, “score” label 548).” NOTE: the AAF decider appears to be a new term created in the specification, however does not have a precise definition since the specification only provides examples); and streaming the relevance between the terms and the positions of extracted content to a JSON file batch (Chang, paragraph 71: “the output format to be used to return contextual analysis data 59. Example output formats include the JavaScript Object Notation for linked Data (JSON-LD) as standardized by W3C and HTML.”).
However, Chang does not disclose receiving RegEx from a model parameter store; parsing the… content using RegEx… JSON file batch for flattening.
Ahern teaches receiving RegEx from a model parameter store; parsing the… content using RegEx (Ahern, paragraph 48: “the running of regular expressions containing relevant lexicons against the html of a page, stylesheets, JavaScript, cookie names and values, and HTTP header names and values”; paragraph 52: “A lexicon containing the known used phrases indicating the presence of a cart is applied to the html. When we say “applied”, this means checking for the existence of a word, phrase or other combination of characters using a regular expression”).
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have combined Chang and Ahern such that Chang’s language processing includes the use of regular expressions to parse and extract data from the web pages. This would have enabled the invention to identify and classify portions of the website (Ahern, paragraph 53: “The above allows the algorithm to quickly identify if a site has any evidence of a cart. A positive text match would result in the algorithm proceeding to the next phase”; paragraph 47: “It is fundamental to the algorithm to be able to access the html code of the web pages comprising the website, as regular expressions containing various lexicons pertaining to ecommerce stores are run as part of the process to determine a result.”). 
Look teaches JSON file batch for flattening (Look, paragraph 76: “The presentation service 114 then receives and translates (at step 655) the JSON representation of the associated content to a client representation of the associated content that is understandable by the client interface 132. For example, the presentation service 114 may " flatten" the JSON representation by removing information 
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to have combined Chang and Look such that the JSON representation of the analyzed website is flattened. This would provide a “simplified and translated version of the original JSON representation” (Look, paragraph 76) by removing information from the original JSON representation that is not understandable by the interface.


With regards to claim 3, which depends on claim 1, Chang discloses operating the AAF Decider to determine operative modules; activating a first module to determine term frequency on the web page metadata and the web page content (Chang, paragraph 21: “relevant topic data generated by a topic categorizer in accordance with an embodiment of the present invention, the relevant topic data including a topic, a relevancy score and a frequency count.” Paragraph 87: “The specific data provided within sub-nodes 533 b′ depends on the particular text analytics tool identified in the @id descriptor, but in general, may include data characterizing the extracted features, such as by specifying… feature frequency within the analyzed content (see, for example, “termfreq” label 542)”); activating a second module in parallel with the first module to identify language used in the web page content (Chang, paragraph 86: “The analytical data included within such foundCandidate nodes can be described in terms of “tags”, which refer to particular features or concepts identified in the corpus of extracted text.”); and passing positions of terms and relevance between the terms to the JSON file for flattening (Chang, paragraph 87: “feature location within the analyzed content based on original word position (provided within, for example, a collapsible “offset” label 544)… and feature relevancy score (see, for example, “score” label 548).” Chang, paragraph 71: “the output format to be used to return contextual .

With regards to claim 4, which depends on claim 1, Chang discloses wherein the web page content is at least one PDF document from the web page (Chang, paragraph 92: “Feature extraction system 122 can be configured to extract plain text and metadata from incoming content such as webpages, word processing documents, PDF files and other types of content containing formatted and/or unformatted text.”).

With regards to claim 5, which depends on claim 1, Chang discloses inputting a universal resource locator (URL) to a scraper queue; invoking a scraper on the URL; scraping data from the URL to identify other URLs, web page content, and web page metadata; pushing the identified other URLs to the scraper queue (Chang, paragraph 17: “a modular extraction rule that can be used to extract elements such as titles, text, metadata and page links from a webpage in accordance with an embodiment of the present invention”); storing the web page content to the object-based storage; and storing the web page metadata to a metadata store (Chang, paragraph 80: “Output interface 116 is configured to receive relevant topic data 58 generated by text analytics module 140, generate formatted metadata representations of such relevant topic data 58, and communicate the resulting formatted contextual analysis data 59 to one or more of the content administration tools 34 and/or sentiment and behavioral analysis services 200. Such formatted metadata representations can be generated by a reporting and visualization module 116 a, and may be provided with a data structure conforming to a preconfigured output format that is defined and stored in output format repository 118”).



Claims 13 and 15-17 recite substantially similar limitations to claims 1 and 3-5 respectively and are thus rejected along the same rationales.







Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Shilman (US 20090319342 A1): Teaches parsing web pages with regex parsers.
Stevens (US10430111B2): Teaches a scraper for extracting information from websites and determining customer interaction metrics.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRODERICK C ANDERSON whose telephone number is (313)446-6566. The examiner can normally be reached Monday-Friday 9-5 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/B.C.A/Examiner, Art Unit 2178                                                                                                                                                                                                        
/STEPHEN S HONG/Supervisory Patent Examiner, Art Unit 2178