Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

 DETAILED ACTION
2.	Claims 1-15 are present for examination.

Information Disclosure Statement
3.	The information disclosure statement (IDS) filed on 11/30/2021 is considered by the examiner.

Claim Objections
4.	Claims 2-13 are objected to because of the following informalities:  
Regarding claim 2-13, these claims recite a phrase of “A method…” in line 1.  However, this phrase should be changed to --The method--.  
Appropriate correction is required.

Claim Rejections - 35 USC § 102
5.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
6.	The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

7.	Claims 1, 9-11, and 14-15 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by U.S. 2015/0161259 (hereinafter Bejerasco).

	Regarding claims 1, 14 and 15, Bejerasco discloses a method performed by an information processing apparatus for classifying subject matter of content in a webpage comprising:
receiving a webpage; extracting content from the webpage; identifying keywords from the extracted content ([0025 and 0034]; fig. 4 as shown below; “In 200, a detection unit 8 detects a listing of web content elements provided by a web search engine on the display 13.  The web search engine searches information on the World Wide Web and presents the search results.  The search results may relate to web pages, images, information and other types of files retrieved by the web search engine”), wherein the keywords are also contained in a taxonomy stored by the information processing apparatus that associates the keywords with categories of subject matter ([0028-0029]; “…However, it is possible that this kind of content and/or these web pages are categorized according to specified predetermined, allowed categories…”); 

    PNG
    media_image1.png
    300
    1076
    media_image1.png
    Greyscale

assigning an importance score to the keywords identified from the extracted content; calculating context scores associated with categories or subcategories of subject matter within the taxonomy based on the importance scores of the identified keywords; and classifying the content as being associated with one or more category or subcategory of subject matter based on the context scores ([0034-0035]; “In 308, the isolated web content elements are calculated and analyzed. In an embodiment, the calculation and analysis are based on keywords that are isolated from the title, link and/or summary text of each result of the web search engine results page. The keywords may also be used to apply weights depending on the web content element in which a keyword match is detected…” and “The calculations may be simplistic or may utilize complex algorithms that may involve any data mining and classification techniques or both, depending on the implementation…”).   Brierasco additionally discloses an information processing apparatus for classifying subject matter of content in a webpage, wherein the information processing apparatus comprises at least one processor and at least one memory; and a non-transitory computer-readable storage medium (fig. 1).

Regarding claim 9, Bejerasco discloses a method further comprising the step of generating the taxonomy, wherein the step of generating the taxonomy comprises a step of automatically extracting keywords from one or more data sources ([0006-0007]; the content categorizations).
Regarding claim 10, Brierasco discloses a method further comprising a step of displaying keywords extracted from one or more data sources to a user to enable selection of keywords to be added to the taxonomy ([0030, 0068 and 0073]).

Regarding claim 11, Brierasco discloses a method further comprising automatically populating a taxonomy with keywords and categories or subcategories based on keywords extracted from the one or more data sources ([0030-0031 and 0068]).

Claim Rejections - 35 USC § 103
8.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
9.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

10.	Claims 2-3 are rejected under 35 U.S.C. 103 as being unpatentable over Bejerasco in view of U.S. 10,079,876 (hereinafter Chung).

Regarding claim 2, Bejerasco discloses a method wherein the content in the webpage includes text ([0029]).  The reference does not explicitly disclose the method further comprises a step of normalising the text.  However, such feature is well known in the art as disclosed by Chung (col. 19, lns. 3-28) and it would have been obvious for one with ordinary skill in the art to utilize the feature of Chung in the system of Bejerasco in view of the desire to enhance the web page processing system by utilizing the specific scheme resulting in improving the efficiency of categorizing the web content process.  

Regarding claim 3, Bejerasco discloses a method wherein the step of extracting content from the webpage comprises extracting text content from the webpage using one of a plurality of methods for extracting content from the webpage ([0035]).  The reference does not explicitly disclose the method for extracting text content from the webpage being selected from the plurality of methods for extracting content using a machine learning model based on a type of webpage structure.  However, such feature is well known in the art as disclosed by Chung (col. 7, lns. 46-65) and it would have been obvious for one with ordinary skill in the art to utilize the feature of Chung in the system of Bejerasco in view of the desire to enhance the web page categorizing system by utilizing the specific engine process resulting in improving the efficiency of searching the web content process.  

11.	Claims 4-8 are rejected under 35 U.S.C. 103 as being unpatentable over Bejerasco in view of U.S. 2008/0065602 (hereinafter Cragun).

Regarding claim 4, Bejerasco discloses a method wherein the content in the webpage includes the step of identifying keywords using the extracted content comprises extracting keywords ([0025 and 0034]).  The reference does not explicitly disclose the method wherein the page includes video data and the method further comprises separating audio data and visual data from the video and the step of identifying keywords comprises extracting keywords from the audio data using speech recognition and extracting keywords from the visual data using image recognition.  However, such features are well known in the art as disclosed by Cragun ([0054-0055]; utilizing an audio tag and a video tag wherein the video include both video images and audio sounds) and it would have been obvious for one with ordinary skill in the art to utilize the teachings of Cragun in the system of Brierasco in view of the desire to enhance the content searching system by utilizing the specific type of information resulting in improving the efficiency of the content extracting scheme.

Regarding claim 5, Bejerasco in view of Cragun discloses a method wherein the step of calculating context scores comprises calculating context scores for the webpage based on importance scores [i.e., importance scores] of keywords, which keywords include keywords extracted from the audio data and keywords extracted from the visual data (Bejerasco: [0034]) and (Cragun: [0024 and 0056]).  Therefore, the limitations of claim 5 are rejected in the analysis of claim 4, and the claim is rejected on that basis.

Regarding claim 6, Bejerasco in view of Cragun discloses a method wherein the context scores are calculating from a weighted combinations of importance scores of keywords associated with the visual data and the audio data (Bejerasco: [0025]) and (Cragun: [0058-0059]).  Therefore, the limitations of claim 6 are rejected in the analysis of claim 4, and the claim is rejected on that basis.

Regarding claim 7, Bejerasco in view of Cragun discloses a method wherein keyword importance scores are calculated based on confidence scores [i.e., highest match scores] from at least one of a speech recognition program that is used to extract keywords from the audio data and an image recognition program that is used to extract keywords from the visual data (Bejerasco: [0025]) and (Cragun: [0075-0075]).  Therefore, the limitations of claim 7 are rejected in the analysis of claim 4, and the claim is rejected on that basis.

Regarding claim 8, while Bejerasco discloses a method of extracting keywords ([0034]), the reference does not explicitly disclose the method wherein the webpage includes both video content and text content and wherein instances of keywords extracted from both the video content and the text content are analysed in combination to generate an importance score for the keywords.  However, such features are well known in the art as disclosed by Cragun ([0024 and 0059]) and it would have been obvious for one with ordinary skill in the art to utilize the teachings of Cragun in the system of Brierasco in view of the desire to enhance the webpage content system by utilizing the weight manipulation scheme resulting in improving the efficiency of the content categorizing scheme.

11.	Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Bejerasco in view of Cragun, and further in view of Chung.

Regarding claim 12, Bejerasco in view of Cragun does not explicitly disclose a method further comprising a step of expanding the content of a data source using a generative transformer, wherein automatically extracting keywords from the one or more data source comprises automatically extracting keywords from the expanded content of the data source.  However, such feature is well known in the art as disclosed by Chung (col. 7, lns. 46-65; col. 17, lns. 35-67) and it would have been obvious for one with ordinary skill in the art to utilize the feature of Chung in the system of Bejerasco in view of the desire to enhance the web page categorizing system by utilizing the expanding scheme resulting in improving the efficiency of searching the web content process.  

12.	Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Bejerasco in view of U.S. 2017/0161774 (hereinafter Gorsline).

Regarding claim 13, Bejerasco discloses a method further comprising an indication of the subject matter classification ([0043 and 0062]).  The reference does not explicitly disclose the method sending a signal to at least one of a demand-side platform and a supply-side platform within a system for automated placement of digital content within webpages.  However, such feature is well known in the art as disclosed by Gorsline ([0061, 0098 and 0104-0105]) and it would have been obvious for one with ordinary skill in the art to utilize the teachings of Gorsline in the system of Brierasco in view of the desire to enhance the content processing system by utilizing the specific processing environment resulting in improving the efficiency of the content categorizing scheme.


Conclusion
13.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to MONICA M PYO whose telephone number is (571)272-8192. The examiner can normally be reached Monday-Friday 8am-4pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, APU MOFIZ can be reached on 571-272-4080. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/MONICA M PYO/Primary Examiner, Art Unit 2161