Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1 – 16 are pending.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 08 April 2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner; however, the Foreign Patent Documents Cite No. 1 was not considered because the reference was not provided with a translation.
The information disclosure statement (IDS) submitted on 12 November 2020 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner; however, the Foreign Patent Documents Cite No. 1 – 11 were not considered because the references were not provided with translations.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1, 2, 4 – 6, 8, 12 and 14 – 16 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by U.S. Patent No. 2003/0221162 issued to Mandayam Andampillai Sridhar (hereinafter referred to as Sridhar).

As to claim 1, Sridhar discloses obtaining webpage source data corresponding to a target webpage (receive data input webpage data, see Sridhar: Para. 0126 – 0128); 
identifying at least one key value block from the webpage source data, wherein the key value block comprises at least one key value pair (key value pair obtained from the input webpage, see Sridhar: Para. 0126 – 0128); 
identifying body values corresponding to the at least one key value block from the webpage source data (associated data values are identified, see Sridhar: Para. 0126 – 0128, and data values including body values, see Sridhar: Para. 0151 – 0153 and codes); and 
generating entity relationship data corresponding to the target webpage according to the key value blocks and the body values corresponding to the key value blocks (data values are associated with appropriate nodes within the UDM-based data structure, see Sridhar: Para. 0124 – 0128).

performing data analysis on the webpage source data by using a basis analysis tool, to obtain at least basis key value pair for adding into a collection of key value pairs (the UDM is traversed from the root node associated with a first key value pair until all leaf nodes are traversed and unique keys are associated with each data value, see Sridhar: Para. 0126 – 0128); 
performing key value pair extension on the basis key value pairs, to obtain at least one extension key value pair for adding into the collection of key value pairs (leaf nodes are traversed recursively until all leaf node unique keys are associated with values from key value pairs from the input webpage, see Sridhar: Para. 0126 – 0128); and 
combining the key value pairs included in the collection of key value pairs, to obtain at least one key value block (key value pairs associated with a unique key for a leaf node are searched and a data value is stored within the leaf node, see Sridhar: Para. 0126 – 0128, and combining sub-keys representing nodes, see Sridhar: Para. 0132).





As to claim 4, Sridhar discloses wherein combining the key value pairs included in the collection of key value pairs, to obtain at least one key value block comprises: 
positioning page locations of the key value pairs of the collection of key value pairs in the target webpage (key value pairs associated with a unique key for a leaf node are searched and a data value is stored within the leaf node at the location of the unique key identifier, see Sridhar: Para. 0126 – 0128); and 
combining at least two key value pairs having continuous page locations into the same key value block (combining sub-keys representing nodes, see Sridhar: Para. 0132).

As to claim 5, Sridhar discloses after combining the key value pairs included in the collection of key value pairs, to obtain at least one key value block, further comprising: 
filtering the key value pairs included in the at least one key value block according to a key value pair filtering rule (key value pairs associated with a unique key for a leaf node are searched and a data value is stored within the leaf node, see Sridhar: Para. 0126 – 0128, searching is filtering); and 
filtering the at least one key value block according to a key value block filtering rule (key value pairs associated with a unique key for a leaf node are searched and a data value is stored within the leaf node, see Sridhar: Para. 0126 – 0128, recursively traversing the nodes of the structure and searching for key value pairs for a  unique key of the nodes is a rule).

As to claim 6, Sridhar discloses wherein identifying body values corresponding to the at least one key value block from the webpage source data comprises: 
when a target key value block currently processed is a main key value block, and the webpage source data comprises an entity page node satisfying a first label (determining the instances of a webpage for correct mapping into the structure, see Sridhar: Para. 0120 – 0128); and 
when the target webpage is an entity page, determining text data corresponding to the entity page node as the body value of the target key value block, wherein the main key value block is the key value block having the most key value pairs in the at least one key value block corresponding to the webpage source data (key value pairs associated with a unique key for a leaf node are searched and a data value is stored within the leaf node, see Sridhar: Para. 0126 – 0128).


As to claim 8, Sridhar discloses wherein identifying body values corresponding to the at least one key value block from the webpage source data comprises: 
matching a key name of the key value pair in a target key value block currently processed with a preset whitelist (key value pairs associated with a unique key for a leaf node are searched and a data value is stored within the leaf node at the location of the unique key identifier, see Sridhar: Para. 0126 – 0128, the unique key is part of the UDM structure leaf nodes (i.e. a whitelist of unique keys)); 
when determining that the key name of the key value pair in the target key value block matches the preset whitelist, obtaining a key value corresponding to the key name as the body value of the target key value block (key value pairs associated with a unique key for a leaf node are searched and a data value is stored within the leaf node at the location of the unique key identifier, see Sridhar: Para. 0126 – 0128).

As to claim 12, Sridhar discloses after identifying the body values corresponding to the at least one key value block from the webpage source data, further comprising: 
filtering the at least one key value block according to the body values corresponding to the at least one key value block by using at least one statistical check template, and/or by using at least one rule check template (generating a new page based on a preconfigured template, see Sridhar: Para. 0114, and key value pairs associated with a unique key for a leaf node are searched and a data value is stored within the leaf node at the location of the unique key identifier, see Sridhar: Para. 0126 – 0128).






As to claim 14, Sridhar discloses wherein generating the entity relationship data corresponding to the target webpage according to the key value blocks and the body values corresponding to the key value blocks comprises: 
combining respective key value pairs in the key value block with the body value corresponding to the key value block, to construct triplet data (key value pairs associated with a unique key for a leaf node are searched and a data value is stored within the leaf node, see Sridhar: Para. 0126 – 0128, and combining sub-keys representing nodes, see Sridhar: Para. 0132); and 
generating the entity relationship data by using a key name in the triplet data as a subject-object relationship value, and using a key value corresponding to the key name as an object value (key value pairs associated with a unique key for a leaf node are searched and a data value is stored within the leaf node at the location of the unique key identifier, see Sridhar: Para. 0126 – 0128).

Claim 15 is rejected using similar rationale to the rejection of claim 1 above.
Claim 16 is rejected using similar rationale to the rejection of claim 1 above.






Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to 
Claims 3, 7 and 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Sridhar modified by U.S. Patent No. 10,289,658 issued to Daocheng Chen et al (hereinafter referred to as Chen).
As to claim 3, Sridhar discloses wherein performing key value pair extension on the basis key value pairs, to obtain at least one extension key value pair for adding into the collection of key value pairs, comprises: 
obtaining a basis node matching the basis key value pair from the webpage source data, obtaining an extension node, and obtaining text data corresponding to the extension node as the extension key value pair; and/or 
obtaining a basis label of a basis node matching the basis key value pair from the webpage source data, determining at least one extension label according to the basis  label, searching for an extension node matching the extension label in the webpage source data, and obtaining text data corresponding to the extension node as the extension key value pair (the UDM is traversed recursively from the root node associated with a first key value pair until all leaf nodes are traversed and unique keys are associated with each data value, see Sridhar: Para. 0126 – 0128).
However, Sridhar does not explicitly disclose obtaining a basis xpath of a basis node matching the basis key value pair from the webpage source data, obtaining an extension node having a xpath same as the basis xpath, and obtaining text data corresponding to the extension node as the extension key value pair; and/or obtaining a basis html label of a basis node matching the basis key value pair from the webpage 
Chen teaches obtaining a basis xpath of a basis node matching the basis key value pair from the webpage source data, obtaining an extension node having a xpath same as the basis xpath, and obtaining text data corresponding to the extension node as the extension key value pair; and/or 
obtaining a basis html label of a basis node matching the basis key value pair from the webpage source data, determining at least one extension html label according to the basis html label, searching for an extension node matching the extension html label in the webpage source data, and obtaining text data corresponding to the extension node as the extension key value pair (obtaining HTML and URL for a webpage by applying the CSS styles according to the HTML hierarchy to web page components, see Chen: Col. 3 line 29 – Col. 4 line 14, and web render engine module may then parse, evaluate and execute the DOM tree to obtain the full HTML for the web content, pattern analysis module reads all the HTML content sent from the web render engine module and parses it as a DOM tree, see Chen: Col. 6 line 31 – Col. 7 line 34).
Chen and Sridhar are analogous due to their disclosure of importing, analyzing and matching web page data to a hierarchical structure based on matching webpage data.



As to claim 7, Sridhar discloses wherein identifying body values corresponding to the at least one key value block from the webpage source data comprises: 
searching for a strongly styled node satisfying a second label condition forwards in the webpage source data, according to a page location of a target key value block currently processed in the target webpage (key value pairs associated with a unique key for a leaf node are searched and a data value is stored within the leaf node at the location of the unique key identifier, see Sridhar: Para. 0126 – 0128); 
when the strongly styled node is found, determining text data corresponding to the strongly styled node as the body value of the target key value block (a data value is stored within the leaf node at the location of the unique key identifier, see Sridhar: Para. 0126 – 0128).
However, Sridhar does not explicitly disclose and a xpath of the strongly styled node is inconsistent with a xpath corresponding to the target key value block.
Chen teaches a xpath of the strongly styled node is inconsistent with a xpath corresponding to the target key value block, determining text data corresponding to the strongly styled node as the body value of the target key value block (fuzzy matching is used when the whole XPath is not matched to the key-value pair to identify popular values for recommendation, see Chen: Col. 12 lines 4 – 58).
Chen and Sridhar are analogous due to their disclosure of importing, analyzing and matching web page data to a hierarchical structure based on matching webpage data.
Therefore, it would have been obvious to modify Sridhar’s use of key value pair analysis in regards to a UDM tree structure with Chen’s use of XPath for matching web data to a hierarchical structure in order to improve web page design.

As to claim 11, Sridhar discloses wherein identifying body values corresponding to the at least one key value block from the webpage source data comprises: 
determining a target site corresponding to the target webpage (generating a new page, see Sridhar: Para. 0114); 
obtaining at least one prestored candidate template corresponding to the target site, and identifying the body value corresponding to a key value block currently processed by using the candidate template (generating a new page based on a preconfigured template, see Sridhar: Para. 0114, and using the UDM structure for generating a new webpage, see Sridhar: Para. 0131 – 0133), 
wherein the candidate template of the target site is generated according to an identifying result of performing key value pair identification on a plurality of webpages of the target site (key value pairs associated with a unique key for a leaf node are searched and a data value is stored within the leaf node at the location of the unique key identifier, see Sridhar: Para. 0126 – 0128 and 0131 – 0133).
However, Sridhar does not explicitly disclose determining a target site corresponding to the target webpage according to a URL of the target webpage.


Chen teaches determining a target site corresponding to the target webpage according to a URL of the target webpage (obtaining HTML and URL for a webpage by applying the CSS styles according to the HTML hierarchy to web page components, see Chen: Col. 3 line 29 – Col. 4 line 14).
Chen and Sridhar are analogous due to their disclosure of importing, analyzing and matching web page data to a hierarchical structure based on matching webpage data.
Therefore, it would have been obvious to modify Sridhar’s use of key value pair analysis in regards to a UDM tree structure with Chen’s use of HTML labeling for matching web data to a hierarchical structure in order to improve web page design.


Allowable Subject Matter
Claims 9 – 10 and 13 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.





Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARK E HERSHLEY whose telephone number is (571)270-7774. The examiner can normally be reached M-Th: 9am-7pm; F: 2pm-10pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ashish Thomas can be reached on 571-272-0631. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MARK E HERSHLEY/Primary Examiner, Art Unit 2164