PNG
    media_image1.png
    172
    172
    media_image1.png
    Greyscale
United States Patent and Trademark Office
    
        
            
                                
            
        
    

Commissioner for Patents
United States Patent and Trademark Office
P.O. Box 1450
Alexandria, VA 22313-1450
www.uspto.gov










BEFORE THE PATENT TRIAL AND APPEAL BOARD


Application Number: 15/156,119
Filing Date: 16 May 2016
Appellant(s): Timothy F. Jones



__________________
Dominic M. Kotab
For Appellant


EXAMINER’S ANSWER





This is in response to the appeal brief filed 07/28/2021.
(1) Grounds of Rejection to be Reviewed on Appeal
Every ground of rejection set forth in the Office action dated 03/15/2021 from which the appeal is taken is being maintained by the examiner except for the grounds of rejection (if any) listed under the subheading “WITHDRAWN REJECTIONS.”  New grounds of rejection (if any) are provided under the subheading “NEW GROUNDS OF REJECTION.”
	
(2) Response to Argument
● 	Rejection under 35 U.S.C. § 101
Appellant argued: In the present application, the claims require “for each of [a] plurality of identified URLs, conditionally processing the identified URL, by the processor, based on data associated with the identified URL” (see independent Claim 11 - emphasis added), “saving extracted data to an associated digest in [a] selected bucket,” and “updating metadata associated with [an] associated digest in [a] selected bucket” (see Claim 14 - emphasis added), which does not “cove[r] performance of the limitation in the mind,” or “encompas[s] the user evaluating... identified data based on associated data,” as alleged by the Examiner, and clearly does not fall within the aforementioned groupings of abstract ideas.





More specifically, by conditionally processing identified URLs based on associated data, the processing of such URLs may be optimized, and unnecessary processing may be avoided, which may improve a performance of hardware performing such processing. More specifically, URLs may be crawled in an efficient manner (e.g., a more efficient manner than a priority queue, etc.), which may result in more efficient processing, less power usage, and increased performance by the computing device (e.g., see Paragraph [0046] in appellant’s Specification).
Additionally, appellant specifically claims the efficient selection of URLs to be reviewed, utilizing a hash table, and the conditional processing of the URLs, based on associated data. Such language constitutes an improvement to the technical field of URL processing, and also includes a specific limitation other than what is well-understood, routine and conventional in the field of URL processing, and is therefore significantly more than an abstract idea.
In response to the argument, the examiner respectfully submits that:
The limitation of “selecting one of a plurality of buckets within a hash table to be reviewed”, as recited, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is that nothing in the claim element precludes the step from practically being performed in the mind. The term “selecting” in the context of this claim encompasses user paying an attention upon one of multiple pieces of information in front of him.
The limitation of “identifying a plurality of uniform resource locators (URLs) stored within the selected bucket of the hash table”, as recited, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is that nothing in the claim element precludes the step from practically being performed in the mind. The term “identifying” in the context of this claim encompasses user looking at multiple pieces of information in a certain area in front of him.
The limitation “for each of the plurality of identified URLs, conditionally processing the identified URL based on data associated with the identified URL”, as recited, is a process that, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components. That is, nothing in the claim element precludes the step from practically being performed in the mind. The term “processing” in the context of this claim encompasses the user evaluating an identified data based on associated data or user selectively considering a piece of information among other pieces of information.  
While conditional processing of particular data sets may be beneficial for processing of these data sets, it has no effect on the functioning of the computing device itself as argued. Processing data based on associated data does not improve any performance of a hardware performing such processing. Accordingly, there is no practical application of improving the functioning of a computer.
If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. 
The claim does not include additional elements that is sufficient to amount to significantly more than the judicial exception.

●	Rejection under 35 U.S.C. § 103-Claims 11-13, 15-19, 21 and 24	
Claim 11 
Appellant argued: (1) selecting a segment from a sequence of segments (that is not a hash table), as in Zhu, and separately disclosing the general concept of a hash table with buckets (but not the selection of a bucket), as in Wong, does not teach the specific selection of “one of a plurality of buckets within a hash table to be reviewed” (emphasis added), as specifically claimed by appellant;
(2) disclosing a sequence of segments, where each segment includes a plurality of URLs, as in Zhu, and generally disclosing a hash table that contains buckets, as in Wong645, fails to teach “identifying, by the processor, a plurality of uniform resource locators (URLs) stored within the selected bucket of the hash table” (emphasis added), as claimed by appellant.
In response to the argument, the examiner respectfully submits that:
As shown in the specification, claim 11 as written and in view of Applicant's disclosure page 12, paragraph [0037]… the hash table may include a data structure storing a plurality of data…... In another embodiment, the portion of the hash table may include a bucket within the hash table. For example, a function (e.g., a hash function, etc.) may be used to divide an index of the hash table into a plurality of buckets, where each bucket represents a range of stored data, the combination of Zhu and Wong-645’s teaching disclose the limitations of claim 11.
Zhu discloses at col. 6 lines 38-46, URL scheduler 202 determines which URLs will be crawled in each epoch, and stores that information in data structure 100. Controller 201 selects a segment 112 from base layer 102 for crawling. The selected segment 112 is referred to herein as the "active segment (selects a segment 112 from base layer 102 for crawling equivalent to “selecting one of a plurality of buckets within a table to be reviewed”, a segment 112 equivalent to “bucket”, the base layer 102 equivalent to “a table”). 
Zhu discloses at col. 5 lines 31-37, data structure for storing URLs. Referring to FIG. 1, a three-layer data structure 100 is illustrated. Base layer 102 of data structure 100 comprises a sequence of segments 112. In one embodiment, each segment 112 comprises more than two hundred million uniform resource locations (URLs). Together, segments 112 represent a substantial percentage of the addressable URLs in the entire Internet (each segment 112 comprises more than two hundred million uniform resource locations (URLs) equivalent to “identifying a plurality of uniform resource locators (URLs) stored within the selected bucket”).

    PNG
    media_image2.png
    604
    850
    media_image2.png
    Greyscale






Wong-645 discloses “a plurality of buckets within a hash table” at [0050] FIG. 7, hash table 700 is comprised of multiple buckets such as, for example, URL buckets 710, 720 and 730. Each URL bucket contains entries as described in connection with FIG. 6.)


    PNG
    media_image3.png
    44
    601
    media_image3.png
    Greyscale


    PNG
    media_image4.png
    210
    589
    media_image4.png
    Greyscale



    PNG
    media_image5.png
    855
    620
    media_image5.png
    Greyscale

Claims 12-13, 15-19, 21 and 24
Appellant does not specifically argue these claims.

●	Rejection under 35 U.S.C. § 103-Claim 14
Claim 14 depends from claim 11 and is rejected based on the combination of the same rationale as applied to claim 11 and reference Kataria et al. (U.S. 2011/0093533).

●	Rejection under 35 U.S.C. § 103-Claim 22
Claim 22 depends from claim 11 and is rejected based on the combination of the same rationale as applied to claim 11 and reference Yu et al. (U.S. 2013/0046584).

●	Rejection under 35 U.S.C. § 103-Claim 23
Claim 23 depends from claim 11 and is rejected based on the combination of the same rationale as applied to claim 11 and reference Wong et al. (U.S. 2008/0104113).

●	Rejection under 35 U.S.C. § 103-Claim 20
Appellant argued: (1) selecting a segment from a sequence of segments (that is not a hash table), as in Zhu, and separately disclosing the general concept of a hash table with buckets (but not the selection of a bucket), as in Wong, does not teach the specific selection of “one of a plurality of buckets within a hash table to be reviewed” (emphasis added), as specifically claimed by appellant;
(2) disclosing a sequence of segments, where each segment includes a plurality of URLs, as in Zhu, and generally disclosing a hash table that contains buckets, as in Wong645, fails to teach “identifying, by the processor, a plurality of uniform resource locators (URLs) stored within the selected bucket of the hash table” (emphasis added), as claimed by appellant.
(3) the Examiner fails to specifically reject appellant’s claimed “for each of the plurality of identified URLs, processing the identified URL in response to determining that an overall score for the identified URL meets or exceeds a threshold score, where the overall score for the identified URL is determined utilizing a page score for the identified URL and global information associated with all URLs within the hash table” (emphasis added).
In response to the argument, the examiner respectfully submits that:
As shown in the specification, claim 20 as written and in view of Applicant's disclosure page 12, paragraph [0037]… the hash table may include a data structure storing a plurality of data…... In another embodiment, the portion of the hash table may include a bucket within the hash table. For example, a function (e.g., a hash function, etc.) may be used to divide an index of the hash table into a plurality of buckets, where each bucket represents a range of stored data, the combination of Zhu, Wong-645, Wong-113 and Papadimitriou’s teaching disclose the limitations of claim 20.
Zhu discloses at col. 6 lines 38-46, URL scheduler 202 determines which URLs will be crawled in each epoch, and stores that information in data structure 100. Controller 201 selects a segment 112 from base layer 102 for crawling. The selected segment 112 is referred to herein as the "active segment (selects a segment 112 from base layer 102 for crawling equivalent to “selecting one of a plurality of buckets within a table to be reviewed”, a segment 112 equivalent to “bucket”, the base layer 102 equivalent to “a table”). 
Zhu discloses at col. 5 lines 31-37, data structure for storing URLs. Referring to FIG. 1, a three-layer data structure 100 is illustrated. Base layer 102 of data structure 100 comprises a sequence of segments 112. In one embodiment, each segment 112 comprises more than two hundred million uniform resource locations (URLs). Together, segments 112 represent a substantial percentage of the addressable URLs in the entire Internet (each segment 112 comprises more than two hundred million uniform resource locations (URLs) equivalent to “identifying a plurality of uniform resource locators (URLs) stored within the selected bucket”).

    PNG
    media_image2.png
    604
    850
    media_image2.png
    Greyscale


Wong-645 discloses “a plurality of buckets within a hash table” at [0050] FIG. 7, hash table 700 is comprised of multiple buckets such as, for example, URL buckets 710, 720 and 730. Each URL bucket contains entries as described in connection with FIG. 6.)


    PNG
    media_image3.png
    44
    601
    media_image3.png
    Greyscale


    PNG
    media_image4.png
    210
    589
    media_image4.png
    Greyscale




    PNG
    media_image5.png
    855
    620
    media_image5.png
    Greyscale

No detail arguments provided by the appellant regarding the limitations “for each of the plurality of identified URLs, processing the identified URL in response to determining that an overall score for the identified URL meets or exceeds a threshold score, where the overall score for the identified URL is determined utilizing a page score for the identified URL and global information associated with all URLs within the hash table”. As 03/15/2021 Office Action indicated:
Zhu discloses “for each of the plurality of identified URLs, processing the identified URL” at col. 5 lines 38-46, periodically (e.g., daily) one of the segments 112 is deployed for crawling purposes,… Daily crawl layer 104 comprises the URLs that are to be crawled more frequently than the URLs in segments 112.
Zhu discloses “a page score for the identified URL” at col. 7 line 54-col. 8 line3, a crawl score is computed for each URL in active segment 112, daily layer 104, and real-time layer 106.
Wong-645 discloses “hash table” at [0050] FIG. 7, hash table 700 is comprised of multiple buckets.
Wong-113 discloses “where the overall score for the identified URL is determined utilizing …. global information associated with all URLs” at [0090] an overall score can be generated using any combination of the metrics … the overall score is calculated in response to the domain density score, the anchor text score, the URL string score, and the category need score, and the overall score is also influenced by the link proximity score.
Papadimitriou discloses “determining that an overall score for the identified URL meets or exceeds a threshold score” at [0083] a metric or score associated with a dominance of a single URL is calculated… Values of the URL dominance score that are less than 0.5 may indicate a relatively low level of URL dominance, while values of the URL dominance score that are greater than 0.5 may indicate a relatively high level of URL dominance.

●	Rejection under 35 U.S.C. § 103-Claim 25	
Appellant argued: disclosing a time taken to download a web page, as well as resolving URLs that are not represented on a local DNS database using conventional DNS resources, as in Zhu, generally disclosing a hash table that contains buckets, as in Wong645, and also disclosing a total number of processed web pages from a domain, and generating different scores for URLs for web pages that have not been downloaded by a crawler, as in Wongl113, fails to teach a technique “wherein... the global information includes a time since a last crawl, a total number of pages crawled within the hash table, a total number of pages not crawled within the hash table, and a percentage of a global crawl goal that is not currently met” (emphasis added), as claimed by appellant.
In response to the argument, the examiner respectfully submits that: 
Zhu discloses at col. 14 lines 18-43, "Time taken to download" 534 provides an indication of how long it took a robot 208 to download the web page associated with the corresponding URL in the last crawl (equivalent to the global information includes “a time since a last crawl”), 
at col. 12 lines 27-55, the use of a local DNS resolution database 250 enables a high percentage of the system's DNS resolution operations to be handled locally, at very high speed. Only those URLs that are not represented on local DNS database 250 (e.g., because they have not been previously crawled) are resolved using conventional DNS resources of the Internet (equivalent to the global information includes... “a percentage of a global crawl goal that is not currently met”).
Wong-113 discloses at [0107] the number of processed pages represents a total number of web pages from the domain processed by the web crawling system (equivalent to “the global information includes... a total number of web pages crawled”), 
at [0003] for URLs that identify web pages that have not yet been downloaded by the web crawler (equivalent to “the global information includes... a total number of web pages not crawled”). 

●	Rejection under 35 U.S.C. § 103-Claim 26	
Appellant argued: Papadimitriou does not teach “comparing the overall score to a threshold score, where the threshold score is an average overall score for all URLs within the hash table, and processing the URL in response to determining that the overall score exceeds the threshold score” (emphasis added), as claimed by appellant.
More specifically, Papadimitriou only teaches a single URL dominance score that indicates a level of URL dominance, which does not teach an “overall score” or a “threshold score,” much less a comparison of the two scores, or “processing the URL in response to determining that the overall score exceeds the threshold score” (emphasis added), as specifically claimed by appellant.
In response to the argument, the examiner respectfully submits that: 
Papadimitriou discloses at [0083] for a given topic associated with the query cluster of extended seed query string 418, a metric or score associated with a dominance of a single URL is calculated. In one example, percentile scores of D=0.9 (e.g., high rank score) and C=0.6 (e.g., average click score) may be determined for steps 802 and 804, and a domain importance score of 0.73 may be determined for step 806, indicating a relatively high level of URL dominance. Values of the URL dominance score that are less than 0.5 may indicate a relatively low level of URL dominance, while values of the URL dominance score that are greater than 0.5 may indicate a relatively high level of URL dominance (equivalent to “the threshold score is an average overall score.. processing the URL in response to determining that the overall score exceeds the threshold score”). 

For the above reasons, it is believed that the rejections should be sustained.
Respectfully submitted,

/HUAWEN A PENG/           Primary Examiner, Art Unit 2157                                                                                                                                                                                             
October 13, 2021

Conferees:

/James Trujillo/           Supervisory Patent Examiner, Art Unit 2157                                                                                                                                                                                             
 /BORIS GORNEY/            Supervisory Patent Examiner, Art Unit 2147                                                                                                                                                                                                                                                                                                                                                                                        

Requirement to pay appeal forwarding fee.  In order to avoid dismissal of the instant appeal in any application or ex parte reexamination proceeding, 37 CFR 41.45 requires payment of an appeal forwarding fee within the time permitted by 37 CFR 41.45(a), unless appellant had timely paid the fee for filing a brief required by 37 CFR 41.20(b) in effect on March 18, 2013.