DETAILED ACTION
Claims 1-20 are pending in this action.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Objections
Claim 1 is objected to for grammar:
Claim 1 requires, in part, “a repository configurable to cause store the collected seeds.”

For purposes of examination, the examiner interprets the limitation as requiring “a repository configurable to store the collected seeds.”

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 1 is rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement.  The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.

Claim 1 and all claims dependent therefrom are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claim 1 requires, in part: adding the additional company information to each of collected seeds to enrich that collected seed to generate an enriched company seed. (emphasis added)

First addressing the 112 2nd rejection for indefiniteness, the bolded limitations refer to a broader limitation followed by a narrow limitation in the same claim.  Specifically, it is unclear whether each collected seed must be enriched, or only a single seed of the collective set of a collected seeds must be enriched.  

Furthermore, regarding the 112 1st rejection for lack of written description support, the addition of the additional company information to each of the collected seeds requires 1



Claim 8 and all claims dependent therefrom (but esp. claim 9) are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 8 provides the limitation of “the additional company information” without an initial recitation of “an additional company information.”  Thus, the claim lacks antecedent basis.  Although the following “wherein” clause further specifies that similar information as obtained in the processing step, because that “wherein” clause is open ended (“comprising”) it is unclear whether antecedent basis is formed from the collection of data from the extractions or whether additional company information is merely inclusive of such elements.

Ex parte Ionescu, 222 USPQ 537 [note: find pincite] (Bd. App. 1984).

Claim 9 is especially noted because it states “additional company information” without any indefinite article.  It is therefore unclear whether this was intended to refer to the “additional company information” first mentioned in claim 8, or whether this was an introduction of an entirely new limitation.  Indefiniteness piled upon top of indefiniteness is too unclear to provide a prior art rejection, which at this point would require the examiner to speculate as to the meaning of the phrase in claim 9.  See In re Steele, 305 F.2d. 859, 862 (CCPA 1962).  This special note should not be taken to mean that claim 9 is the only other dependent claim with this issue.


Claim 15 and all claims dependent therefrom are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement.  The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Claim 15 requires “processor-executable instructions encoded on a non-transient processor readable media.”
The written description is replete with the phrase “non-transitory processor readable media,” where the phrase “non-transitory” is the magic phrase specifically indicated in Subject Matter 

Because claims must be rejected without speculation as to what was intended but must confine the interpretation to what is actually claimed, the prior art rejection below address the mobility (or lack thereof) of the medium.  Steele at 292-93.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 8 is/are rejected under 35 U.S.C. 102(a)(1),(2) as being anticipated by Stern et al. (US 6,983,282) hereinafter Stern, 
w/incorporated features from 60/221,750, hereinafter ‘750
With respect to claim 8, Stern discloses A method performed by a seed enricher module for automatically enriching collected seeds, the method comprising: 
receiving the collected seeds, wherein each of the collected seeds comprises: original seed data that includes a plurality of attributes each having a type and an associated value (Col 6 lines 45-
processing each website that is associated with each collected seed, via a web crawler of the seed enricher module (Fig. 2 shows Crawler), by: crawling a home webpage for the company associated with that collected seed (Col 6 lines 63-66, crawler tries to load a home page) to verify, based on similarity between company name and website name, that a website associated with that home page belongs to that company (Col 4 lines 40-52, esp. line 42, information that organizations typically publish on websites they own is their name, Col 7 lines 40-42. “the invention must maintain and grow a comprehensive database of domain URLs with additional information about each domain”  The information includes “Domain URL name of owner as identified from the web site (“Organization name”).  Thus, one the crawler first encounters the URL, it is logged as information into the same database that the crawler may use to load a home page in later crawls.  Per Col 8 lines 40-46, and Col 9 lines 5-15 esp. lines 13-14 , organization name is used to determine a signature for websites to help find duplicate sites.); and 
when verification is successful: processing other webpages on the website to fetch information (Col 9 lines 30-37, if links on the home page are external, then exploring them is delayed for later.  If internal, they logged in a queue for visitation) using different extractor algorithms, wherein each extractor algorithm is designed to fetch a specific attribute for that company that corresponds to either missing seed data for that collected seed or other instances of the original 

enriching each collected seed by adding the additional company information to the original seed data for each collected seed to generate an enriched company seed (Col 7 lines 40-53 attests that the database is grown with “additional information about each domain” including owner, site classification (“type”), visiting frequency, last visitation date, whether the last visit was successful or timed-out, domain size, number of data found in last visit), 
wherein the additional company information added to each collected seed comprises: the missing seed data and the other instances of the original seed data that were fetched by the web crawler (as cited above), wherein each enriched company seed comprises: values for each attribute from the original seed data prior to enrichment (e.g. outcome of last visit), one or more websites that are associated with that enriched company seed (e.g. Domain URL), and additional values for attributes that have been extracted from the one or more websites (e.g. Size of domain); and 

validating the missing seed data and the other instances of the original seed data fetched by the web crawler by comparing the missing seed data and the other instances of the original seed data fetched by the web crawler to the original seed data (‘750, Section 1, use of Bayesian networks to determine content owner name.  Col 7 lines 53-Col 8 line 7,“content owner” and “organization information” is used interchangeably; in this paragraph, “ABC Corporation” is the exemplary . 

Claim(s) 15-20 is/are rejected under 35 U.S.C. 102(a)(1),(2) as being clearly anticipated by Aakolk et al. (US 2015/0293969 A1) hereinafter Aakolk
Claim 15 is a system claim, where the elements of the system are:
(1) A least one hardware-based processor and at least one memory (per Superguide v DirecTV Enters, 358 F.3d 870, 885 (Fed. Cir. 2004), (use of conjunctive in list imparts distributive property to phrase “at least one of”.))
(2) the memory comprises processor-executable instructions encoded on “non-transient” processor-readable media.
(3) when executed by the processor, the instructions are configurable to cause various computing functions to occur.

The rest of the independent claim, and each dependent claim therefrom, are directed to the computing functions.

Under the Broadest Reasonable Interpretation of this claim, and under the principles of inherency, any computer system with an Internet connection and an interpreter that executes script code discloses the necessary elements to reject the claim.  This is because interpreters (as opposed to compilers) have the capability to modify the code during execution.

configurable to perform any computable function.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Stern as applied to claim 14, in view of Flyer et al., DFS vs. BFS in web crawler design[closed] (https://stackoverflow.com/questions/20579169/dfs-vs-bfs-in-web-crawler-design) hereinafter Flyer
With respect to claim 14, Stern does not teach the webcrawler of the seed enricher uses a BFS traversal method to fetch information using the different extractor algorithms.

Flyer teaches a webcrawler using a BFS traversal method (stating to question 1 “In most situations, I will use BFS algorithm to implement a spider.”).  Thus, the combination teaches the 

Stern and Flyer are directed to web data processing via crawling.  It would have been obvious to those of ordinary skill in the art at the time of filing to combine the teachings of the references (1) because it is obvious to try; there are two general traversal techniques – BFS and DFS, and both ultimately work, so one of ordinary skill could just select one based on the specific use-case the programmer is attempting to solve.  (2) Flyer expressly motivates the use of BFS by noting that most valuable information from web pages does not have a deep link depth, so it may be often more efficient to use a BFS algorithm to obtain the more useful information sooner.

Remarks
All portions of all references cited in the course of prosecution of this application, in this or any previous office action, are hereby employed in support of the current rejections for clarity and to preserve their viability as evidence upon any future appeal.

“validation” linked to Bayesian networks due to them being machine learning models.  This tracks [0096] citing Figs 5A-B as an exemplary implementation of step 230 (during which the paragraph states the validation occurs).  Fig. 5A element 516 shows that a “pre-trained random forest machine learning model” is run on the extracted features, thus, it can be shown that machine learning is used to validate.
Beauregard are all present, have nigh-identical limitations, and only one claim has anomalous wording, the examiner would feel confident in finding that the other claims demonstrate the intended scope despite the typo.  Here, applicants have two independent ostensible-system claims (1, 15).  Also, while the first system claim and the method claim have some overlap, there are also some differences in steps (e.g., the fetching step of system claim 1 does not have an analogue in the method claim 15.)

That said, the examiner also provides the following citations for various claimed elements so that applicants have an opportunity to understand the state of the art.

Pipelining.  Stern, Col 14 lines 19-25.  This is for the “crawler” while claim 2 discusses the “enrichment tasks” but the tasks being discussed as being pipelined with respect to the fig. 2 flow chart are those that tie into the “enrichment” process per the art-claim mapping.

Fetching: fetching additional company information for each of the collected seeds from a plurality of different web-based sources (Col 8 lines 26-29, one embodiment for “duplicate” websites is to ignore the situation and crawl the new site.); and 

APIs and normalization: Guo et al. (US 2017/0091270 A1) [0055]-[0058] normalization and clustering of data.  [0055] shows that the ingestion platform uses APIs, and which may link to 3rd party servers, thus constituting a 3rd party API.  Additionally, the examiner notes that as of 

Organizational profiles v. Organizational information: Examiner notes that “company profile” is only in the preamble of claim 1 as an intended use of the system.  It is of particular note that Stern’s database stores organization name (owner’s name).  Stern does not specify how the database records are keyed.  Stern also does not preclude any field from serving as a key.  This is all currently moot because the “company profile” does not breathe life, meaning and vitality into the claim, but if it were found to do so by future amendment, it might require a single-reference 103 rejection instead.

Claim 1, system limitations:
a plurality of independent seed source services (Fig. 1 element 11, crawler) each being configurable to cause crawling of web pages to collect seeds from different web-based sources (Col 6 lines 49-51, a number of Crawlers that crawl in parallel in different domains); 
a repository (Fig. 1 element 14) configurable to [[cause]] store the collected seeds (Col 7 lines 41-43, database grows with new domains); and 
a seed enricher module (Whatever unspecified program code of Fig. 1 element 11, that implements the Fig. 2 elements performing data normalization) that, when executed by a hardware-based processing system, is configurable to cause: 

The functional limitations generally track the mapping for claim 8 (except, of course, for the indefinite limitation.)

Remarks
All portions of all references cited in the course of prosecution of this application, in this or any previous office action, are hereby employed in support of the current rejections for clarity and to preserve their viability as evidence upon any future appeal.

Guo	US 2017/0091270 A1
	Shows normalization of data.
	Also, abstract shows building organization profiles based on internet-accessible data.  

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JASON G LIAO whose telephone number is (571)270-3775.  The examiner can normally be reached on M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tamara Kyle can be reached on 571-272-4241.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.







/JASON G LIAO/Primary Examiner, Art Unit 2156                                                                                                                                                                                                        12 Feb 21


    
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
        
            
    

    
        1 Of note: Because the originally filed claims are part of the disclosure, if applicants desire to capture the immobility of the medium and avoid the rejection, they have the option of amending the written description to set forth what the claim establishes here.