DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
In the manner set forth in MPEP 609.05(b), the Examiner has considered all of the references submitted as part of the Information Disclosure Statement(s), but has not found any to be particularly relevant.  If Applicant is aware of pertinent material in the references, Applicant should so state in a response to this Office action.
Applicant is reminded that MPEP 2004 states:
“It is desirable to avoid the submission of long lists of documents if it can be avoided.  Eliminate clearly irrelevant and marginally pertinent cumulative information. If a long list is submitted, highlight those documents which have been specifically brought to applicant’s attention and/or are known to be of most significance.  See Penn Yan Boats, Inc. v. Sea Lark Boats, Inc., 359 F. Supp. 948, 175 USPQ 260 (S.D. Fla. 1972), aff ’d, 479 F.2d 1338, 178 USPQ 577 (5th Cir. 1973), cert. denied, 414 U.S. 874 (1974).  But cf. Molins PLC v. Textron Inc., 48 F.3d 1172, 33 USPQ2d 1823 (Fed. Cir. 1995)”.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claim(s) (1-19) are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Sam Wood et al. (hereinafter Wood) “Automated Industry Classification with Deep Learning” IEEE 2017.
Re claim 1, Wood discloses a method of generating industry classifications, the method comprising: constructing a training dataset of using a database, wherein the database includes an index tying a plurality of companies with a set of attributes (See section III); representing the plurality of companies using sparse feature vectors; training a classifier on mini batches of data from the dataset (See section III); extracting relevant keywords or keyword phrases for one or more of the plurality of companies from one or more of webpages, titles, meta descriptions, meta keywords, hyperlinks, and visible text (See section III); determining a vocabulary of keywords and keyword phrases relevant in describing the services and product offerings of one or more of the plurality of companies (See section III); generating categories for sparse features of the dataset (See section III); generating labels for the plurality of companies in the dataset (See section III); and training a neural-network model using the labeled dataset. (See section III)

Re claim 2, Wood discloses performing natural language processing on the one or more of the webpages, titles, meta descriptions, meta keywords, hyperlinks, and visible text to extract relevant keywords. (See section III)

Re claim 3, Wood discloses web crawling internet data to acquire the one or more of the webpages, titles, meta descriptions, meta keywords, hyperlinks, and visible text. (See section III)

Re claim 4, Wood discloses determining a frequency at which each keyword or keyword phrase appears within the one of more of the webpages, titles, meta descriptions, meta keywords, hyperlinks, and visible text. (See section III)

Re claim 5, Wood discloses performing a bag-of-words analysis on the keyword or keyword phrases to determine the frequency of each keyword or keyword phrase. (See section III)

Re claim 6, Wood discloses wherein the training dataset includes a weight for each class based on a ratio of a total number of training examples to a number of training examples for that class. (See section IV)

Re claim 7, Wood discloses wherein the weight causes the training examples of one class to be up-weighted. (See section IV)

Re claim 8, Wood discloses wherein the weight causes the training examples of one to be down-weighted. (See section IV)

Re claim 9, Wood discloses wherein the weight is also based on verified and unverified training examples. (See section IV)

Re claim 10, Wood discloses comprising up-weighting the verified training examples. (See section IV)

Re claim 11, Wood discloses down-weighting the unverified training examples. (See section IV)

Re claim 12, Wood discloses updating parameters of the neural-network model with an Adam optimizer with Nesterov momentum. (See section IV)
Re claim 13, Wood discloses inputting a second dataset associated with a company into the neural-network model; and causing the neural-network model to output a classification code associated with the company based on the second dataset. (See section III-V)

Re claim 14, Wood discloses wherein the neural-network model generates a confidence score for the classification code based on the second dataset. (See section V)

Re claim 15, Wood discloses thresholding the confidence score to remove the classification code when the confidence score is below a given value. (See section V)

Re claim 16, Wood discloses generating a term frequency- inverse document frequency (tf-idf) value, the tf-idf value being a ratio of a frequency with which the keywords and keyword phrases occur in a document or collected information about the one or more companies to a frequency with which the keywords and keyword phrases occur in a plurality of documents or collected information about the one or more companies. (See section III)

Re claim 17, Wood discloses wherein one of the sparse features is the tf-idf value calculated from a pooled set of texts across all sources for the one or more companies. (See section III)

Re claim 18, Wood discloses wherein one of the sparse features is the tf-idf value calculated from homepage title, meta description, meta keywords, or combinations thereof associated with the one or more companies. (See section III)

Re claim 19, Wood discloses wherein one of the sparse features is the tf-idf value calculated from a homepage hyperlink associated with the one or more companies. (See section III)
Contact
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEON FLORES whose telephone number is (571)270-1201. The examiner can normally be reached M-F 7am - 5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent Rudolph can be reached on 571-272-8143. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/LEON FLORES/Primary Examiner, Art Unit 2661                                                                                                                                                                                                        May 2, 2022