DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

EXAMINER’S AMENDMENT

An examiner’s amendment to the record appears below. Should the changes and/or additions be unacceptable to applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for this examiner’s amendment for application S/N 14700683 was given via e-mail followed by an examiner interview with Mr. Barry Goldsmith (Registration No. 39,690) on 2/4/2021.

	Amendment to claim 1, 11 and 16 were made.  Claim 2, 12 and 17 are cancelled. Claim 21, 22 and 23 are new. The application has been amended as follows:

Claim 1 (Amended):
A non-transitory computer-readable medium having instructions stored thereon that, when executed by a processor, cause the processor to automatically populate fields in a data store with extracted attribute values, the extracting comprising:

tokenizing the unstructured text, one character of unstructured text at a time, into character-level tokens using a character-based model, wherein each character-level token corresponds to only one character of the received unstructured text;
annotating each of the character-level tokens with an individual attribute label or a background noise label using a character-based conditional random field (CRF), wherein the attribute label is determined based at least in part on features from a word that a character-level token originates from within the unstructured text, the background noise label corresponding to text that is not part of a first attribute value for any attribute, wherein each attribute label corresponds to a fixed number of possible corresponding attribute values and each attribute label corresponds to a portion of the description of the product;
annotating a first word of the plurality of words with at least two different attribute labels, wherein the first word comprises a plurality of characters; 
grouping the character-level tokens into one or more text segments based on the attribute labels, wherein a set of character-level tokens that are annotated with an identical attribute label are grouped into a text segment, and wherein the one or more text segments define one or more attribute values and the first word is assigned at least two different attribute values based on the at least two different attribute labels;
normalizing at least one attribute value of the one or more attribute values by providing pairwise entity linking comprising pairing an attribute value of the one or more attribute values with one or more target attribute values, selecting a target attribute value that has a highest probability of matching the attribute value, and replacing the attribute value with the selected target attribute value; and
storing the one or more attribute labels and the one or more attribute values within the data store.

Claim 2, Canceled.

Claim 11 (Amended):
A computer-implemented method for automatically populating fields in a data store with extracted attribute values, the computer-implemented method comprising:
receiving data comprising unstructured text from the data store, the unstructured text comprising a plurality of words and forming a description of a product, wherein each word comprises one or more characters;
tokenizing the unstructured text, one character of unstructured text at a time, into character-level tokens using a character-based model, wherein each character-level token corresponds to only one character of the received unstructured text;
annotating each of the character-level tokens with an individual attribute label or a background noise label using a character-based conditional random field (CRF), wherein the attribute label is determined based at least in part on features from a word that a character-level token originates from within the unstructured text, the background noise label corresponding to text that is not part of a first attribute value for any attribute, wherein each attribute label corresponds to a fixed number of possible corresponding 
annotating a first word of the plurality of words with at least two different attribute labels, wherein the first word comprises a plurality of characters; 
grouping the character-level tokens into one or more text segments based on the attribute labels, wherein a set of character-level tokens that are annotated with an identical attribute label are grouped into a text segment, and wherein the one or more text segments define one or more attribute values and the first word is assigned at least two different attribute values based on the at least two different attribute labels;
normalizing at least one attribute value of the one or more attribute values by providing pairwise entity linking comprising pairing an attribute value of the one or more attribute values with one or more target attribute values, selecting a target attribute value that has a highest probability of matching the attribute value, and replacing the attribute value with the selected target attribute value; and
storing the one or more attribute labels and the one or more attribute values within the data store.
Claim 12, Canceled.

Claim 16 (Amended):
A system for automatically populating fields in a data store with extracted attribute values, the system comprising:
a non-transitory computer-readable medium having instructions stored thereon; and

receiving data comprising unstructured text from the data store, the unstructured text comprising a plurality of words and forming a description of a product, wherein each word comprises one or more characters;
tokenizing the unstructured text, one character of unstructured text at a time, into character-level tokens using a character-based model, wherein each character-level token corresponds to only one character of the received unstructured text;
annotating each of the character-level tokens with an individual attribute label or a background noise label using a character-based conditional random field (CRF), wherein the attribute label is determined based at least in part on features from a word that a character-level token originates from within the unstructured text, the background noise label corresponding to text that is not part of an attribute value for any attribute, wherein a first attribute label corresponds to a fixed number of possible corresponding attribute values and each attribute label corresponds to a portion of the description of the product;
annotating a first word of the plurality of words with at least two different attribute labels, wherein the first word comprises a plurality of characters; 
grouping the character-level tokens into one or more text segments based on the attribute labels, wherein a set of character-level tokens that are annotated with an identical attribute label are grouped into a text segment, and wherein the 
normalizing at least one attribute value of the one or more attribute values by providing pairwise entity linking comprising pairing an attribute value of the one or more attribute values with one or more target attribute values, selecting a target attribute value that has a highest probability of matching the attribute value, and replacing the attribute value with the selected target attribute value; and storing the one or more attribute labels and the one or more attribute values within the data store.

Claim 17, Canceled.

Claim 21 (New): 
The method of claim 11, further comprising:
receiving one or more pre-defined attribute values;
annotating one or more characters of the unstructured text with one or more attribute labels by matching the one or more pre-defined attribute values with one or more text segments of the unstructured text; and
replacing at least one attribute label that is annotated for at least one character with at least one new attribute label in response to a user interaction.

Claim 22 (New): 
The system of claim 16, further comprising:
receiving one or more pre-defined attribute values;
annotating one or more characters of the unstructured text with one or more attribute labels by matching the one or more pre-defined attribute values with one or more text segments of the unstructured text; and
replacing at least one attribute label that is annotated for at least one character with at least one new attribute label in response to a user interaction.

Claim 23 (New): 
The method of claim 11, wherein annotations of the character-level tokens are represented as an annotation string comprising a visualization of the attribute labels, further comprising displaying the annotation string with the unstructured text. 

Allowable Subject Matter

Claims 1, 3-11, 13-16, 18-23 submitted on August 17, 2020 and further amended on 2/8/2021 by examiner’s amendments are allowed.  

Claim 1 is allowed. The following is a statement of reasons for the indication of allowable subject matter:  
The prior arts made of record neither render obvious nor anticipates the combination of claimed elements, as recited in independent claims 1.
Risvik, Zhao, Wasson and Newsted  teach extraction of attributes by tokenizing unstructured text but the prior arts of record do not specifically suggest the combination of “normalizing at least one attribute value of the one or more attribute values by providing pairwise entity linking comprising pairing an attribute value of the one or more attribute values with one or more target attribute values, selecting a target attribute value that has a highest probability of matching the attribute value, and replacing the attribute value with the selected target attribute value” with all the other limitations recited in the independent claims 1.
These features together with other limitations of the independent claim is novel and non-obvious over the prior arts of record; therefore claims 1 is allowed.  

Claim 11 is allowed. The following is a statement of reasons for the indication of allowable subject matter:  
The prior arts made of record neither render obvious nor anticipates the combination of claimed elements, as recited in independent claims 11.
Risvik, Zhao, Wasson and Newsted  teach extraction of attributes by tokenizing unstructured text but the prior arts of record do not specifically suggest the combination of “normalizing at least one attribute value of the one or more attribute values by providing pairwise entity linking comprising pairing an attribute value of the one or more attribute values with one or more target attribute values, selecting a target attribute value that has a highest probability of matching the attribute value, and replacing the attribute value with the selected target attribute value” with all the other limitations recited in the independent claims 11.
These features together with other limitations of the independent claim is novel and non-obvious over the prior arts of record; therefore claims 11 is allowed.  

Claim 16 is allowed. The following is a statement of reasons for the indication of allowable subject matter:  
The prior arts made of record neither render obvious nor anticipates the combination of claimed elements, as recited in independent claims 16.
Risvik, Zhao, Wasson and Newsted  teach extraction of attributes by tokenizing unstructured text but the prior arts of record do not specifically suggest the combination of “normalizing at least one attribute value of the one or more attribute values by providing pairwise entity linking comprising pairing an attribute value of the one or more attribute values with one or more target attribute values, selecting a target attribute value that has a highest probability of matching the attribute value, and replacing the attribute value with the selected target attribute value” with all the other limitations recited in the independent claims 16.
These features together with other limitations of the independent claim is novel and non-obvious over the prior arts of record; therefore claims 16 is allowed.  

The dependent claims 3-10, 13-15 and 18- 23 are being definite, enabled by the specification, and further limiting to the independent claims, are also allowed.

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee.  Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance”.

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Abdullah Daud whose telephone number is 469-295-9283.  The examiner can normally be reached on 7:30 am - 5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ashish Thomas can be reached on 571-272-0631.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/A. D./
/ASHISH THOMAS/Supervisory Patent Examiner, Art Unit 2164