Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on January 27, 2021 has been entered.

Response to Amendment
Applicant’s response to the last office action, filed January 27, 2021 has been entered and made of record. Claims 1-5, 8-9, 11-12, 17, and 21 have been amended; and claim 6 has been cancelled. Claims 1-5, and 7-21 are pending in this application.
In view of Applicant’s amendment, the objection to specification, has been withdrawn.
In view of Applicant’s amendment, the rejection of claims 1-5, 7-21 under 35 U.S.C. 112(a), has been withdrawn.



Response to Arguments
Applicant’s arguments with respect to claims 1-5, 7-20 have been considered but are moot in view of new ground(s) of rejection: US-PGPUB 2020/041678, (based on Prov. Appl. 62/582,092, filed on November 6, 2017)

Specification
The specification is objected to as failing to provide proper antecedent basis for the claimed subject matter. See 37 CFR 1.75(d) (1) and MPEP 608.01(o). Correction of the following is required:
a.	The specification is objected to because of lacking support for the limitation “identifying whether a sequence of words in a sentence belonging to the attribute is interrupted by at least one word not belonging to the attribute.” cited in the claim 9.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claim 21 is rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. Claim 21 recites the following limitation: “identifying whether a sequence of words in a sentence belonging to the attribute is interrupted by at least one word not belonging to the attribute”. For the purpose of prior art consideration, the claims in question will be construed as best understood.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 17-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. Claim 17 recites the following limitation: “One or more computer storage devices having computer-executable instructions stored thereon for enhancing product descriptions, which, on execution by a computer, cause the computer to perform operations”. It should be noted that the one or more computer storage devices is/are a merely memories, so the limitation “One or more computer storage devices” is equivalent to the “One or more computer storage medium (or media) devices)”, and the above limitation is construed as “: “One or more computer storage medium (or media) devices having computer-executable instructions stored thereon…”. Further, in other hand, the specification (see Para. 0078, Applicant’s US-PGPUB 20190311210), recites the following limitation: “Although the computer storage medium (the memory 722) is shown within the computing apparatus 718, it will be appreciated by a person skilled in the art, that the storage may be distributed or located remotely and accessed via a network or other communication link (e.g. using a communication interface 723)”. As shown in Para. 0078 of the specification, the one or more computer storage devices, is/are not limited to physical devices (Rom, Ram ...etc.) and includes a signal carrier wave; and that a “signal”, “carrier wave”, or “transmission medium” are deemed non-statutory.  
As remedy, the Examiner suggests amending the claim 17 to reflect such as: “One or more non-transitory computer storage devices having computer-executable instructions stored thereon for enhancing product descriptions, which, on execution by a computer, cause the computer to perform operations”, to be consistent with the guidelines for Subject Matter Eligibility of Computer readable media, 1351 OG 212, Feb. 23, 2010.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-2, 5, 7, 12, and 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Gokturk et al (US-PGPUB 2012/0117072) in view of Paradise et al, (US-PGPUB 2012/0158482); and further in view of Song et al, (US-PGPUB 2020/0410678, “based on its US Prov. Appl. 62/582,092, filed on November 6, 2017”); and further in view of Kanani et al (US-PGPUB 2016/0321358); and further in view of 

In regards to claim 1, Gokturk discloses a method for enhancing product 
Descriptions, (see at least: Par. 0047), the method comprising: 
obtaining at least a first product image associated with the product, the first product image including textual information about the product, (Par. 0022-0023, receives product data (including image data and text data) for a product from the information processing system 100. The text data associated with a product may include the product title or name, text descriptions of the product, text reviews of the product, etc., [i.e., textual information about the product]);
processing the first product image using a machine learning system for attribute tagging to determine one or more attributes associated with the product, (Par. 0023, selecting attributes from the vocabulary 106 that are applicable to that product, using machine-learning, regular expressions, and other techniques. For example, the attribute label, [i.e., tag], the products stored in the product data store, [i.e., process the image using machine-learning system for attribute tagging to determine one or more attributes associated with the product]. Further, Par. 0038, discloses that the attribute label is associated with the product, [i.e., one or more attributes associated with the product]); and 
storing the one or more attributes associated with the product in a database used to generate an online description of the product, (Par. 0023, the attributes selected by the attribute selection module 103 may be used to label the products stored in the product data store 101, or they may be provided to external services such as search services or retail services, so that such services can label products in their own data stores. Further, Para 0022, discloses that the product data store 101 may be hosted by one or more remote systems, for e.g. on the Internet-connected servers of web retailers, [i.e., online], and may store both image data and text data, such that the text data associated with a product may include the product title or name, text descriptions of the product, text reviews of the product, [which results in generating an online description of the product via the web retailers]. Further, the Abstract, and Par. 0038, disclose that the attribute label is associated with the product, [i.e., the one or more attributes associated with the product]).



recognize text in the first product image, wherein executing the text recognition module includes executing an attention layer that emphasizes an area of the first product image to receive greater visual attention; and that the textual information being represented by at least one relevant portion of the first product image; using an end-to-end automated machine learning system, (an ML system), wherein the attribute tagging includes assigning a token to each word of recognized text from the textual information, the token being either a starting token of an individual attribute, a continuing token of the individual attribute, or a disregard token if the word is unrelated to the individual attribute, and wherein the individual attribute is at least one of a term sequence or a character sequence.
	However, Paradise discloses that the image acquisition component 115 may further include optical character recognition capabilities, such that product names, brands, serial numbers, product numbers, or other text-based information may be read, digitally represented, [i.e., the textual information being represented by at least one relevant portion of the first product image, and further executing a text recognition module to recognize text in the first product image using optical character recognition], (see at least: Fig. 1, Par. 0033).
Gokturk and Paradise are combinable because they are both concerned with 
the text recognition. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify Gokturk, to use the image acquisition component 115 including optical character recognition capabilities, as though by Paradise, in order capture an image of all 
The combine teaching Gokturk and Paradise as whole does not expressly 
disclose wherein executing the text recognition module includes executing an attention layer that emphasizes an area of the first product image to receive greater visual attention; processing the recognized text in the first product using an end-to-end automated machine learning system, (an ML system), wherein the attribute tagging includes assigning a token to each word of recognized text from the textual information, the token being either a starting token of an individual attribute, a continuing token of the individual attribute, or a disregard token if the word is unrelated to the individual attribute, and wherein the individual attribute is at least one of a term sequence or a character sequence.
Song et al, discloses wherein executing the text recognition module includes executing an attention layer that emphasizes an area of the first product image to receive greater visual attention, (see at least: Par. 0056, the diagnostic report generating system 100 may determine attention weights. In some embodiments, attention weights may be implemented as numerical values used to quantify the contribution of each image feature of the image in the decision of outputting a specific word in the generated report. For example, an attention weight of a higher value indicates that the corresponding image feature is more important, [i.e., receive greater visual attention based on weighted values of the first product image], (equivalent to Page 5, 1st paragraph, lines 1-7, US Prov.  Appl. 62/582,092). Further, Par. 0064, discloses that the attention layer 408 may be constructed by weight matrices that assign different weights to the image features in different regions st paragraph, last following sentence, US Prov.  Appl. 62/582,092: “It is also contemplated that image/region selection can be an automated solution”, [i.e., step 308 of Fig. 3, “the attention layer”]), [i.e., executing an attention layer that emphasizes an area of the first product image to receive greater visual attention]. Song et al further discloses processing the recognized text in the first product using an end-to-end automated machine learning system, (an ML system), (Par. 0023, 0062, the end-to-end diagnosis report generation model 400 may take one or more pre-processed images, e.g., a medical image 402, as input and output the description of the medical image (e.g., a text-based description) together with attention weights for the input image(s), [i.e., processing the recognized text in the first product using an end-to-end automated machine learning system], (equivalent to eq. Page 3, first paragraph, and Page 5, under the section “details of diagnosis report process”, lines 1-3, US Prov.  Appl. 62/582,092)).
Gokturk and Paradise and Song et al are combinable because they are all concerned with text recognition. Using Song’s attention weights based text recognition method in the Gokturk’s product descriptions method, represents a simple substitution of one well known element, (Gokturk’s machine learning), with another well-known element, (Song’s end-to-end machine learning), to yield a predictable results. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify the combine teaching Gokturk and Paradise, to substitute one well-known element, (Gokturk’s machine learning) with another well-known element, (Song’s end-to-end machine learning), in order to significantly reduce the amount of time medical professionals need to spend on each patient and can help improve the efficiency in diagnosis of diseases, (Song, Par. rd paragraph, last two sentences, US Prov.  Appl. 62/582,092)
	However, the combine teaching Gokturk and Paradise and Song et al as whole does not expressly disclose wherein the attribute tagging includes assigning a token to each word of recognized text from the textual information, the token being either a starting token of an individual attribute, a continuing token of the individual attribute, or a disregard token if the word is unrelated to the individual attribute, and wherein the individual attribute is at least one of a term sequence or a character sequence.
Kanani discloses that the attribute tagging includes assigning a token to each word of recognized text from the textual information, wherein the individual attribute is at least one of a term sequence or a character sequence (see at least: Par. 0014, identify an original word that a character originates from within the unstructured text, and use the features of the original word as evidence of an attribute label assigned to the character-based token; and specially Fig. 4, and Par. 0036, disclose assigning a brand name attribute labels, 430A, 430B,….430G, to word-based tokens 420 A, 420B,…420G, respectively, [i.e., attribute tagging includes assigning a token to each word of recognized text from the textual information]. Further, Par. 0014, discloses also that the sequence tagging can be character-based sequence tagging, [i.e., the individual attribute is at least one of a term sequence or a character sequence]. Kanani further discloses that the token being either a starting token of an individual attribute, a continuing token of the individual attribute, or a disregard token if the word is unrelated to the individual attribute, (see at least: Par. 0034, using a "BIO representation" to annotate attribute values in text, in which each token label is prefixed with a "B" to indicate the beginning of an entity name, an "I" inside of an entity name, or an "O" to indicate a background token [That is, the token being either a token label indicating the beginning of an entity name, “starting token of an individual attribute”, or token label indicating inside of an entity name, “continuing token of the individual attribute”]) 
	Gokturk, Paradise, Song et al, and Kanani are combinable because they are all concerned with character sequence recognition. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify the combine teaching Gokturk, Paradise, and Song et al, to include the character-based attribute value extraction system, as though by Kanani, in order to product descriptions found in retail systems, where the unstructured text explicitly, or implicitly, includes the attribute values, (Kanani, Par. 0014).

The following prior art of record show also the end-to-end machine learning text 
recognition:
Wang et al (“End-to-End Text Recognition with Convolutional Neural 
Networks”, IEEE 2012), (see at least: Abstract, and Fig. 1)
Jaderberg et al (“Reading Text in the Wild with Convolutional Neural 
Networks”, Springer, Int. J. Comput. Vis (2016) 116:1–20), (see at least: Abstract)
Baoguang et al (“An End-to-End Trainable Neural Network for Image-Based 
Sequence Recognition and Its Application to Scene Text Recognition, IEEE transactions on pattern analysis and machine intelligence, Vol. 39, No. 11, November 2017), (see at least: Abstract, and Page 2298, right- hand –col., 4th paragraph, integrating DCNN and RNN in an end-to-end manner).

The following prior art of record show also the “first product image being at least 
one of a raster graphic or a bitmap graphic, and that the textual information being represented by at least one relevant portion of the first product image”:
US-PGPUB 2018/0189620, (see at least: Par. 0078, 0153-0154)
US-PGPUB 2015/0363625, (see at least: Par. 0069-0072)
US-PGPUB 2014/0156346, (see at least: Par. 0037)

Regarding claim 12, claim 12 recites substantially similar limitations as set forth in claim 1. As such, claim 12 is rejected for at least similar rational.
The Examiner further acknowledged the following additional limitation(s): “a system”. However, Gokturk et al discloses a “system”, (see at least: Par. 0006, “a computer system”).

Regarding claim 17, claim 17 recites substantially similar limitations as set forth in claim 1. As such, claim 17 is rejected for at least similar rational.
The Examiner further acknowledged the following additional limitation(s): “computer-readable medium having stored therein instructions which, when executed by a processor”. However, Gokturk et al discloses a “system”, (see at least: Par. 0006, “a non-transitory computer-readable storage medium storing executable code, and Par. 0026, “processor 202”).


Claims 2 is rejected under 35 U.S.C. 103 as being unpatentable over Gokturk, Paradise, Song et al, and Kanani, as applied to claim 1 above; and further in view of Tian et al, (“detecting text in natural image with connectionist text proposal network, September 2016), (provided by Applicant).
The combine teaching Gokturk, Paradise, Song et al, and Kanani as whole, 
discloses the limitations of the claim 1.
	Furthermore, Song et al discloses wherein the ML system combines a plurality of machine learning techniques in a sequence including at least a first machine learning technique including a CNN  a second machine learning technique including an RNN, and a third machine learning technique including an attribute sequence recognition system, wherein an output of at least one of the plurality of machine learning techniques in the sequence is an input for a next machine learning technique in the sequence, (Song et al, Par. 0023, The end-to-end deep learning model background process may be configured to combine an image processing convolutional neural network (CNN), a natural language processing recurrent neural network (RNN), and an attention process, (an attribute sequence recognition system), [which the output of at least one of the plurality of machine learning techniques in the sequence is implicitly an input for a next machine learning technique in the sequence]. Note that the attention layer corresponds to the attribute sequence recognition system, “equivalent to Page 3, 1st paragraph, US Prov.  Appl. 62/582,092”). 
The combine teaching Gokturk, Paradise, Song et al, and Kanani as whole does 
not expressly disclose a connectionist text proposal network (CTPN)  
st paragraph, discloses the CTPN is essentially a fully convolutional network that allows an input image of arbitrary size. It detects a text line by densely sliding a small window in the convolutional feature maps, and outputs a sequence of fine-scale proposals).
Gokturk, Paradise, Song et al, and Kanani, and Tian are combinable because 
they are all concerned with text recognition. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify the combine teaching Gokturk, Paradise, Song et al, and Kanani, by substituting the Song’s RNN, with the Text Proposal Network (CTPN), as though by Tian et al, in order to detect text lines and accurately localizing text lines in natural language (Tian, Abstract).

Claims 3-4, 7-9, and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Gokturk, Paradise, Song et al, Kanani, and Tian et al, as applied to claim 2 above; and further in view of King et al (US-PGPUB 2006/0294094)

In regards to claim 3, the combine teaching Gokturk, Paradise, Song et al, and 
Kanani, and Tian as whole, discloses the limitations of the claim 2.
	Furthermore, Gokturk discloses performing a text detection to locate at least a first text portion in the first product image, (Par. 0022); and performing a text recognition, (see at least: Par. 0043-0044).
st paragraph, US Prov.  Appl. 62/582,092”, (higher weight value means higher importance), [i.e., the region or area of the first product image attention weight of a higher value to receive greater visual attention]).
The combine teaching Gokturk, Paradise, Song et al, and Kanani, and Tian as whole does not expressly disclose performing the text recognition to determine a text sequence in the first text portion; and determining the one or more attributes regarding the product based, at least in part, on the text sequence.
However, King et al discloses performing a text detection to locate at least a first text portion in the first product image, (see at least: Par. 0060, 0110-0114); performing a text recognition to determine a text sequence in the first text portion, (see at least: Abstract, par. 0053, 0060). King further discloses that the facility receives a text sequence captured by a user, and identifies in the received text sequence a reference to a distinguished product, (see at least: Abstract), such that when the user captures text, the system determines whether the captured text contains a reference to a product that is available for purchase, [i.e., the reference includes at least one attribute regarding the product based, at least in part, on the text sequence captured by a user], (see Para 0510). 


In regards to claim 4, the combine teaching Gokturk, Paradise, Song et al, and Kanani, Tian, and King et al as whole discloses the limitations of the claim 3.
Furthermore, Gokturk et al wherein the text detection is performed by the text detection module of the machine learning system, (Gokturk, Par. 0043-0044).
In the other hand, King et al discloses wherein the text recognition is performed by a text recognition module of the machine learning system, and wherein the one or more attributes are determined by an attribute tagging module of the machine learning system, (King, see at least: Par. 0515, “neural network”, and Par. 0053, 0060, 510).

In regards to claim 7, the combine teaching Gokturk, Paradise, Song et al, and Kanani, Tian, and King et al as whole discloses the limitations of the claim 4. 
Furthermore, Gokturk et al discloses training the attribute tagging module to identify words associated with each of a plurality of product attributes, (Gokturk, Par. 0035, in another embodiment, the applicable category may be determined based on the application of text processing techniques such as regular expressions, text classifiers etc. to the text data included in the product data. For example, based on the word "shoe" appearing in the text data, the system may select the "shoe toe type" category as being applicable to the product, [i.e., the word “shoe" is associated with the product attribute “shoe toe type”. Further, Par. 0019; an information processing system 100 configured to select attributes for a product based on image data and text data associated with the product)

In regards to claim 8, the combine teaching Gokturk, Paradise, Song et al, and Kanani, Tian, and King et al as whole discloses the limitations of the claim 4. 
Furthermore, Tian et al discloses wherein the text detection module implements the CTPN, (see at least: Abstract, a novel Connectionist Text Proposal Network (CTPN) that accurately localizes text lines in natural image); and wherein the text recognition module implements the CNN, (Page 7, 2nd paragraph, a text recognition implementing convolutional neural network);
Further, in the other hand, Song et al discloses wherein the attribute tagging module implements the attribute sequence recognition system, (Par. 0023, an attention process, corresponds to an attribute sequence recognition system).

In regards to claim 9, the combine teaching Gokturk, Paradise, Song et al, and Kanani, Tian, and King et al as whole discloses the limitations of the claim 8. 
Furthermore, Song et al discloses wherein the attention layer emphasizes the area of the first product image to receive greater visual attention based on weighted values of the first product image, (Song et al, see at least: Par. 0056, the diagnostic report an attention weight of a higher value indicates that the corresponding image feature is more important, [i.e., receive greater visual attention based on weighted values of the first product image], (equivalent to Page 5, 1st paragraph, lines 1-7, US Prov.  Appl. 62/582,092); and Par. 0064, discloses that the attention layer 408 may be constructed by weight matrices that assign different weights to the image features in different regions of medical image 402, (equivalent to Fig. 3, Page 5, 1st paragraph, last following sentence, of US Prov.  Appl. 62/582,092: “It is also contemplated that image/region selection can be an automated solution”, [see step 308 of Fig. 3, “the attention layer”]), [i.e., executing an attention layer that emphasizes an area of the first product image to implicitly receive greater visual attention]).

In regards to claim 11, the combine teaching Gokturk, Paradise, Song et al, and Kanani, Tian, and King et al as whole discloses the limitations of the claim 8. 
Furthermore, Song et al discloses wherein the attribute sequence recognition system further includes a second attention layer, (Song et al, Par. 0064, the attention layer 408 may be constructed by weight matrices that assign different weights to the image features in different regions of medical image 402, (equivalent to Fig. 3, Page 5, 1st paragraph, last following sentence, of US Prov.  Appl. 62/582,092: “It is also contemplated that image/region selection can be an automated solution”, [see step 308 of Fig. 3, “the attention layer”]), [Note that the second attention layer corresponds to the st paragraph, US Prov.  Appl. 62/582,092”).

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Gokturk, Paradise, Song et al, Kanani, and Tian et al, and King et al, as applied to claim 8 above; and further in view of Zheng et al (CN107122416)
The combine teaching Gokturk, Paradise, Song et al, and Kanani, Tian, and King et al as whole discloses the limitations of the claim 8. 
The combine teaching Gokturk, Paradise, Song et al, and Kanani, Tian, and King et al as whole does not expressly disclose wherein the attribute sequence recognition system includes a bidirectional long short-term memory layer with a conditional random field layer.
Zheng discloses wherein the attribute sequence recognition system includes a bidirectional long short-term memory layer with a conditional random field layer, (see at least: Paragraphs 0019, 0024, 0054)
Gokturk, Paradise, Song et al, Kanani, and Tian et al, and King et al, and Zheng are combinable because they are all concerned with text recognition. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify the combine teaching Gokturk, Paradise, Song et al, Kanani, and Tian et al, and King et al, to include the bidirectional long short-term memory layer with a conditional random field layer, as though by Zheng, in order to obtain context information characterizing each word (Zheng, Abstract)

Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Gokturk, Paradise, Song et al, Kanani, and Tian et al, and King et al, as applied to claim 4 above; and further in view of Hsiao et al, (US-PGPUB 2016/0110794); and further in view of Ling et al, (“Topic Detection from Microblogs Using T-LDA and Perplexity, 2017 24th Asia-Pacific Software Engineering Conference Workshops, IEEE, Pages 71-77)
The combine teaching Gokturk, Paradise, Song et al, and Kanani, Tian, and King et al as whole discloses the limitations of the claim 4.
Furthermore, Gokturk et al discloses the generating synthetic images that include product-related vocabulary, the product-related vocabulary including product titles, product descriptions, (Gokturk, see at least: Par. 0021, the vocabulary 106 is conceptually the corpus from which the product attributes are selected. The vocabulary 106 includes one or more categories, with each category comprising a plurality of attributes. For example, if the product is a shoe, two attributes that might apply to it may be "high heel" and "square toe". Further, Par. 0023, discloses the text data associated with a product may include the product title or name, text descriptions of the product, text reviews of the product, etc., [i.e., the product-related vocabulary including product titles, product descriptions]); and training the text recognition module using the synthetic images, (Gokturk, Par. 0044, training text classifiers used by the tex1 classification-based generator [i.e. tex1 recognition]. The text data from the labeled training set is used to train 503 one or more machine-learned text classifiers for each category in the vocabulary 106. 
The combine teaching Gokturk, Paradise, Song et al, Kanani, Tian, and King et al as whole does not expressly discloses a sample unigram or sample bigram; and wherein training the text recognition module optimizes and reduces perplexity of the text.
Hsiao et al discloses sample unigram or sample bigram, (see at least: Par. 0060, unigram language model may be estimated by treating the image as a sample from an underlying multinomial word distribution and using a maximum likelihood estimator).
Gokturk, Paradise, Song et al, and Kanani, Tian, King et al, and Hsiao et al are combinable because they are all concerned with text recognition. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify the combine teaching Gokturk, Paradise, Song et al, Kanani, Tian, and King et al, to estimate the unigram language model, as though by Hsiao et al, with the Gokturk’s vocabulary 106, in order to express the degree of similarity between two images using a relevance value determined for the two images using each image's estimated language model, (Hsiao, Par. 0061).
The combine teaching Gokturk, Paradise, Song et al, Kanani, Tian, King et al, and Hsiao et al as whole does not expressly disclose wherein training the text recognition module optimizes and reduces perplexity of the text.
Ling et al discloses using perplexity-K curve to decide the optimal K-value, [i.e., implicitly optimizing perplexity of the text], (Page 73, left-hand-column, third paragraph); and selecting smaller value of perplexity using the equation (4), [i.e., implicitly reducing perplexity of the text], (Page 73, right-hand-column, Fig. 2, 1st paragraph, and 2nd paragraph, under section (B))
.

Claims 13-14, and 18-19  are rejected under 35 U.S.C. 103 as being unpatentable over Gokturk, Paradise, Song et al, and Kanani, as applied to claims 12 and 17 above; and further in view of King et al (US-PGPUB 2006/0294094)

In regards to claim 13, the combine teaching Gokturk, Paradise, Song et al, and Kanani as whole discloses the limitations of the claim 12. 
Furthermore, Gokturk discloses performing a text detection to locate at least a first text portion in the first product image, (Gokturk, Par. 0022); and performing a text recognition, (Gokturk, see at least: Par. 0043-0044). Gokturk et al further discloses in an another embodiment, the applicable category may be determined based on the application of text processing techniques such as regular expressions, text classifiers etc. to the text data included in the product data. For example, based on the word "shoe" appearing in the text data, the system may select the "shoe toe type" category as being text recognition]).
The combine teaching the combine teaching Gokturk, Paradise, Song et al, and Kanani as whole does not expressly disclose performing the text recognition to determine a text sequence in the first text portion; and determining the one or more attributes regarding the product based, at least in part, on the text sequence.
However, King et al discloses performing a text detection to locate at least a first text portion in the first product image, (see at least: Par. 0060, 0110-0114); performing a text recognition to determine a text sequence in the first text portion, (see at least: Abstract, par. 0053, 0060). King further discloses that the facility receives a text sequence captured by a user, and identifies in the received text sequence a reference to a distinguished product, (see at least: Abstract), such that when the user captures text, the system determines whether the captured text contains a reference to a product that is available for purchase, [i.e., the reference includes at least one attribute regarding the product based, at least in part, on the text sequence captured by a user], (see Para 0510). 
Gokturk, Paradise, Song et al, Kanani, and King et al are combinable because they are all concerned with text recognition. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify the combine teaching Gokturk, Paradise, Song et al, and Kanani, to identify the received text sequence a reference, as though by King et al, in order to determine whether the captured text contains a reference to a product that is available for purchase based at least in part on the text sequence captured by a user, (King et al, Par. 0510).

In regards to claim 14, the combine teaching Gokturk, Paradise, Song et al, Kanani, and King et al as whole discloses the limitations of the claim 13.
Furthermore, Gokturk et al wherein the text detection is performed by a text detection module of the machine learning system, (Gokturk, Par. 0043-0044).
In the other hand, King et al discloses wherein the text recognition is performed by a text recognition module of the machine learning system, and wherein the one or more attributes are determined by an attribute tagging module of the machine learning system, (King, see at least: Par. 0515, “neural network”, and Par. 0053, 0060, 510).

Regarding claim 18, claim 18 recites substantially similar limitations as set forth in claim 13. As such, claim 18 is rejected for at least similar rational.

Regarding claim 19, claim 19 recites substantially similar limitations as set forth in claim 14. As such, claim 19 is rejected for at least similar rational.

Claims 15-16, and 20  are rejected under 35 U.S.C. 103 as being unpatentable over Gokturk, Paradise, Song et al, Kanani, and King et al, as applied to claims 14 and 19 above; and further in view of Zheng et al (CN107122416)

In regards to claim 15, the combine teaching Gokturk, Paradise, Song et al, Kanani, and King et al discloses the limitations of the claim 14.
the combine teaching Gokturk, Paradise, Song et al, Kanani, and King et al as whole does not expressly disclose wherein the text detection module includes a 
However, Zheng et al discloses wherein the text detection module includes a connectionist text proposal network, wherein the text recognition module includes a convolutional neural network, and wherein the attribute tagging module includes an attribute sequence recognition system, the attribute sequence recognition system including a bidirectional long short-term memory layer with a conditional random field layer, (Paragraphs 0024, 0054)
Gokturk, Paradise, Song et al, Kanani, King et al, and Zheng are combinable because they are both concerned with text recognition. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify the combine teaching Gokturk, Paradise, Song et al, Kanani, and King et al, to include the bidirectional long short-term memory layer with a conditional random field layer, as though by Zheng, in order to obtain context information characterizing each word (Zheng, Abstract)

In regards to claim 16, the combine teaching Gokturk, Paradise, Song et al, Kanani, King et al, and Zheng as whole discloses the limitations of the claim 15.
Furthermore, Zheng et al discloses wherein the text recognition module further includes a first attention layer, and wherein the attribute sequence recognition system further includes a second attention layer, (Zheng, see at least: Paragraphs 0024, 0054)

Regarding claim 20, claim 20 recites substantially similar limitations as set forth in claim 15. As such, claim 20 is rejected for at least similar rational.

Claims 21 is rejected under 35 U.S.C. 103 as being unpatentable over Gokturk, Paradise, Song et al, and Kanani, as applied to claim 1 above; and further in view of Hu et al (US-PGPUB 2010/0185569)
The combine teaching Gokturk, Paradise, Song et al, and Kanani as whole discloses the limitations of the claim 1.
The combine teaching Gokturk, Paradise, Song et al, and Kanani as whole does not expressly disclose identifying whether a sequence of words in a sentence belonging to the attribute is interrupted by at least one word not belonging to the attribute.
However, Hu et al discloses training data 114 (i.e. a grouping of sentences, a customer review, or any text as mentioned above), with each sentence extracted from training data 114 having at least one defined attribute associated therewith that is assigned by human labelers, (see at least: Par. 0014). Each sentence of training data 114 may have more than one attribute associated therewith, [i.e., each sentence of training data 114 may have at least one a word within the sentence with different attribute or not belonging to the same attribute]. Hu et al further discloses using a binary classifier of the AI algorithm module 106, which is trained for each defined attribute of the sentence, which enables the detecting sentence of training data 114, which may have dependent and independent attributes, [i.e., AI algorithm module 106 identifies whether a sequence 
Gokturk, Paradise, Baoguang, Kanani, and Hu et al are combinable because they are all concerned with character sequence recognition. Therefore, it would have been obvious to a person of ordinary skill in the art, to modify the combine teaching Gokturk, Paradise, Baoguang and Kanani, to include the steps 402, and 404, as though by Hu et al, in order to identify attributes of any text block, (Hu et al, Par. 0005).

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMARA ABDI whose telephone number is (571)270-1670.  The examiner can normally be reached on 9:00am-5:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached on (571)272-7332.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information 




/AMARA ABDI/Primary Examiner, Art Unit 2668                                                                                                                                                                                                        02/16/2021