Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 11, 12 and 15 are cancelled. Claims 1, 10, 13-14 and 16-27 are pending.
The following rejections are withdrawn in view of applicant’s amendments:
Claims 1-3, 22 and 23 rejected under 35 U.S.C. 103 as being unpatentable over Dejean et al (US Patent: 9613267, issued: Apr. 4, 2017, filed: Sep. 3, 2014) in view of Duta (US Application: US 2020/0117944, published: Apr. 16, 2020, filed: Oct. 10, 2018) in view of Foncubierta Rodriguez et al (US Application: US 2020/0050845, published: Feb. 13, 2020, filed: Aug. 13, 2018).
Claim 4 rejected under 35 U.S.C. 103 as being unpatentable over Dejean et al (US Patent: 9613267, issued: Apr. 4, 2017, filed: Sep. 3, 2014) in view of Duta (US Application: US 2020/0117944, published: Apr. 16, 2020, filed: Oct. 10, 2018) in view of Foncubierta Rodriguez et al (US Application: US 2020/0050845, published: Feb. 13, 2020, filed: Aug. 13, 2018) in view of Grams (US Application: US 2017/0220859, published: Aug. 3, 2017, filed: Jan. 29, 2016).
Claims 6 and 7 rejected under 35 U.S.C. 103 as being unpatentable over Dejean et al (US Patent: 9613267, issued: Apr. 4, 2017, filed: Sep. 3, 2014) in view of Duta (US Application: US 2020/0117944, published: Apr. 16, 2020, filed: Oct. 10, 2018) in view of Foncubierta Rodriguez et al (US Application: US 2020/0050845, published: Feb. 13, 2020, filed: Aug. 13, 2018) in view of Grams (US Application: US 2017/0220859, published: Aug. 3, 2017, filed: Jan. 29, 2016) in view of Prasad et al (US Application: US 2020/0311410, published: Oct. 1, 2020, filed: Mar. 5, 2020, foreign priority: Mar. 28, 2019) and further in view of Jannssen (US Application: US 2014/0064618, published: Mar. 6, 2014, filed: Aug. 29, 2012).
Claim 8 rejected under 35 U.S.C. 103 as being unpatentable over Dejean et al (US Patent: 9613267, issued: Apr. 4, 2017, filed: Sep. 3, 2014) in view of Duta (US Application: US 2020/0117944, published: Apr. 16, 2020, filed: Oct. 10, 2018) in view of Foncubierta Rodriguez et al (US Application: US 2020/0050845, published: Feb. 13, 2020, filed: Aug. 13, 2018) in view of Jannssen (US Application: US 2014/0064618, published: Mar. 6, 2014, filed: Aug. 29, 2012).
Claims 9 and 10 rejected under 35 U.S.C. 103 as being unpatentable over Dejean et al (US Patent: 9613267, issued: Apr. 4, 2017, filed: Sep. 3, 2014) in view of Duta (US Application: US 2020/0117944, published: Apr. 16, 2020, filed: Oct. 10, 2018) in view of Foncubierta Rodriguez et al (US Application: US 2020/0050845, published: Feb. 13, 2020, filed: Aug. 13, 2018) in view of Ruzon et al (US Patent: 7970213, issued: Jun. 28, 2011, filed: May 21, 2007).
Claim 17 rejected under 35 U.S.C. 103 as being unpatentable over Dejean et al (US Patent: 9613267, issued: Apr. 4, 2017, filed: Sep. 3, 2014) in view of Duta (US Application: US 2020/0117944, published: Apr. 16, 2020, filed: Oct. 10, 2018) in view of Foncubierta Rodriguez et al (US Application: US 2020/0050845, published: Feb. 13, 2020, filed: Aug. 13, 2018) and further in view of Alam (US Patent: 6336124, issued: Jan. 1, 2002, filed: Jul. 7, 1999).
Claim 18 rejected under 35 U.S.C. 103 as being unpatentable over Dejean et al (US Patent: 9613267, issued: Apr. 4, 2017, filed: Sep. 3, 2014) in view of Duta (US Application: US 2020/0117944, published: Apr. 16, 2020, filed: Oct. 10, 2018) in view of Foncubierta Rodriguez et al (US Application: US 2020/0050845, published: Feb. 13, 2020, filed: Aug. 13, 2018) in view of Ng (US Patent: 6405175, issued: Jun. 11, 2002, filed: Jul. 27, 1999).
Claims 11, 19, 20,  21 and 24-27 rejected under 35 U.S.C. 103 as being unpatentable over Dejean et al (US Patent: 9613267, issued: Apr. 4, 2017, filed: Sep. 3, 2014) in view of Duta (US Application: US 2020/0117944, published: Apr. 16, 2020, filed: Oct. 10, 2018) in view of Foncubierta Rodriguez et al (US Application: US 2020/0050845, published: Feb. 13, 2020, filed: Aug. 13, 2018) and further in view of Meunier (US Application: US 2006/0271847, published: Nov. 30, 2006, filed: May 26, 2005).
Claim 12 rejected under 35 U.S.C. 103 as being unpatentable over Dejean et al (US Patent: 9613267, issued: Apr. 4, 2017, filed: Sep. 3, 2014) in view of Duta (US Application: US 2020/0117944, published: Apr. 16, 2020, filed: Oct. 10, 2018) in view of Foncubierta Rodriguez et al (US Application: US 2020/0050845, published: Feb. 13, 2020, filed: Aug. 13, 2018) in view of in view of Meunier (US Application: US 2006/0271847, published: Nov. 30, 2006, filed: May 26, 2005) and further in view of Cooperman (US Patent: 5784487, issued: Jul. 21, 1998, filed: May 23, 1996).

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 06/14/2022 has been entered.
 
Allowable Subject Matter
Claims 5 -7 13, 14 and 16 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3 and 20 - 27 are rejected under 35 U.S.C. 103 as being unpatentable over Dejean et al (US Patent: 9613267, issued: Apr. 4, 2017, filed: Sep. 3, 2014) in view of Duta (US Application: US 2020/0117944, published: Apr. 16, 2020, filed: Oct. 10, 2018) in view of Foncubierta Rodriguez et al (US Application: US 2020/0050845, published: Feb. 13, 2020, filed: Aug. 13, 2018) in view of Stolin (US Patent: 6175844, published: Jan. 16, 2001, filed: Dec. 30, 1997).

With regards to claim 1, Dejean et al teaches one or more non-transitory computer readable media comprising instructions which, when executed by one or more hardware processors, causes performance of operations comprising: 

identifying a plurality of phrases rendered in an electronic document (Fig 12, column 7, lines 47-54: a plurality of phrases/words/labels are identified within the document); 

identifying a first phrase, of the plurality of phrases rendered in the electronic document, that comprises a digit (column 7, lines 32-42: “To identify and tag values, any string with at least one digit is tagged as a value”); 

determining  (a) that a second phrase, of the plurality of phrases rendered in the electronic document does not include any digit (Fig 10, column 7, lines 33-44, column 8, lines 13-33: Dejean et al explains: “a list of predefined terms provided by lexical resources 810 is used in order to identify and tag label candidates”. Also Dejean explains the second phrase includes a label such as ‘Bill To Code’, and a first phrase can be a value phrase that includes at least a digit);

responsive at least to determining (a) … , assigning both the first phrase and the second phrase to a same group in a plurality of groups (column 7, lines 33-47, column 9, lines 44-50, tables 2 and 4: the second phrase is paired with the first phrase is stored/extracted); 

storing, transmitting or presenting information identifying members of each group of the plurality of groups (column 9, lines 44-50, table 4: the second phrase is paired with the first phrase is stored/extracted in the form of Table 4. Each pairing is a group and each group among a plurality of groups include an identifying string/label value).

However Dejean et al does not expressly teach determining (b) that a first of x and y coordinates associated with the first phrase as rendered within the electronic document within a threshold Cartesian distance of a second set of x and y coordinates associated with the second  phrase as rendered within the electronic document; … responsive at least to determining (a) and (b), assigning both the first phrase and the second phrase to a same group ….; determining an order for the groups in the plurality of groups based at least on a cartesian distance between pairs of groups in the plurality of groups; … based on the order determined for the plurality of groups; and storing, transmitting, or presenting the groups in the plurality of groups based on the order determined for the groups.

Yet Duta teaches within a threshold distance of the first phrase ; responsive at least to determining (a) and (b), assigning both the first phrase and the second phrase to a same group (paragraphs 0043 and 0044: the first and second phrase can be checked to determine if they are within a threshold predefined distance of each other to be paired as a key and value phrase. The second phrase can be ‘Amount:’ (that ends in a symbol of ‘:’) which is a left neighbor of the first phrase having a value of $100.00).

It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to have modified Dejean et al’s ability to store each group/pair of Label (second phrase) and value (first phrase) in a plurality of pairs/groupings, such that the first and second phrases are checked to be within a predefined threshold distance of each other to be grouped/paired, as taught by Duta. The combination would have allowed Desean et al to have optimally extracted key value pairs even when there are different types of forms (Duta, paragraph 0001). 

However the combination does not expressly teach determining (b) that a first of x and y coordinates associated with the first phrase as rendered within the electronic document within a threshold Cartesian distance of a second set of x and y coordinates associated with the second  phrase as rendered within the electronic document …;  ….; determining an order for the groups in the plurality of groups based at least on a cartesian distance between pairs of groups in the plurality of groups; … based on the order determined for the plurality of groups; and storing, transmitting, or presenting the groups in the plurality of groups based on the order determined for the groups..

Yet Foncubierta Rodriguez et al teaches determining (b) that a first of x and y coordinates associated with the first phrase as rendered within the electronic document within a threshold Cartesian distance of a second set of x and y coordinates associated with the second  phrase as rendered within the electronic document (paragraphs 0050 and 0057: two phrases that are encompassed within boxes are also within a cartesian x,y coordinate system and the borders of the boxes (which as disclosed and known, each box’s borders include corners) of each of the two phrases are compared, to determine whether to group them together as (‘corresponding’)).

It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to have modified Dejean and Duta’s ability to group together two phrases based upon the distance criteria/threshold between bounding boxes, such that the distance check would have measured the closest distance between boxes (that have corners) as part of the criteria to group the phrases together, as taught by Foncubierta Rodriguez et al. The combination would have allowed Dejean and Duta to have been able to extract information from forms even if the forms happen to change (Foncubierta Rodriguez et al, paragraph 0002). 

However the combination does not expressly teach  ….; determining an order for the groups in the plurality of groups based at least on a cartesian distance between pairs of groups in the plurality of groups; … based on the order determined for the plurality of groups; and storing, transmitting, or presenting the groups in the plurality of groups based on the order determined for the groups.

Yet Stolin teaches ….; determining an order for the groups in the plurality of groups based at least on a cartesian distance between pairs of groups in the plurality of groups; … based on the order determined for the plurality of groups; and storing, transmitting, or presenting the groups in the plurality of groups based on the order determined for the groups (Fig. 3, column 3, lines 1-52: based upon calculated distance between blocks (where each block can include an already paired combining of text), an order for the blocks can be determined and saved/represented via a directed graph).

It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to have modified Dejean, Duta and Foncubierta Rodrigeuz et al’s ability to determine a cartesian distance between groups of paired text and assigning the phrases to a same group based upon a threshold, such that order can be represented/saved based upon at least the distance between pairs of the groups, as taught by Stolin. The combination would have allowed Dejean, Duta, Foncubierta Rodriguez et al to have “grouped text into proper reading order” (Stolin, column 1, lines 28-30). 

With regards to claim 2. The media of claim 1, Dejean et al and Duta teaches wherein the assigning operation is further responsive to determining that the second phrase ends in an association symbol, as similarly explained in the rejection of claim 1, and is rejected under similar rationale.

With regards to claim 3. The media of claim 2, Dejean et al and Duta teaches wherein the association symbol comprises a colon, as similarly explained in the rejection of claim 1, and is rejected under similar rationale.

With regards to claim 20. Dejean teaches a method comprising: 

identifying a plurality of phrases rendered in an electronic document, wherein text corresponding to each phrase, in the plurality of phrases, are on a same text line in the first electronic document (Fig 12, column 7, lines 32-54: a plurality of phrases/words/labels are identified within the document such that a label phrase can be on the same line as value phrase among a plurality of phrases in the document);
 
identifying a first phrase, of the plurality of phrases rendered in the electronic document, that comprises a digit (column 7, lines 32-42: “To identify and tag values, any string with at least one digit is tagged as a value”); 

determining that (a) second phrase, of the plurality of phrases rendered in the electronic document,  does not include any digit; …(Fig 10, column 7, lines 33-44, column 8, lines 13-33: Dejean et al explains: “a list of predefined terms provided by lexical resources 810 is used in order to identify and tag label candidates”. Also Dejean explains the second phrase includes a label such as ‘Bill To Code’ while the first phrase can be a value phrase that includes at least a digit); 

responsive at least to determining (a) … , assigning both the first phrase and the second phrase to a same group in a plurality of groups (column 7, lines 33-47, column 9, lines 44-50, tables 2 and 4: the second phrase is paired with the first phrase is stored/extracted); 

storing, transmitting, or presenting information identifying members of each group of the plurality of groups (column 9, lines 44-50, table 4: the second phrase is paired with the first phrase is stored/extracted in the form of Table 4. Each pairing is a group and each group among a plurality of groups include an identifying string/label value); 

… , wherein the method is executed by at least one device including a hardware processor (Fig 13: a hardware computing processor is implemented).

However Dejean et al does not expressly teach determining (b) that a first set of x and y coordinates associated with the first phrase as rendered within the electronic document is within  a threshold Cartesian distance of a second set of x and y coordinates associated with the second  phrase as rendered within the electronic document; … responsive at least to determining (a) and (b), assigning both the first phrase and the second phrase to a same group …; determining an order for the groups in the plurality of groups based at least on cartesian distances between pairs of groups in the plurality of groups; storing or presenting the groups, in the plurality of groups, based on the order determined for the groups.

Yet Duta teaches within a threshold distance of the first phrase ; responsive at least to determining (a) and (b), assigning both the first phrase and the second phrase to a same group (paragraphs 0043 and 0044: the first and second phrase can be checked to determine if they are within a threshold predefined distance of each other to be paired as a key and value phrase. The second phrase can be ‘Amount:’ (that ends in a symbol of ‘:’) which is a left neighbor of the first phrase having a value of $100.00).

It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to have modified Desean et al’s ability to store each group/pair of Label (second phrase) and value (first phrase) in a plurality of pairs/groupings, such that the first and second phrases are checked to be within a predefined threshold distance of each other to be grouped/paired, as taught by Duta. The combination would have allowed Desean et al to have optimally extracted key value pairs even when there are different types of forms (Duta, paragraph 0001).

However, the combination does not expressly teach determining (b) that a first of x and y coordinates associated with the first phrase as rendered within the electronic document within a threshold Cartesian distance of a second set of x and y coordinates associated with the second  phrase as rendered within the electronic document; … determining an order for the groups in the plurality of groups based at least on cartesian distances between pairs of groups in the plurality of groups; storing or presenting the groups, in the plurality of groups, based on the order determined for the groups.

Yet Foncubierta Rodriguez et al teaches determining (b) that a first of x and y coordinates associated with the first phrase as rendered within the electronic document within a threshold Cartesian distance of a second set of x and y coordinates associated with the second  phrase as rendered within the electronic document (paragraphs 0050 and 0057: two phrases that are encompassed within boxes are also within a cartesian x,y coordinate system and the borders of the boxes (which as disclosed and known, each box’s borders include corners) of each of the two phrases are compared, to calculate the closest distance between to group them together as (‘corresponding’)).

It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to have modified Dejean and Duta’s ability to group together two phrases based upon the distance criteria/threshold between bounding boxes, such that the distance check would have measured the closest distance between boxes (that have corners) as part of the criteria to group the phrases together, as taught by Foncubierta Rodriguez et al. The combination would have allowed Dejean and Duta to have been able to extract information from forms even if the forms happen to change (Foncubierta Rodriguez et al, paragraph 0002). 

However the combination does not expressly teach determining an order for the groups in the plurality of groups based at least on cartesian distances between pairs of groups in the plurality of groups; storing or presenting the groups, in the plurality of groups, based on the order determined for the groups.

Yet Stolin teaches determining an order for the groups in the plurality of groups based at least on cartesian distances between pairs of groups in the plurality of groups; storing or presenting the groups, in the plurality of groups, based on the order determined for the groups (Fig. 3, column 3, lines 1-52: based upon calculated distance between blocks (where each block can include an already paired combining of text), an order for the blocks can be determined and saved/represented via a directed graph).

It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to have modified the combination of Dejean, Duta and Foncubierta Rodriguez et al’s ability to store identified group data gleaned from an unstructured document (Image), such that the stored group data stored would further include and be based on order of the identified/determined groups, as taught by Stolin. The combination would have allowed Dejean, Duta, and Foncubierta Rodriguez et al to have recovered document content and logical structure of electronic documents that are scanned without resulting in fidelity of document representation (Meunier, paragraph 0003).

With regards to claim 21. the combination of Dejean, Duta, Foncubierta Rodriguez et al and Stolin  teaches a system comprising: at least one device including a hardware processor; the system being configured to perform operations comprising: identifying a plurality of phrases rendered in an electronic document, wherein text corresponding to each phrase, in the plurality of phrases, are on a same text line in the first electronic document; identifying a first phrase, of the plurality of phrases rendered in the electronic document, that comprises a digit; determining (a) that a second phrase, of the plurality of phrases rendered in the electronic document,  does not include any digit; determining (b) that a first set of x and y coordinates associated with the first phrase as rendered within the electronic document  is within a threshold Cartesian distance of a second set of x and y coordinates associated with the second  phrase as rendered within the electronic document; responsive at least to determining (a) and (b), assigning both the first phrase and the second phrase to a same group in a plurality of groups; storing information identifying members of each group of the plurality of groups; determining an order for the groups in the plurality of groups; storing, transmitting, or presenting the groups, in the plurality of groups, based on the order determined for the groups, as similarly explained in the rejection of claim 20, and is rejected under similar rationale.

With regards to claim 22. which depends on claim 1, Dejean et al, Duta,  Foncubierta Rodriguez et al and Stolin  teaches wherein determining (b) comprises: identifying a first bounding box, within the electronic document, corresponding to the first phrase such that the first phrase is within the first bounding box; determining the first set of x and y coordinates associated with the first phrase based on the first bounding box; identifying a second bounding box, within the electronic document, corresponding to the second phrase such that the second phrase is within the second bounding box; and determining the second set of x and y coordinates associated with the second phrase based on the second bounding box, as similarly explained in the rejection for claim 1, and is rejected under similar rationale.

With regards to claim 23. which depends on claim 22, Dejean et al, Duta,  Foncubierta Rodriguez et al and Stolin teaches wherein the first set of x and coordinates correspond to a corner of the first bounding box that is closest to the second bounding box, and wherein the second set of x and y coordinates correspond to a corner of the second bounding box that is closest to the first bounding box, as similarly explained in the rejection for claim 1, and is rejected under similar rationale.

With regards to claim 24, which depends on claim 20, the combination of Dejean, Duta, Foncubierta Rodriguez et al and Stolin  teaches wherein determining (b) comprises: identifying a first bounding box, within the electronic document, corresponding to the first phrase such that the first phrase is within the first bounding box; determining the first set of x and y coordinates associated with the first phrase based on the first bounding box; identifying a second bounding box, within the electronic document, corresponding to the second phrase such that the second phrase is within the second bounding box; and determining the second set of x and y coordinates associated with the second phrase based on the second bounding box, as similarly explained in the rejection for claim 20, and is rejected under similar rationale.

With regards to claim 25. which depends on claim 24, the combination of Dejean, Duta, Foncubierta Rodriguez et al and Stolin teaches wherein the first set of x and coordinates correspond to a corner of the first bounding box that is closest to the second bounding box, and wherein the second set of x and y coordinates correspond to a corner of the second bounding box that is closest to the first bounding box, as similarly explained in the rejection for claim 20, and is rejected under similar rationale..

With regards to claim 26. which depends on claim 21, the combination of Dejean, Duta, Foncubierta Rodriguez et al and Stolin teaches wherein determining (b) comprises: identifying a first bounding box, within the electronic document, corresponding to the first phrase such that the first phrase is within the first bounding box; determining the first set of x and y coordinates associated with the first phrase based on the first bounding box; identifying a second bounding box, within the electronic document, corresponding to the second phrase such that the second phrase is within the second bounding box; and determining the second set of x and y coordinates associated with the second phrase based on the second bounding box, as similarly explained in the rejection for claim 22, and is rejected under similar rationale.

With regards to claim 27. which depends on claim 26, the combination of Dejean, Duta, Foncubierta Rodriguez et al and Stolin  teaches wherein the first set of x and coordinates correspond to a corner of the first bounding box that is closest to the second bounding box, and wherein the second set of x and y coordinates correspond to a corner of the second bounding box that is closest to the first bounding box, as similarly explained in the rejection for claim 21, and is rejected under similar rationale.

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Dejean et al (US Patent: 9613267, issued: Apr. 4, 2017, filed: Sep. 3, 2014) in view of Duta (US Application: US 2020/0117944, published: Apr. 16, 2020, filed: Oct. 10, 2018) in view of Foncubierta Rodriguez et al (US Application: US 2020/0050845, published: Feb. 13, 2020, filed: Aug. 13, 2018) in view of Stolin (US Patent: 6175844, published: Jan. 16, 2001, filed: Dec. 30, 1997) in view of Grams (US Application: US 2017/0220859, published: Aug. 3, 2017, filed: Jan. 29, 2016).

With regards to claim 4. The media of claim 1, the combination of Dejean et al, Duta , Foncubierta Rodriguez et al and Stolin teaches wherein the assigning operation is further responsive to determining that the second phrase, as similarly explained in the rejection for claim 1, and is rejected under similar rationale.

However the combination of Dejean et al, Duta,  Foncubierta Rodriguez et al and Stolin do not expressly teach … wherein the assigning operation is further responsive to determining that the second phrase is one of: 
a left-neighbor of the first phrase such that the first phrase and second phrase (a) include overlapping ranges of corresponding vertical positions in the electronic document and (b) the second phrase's horizontal position is to the left of the first phrase's horizontal position in the electronic document; 
an above-neighbor of the first phrase such that the first phrase and second phrase include overlapping ranges of corresponding horizontal positions in the electronic document.

Yet Grams teaches … a left-neighbor of the first phrase such that the first phrase and second phrase (a) include overlapping ranges of corresponding vertical positions in the electronic document and (b) the second phrase's horizontal position is to the left of the first phrase's horizontal position in the electronic document (Fig 7C: subitem phrases (first phrases) are determined to be associated with an ‘An Example List’ phrase (second phrase) that is higher in hierarchy in an above and to the left orientation. The first phrase is within the larger second phrase bounding box positions, and thus have common overlapping vertical positions); 

an above-neighbor of the first phrase such that the first phrase and second phrase include overlapping ranges of corresponding horizontal positions in the electronic document (Fig 7C: Fig 7C: subitem phrases (first phrases) are determined to be associated with an ‘An Example List’ phrase (second phrase) that is higher in hierarchy in an above and to the left orientation. The second phrase has a larger bounding area than the first subitem(s), and since the first subitem is contained within the larger bounding area, then there are at least a common subset of overlapping horizontal positions).

It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to have modified Dejean et al, Duta, Foncubierta Rodriguez et al and Stolin’s ability to perform assigning operations with respect to first and second phrases, such that overlapping vertical and horizontal positions can be recognized/identified based upon above and to the left relative positioning of phrases, as taught by Grams. The combination would have  “reliably recognized characters while also preserving structure” (Gram, paragraph 0002). 

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Dejean et al (US Patent: 9613267, issued: Apr. 4, 2017, filed: Sep. 3, 2014) in view of Duta (US Application: US 2020/0117944, published: Apr. 16, 2020, filed: Oct. 10, 2018) in view of Foncubierta Rodriguez et al (US Application: US 2020/0050845, published: Feb. 13, 2020, filed: Aug. 13, 2018) in view of Stolin (US Patent: 6175844, published: Jan. 16, 2001, filed: Dec. 30, 1997) in view of Jannssen (US Application: US 2014/0064618, published: Mar. 6, 2014, filed: Aug. 29, 2012).

With regards to claim 8. The medium of claim 1, the combination of Dejean et al, Duta,  Foncubierta Rodriguez et al and Stolin  teaches wherein the assigning operation … the first phrase … the second phrase in the electronic document, as similarly explained in the rejection of claim 1, and is rejected under similar rationale. 
However the combination does not expressly teach … is further responsive to determining that no other phrases are located between the first phrase and the second phrase in the electronic document.

Yet Jannssen teaches  … is further responsive to determining that no other phrases are located between the first phrase and the second phrase in the electronic document (paragraph 0037: a key phrase is associated with another phrase entity based upon a specific positional relationship between them and that there are no intervening text between the key phrase and entity (thus closest due to no intervening phrase within the defined positional relationship)).

It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to have modified Dejean et al, Duta, Foncubierta Rodriguez et al and Stolin’s ability to perform and assignment operation with respect to positioning between phrases, such that an additional check is perform to ensure the association is due to the second phrase (having an associated colon symbol) being the closest to the first phrase (due to no intervening text/phrases between the first and second) as taught by Jannssen. The combination would have allowed Dejean et al, Duta and Foncubierta Rodriguez et al to have efficiently captured and stored information even when the format of the documents are not known (Jannssen, paragraph 0004). 

Claims 9 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Dejean et al (US Patent: 9613267, issued: Apr. 4, 2017, filed: Sep. 3, 2014) in view of Duta (US Application: US 2020/0117944, published: Apr. 16, 2020, filed: Oct. 10, 2018) in view of Foncubierta Rodriguez et al (US Application: US 2020/0050845, published: Feb. 13, 2020, filed: Aug. 13, 2018), in view of Stolin (US Patent: 6175844, published: Jan. 16, 2001, filed: Dec. 30, 1997) and in view of Ruzon et al (US Patent: 7970213, issued: Jun. 28, 2011, filed: May 21, 2007).

With regards to claim 9. The medium of claim 1, Dejean et al, Duta, Foncubierta Rodriguez et al and Stolin teaches  wherein the  operations further comprise: … the pair of tokens, as similarly explained in the rejection of claim 1, and is rejected under similar rationale.

However Dejean et al, Duta,  Foncubierta Rodriguez et al and Stolin do not expressly teach responsive to determining that horizontal spacing between a pair of tokens is less than a threshold value: assigning both tokens in the pair of tokens to a same third phrase.

Yet Ruzon et al teaches responsive to determining that horizontal spacing between a pair of tokens is less than a threshold value: assigning both tokens in the pair of tokens to a same third phrase (column 4, lines 1-7: “If the width of the space between the words is less than the predefined separation threshold, then the first and the second words will be concatenated to form a single word "KRUPSWAFFLE").

It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to have modified Dejean et al, Duta,  Foncubierta Rodriguez et al and Stolin’s ability to process and identify a plurality of phrases, such that two tokens could have been assigned to one same phrase/word, as taught by Ruzon et al. The combination would have allowed Dejean et al and Duta to have “improved text recognition in captured images” (Ruzon et al, column 1, lines 57-59). 

With regards to claim 10. The medium of claim 9, the combination of Dejean et al, Duta  Foncubierta Rodriguez et al, Stolin and Ruzon et al teaches wherein the threshold value, as similarly explained in the rejection for claim 9, and is rejected under similar rationale. 

However the combination as explained in the rejection of claim 9, does not teach … wherein the threshold value is computed by multiplying the average character width of characters in the third phrase with a horizontal tolerance factor.

Yet Ruzon et al teaches wherein the threshold value is computed by multiplying the average character width of characters in … [a]  phrase with a horizontal tolerance factor (column 5, lines 1-4: “a predefined distance of 50% of the average horizontal distance of 7” is computed for phrase characters).

It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to have modified the combination of Dejean et al, Duta, Foncurbierta Rodriguez et al, Stolin and Ruzon et al’s ability to determine a threshold value for generating a target phrase (third phrase), such that the threshold value is computed based upon average character width of characters of the target phrase, as also taught by Ruzon et al. The combination would have allowed Dejean et al, Duta, Foncurbierta Rodriguez et al, Stolin and Ruzon et al to have “improved text recognition in captured images” (Ruzon et al, column 1, lines 57-59).

Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over Dejean et al (US Patent: 9613267, issued: Apr. 4, 2017, filed: Sep. 3, 2014) in view of Duta (US Application: US 2020/0117944, published: Apr. 16, 2020, filed: Oct. 10, 2018) in view of Foncubierta Rodriguez et al (US Application: US 2020/0050845, published: Feb. 13, 2020, filed: Aug. 13, 2018) in view of Stolin (US Patent: 6175844, published: Jan. 16, 2001, filed: Dec. 30, 1997) and further in view of Alam (US Patent: 6336124, issued: Jan. 1, 2002, filed: Jul. 7, 1999).

With regards to claim 17. The medium of claim 1, the combination of Dejean et al, Duta and Foncubierta Rodriguez et al and Stolin teaches wherein the threshold distance, as similarly explained in the rejection of claim 1, and is rejected under similar rationale.

However the combination does not expressly teach wherein the threshold distance is based on an average character height of characters in the first phrase. 

Yet Alam teaches the threshold distance is based on an average character height of characters in the first phrase (column 8, lines 25-32: “If the inter-word spacing or distance in the Y direction is greater than a threshold of, for example, 10% of the average character height, then the inter-word spacing parameter in the Y direction is not met and the word is determined not to be in the current line. The average character height may be determined from the words in the current line or from all the words in the document, for example”).

It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to have modified Dejean et al, Duta,  Foncubierta Rodriguez et al and Stolin’s ability to identify a threshold distance, such that the threshold distance could have been based upon average character height for content/phrases/words, as taught by Alam. The combination would have allowed Dejean et al, Duta,  Foncubierta Rodriguez et al and Stolin to have implemented an accurate an efficient way to convert a document from one format to another format (Alam, column 1, lines 50-54). 

Claim 18 is rejected under 35 U.S.C. 103 as being unpatentable over Dejean et al (US Patent: 9613267, issued: Apr. 4, 2017, filed: Sep. 3, 2014) in view of Duta (US Application: US 2020/0117944, published: Apr. 16, 2020, filed: Oct. 10, 2018) in view of Foncubierta Rodriguez et al (US Application: US 2020/0050845, published: Feb. 13, 2020, filed: Aug. 13, 2018) in view of Stolin (US Patent: 6175844, published: Jan. 16, 2001, filed: Dec. 30, 1997) in view of Ng (US Patent: 6405175, issued: Jun. 11, 2002, filed: Jul. 27, 1999).

With regards to claim 18. The medium of claim 1, the combination of Dejean et al, Duta, Foncubierta Rodriguez et al and Stolin teaches the plurality of phrases .., as similarly explained in the rejection of claim 1, and is rejected under similar rationale.

However the combination does not expressly teach wherein identifying a third phrase, of the plurality of phrases, comprises: detecting an indication of currency in the electronic document; identifying a numerical value closest to the indication of currency; assigning the indication of currency and the numerical value closest to the indication of currency symbol to the same third phrase.

Yet Ng teaches  detecting an indication of currency in the electronic document; identifying a numerical value closest to the indication of currency; assigning the indication of currency and the numerical value closest to the indication of currency symbol to the same third phrase (Fig 4, column 7, lines 9-15: “Coordinates 65 are used by the database refresher to locate data for a parameter within a newly-fetched web page. Coordinate 65 can be an x,y coordinate, or a text string of text that precedes the data, such as a dollar sign "$" before a price. A bounding box may be defined over the parameter value on the web page.”).

It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to have modified the combination of Dejean et al, Duta, Foncubierta Rodriguez et al and Stolin’s ability to identify a plurality of phrases from a document, such that a third phrase is identified by detecting an indication of currency and associated numerical value, as taught by Ng. The combination would have allowed Dejean and Duta to have implemented an easier way to search price information from a variety of online stores (Ng, column 3, lines 8-11).  

Claims  19 is rejected under 35 U.S.C. 103 as being unpatentable over Dejean et al (US Patent: 9613267, issued: Apr. 4, 2017, filed: Sep. 3, 2014) in view of Duta (US Application: US 2020/0117944, published: Apr. 16, 2020, filed: Oct. 10, 2018) in view of Foncubierta Rodriguez et al (US Application: US 2020/0050845, published: Feb. 13, 2020, filed: Aug. 13, 2018) in view of Stolin (US Patent: 6175844, published: Jan. 16, 2001, filed: Dec. 30, 1997) and further in view of Meunier (US Application: US 2006/0271847, published: Nov. 30, 2006, filed: May 26, 2005).

With regards to claim 19. The medium of claim 1, wherein the combination of Dejean et al, Duta, Foncubierta Rodriguez et al and Stolin  teaches storing the groups, as similarly explained in the rejection of claim 1, and is rejected under similar rationale. 

However the combination does not expressly teach storing the groups comprises writing information describing the groups in a second document formatted using a markup language.

Yet Meunier teaches storing the groups comprises writing information describing the groups in a second document formatted using a markup language (paragraph 0025: “The output of the order computation module 210 is a structured document 212 which defines the logical structure (e.g., logical reading and viewing order) of the unstructured document 208” .. such as XML).

It would have been obvious to one of ordinary skill in the art before the effective filing of the invention to have modified the combination of Dejean et al, Duta, Foncubierta Rodriguez et al and Stolin’s ability to store identified group data gleaned from an unstructured document (Image), such that the stored group data stored would further include and be based on order of the identified/determined groups, as taught by Meunier. The combination would have recovered document content and logical structure of electronic documents that are scanned without resulting in fidelity of document representation (Meunier, paragraph 0003).

Response to Arguments
Applicant's arguments filed 06/14/2022 have been fully considered but they are not persuasive.
With regards to the amended limitations within claim 1, the applicant argues prior art references Meunier and Cooperman. However, Dejean, Duta and Foncubierta Rodriguez et al are now combined with Stolin to teach the newly amended limitations. Thus, the arguments regarding Meunier and Cooperman for claim 1 are no longer relevant and the examiner respectfully directs the applicant’s attention to the new grounds of rejection for claim 1 above for a full explanation as to how the limitations are taught.
As a note the Examiner’s response to applicant’s arguments with regards to Meunier in the last advisory action are repeated below for convenience:
The applicant argues with respect to Meunier that “determining a reading order for a document based on identifying ‘cuts’ or spaces in the document is not a grouping in phrases in an electronic document based on spatial relationships and content characteristics of the phrases”. However this argument is not persuasive since grouping of phrases determined in claim 1 has already been explained/taught and the prior art rejection of claim 1 teaches cartesian based spatial distance/gap analysis to group a pair of phrases/objects (see at least how Rodriguez with applied to the combination of Dejean and Duta).  Meunier was only introduced in the 103 to show that distances between the objects are further considered to determine ordering (not to incorporate the x-y technique, but to acknowledge and identify distances between objects).
The applicant argues including Meunier to select a reading order for a document would render Meunier inadequate for its intended purpose. More specifically, it appears the applicant is targeting Meuniers usage of ‘cuts’, with Cooperman. However the examiner respectfully points out, as explained earlier above, the combination that incorporates Meunier in the office action is not to incorporate the usage of cuts, but rather to include Meuniers recognition of objects and spaces identified between the objects to determine order.
With regards to claims 4-10 and 17-19, the applicant argues they are allowable for reasons presented by the applicant for claim 1. The examiner first notes that claims 5 -7 13, 14 and 16 have been objected to for allowable subject matter above. With regards to the remaining claims or concern in the applicant’s arguments, the arguments are not persuasive since claim 1 has been shown/explained to be rejected above.
With regards to claims 11, 19-21, the applicant argues they are allowable for reasons presented by the applicant for claim 1. However, this argument is not persuasive since claim 1 has been shown/explained to be rejected above.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WILSON W TSUI whose telephone number is (571)272-7596. The examiner can normally be reached Monday - Friday 9 am -6 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Stephen Hong can be reached on 571-272-4124. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/WILSON W TSUI/Primary Examiner, Art Unit 2178