Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
This action is in response to the applicant's communication filed on 09/02/2019. In virtue of this communication, claims 1-18 filed on 09/02/2019 are currently pending in the instant application.
                                                    
                                                
  Information Disclosure Statement
The information Disclosure statement (IDS) form PTO-1449, filed on 09/02/2019 is in compliance with the provisions of CFR 1.97. Accordingly, the information disclosed therein was considered by the examiner.
 
Drawings
The drawings were received on 09/02/2019 have been reviewed by Examiner and they are acceptable.
 
Priority
Acknowledgment is made of applicant's claim for foreign priority under 35 U.S.C. 119(a)-(d). 



Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 

(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “an extraction unit”, “aggregation unit” “assigning unit”, “output unit”,  in claims 1-17.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.



Claim Objections
Claims 3-4 are objected to because of the following informalities: limitation “becomes smaller into the group” has typographical error. Examiner suggest amending the limitation to similar to specification ¶[0072] and ¶[0089].  Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.

Claim 1-4, 7-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kato (US 8,139870), in view of Konno et al. (US 2008/0187221.)
As per claim 1, An information processing apparatus comprising: “an extraction unit that extracts character strings obtained by converting characters in an image to character codes;”(Refer to Kato figure 4, block 3, column 4, line 8-11 discloses he OCR section 3 performs OCR on the image of the text area in the preprocessed image to thereby obtain text information from the text area. the OCR processing recognizes each character contained in the text area. It is also possible to locate the position of each character in the document image 100. Examiner notes the OCR processing inherently discloses converting characters in images to character codes for searching and copying.)
 “an aggregation unit that aggregates a plurality of first character strings that are included in the character strings extracted by the extraction unit and each of which represents an item into a group by using information items regarding positions of the plurality of first character strings in the image;”(Refer to Kato figure 3, col 3, line 5-15, discloses to automatically extract an information item from a document. It is not necessary that all information items contained in the form be extracted, and only necessary to extract items necessary for a desired purpose. The extraction item information contains, for each extraction target information item, information concerning the relative position of an item value to a corresponding item name. For example, the extraction item information illustrated in FIG. 3 indicates that a 
 “an assigning unit that assigns candidates for a second character string that are included in the character strings extracted by the extraction unit and that correspond to the group aggregated by the aggregation unit to the group;”(Refer to Kato figure 5, col 3, line 18-26 discloses the electronic filing device performs a known optical character recognition process on the document image and retrieves extraction target item names from the character recognition results. The electronic filing device then retrieves the relative position of the value corresponding to each retrieved item name from the extraction item information, and extracts, from the character recognition results, a character string located at an appropriate relative position to the item name as the value for the item. )
“and an output unit that outputs the candidates for the second character string assigned to the group by the assigning unit.”(Refer to Kato figure 5. Refer to col 6, line 14-28 discloses the item value extraction section 62 retrieves the information of a relative position of an item value corresponding to the received item name from the 
Kato does not explicitly disclose the following which would have been obvious in view of Konno from similar field of endeavor “the character strings extracted by the extraction unit and each of which represents an item that a user desires to obtain” (Konno ¶ [0025],discloses a method for marking the attribute name may include circling it with a pen. In this exemplary embodiment, the attribute name is marked using a marker. ¶ [0026], the attribute name and the attribute value acquired from the form are specified by the user marking the attribute name of acquisition object. ¶ [0027] If the scanner 3 reads the template created by the user, the read image input part 21 inputs read image data from the scanner 3 (step 101).)
Before the effective filing date of the claimed invention it would have been obvious to a person of ordinary skill in the art to combine Konno technique of Document processing into Kato technique to provide the known and expected uses and benefits of Konno technique over image processing and optical character recognition technique of Kato. The proposed combination would have constituted a mere arrangement of old elements with each performing their known function, the combination yielding no more than one would expect from such an arrangement.
Therefore, it would have been obvious to a person of ordinary skill in the art to incorporate Konno to Kato in order to efficiently performing character recognition with an OCR using designated keyword. (Refer to Konno paragraph [0005].)

Claim 8 has analyzed and is rejected for the reasons indicated in claim 1 above. Additionally, the rationale and motivation to combine the Kato and Konno references. Presented in rejection of claim 1 apply to this claim.

As per claim 2, in view of claim 1, Kato as modified by Konno discloses “wherein the aggregation unit aggregates the first character strings into the group in such a manner that each of the first character strings is included in the group.” (Refer to Kato figure 3, column 3, line 5-15, discloses to automatically extract an information item from a document. It is not necessary that all information items contained in the form be extracted, and only necessary to extract items necessary for a desired purpose. The extraction item information contains, for each extraction target information item, information concerning the relative position of an item value to a corresponding item name. For example, the extraction item information illustrated in FIG. 3 indicates that a value of an item indicated by the item name "drafted date" is located closely to the "right" of the item name, and that a value of an item indicated by the item name "acquired date" is located closely to the "lower right" of the item name.  further line 18-26 discloses the electronic filing device performs a known optical character recognition 

As per claim 3, in view of claim 1, Kato as modified by Konno discloses, “wherein the aggregation unit aggregates the first character strings that are selected in such a manner that a size of a virtual rectangle surrounding and including the first character strings on the image becomes smaller into the group.” (Konno, Fig. 7, step 211, 212, 214, 215, 217, 21, ¶s [0040]-[0042] discloses the character string extracted from the read form and the attribute name variation pattern generated instead of the registered attribute name are collated (step 212). If the character string matched with the attribute name variation pattern exists among the character strings extracted from the read form (Y at step 213), the applicable attribute name is detected from the read form, whereby the process following the step 205 is 

As per claim 4, in view of claim 2, Kato as modified by Konno discloses “wherein the aggregation unit aggregates the first character strings that are selected in such a manner that a size of a virtual rectangle surrounding and including the first character strings on the image becomes smaller into the group.” (Refer to Konno, Fig. 7, step 211, 212, 214, 215, 217, 21, ¶s [0040]-[0042] discloses the character string extracted from the read form and the attribute name variation pattern generated instead of the registered attribute name are collated (step 212). If the character string matched with the attribute name variation pattern exists among the character strings extracted from the read form (Y at step 213), the applicable attribute name is detected from the read form, whereby the process following the step 205 is performed. This is effective in the case where the attribute name is changed owing to the revision of the form. On the other hand, if the character string matched with the attribute name variation pattern does not exist among the character strings extracted from the read form  (N at step 213), the character string existing near the area specified by the positional information of the registered attribute name on the read form is extracted (step 214). Namely, the extraction area of the character string on the read form is slightly expanded. In this manner, the character string extracted from the read form and the registered attribute name or the registered attribute name and attribute 

Regarding Claim 7, in view of claim 1, Kato as modified by Konno discloses “wherein the assigning unit assigns the candidates for the second character string to the group in accordance with a degree of association between each of the candidates and a representative character string that is set to the group and that represents the first character strings.” (Refer to Kato, Col. 5, lines 38-65 discloses It is possible that plural character strings may match plural different variations might be found from the character recognition results. In such a case, the string with highest similarity to the item name of an extraction target (which can be calculated from the similarity of words composing an item name and words composing the variation) may be selected from among the variations found from the character recognition results. Furthermore, col. 8, lines 15-25 discloses If there are plural text areas that are in close proximity to the item name, it is possible to calculate a numerical value indicating the closeness of matching of the character strings included in each text area to the feature of the extraction target item value and to select the character string that most closely matches as an item value.)

Regarding Claim 8, in view of claim 2, Kato as modified by Konno discloses “wherein the assigning unit assigns the candidates for the second character string to the group in accordance with a degree of association between each of the candidates and a representative character string that is set to the group and that represents the first character strings.”  (Refer to Kato, Col. 5, lines 38-65 discloses It is possible that plural character strings may match plural different variations might be found from the character recognition results. In such a case, the string with highest similarity to the item name of an extraction target (which can be calculated from the similarity of words composing an item name and words composing the variation) may be selected from among the variations found from the character recognition results. 

Regarding Claim 9, in view of claim 3, Kato as modified by Konno discloses “wherein the assigning unit assigns the candidates for the second character string to the group in accordance with a degree of association between each of the candidates and a representative character string that is set to the group and that represents the first character strings.”  (Refer to Kato, Col. 5, lines 38-65 discloses It is possible that plural character strings may match plural different variations might be found from the character recognition results. In such a case, the string with highest similarity to the item name of an extraction target (which can be calculated from the similarity of words composing an item name and words composing the variation) may be selected from among the variations found from the character recognition results. Furthermore, col. 8, lines 15-25 discloses If there are plural text areas that are in close proximity to the item name, it is possible to calculate a numerical value indicating the closeness of matching of the character strings included in each text area to the feature of the extraction target item value and to select the character string that most closely matches as an item value.)

wherein the assigning unit assigns the candidates for the second character string to the group in accordance with a degree of association between each of the candidates and a representative character string that is set to the group and that represents the first character strings.” (Refer to Kato, Col. 5, lines 38-65 discloses It is possible that plural character strings may match plural different variations might be found from the character recognition results. In such a case, the string with highest similarity to the item name of an extraction target (which can be calculated from the similarity of words composing an item name and words composing the variation) may be selected from among the variations found from the character recognition results. Furthermore, col. 8, lines 15-25 discloses If there are plural text areas that are in close proximity to the item name, it is possible to calculate a numerical value indicating the closeness of matching of the character strings included in each text area to the feature of the extraction target item value and to select the character string that most closely matches as an item value.)

Regarding Claim 11, in view of claim 5, Kato as modified by Konno discloses “wherein the assigning unit assigns the candidates for the second character string to the group in accordance with a degree of association between each of the candidates and a representative character string that is set to the group and that represents the first character strings.” (Refer to Kato, Col. 5, lines 38-65 discloses It is possible that plural character strings may match plural different variations might be found from the character recognition results. In such a case, the string with 

Regarding Claim 12, in view of claim 6, Kato as modified by Konno discloses “wherein the assigning unit assigns the candidates for the second character string to the group in accordance with a degree of association between each of the candidates and a representative character string that is set to the group and that represents the first character strings.” (Refer to Kato, Col. 5, lines 38-65 discloses It is possible that plural character strings may match plural different variations might be found from the character recognition results. In such a case, the string with highest similarity to the item name of an extraction target (which can be calculated from the similarity of words composing an item name and words composing the variation) may be selected from among the variations found from the character recognition results. Furthermore, col. 8, lines 15-25 discloses If there are plural text areas that are in close proximity to the item name, it is possible to calculate a numerical value indicating the closeness of matching of the character strings included in each text area to the feature 

Regarding Claim 13, in view of claim 7, Kato as modified by Konno discloses wherein the degree of association between each of the candidates for the second character string and the representative character string is represented by at least one evaluation item.” (Refer to Kato, Col. 5, lines 38-65 discloses It is possible that plural character strings may match plural different variations might be found from the character recognition results. In such a case, the string with highest similarity to the item name of an extraction target (which can be calculated from the similarity of words composing an item name and words composing the variation) may be selected from among the variations found from the character recognition results. Furthermore, col. 6, lines 39-45 discloses , if an item name and an item value are not divided by a ruled line such as "acquired date", a character string within a given threshold distance from the area of the item name may be determined to be a character string in "close proximity", for example. The threshold distance may be variable according to the character size of the item name so that the range of "close proximity" is made wider when the character of an item name is larger, col. 8, lines 15-25 discloses If there are plural text areas that are in close proximity to the item name, it is possible to calculate a numerical value indicating the closeness of matching of the character strings included in each text area to the feature of the extraction target item value and to select the character string that most closely matches as an item value.)

wherein a distance between each of the candidates for the second character string and the representative character string is included in a plurality of the evaluation items, and wherein the degree of association between each of the candidates for the second character string and the representative character string is set in such a manner that the degree of association becomes higher as a distance between the candidate for the second character string and the representative character string decreases.” ( Refer to Kato, col. 6, lines 39-45 discloses , if an item name and an item value are not divided by a ruled line such as "acquired date", a character string within a given threshold distance from the area of the item name may be determined to be a character string in "close proximity", for example. The threshold distance may be variable according to the character size of the item name so that the range of "close proximity" is made wider when the character of an item name is larger, col. 8, lines 15-25 discloses If there are plural text areas that are in close proximity to the item name, it is possible to calculate a numerical value indicating the closeness of matching of the character strings included in each text area to the feature of the extraction target item value and to select the character string that most closely matches as an item value. Furthermore, col. 8, lines 25-30 discloses the character string can be in close proximity to an extraction target item name in various directions such as above, below, left, right, upper right, lower right, upper left and lower left. An item value for an item name will have a greater likelihood of appearing in certain positions depending on the language or notation system used for the description of a document.)

Regarding Claim 15, in view of claim 13, Kato as modified by Konno discloses “wherein a direction of each of the candidates for the second character string as seen from the representative character string is included in the plurality of evaluation items, and wherein the degree of association between each of the candidates for the second character string and the representative character string is set in such a manner that the degree of association becomes higher as the direction of the candidate for the second character string as seen from the representative character string becomes closer to a direction that is set beforehand as a direction in which a character string corresponding to the representative character string is positioned.” (Refer to Kato, Col. 8, lines 25-35 discloses The character string can be in close proximity to an extraction target item name in various directions such as above, below, left, right, upper right, lower right, upper left and lower left. An item value for an item name will have a greater likelihood of appearing in certain positions depending on the language or notation system used for the description of a document. Therefore, in the direction where the item value is likely to appear in the language or notation style in use, it is possible to search for a character string having the feature of an item value from the adjacent character strings farther than in a random direction.)

Regarding Claim 16, in view of claim 13, Kato as modified by Konno discloses “wherein data types represented by the candidates for the second character string are included in the plurality of evaluation items, and wherein the degree of association between each of the candidates for the second character string and the representative character string is set in such a manner that the degree of association is high when the data type represented by one of the candidates for the second character string is a data type that is set beforehand as a possible data type for an item that includes the representative character string.” (Refer to Kato, Col. 9, lines 55-60 discloses for example, if a character string that is relevant to a value of date such as "5/25/2004" is found, a character string that contains a word or phrase similar to the date is searched for in the vicinity of the character string. Consequently, for example, if a character string containing the word "date" belonging to a semantic category of data is found, such as, for example, "date of drafting" or the like, that character string is recognized as a candidate for an item name.)

Regarding Claim 17, in view of claim 7, Kato as modified by Konno discloses “wherein, when the first character strings in the group are included in the same line, the assigning unit assigns the first character string that is located in a rightward direction in the group to the representative character string, wherein, when the first character strings in the group are distributed in a plurality of lines, the assigning unit assigns the first character string that is located in a downward direction in the group to the representative character string, wherein, when the first character strings in the group are distributed in a plurality of lines, and two or more of the plurality of first character strings are included in the line including the first character string that is located in the downward direction in the group, the assigning unit assigns the first character string that is located in the rightward direction in the line including the first character string that is located in the downward direction in the group to the representative character string.” (Refer to Kato, Col. 8, lines 25-35 discloses The character string can be in close proximity to an extraction target item name in various directions such as above, below, left, right, upper right, lower right, upper left and lower left. An item value for an item name will have a greater likelihood of appearing in certain positions depending on the language or notation system used for the description of a document. Therefore, in the direction where the item value is likely to appear in the language or notation style in use, it is possible to search for a character string having the feature of an item value from the adjacent character strings farther than in a random direction.)


Claim 5-6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kato (US 8,139870), in view of Konno et al. (US 2008/0187221,) further in view of Ghimire (US 2012/0078979.)

As per claim 5, in view of claim 3, “wherein the aggregation unit aggregates the first character strings in each line, and when each of the first character strings is not included in the aggregated first character strings, the aggregation unit performs aggregation the first character strings into a group on a paragraph-by-paragraph basis in such a manner that each of the first character strings is included in the group.”(Refer to Ghimire Figure 11 and  14, ¶[0011] discloses the system will automatically bring the most relevant paragraph to the user's view with most 

Before the effective filing date of the claimed invention it would have been obvious to a person of ordinary skill in the art to combine Ghimire technique of searching in a document into Kato as modified by Konno technique to provide the known and expected uses and benefits of Ghimire technique over image processing and optical character recognition technique of Kato as modified by Konno. The proposed combination would have constituted a mere arrangement of old elements with each performing their known function, the combination yielding no more than one would expect from such an arrangement. Therefore, it would have been obvious to a person of ordinary skill in the art to incorporate Ghimire to Kato as modified by Konno in order to enhance the quality and thoroughness of a researcher for analyzing a large number of larger sized documents/references. (Refer to Ghimire paragraph [0004].)

As per claim 6, in view of claim 4, “wherein the aggregation unit aggregates the first character strings in each line, and when each of the first character strings is not included in the aggregated first character strings, the aggregation unit performs aggregation the first character strings into a group on a paragraph-by-paragraph basis in such a manner that each of the first character strings is included in the group.” (Refer to Ghimire Figure 11 and  14, ¶[0011] discloses the system will automatically bring the most relevant paragraph to the user's view with most relevant keywords highlighted with an automatically selected color or a user selected color scheme. Further ¶[0022] disclose s in response to a user selection of a paragraph (or a user selection of a portion of a text) the program code of the inventive system can locate most relevant paragraph (to the selected portion of the text). The GUI also allows to automatically sending the detected keywords in the highlighting box. Further see ¶[0024].)

Before the effective filing date of the claimed invention it would have been obvious to a person of ordinary skill in the art to combine Ghimire technique of searching in a document into Kato as modified by Konno technique to provide the known and expected uses and benefits of Ghimire technique over image processing and optical character recognition technique of Kato as modified by Konno. The proposed combination would have constituted a mere arrangement of old elements with each performing their known function, the combination yielding no more than one would expect from such an arrangement. Therefore, it would have been obvious to a person of ordinary skill in the art to incorporate Ghimire to Kato as modified by Konno in order to enhance the quality and thoroughness of a researcher for analyzing a large number of larger sized documents/references. (Refer to Ghimire paragraph [0004].)



						Contact

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent Rudolph can be reached on (571)272-8243. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/SHAGHAYEGH AZIMA/Examiner, Art Unit 2661