DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1, 4, 6, 7, 9-11, 14, 16, 17, 19, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Pribble et al., US 2020/0125881 A1 (Pribble), Grangetto et al., US 2020/0014937 A1 (Grangetto), and further in view of Bhatt et al., US 2022/0122347 A1 (Bhatt).
Regarding claim 1, Pribble teaches a computer-implemented method for document image detection (a method for detecting a document in an image) (Abstract), comprising: 
identifying one or more connected components (identifying a set of edges in the image) (Fig. 1B, item 125; [0016]); 
for each connected component (for each of the detected set of edges) ([0017]) identifying a corresponding minimum bounding polygon (identifying a bounding rectangle; wherein the rectangle can be of minimum area that bounds the document) (Fig. 1C, item 130; [0017]); 
creating one or more image dividing lines based on the minimum bounding polygons (generating a set of edge candidate lines based on the user device using the bounding rectangle) (Fig. 1C, item 145; [0019]); and 
defining boundaries of one or more objects of interest based on at least a subset of the image dividing lines (removing lines and the edge candidate lines that remain may be more likely to represent an edge of the document than the lines that were removed from the set of lines for failing one or more tests) ([0019-0023]).  
However, Pribble does not explicitly teach “producing, using a neural network, a superpixel segmentation map of an input image,” or “generating a superpixel binary mask by associating each superpixel of the superpixel segmentation map with a class of a predetermined set of classes”.
Grangetto teaches a method for encoding the borders of pixel regions of an image (Abstract); wherein producing a superpixel segmentation map of an input image (segmentation for portioning the preprocessed digital image into a predefined number of regions; wherein these regions are also called superpixel and are determined using some segmentation methods) ([0125]); generating a superpixel binary mask (indicating which pixels belong to which superpixel based on setting the pixels to either “1” or “0”) ([0128-0129]) by associating each superpixel of the superpixel segmentation map with a class of a predetermined set of classes (setting each superpixel as either border pixels of a superpixel, which are set to “1” and all other pixels are set to “0”) ([0129]); and identifying one or more connected components in the superpixel binary mask (identifying connected pixels forming borders of the superpixels) (Figs. 1(b) and 3(b); [0129]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Pribble to use superpixel segmentation and generate a superpixel binary mask since it can efficiently represent the contour of a segmented image (Grangetto; [0059]).
However, neither explicitly teaches producing, “using a neural network”, a superpixel segmentation map of an input image.
Bhatt teaches detecting one or more ROIs (regions of interest) in a received image ([0009]); and producing, using a neural network (using a Convolutional Neural Network (CNN)) ([0007] and [0011]), a superpixel segmentation map of an input image (using the CNN to obtain a segmentation map having predicted labels based on spatial continuity of pixels comprised within each of the ROIs and obtain superpixels from the received image using a superpixel generating algorithm) ([0005-0007]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of prior arts to include using a neural network since it provides a time efficient solution (Bhatt; Abstract and [0037]).

Regarding claim 4, Pribble teaches further comprising: cropping each region of interest of the one or more regions of interest to produce a corresponding document image (cropping the region of interest, i.e. the document, to remove a non-document portion of the image and generate a perspective-corrected image of the document) (Fig. 1E; [0030] and [0035]).  

Regarding claim 6, Bhatt teaches wherein the neural network is trained using augmented images (training a neural network using images under different conditions; wherein the images are annotated) ([0022] and [0050]).  

Regarding claim 7, Pribble teaches wherein identifying the minimum bounding polygon (identifying a bounding rectangle; wherein the rectangle can be of minimum area that bounds the document) (Fig. 1C, item 130; [0017]) further comprises: 
generating a plurality of candidate lines for the minimum bounding polygon (generating a set of edge candidate lines based on the user device using the bounding rectangle; wherein the rectangle can be of minimum area that bounds the document) (Fig. 1C, item 145; [0017] and [0019]); 
computing a value of a quality metric for a set of regions of interest that are defined using the plurality of candidate lines (determining the quality of the boundary for the document based on the plurality of candidate lines using a threshold angle, a distance, and/or a minimum distance between the midpoint of a detected line to a midpoint intersection line; and removing the lines that fail, i.e. aren’t of a good quality) ([0019-0023]).  

Regarding claim 9, Pribble teaches wherein generating the plurality of candidate lines for the minimum bounding polygon (generating a set of edge candidate lines based on the user device using the bounding rectangle; wherein the rectangle can be of minimum area that bounds the document) (Fig. 1C, item 145; [0017] and [0019]) further comprises: utilizing, as a candidate boundary of the bounding polygon, a line traversing a center (utilizing as part of the candidate boundary of the bounding rectangle, midpoint intersection lines) (Fig. 1C, item 140; [0018]).  
However, Pribble does not explicitly teach “a superpixel binary mask”.
Grangetto teaches a method for encoding the borders of pixel regions of an image (Abstract); wherein producing a superpixel segmentation map of an input image (segmentation for portioning the preprocessed digital image into a predefined number of regions; wherein these regions are also called superpixel and are determined using some segmentation methods) ([0125]); generating a superpixel binary mask (indicating which pixels belong to which superpixel based on setting the pixels to either “1” or “0”) ([0128-0129]) by associating each superpixel of the superpixel segmentation map with a class of a predetermined set of classes (setting each superpixel as either border pixels of a superpixel, which are set to “1” and all other pixels are set to “0”) ([0129]); and identifying one or more connected components in the superpixel binary mask (identifying connected pixels forming borders of the superpixels) (Figs. 1(b) and 3(b); [0129]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Pribble to use superpixel segmentation and generate a superpixel binary mask since it can efficiently represent the contour of a segmented image (Grangetto; [0059]).

Regarding claim 10, Pribble teaches wherein computing a value of a quality metric for the set of regions of interest (determining the quality of the boundary for the document based on the plurality of candidate lines using a threshold angle, a distance, and/or a minimum distance between the midpoint of a detected line to a midpoint intersection line; and removing the lines that fail, i.e. aren’t of a good quality) ([0019-0023]). Grangetto teaches a method for encoding the borders of pixel regions of an image (Abstract). 
However, neither explicitly teaches “applying, to the set of regions of interest, a trainable classifier”.
Bhatt teaches detecting one or more ROIs (regions of interest) in a received image ([0009]); and applying, to the set of regions of interest, a trainable classifier (applying to the one or more regions of interest as localized bounding boxes a trainable model, such as a convolutional neural network (CNN)) ([0005-0007], [0014], [0022], and [0050]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of prior arts to include using a neural network since it provides a time efficient solution (Bhatt; Abstract and [0037]).

Regarding claim 11, see the rejection made to claim 1, as well as prior art Pribble for a system (device that can include an image capture component) ([0004]), comprising: a memory (one or more memories) ([0004]); a processor (one or more processors) ([0004]), coupled to the memory (the one or more processors communicatively coupled to the one or more memories) ([0004]), for they teach all the limitations within this claim.
Regarding claim 14, see the rejection made to claim 4, as well as prior art Pribble for a system (device that can include an image capture component) ([0004]), comprising: a memory (one or more memories) ([0004]); a processor (one or more processors) ([0004]), coupled to the memory (the one or more processors communicatively coupled to the one or more memories) ([0004]), for they teach all the limitations within this claim.
Regarding claim 16, see the rejection made to claim 6, as well as prior art Pribble for a system (device that can include an image capture component) ([0004]), comprising: a memory (one or more memories) ([0004]); a processor (one or more processors) ([0004]), coupled to the memory (the one or more processors communicatively coupled to the one or more memories) ([0004]), for they teach all the limitations within this claim.
Regarding claim 17, see the rejection made to claim 7, as well as prior art Pribble for a system (device that can include an image capture component) ([0004]), comprising: a memory (one or more memories) ([0004]); a processor (one or more processors) ([0004]), coupled to the memory (the one or more processors communicatively coupled to the one or more memories) ([0004]), for they teach all the limitations within this claim.

Regarding claim 19, see the rejection made to claim 1, as well as prior art Pribble for a non-transitory computer-readable storage medium (a non-transitory computer-readable medium) ([0005]) comprising executable instructions (storing one or more instructions) ([0005]) that, when executed by a computer system (executed by one or more processors of a device) ([0005]), for they teach all the limitations within this claim.
Regarding claim 20, see the rejection made to claim 7, as well as prior art Pribble for a non-transitory computer-readable storage medium (a non-transitory computer-readable medium) ([0005]) comprising executable instructions (storing one or more instructions) ([0005]) that, when executed by a computer system (executed by one or more processors of a device) ([0005]), for they teach all the limitations within this claim.

Claim(s) 2, 3, 12, and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Pribble et al., US 2020/0125881 A1 (Pribble), Grangetto et al., US 2020/0014937 A1 (Grangetto), Bhatt et al., US 2022/0122347 A1 (Bhatt), and further in view of Krauth et al., US 2021/0019883 A1 (Krauth).
Regarding claim 2, Pribble teaches a computer-implemented method for document image detection (a method for detecting a document in an image) (Abstract). Grangetto teaches a method for encoding the borders of pixel regions of an image (Abstract). Bhatt teaches detecting one or more ROIs (regions of interest) in a received image ([0009]); and using a neural network (using a Convolutional Neural Network (CNN)) ([0007] and [0011]).
However, none of them explicitly teach “wherein the neural network comprises: a downscale block; a context block; and a final classification block”.
Krauth teaches automatically detecting objects in an image by means of digital image processing ([0008]); detecting by means of a convolutional neural network ([0118]); and wherein the neural network  (segmentation convolutional neural network SEG-CNN) ([0229]) comprises: a downscale block (downscaling) ([0229-0232]); a context block (generating probability maps) ([0232]); and a final classification block (final output classes) ([0207] and [0241-0242]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of prior arts to include the specific blocks for classification in the neural network since the structure of the convolutional neural network gives rise to the advantage of detecting a plurality of pattern types simultaneously (Krauth; [0211]).

Regarding claim 3, Krauth teaches wherein the neural network further comprises a rectifier activation function (wherein the neural network further comprises activation maps) ([0231-0232]) (wherein the activation is preferably in the form of ReLU activation) ([0266]).  

Regarding claim 12, see the rejection made to claim 2, as well as prior art Pribble for a system (device that can include an image capture component) ([0004]), comprising: a memory (one or more memories) ([0004]); a processor (one or more processors) ([0004]), coupled to the memory (the one or more processors communicatively coupled to the one or more memories) ([0004]), for they teach all the limitations within this claim.
Regarding claim 13, see the rejection made to claim 3, as well as prior art Pribble for a system (device that can include an image capture component) ([0004]), comprising: a memory (one or more memories) ([0004]); a processor (one or more processors) ([0004]), coupled to the memory (the one or more processors communicatively coupled to the one or more memories) ([0004]), for they teach all the limitations within this claim.

Claim(s) 5 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Pribble et al., US 2020/0125881 A1 (Pribble), Grangetto et al., US 2020/0014937 A1 (Grangetto), Bhatt et al., US 2022/0122347 A1 (Bhatt), and further in view of Nepomniachtchi et al., US 10,685,223 B2 (Nepomniachtchi).
Regarding claim 5, Pribble teaches a computer-implemented method for document image detection (a method for detecting a document in an image) (Abstract). Grangetto teaches a method for encoding the borders of pixel regions of an image (Abstract). Bhatt teaches detecting one or more ROIs (regions of interest) in a received image ([0009]).
However, none of them explicitly teach “determining whether two or more regions of interest belong to a single multi-part document”.
Nepomniachtchi teaches systems and methods for processing and extracting content from an image captured (Abstract); and determining whether two or more regions of interest belong to a single multi-part document (performing a test to determine whether images of a front and back of a check are actually images of the same document) (col. 31, lines 18-20).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of prior arts to include determining if two or more regions of interest are a part of a single document since it provides more accurate extraction of important content from an image (Nepomniachtchi; col. 5, lines 62-64 and col. 46, lines 48-51).

Regarding claim 15, see the rejection made to claim 5, as well as prior art Pribble for a system (device that can include an image capture component) ([0004]), comprising: a memory (one or more memories) ([0004]); a processor (one or more processors) ([0004]), coupled to the memory (the one or more processors communicatively coupled to the one or more memories) ([0004]), for they teach all the limitations within this claim.

Claim(s) 8 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Pribble et al., US 2020/0125881 A1 (Pribble), Grangetto et al., US 2020/0014937 A1 (Grangetto), Bhatt et al., US 2022/0122347 A1 (Bhatt), and further in view of Alexeev et al., US 2019/0294641 A1 (Alexeev).
Regarding claim 8, Pribble teaches wherein generating the plurality of candidate lines for the minimum bounding polygon (generating a set of edge candidate lines based on the user device using the bounding rectangle; wherein the rectangle can be of minimum area that bounds the document) (Fig. 1C, item 145; [0017] and [0019]) further comprises: responsive to determining that a first number of pixels in a first line exceeds, by at least a predetermined threshold (threshold size of height and width) ([0073]), a second number of pixels in a second line which is adjacent to the first line, utilizing the second line as a candidate boundary of the bounding polygon (utilizing a line that has to be within a threshold size; so if a line is above the threshold a new line would be selected) ([0073]) (this way the system can generate a bounding rectangle that is smaller than the original bounding rectangle) ([0072] and [0088]), wherein the first line is provided by one of: a row or a column (wherein the line is either a row/width or a column/height) ([0073]).  
However, Pribble does not explicitly teach “a superpixel binary mask”.
Grangetto teaches a method for encoding the borders of pixel regions of an image (Abstract); wherein producing a superpixel segmentation map of an input image (segmentation for portioning the preprocessed digital image into a predefined number of regions; wherein these regions are also called superpixel and are determined using some segmentation methods) ([0125]); generating a superpixel binary mask (indicating which pixels belong to which superpixel based on setting the pixels to either “1” or “0”) ([0128-0129]) by associating each superpixel of the superpixel segmentation map with a class of a predetermined set of classes (setting each superpixel as either border pixels of a superpixel, which are set to “1” and all other pixels are set to “0”) ([0129]); and identifying one or more connected components in the superpixel binary mask (identifying connected pixels forming borders of the superpixels) (Figs. 1(b) and 3(b); [0129]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Pribble to use superpixel segmentation and generate a superpixel binary mask since it can efficiently represent the contour of a segmented image (Grangetto; [0059]).
Pribble teaches a computer-implemented method for document image detection (a method for detecting a document in an image) (Abstract). Grangetto teaches a method for encoding the borders of pixel regions of an image (Abstract). Bhatt teaches detecting one or more ROIs (regions of interest) in a received image ([0009]).
However, none of them explicitly teaches determining a “number of pixels” in the lines.
Alexeev teaches identifying first and second sets of elements within one or more images (Abstract); responsive to determining that a first number of pixels (counting black pixels of the bounding box) ([0119]) in a first line of the exceeds (greater than the threshold) ([0119]), by at least a predetermined threshold (maximum length threshold; including maximum height threshold) ([0119]), a second number of pixels in a second line of the which is adjacent to the first line (shrinking the area of each bounding box by one pixel in each cardinal direction) ([0119]), utilizing the second line as a candidate boundary of the bounding polygon (using the second line that has a number of pixels less than the threshold) ([0119]), wherein the first line is provided by one of: a row of the or a column (wherein the line is a row/width or column/height of the bounding box) ([0119]).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of prior arts to include using a pixel count threshold for shrinking the bounding box since it increases the accuracy of detecting the object being bounded by the box (Alexeev; [0228]).

Regarding claim 18, see the rejection made to claim 8, as well as prior art Pribble for a system (device that can include an image capture component) ([0004]), comprising: a memory (one or more memories) ([0004]); a processor (one or more processors) ([0004]), coupled to the memory (the one or more processors communicatively coupled to the one or more memories) ([0004]), for they teach all the limitations within this claim.

Contact
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL J VANCHY JR whose telephone number is (571)270-1193. The examiner can normally be reached Monday - Friday 9am - 5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Emily Terrell can be reached on (571) 270-3717. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MICHAEL J VANCHY  JR/Primary Examiner, Art Unit 2666                                                                                                                                                                                                        Michael.Vanchy@uspto.gov