DETAILED ACTION

Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114

A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on October 12, 2021 has been entered.

Claims 1-6, 8-13, and 15-20 are pending in this case.  Independent claims 1, 8 and 15 have been newly amended.  Claims 7 and 14 were previously cancelled.  No claims have been newly added or cancelled.  This action is made Non-Final.

Claim Objections

Claims 1, 8, and 15 are objected to because of the following informalities:  
the image object…” but should recite, “…automatically identifying one or more semantic labels that are associated with the first image object.”
Appropriate correction is required.

Claim Rejections - 35 USC § 103

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-6, 8-13, and 15-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Shen et al. (US 2020/0019799) in view of Choi et al. (US 2020/0311956) and Colgate et al. (US 2019/0271559).

As to claim 8, Shen et al. disclose a system for performing image-object detection (Figure 1 illustrates network environment 100 comprising various systems communicatively coupled via a network 120 including autonomous control system 110, client device 116, model training system 130, and annotation system 140) comprising: one or more processors ([0077] notes systems may include a single processor or may be architectures employing multiple processor designs); and a computer-readable medium comprising instructions stored therein, which when executed by the processors ([0077] and [0078] notes computer program may be stored in a non-transitory medium, or any type of media suitable for storing electronic instructions to executed by one or more processors), cause the processors to perform operations comprising: receiving, from a first data set, an image comprising a first image object (Figure 2 illustrates input data {x0, x1…xn} as a video sequence of training images including images 210, 212, and 214 illustrating vehicle (as image object) on a highway, where [0057] notes objects may include pedestrians, trees, vehicles, stop signs, and the like); processing the image to identify a pixel region associated with the first image object ([0040] and [0041] note the annotation model generates a set of annotations {y0, y1…yn} that indicate the location (e.g. region of interest) of bounding boxes around the vehicle by combining forward states and backwards states of each image, where it is understood the “pixel region” is the location (region of interest) in which the bounding box will be placed); placing a first bounding box around the first image object based on the identified pixel region (Figure 2 illustrates bounding boxes around vehicle of images, e.g. bounding boxes 220, 222, and 224 of images 210, 212, and 214, respectively, [0042]; Figure 3A further illustrates bounding box predictions using annotation model to detect a vehicle in the corresponding region of interest, [0045]); receiving a user input comprising an indication of whether the first bounding box is accurately placed around the first image object (Figure 3B illustrates automatically suggesting annotations based on human interaction on a client device, where [0046] notes the human operator may interact with pointer 354 at a location on the image, which corresponds to a boundary of bounding box 354, and the annotation system 140 displays the bounding box on the interface and a menu 358 indicating whether the operator would like to “accept” the annotation (e.g. accurately placed around the vehicle) or whether the operator would like to pursue “other” options); and one or more labels that are associated with the image object ([0032] notes model training system 130 trains machine-learned computer models using training data, where portions of the training data, such as various objects of interest, are annotated with labels).

As noted above, Shen et al. disclose multiple systems communicatively coupled via a network implemented as network environment 100 as illustrated in Figure 1, and further disclose any computer system may include at least one processor and media for storing instructions.  Although Shen et al. describes multiple systems for contributing to the operations performed above, it may be considered the multiple systems may collectively represent one system, thus yielding predictable results, without changing the scope of the invention.  Additionally, as noted above, Shen et al. disclose “receiving a user input comprising an indication of whether the first bounding box is accurately placed around the first image object,” but do not specifically disclose “wherein the user input indicates a centroid location of the pixel region associated with the first image object.”  Lastly, Shen et al. disclose one or more labels that are associated with the image object, but do not specifically disclose “automatically identifying one or more semantic labels that are associated with the image object.”

Choi et al. disclose wherein the user input indicates a centroid location (e.g. Figures 33A and/or 33B, point P9) of the pixel region (e.g. Figures 33A and/or 33B, 3D bounding boxes including points P1 thru P8) associated with the first image object (e.g. Figure 33A, vessel 102, Figure 33B, vessel 102a)([0233] notes Figures 33A and 33B illustrate 3D bounding boxes for vessels, e.g. 102, including points P1 to P8 that are coordinates of corners of the 3D bounding box, and point P9 is the 3D coordinate of the centroid of the 3D bounding box, [0237] notes the eight corners, e.g. points P1 to P8, of the bounding box and its centroid location, e.g. point P9, may be provided by a human annotator); and automatically identifying one or more labels that are associated with the image object (Figures 6, 7, 8A, 8B, 9A thru 9F and associated text illustrate and note objects identified with bounding boxes and assigned to object configuration categories, where [0086] notes machine learning model trained to identify objects in an image, a human operator may evaluate the images and draw polygons around each object present in the image, and each polygon may then by labeled with a class, [0087] notes a labeled training data may then be used to train a machine learning model to identify boundaries of, and possibly classify, objects detected in images, [0105] notes an object configuration classifier model 800 for assigning clusters, e.g. objects, to specific categories, where [0112] notes the object configuration classifier model 800 outputs a single mutually exclusive label that indicates an object configuration category to which a cluster is determined to belong, [0118] and [0119] note the object configuration classifier may be trained using a manual or automated approach).

It would have been obvious to one of ordinary skill in the art at the time of the invention to modify Shen et al.’s system and method of detecting objects utilizing bounding boxes with Choi et al.’s method of generating bounding boxes including receiving a user’s input indicating a centroid location to more accurately locate objects and define bounding boxes within images.

semantic labels that are associated with the image object.”

Colgate et al. disclose automatically identifying one or more semantic labels that are associated with the image object ([0093] notes a semantic labeling module creates and stores metadata describing semantic labels for three-dimensional objects in an OMap, the metadata describes real-world objects with the semantic labels, the semantic labeling module determining semantic labels for the objects from the OMap using stored metadata describing objects and matching objects from the OMap with the stored metadata and associated objects and storing the semantic labels in relation to the associated objects, the semantic labeling module may determine labels for various objects using machine learning, e.g. convolutional neural network to receive a set of 3D points corresponding to an object and predict one or more labels for the input object, Figure 20 and associated text, e.g. [0129] thru [0137], further notes performing semantic label based filtering of objects).

It would have been obvious to one of ordinary skill in the art at the time of the invention to further modify Shen et al. modified with Choi et al.’s method of performing object detection and identifying labels with Colgate et al.’s method of identifying one or more semantic labels associated with objects for providing a description and/or classification for accurately depicting objects within an image.



As to claims 2, 9, and 16, Shen et al. modified with Choi et al. and Colgate et al. disclose the indication provided by the user input is configured to verify an accurate location of the first bounding box (Shen, Figure 3B illustrates and [0046] notes the annotation system 140 displays bounding box 322 on the interface and a menu 358 indicating whether the operator would like to “accept” the annotation (e.g. accurate location of the bounding box) or whether the operator would like to pursue “other” options, such as re-drawing the annotation, where Figure 3A illustrates various bounding boxes (e.g. bounding boxes 320, 322, 324, and 326) with bounding boxes 324 and 326 as examples of inaccurate locations of bounding boxes, and bounding box 322 as an example of an accurate location of a bounding box around the vehicle, thus may be accepted (verified) as an accurate location in Figure 3B).

As to claims 3, 10, and 17, Shen et al. modified with Choi et al. and Colgate et al. disclose the indication provided by the user input is configured to verify an accurate size of the first bounding box (Shen, Figure 3B illustrates and [0046] notes the annotation system 140 displays bounding box 322 on the interface and a menu 358 indicating whether the operator would like to “accept” the annotation (e.g. accurate size of the bounding box) or whether the operator would like to pursue “other” options, such as re-drawing the annotation, where Figure 3A illustrates various bounding boxes (e.g. bounding boxes 320, 322, 324, and 326) with bounding box 320 as an example of an inaccurate size of a bounding box (e.g. although bounding box encompasses vehicle, it is too large), and bounding box 322 as an example of an accurate size of a bounding box around the vehicle, thus may be accepted (verified) as an accurate size in Figure 3B).

As to claims 4, 11, and 18, Shen et al. modified with Choi et al. and Colgate et al. disclose the indication provided by the user input is configured to verify an inaccurate placement of the bounding box around the first image object, and wherein the user input is further configured to modify placement of the first bounding box to produce an accurate placement of the first bounding box around the first image object (Shen, Figure 3B illustrates and [0046] notes the annotation system 140 displays bounding box 322 on the interface and a menu 358 indicating whether the operator would like to “accept” the annotation (e.g. accurate location/placement of the bounding box) or whether the operator would like to pursue “other” options, such as re-drawing the annotation (e.g. modify the placement of the bounding box), where Figure 3A illustrates various bounding boxes (e.g. bounding boxes 320, 322, 324, and 326) with bounding boxes 324 and 326 as examples of inaccurate locations/placements of bounding boxes, and bounding box 322 as an example of an accurate location/placement of a bounding box around the vehicle, where it is understood that if the bounding box displayed in Figure 3B were one of bounding boxes 324 or 326 corresponding to an inaccurate location/placement, the operator may select “other” to redraw (modify) the bounding box to provide a more accurate location/placement of the bounding box).

As to claims 5, 12, and 19, Shen et al. modified with Choi et al. and Colgate et al. disclose the processors are further configured to perform operations comprising: processing the image to identify a pixel region associated with a second image object (Shen, Figure 4A illustrates prediction of annotations within an image containing multiple objects, e.g. overlapping vehicles, where [0049] notes the annotation system 140 generates a set of bounding box predictions using an annotation model configured to detect vehicles in an image); placing a second bounding box around the second image object (Shen, Figure 4A illustrates bounding boxes 420, 422, 424, and 428, where bounding box 420 is an annotation for occluded back vehicle, while bounding box 422 is an annotation for the front vehicle, and bounding box 428 is an annotation for both overlapping vehicles); and receiving a user second input comprising an indication of whether the second bounding box is accurately placed around the second image object (Shen, Figure 4B illustrates and [0050] notes receiving input from the human operator for a bounding box 422 that identifies the front vehicle, and based on the input, the annotation system 140 determines that the original annotation 428 contains multiple objects, discards the original annotation 428, and displays an annotations for the occluded back vehicle on the interface, as well as a menu 458 indicating whether the operator would like to “accept” the annotation (e.g. accurately placed around the vehicle) or whether the operator would like to pursue “other” options).  Please NOTE: Figures 4A and 4B are noted above for explanation purposes as they depict an image comprising multiple objects, where Figures 3A and 3B illustrate an image comprising only a single object.  Therefore, it is understood the image of Figures 3A and 3B may comprise multiple objects, where the operations performed would be similar to that described in Figures 4A and 4B.  Additionally, it is also understood if the image of Figures 3A and 3B comprised multiple objects, bounding boxes for each object may be generated in a manner as outlined in claim 8 (by repeating operations for each object). 

Shen, [0040] notes the video sequence of training images input to annotation system 140, and the annotation model generates a set of annotations that indicate the locations (e.g. region of interests) of bounding boxes around the vehicle, where [0032] notes model training system 130 trains machine learned computer models, [0033] thru [0039] note the annotation system 130 provides annotated data to the model training system 130 and further trains a bi-directional annotation model that generates annotations for an image sequence; further modified with Choi, [0086] and [0087] note a machine learning model to identify boundaries and classify objects detected in images).

Response to Arguments

Applicant's arguments filed October 12, 2021 have been fully considered but they are not persuasive.  Applicant amends independent claims 1, 8, and 15 to similarly recite, “…wherein the user input indicates a centroid location of the pixel region associated with the first image object…”  Applicant argues on pages 9-11 of the Amendment filed that the prior art of record, e.g. Shen et al. and Colgate et al., does not teach or suggest the newly amended limitation of the claims.  However, in light of the amendment, Shen et al. is modified with newly found reference Choi et al. (US 2020/0311956) and Colgate et al. (previously of record) for teaching the limitations of the claims.  Please see the rejection and notes regarding the claims above. 
Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to JACINTA M CRAWFORD whose telephone number is (571)270-1539. The examiner can normally be reached 9:00 a.m. to 5:00 p.m.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Mehmood can be reached on (571)272-2976. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/JACINTA M CRAWFORD/Primary Examiner, Art Unit 2612