DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments (12/7/20 Remarks: page 7, lines 8-9) with respect to the rejection of claim 5 under 35 USC §112 have been fully considered and are persuasive. The rejection of claim 5 under 35 USC §112 has been obviated by the claim’s cancellation.
Applicant’s arguments (12/7/20 Remarks: page 7, lines 10 – page 8, line 22) with respect to the rejection of claims 1-4 & 6-12 under 35 USC §102 and the rejection of claim 13 under 35 USC §103 have been fully considered and are persuasive, in view of as the present amended language describing the image processing functions as integrated into the recited camera device. Therefore, the rejection has been withdrawn. However, upon further consideration, a new ground of rejection is made in view of Fukada (US 20160044558) and Planche (US 20200294201).
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-4 & 6-13 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
The phrase “more similar to the visual characteristics of the synthetic-domain images” (claim 1, lines 12-13) is unclear as to what standard defines a given image as meeting this criterion.
Claim Rejections - 35 USC § 103
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 1-4 & 6-12, insofar as they are understood, are rejected under 35 U.S.C. 103 as being unpatentable over Hoffman (“CyCADA: Cycle-Consistent Adversarial Domain Adaptation”, cited in 5/21/19 Information Disclosure Statement) in view of Fukada (US 20160044558) and Planche (US 20200294201).
Re claim 1, Hoffman discloses
Claim 1: A method for training a machine-learning model to convert real-domain images to synthetic-appearing mounted (see below) camera device (Hoffman section 2, paragraph bridging pages 3 & 4, “drive cam”, see below re “mounted” camera) at a location (Hoffman Introduction, “real-world imagery” (which is necessarily obtained by an imaging device at a particular location), the location associated with a scene type (Hoffman Abstract, road scenes), the method comprising:
receiving a first set of training images associated with the scene type, wherein the first set of training images includes real-domain images (Hoffman Abstract, real world domain images; Hoffman section 4.2, real-world CityScapes images);
generating a second set of training images associated with the scene type, wherein the second set of training images includes synthetic-domain images (Hoffman section 4.2, Grand Theft Auto (i.e. video game) images);
training the machine-learning model, using the first and second sets of training images, to generate respective synthetic-appearing output images based on respective sample real-domain images, wherein visual characteristics of the respective synthetic-appearing output images are more similar to (see below) visual characteristics of the synthetic-domain images than to visual characteristics of the real-domain images (Hoffman section 4.2, training of model using real-world CityScapes and synthetic GTA images; Hoffman section 4.2.2, converted images adjust visual characteristics (e.g. color) to compensate for differences between CityScape and GTA images; Hoffman Figure 6, generation of translation images between GTA and CityScapes with translation in both directions (i.e. including translation of images from real-world CityScapes images to synthetic GTA images), Hoffman section 4.2.2. “We visualize the results of image-space adaptation between GTA5 and Cityscapes in Figure 6. The most obvious difference between the original images and the adapted images is the saturation levels the GTA5 imagery is much more vivid than the Cityscapes imagery, so adaptation adjusts the colors to compensate.”, describing an adjustment of a visual characteristic in the conversion between real-domain and synthetic-domain images so as to compensate for the difference, see below re “more similar” element); and
subsequent to training the machine-learning model using the first and second sets of training images (Hoffman Figure 6, conversion of images using trained model), providing a trained version of the machine-learning model to the mounted (see below) camera device  (Hoffman section 2, paragraph bridging pages 3 & 4, “drive cam”, see below re “mounted” camera), so as to enable the mounted (see below) camera device to convert images obtained by an image sensor of the mounted camera to corresponding synthetic-appearing images (see below).
Re the recitations annotated “see below” above, Hoffman discloses a “drive cam” (Hoffman section 2, paragraph bridging pages 3 & 4).
Hoffman, however, does not specifically disclose the use of a mounted camera having image processing functionality rather than (e.g.) a hand held driving camera:
…the machine-learning model is associated with a mounted camera device…
mounted camera device so as to enable the mounted camera device to convert images obtained by an image sensor of the mounted camera to corresponding synthetic-appearing images
The use of a vehicular mounted camera device which includes an image processor enabling it to perform image processing operations is known in the art as disclosed for example by Fukada (Fukada paragraphs 0028-0029 mounted vehicle camera having an integrated image processor).
Hoffman and Fukada are combinable because they are from the field of vehicular cameras (i.e. drive cams).
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to implement the “drive cam” mentioned but not described in detail by Hoffman in the form of the mounted vehicle camera of Fukada.
The suggestion/motivation for doing so would have been to implement the “drive cam” element of Hoffman in a form which does not require a driver or passenger to hold a device by hand.
In addition to the specific teachings of the cited Prior Art, Examiner notes that the making of a known device in one-piece integrated form (e.g. mounting a device to a vehicle), in cases where this feature does not produce a new and unexpected result, has been judicially recognized as an expedient obvious to one of ordinary skill in the art. In re Larson, 340 F.2d 965, 968, 144 USPQ 347, 349 (CCPA 1965).
(Hoffman section 4.2.2) that the difference between the original real-domain images and the GTA synthetic-domain images is the saturation levels or the latter are much more vivid than the former. Hoffman further teaches that adaptation of the former to the latter adjusts the colors to compensate for this difference.
Hoffman does not specifically teach that offsetting a difference between images results in images becoming more similar to synthetic images in visual characteristics.
Planche teaches (Planche paragraph 0035, “A first generator network is trained to take as input a real depth scan and to return an image that resembles a synthetic image”) generation of an image that is more similar to a synthetic image in visual characteristics.
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to use a conversion of real-domain images to synthetic-domain images in such a way as to produce images more similar to synthetic-domain images.
The suggestion/motivation for doing so would have been to produce output synthetic-domain images which resembled the desired type of images (synthetic-domain images).

Applying the teachings combined in accordance with the above rationale to claims 2-4 & 6-12:
Claim 2: The method of claim 1 (see above), wherein the real-domain images from the first set of training images are not paired with the synthetic-domain images from the second set of training images (Hoffman Abstract, “Recent work has shown that generative adversarial networks combined with cycle-consistency constraints are surprisingly effective at mapping images between domains, even without the use of aligned image pairs. We propose a novel discriminatively-trained Cycle-Consistent Adversarial Domain Adaptation model. CyCADA adapts representations at both the pixel-level and feature-level, enforces cycle-consistency while leveraging a task loss, and does not require aligned pairs (i.e. both the background art known to Hoffman and the method described by Hoffman includes the practice of the method without alignment of any image pairs)”).
Claim 3: The method of claim 1 (see above), wherein the machine-learning model is a cycle-consistent generative adversarial network (Hoffman Abstract, cycle-consistent adversarial domain model).
Claim 4: The method of claim 1 (see above), wherein the scene type is indoor scene, outdoor scene (Hoffman Figures 4-6, outdoor urban scene), urban scene (Hoffman Figures 4-6, outdoor urban scene), rural scene, night scene, day scene, or (Note: This is a recitation in the alternative, satisfied by a teaching of any one option) a particular view of the location.
Claim 6: The method of claim 1 (see above), wherein the visual characteristics include a distribution of textures or a distribution of colors Hoffman section 4.2.2, color visual characteristic).
Claim 7: The method of claim 1 (see above), wherein the first and second sets of training images both depict a similar distribution of object structures (Hoffman Figures 4-6, CityScapes and GTA outdoor urban scene having similar distribution of object structures).
Claim 8: A method for using a machine-learning model to identify objects depicted in real-domain sample images, wherein the machine learning model includes an object-recognition component and a real-to-synthetic-image component, and wherein the machine-learning model is associated with a mounted camera device (Hoffman Introduction, “real-world imagery” (which is necessarily obtained by an imaging device at a particular location, Hoffman section 2, paragraph bridging pages 3 & 4, “drive cam”, Fukada paragraphs 0028-0029 mounted vehicle camera), the method comprising:
generating, by one or more image sensors of the mounted camera device, one or more real-domain sample images, the one or more real-domain sample images depicting a view of the mounted camera device (Hoffman Abstract, real world domain images; Hoffman section 4.2, real-world CityScapes images, Hoffman section 2, paragraph bridging pages 3 & 4, “drive cam”, Fukada paragraphs 0028-0029 mounted vehicle camera);
at the mounted camera device (Hoffman section 2, paragraph bridging pages 3 & 4, “drive cam”, Fukada paragraphs 0028-0029 mounted vehicle camera), generating (Fukada paragraphs 0028-0029, integration of image processing function into camera device), by the real-to-synthetic-image component, respective synthetic-appearing sample images based on the respective real-domain sample images (Hoffman Figure 6, generation of translation images between Grand Theft Auto (i.e. video game) and CityScapes with translation in both directions);
at the mounted camera device (Hoffman section 2, paragraph bridging pages 3 & 4, “drive cam”, Fukada paragraphs 0028-0029 mounted vehicle camera), identifying (Fukada paragraphs 0028-0029, integration of image processing function into camera device), by the object-recognition component, objects depicted in the synthetic-appearing sample images (Hoffman Abstract and section 1, recognition of image objects), wherein the object-recognition component was trained using a set of synthetic-domain image data (Hoffman section 4.2, training of model using real-world CityScapes and synthetic GTA images); and
providing a report concerning the depicted objects based on the identification (Hoffman section 6.1.1, recognition results report).
Claim 9: The method of claim 8 (see above), wherein visual characteristics of the synthetic-appearing sample images are similar to the visual characteristics associated with the set of synthetic-domain image data (Hoffman section 4.2.2, converted images adjust visual characteristics (e.g. color) to compensate for differences between CityScape and GTA images; Hoffman Figure 6, generation of translation images between GTA and CityScapes with translation in both directions (i.e. including translation of images from real-world CityScapes images to synthetic GTA images), Hoffman section 4.2.2. “We visualize the results of image-space adaptation between GTA5 and Cityscapes in Figure 6. The most obvious difference between the original images and the adapted images is the saturation levels the GTA5 imagery is much more vivid than the Cityscapes imagery, so adaptation adjusts the colors to compensate.”, describing an adjustment of a visual characteristic in the conversion between real-domain and synthetic-domain images, Planche paragraph 0035, “A first generator network is trained to take as input a real depth scan and to return an image that resembles a synthetic image”)).
Claim 10: The method of claim 8 (see above), wherein the object-recognition component is a convolutional neural network (Hoffman section 1, first paragraph, neural network; section 3, final paragraph, convolutional network).
Claim 11: The method of claim 8 (see above), wherein the real-to-synthetic-image component is a generative network of a cycle-consistent adversarial network (Hoffman Abstract, cycle-consistent adversarial domain model).
Claim 12: The method of claim 8 (see above), wherein the mounted camera device (Hoffman section 2, paragraph bridging pages 3 & 4, “drive cam”, Fukada paragraphs 0028-0029 mounted vehicle camera) is associated with a location (Hoffman Introduction, “real-world imagery” (which is necessarily obtained by an imaging device at a particular location)), the location associated with a scene type (Hoffman Abstract, road scenes), and the set of synthetic-domain image data includes objects and lighting conditions that are expected to be present at the location (Hoffman section 4.2, SYNTHIA video sequences rendered for various environments and lighting conditions).
Claim 13, insofar as it is understood, is rejected under 35 U.S.C. 103 as being unpatentable over Hoffman (“CyCADA: Cycle-Consistent Adversarial Domain Adaptation”, cited in 5/21/19 Information Disclosure Statement) in view of Fukada and Planche as applied to claim 8 above, and further in view of Gaidon (“Virtual Worlds as Proxy for Multi-Object Tracking Analysis”, cited in 5/21/19 Information Disclosure Statement).
Re claim 13, Hoffman in view of Fukada teaches the invention of claim 8, as described above.
Claim 13: The method of claim 8 (see above)…
Hoffman in view of Fukada does not disclose the recited arrangement of generating synthetic image data from a scene specification outline and seed value.
Claim 13: …wherein the set of synthetic-domain image data used to train the object-recognition component was deterministically generated in accordance with a scene specification outline and a seed value, wherein the scene specification outline specifies a range of scenes, and wherein each of the scenes comprises one or more objects and a camera model.
Gaidon discloses (Gaidon section 3.2-3.3, particularly section 3.2, final paragraph and section 3.3., second paragraph, generation of synthetic video having a range of parameters such as camera position (generating a different scene) and lighting conditions) generating synthetic image data from a scene specification outline (any element of which is readable upon the (not further specified) “seed value”).
Hoffman in view of Fukada and Gaidon are combinable because they are from the field of synthetic image processing.
Before the effective filing date of the invention, it would have been obvious to a person of ordinary skill in the art to generate the Hoffman scenes from various parameters such as camera position and lighting as taught by Gaidon.
The suggestion/motivation for doing so would have been to enable the generation of synthetic images in a manner customized to enable study of single factors or conduct “what if” analysis, as described by Gaidon (Gaidon section 3.3, first paragraph).
Therefore, it would have been obvious to combine Hoffman in view of Fukada with Gaidon to obtain the invention as specified in claim 13.
Claim 13: The method of claim 8 (see above), wherein the set of synthetic-domain image data used to train the object-recognition component was deterministically generated in accordance with a scene specification outline and a seed value (Gaidon section 3.2-3.3, generation of synthetic video having a range of parameters such as camera position (generating a different scene) and lighting conditions), wherein the scene specification outline specifies a range of scenes (Gaidon section 3.3, generating scenes with changed components, inherently indicating that at least two scenes are generated) and wherein each of the scenes comprises one or more objects and a camera model (Gaidon Figure 3, scene having a plurality of objects with camera modeling data such as camera angle).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Li, Dundar, Matsumura, Zhang, and Peng disclose further examples of camera devices and synthetic-domain type conversion.
Any inquiry concerning the contents of this communication or earlier communications from the examiner should be directed to Stephen M. Brinich at 571-272-7430 (voice) or 571-273-7430 (fax).
Any inquiry relating to the status of this application, entry of papers into this application, or other any inquiries of a general nature concerning application processing should be directed to the Tech Center 2600 Customer Service center at 571-272-2600 or to the USPTO Contact Center at 800-786-9199 or 571-272-1000.
The examiner can normally be reached on weekdays 7:30-4:00 Eastern Time.
If attempts to contact the examiner and the Customer Service Center are unsuccessful, supervisor Claire Wang can be contacted at 571-270-1051.

/Stephen M Brinich/
Examiner, Art Unit 2663