Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 3/8/21 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.
Claims 1-20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claim 1 of U.S. Patent No. 10,944,898 in view of Wrenninge et al. (US 10,235,601). 
Regarding claim 1 of the instant application:

Claim 1 of instant application:
1. A system for guiding image sensor angle settings, the system comprising: a memory storing computer program instructions; and at least one processor configured to execute the computer program instructions to perform operations comprising: 
Claim 1 of US 10,944,898:
1. A system for guiding image sensor angle settings, the system comprising:
a memory storing executable instructions; and at least one processor configured to execute instructions to perform operations comprising:
obtaining synthetic images depicting a set of synthetic scenes, wherein each synthetic image of the synthetic images depicts a synthetic scene of the set of synthetic scenes, the synthetic images comprising (i) a first synthetic image that depicts a first synthetic scene corresponding to a first classification and (ii) a second synthetic image that depicts a second synthetic scene corresponding to a second classification;
obtaining a plurality of synthetic images, the synthetic images representing a plurality of scenes;
capturing, by an image sensor, a plurality of images of an environment of an automated teller machine (ATM);
capturing, by an image sensor, a plurality of images from an environment of a user;
identifying the first classification for a live scene in the live image stream of the ATM environment based on the live scene being similar to the first synthetic scene with respect to a criteria set, the first classification corresponding to the first synthetic scene being identified for the live scene over the second classification corresponding to the second synthetic scene
training a classification model to classify the captured images based on the comparison,
wherein the classification of the captured images includes classification of the captured images into a plurality of groups based on characteristics of objects identified in the captured images;
causing, based on the identification of the first classification for the live scene, image sensor angle parameters of the image sensor to be adjusted from a different set of image sensor angle parameters to a first set of image sensor angle parameters corresponding to the first synthetic scene such that the live scene in the live image stream becomes more similar to the first synthetic scene with respect to the criteria set as a result of the adjustment from the different set of image sensor angle parameters to the first set of image sensor angle parameters
determining an angular position of the image sensor based on the classification of the captured images;
adjusting, based on the comparison of the angular position of the image sensor to the predetermined angular position, the angular position of the image sensor.


In an analogous art, Wrenninge teaches:
the synthetic images comprising (i) a first synthetic image that depicts a first synthetic scene corresponding to a first classification and (ii) a second synthetic image that depicts a second synthetic scene corresponding to a second classification (Figs. 1A and 2, wherein a set of synthetic image dataset is generated using different video scenes and later used for training a classification model. Col. 23, lines 17-33 teaches repeatedly performing steps S100-S300 for numerous video scenes used to generate the synthetic image dataset. Additionally, col. 4, lines 13-33 teaches “labels” such as labels of object within synthetic images, object types, layouts, relative orientations and positions, etc. The set of scenes are met by the synthetic images, which naturally includes a “scene”);
It would have been obvious to one of ordinary skill in the art at the time of the invention to incorporate the teachings of Wrenninge into claim 1 of 10,944,898 such that the combined system also utilizes the ability to obtain image data sets wherein each of the plurality of labels indicates a scene of a set of scenes so that it allows the combined system to determine whether a user’s face is ideally located in the image frame (prior to the decision to adjust the camera) because such incorporation allows for the benefit of improving the accuracy of image based comparisons (Wrenninge: col. 1, lines 31-61).
Claims 3 and 11 of the instant application also inherits the same analysis of claim 1 of the instant application above.
Dependent claims 2, 4-10 and 12-20 are also rejected based on the recitations of claim 1 of the instant application above.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Li (CN108877099A) in view of Wrenninge et al. (US 10,235,601).
Regarding claim 1, Li teaches a system for guiding image sensor angle settings for an automated teller machine (ATM) environment (Fig. 1 and abstract of a camera system that changes its angle settings), the system comprising: 
a memory storing executable instructions (abstract, wherein the face recognition technology requires at least software stored on a medium being executed by a processor)
at least one processor configured to execute the instructions to perform operations (abstract, wherein the face recognition technology requires at least software stored on a medium being executed by a processor) comprising:
obtaining, via an image sensor, a live stream image stream of an ATM environment (abstract, camera images the user of the ATM);
causing, based on the identification of the first classification for the live scene, image sensor angle parameters of the image sensor to be adjusted from a different set of image sensor angle parameters to a first set of image sensor angle parameters corresponding to the first synthetic scene such that the live scene in the live image stream becomes more similar to the first synthetic scene with respect to the criteria set as a result of the adjustment from the different set of image sensor angle parameters to the first set of image sensor angle parameters (abstract teaches “the ATM … adjusts the angle and height of the camera through the adjusting mechanism” based on face recognition technology. It also teaches wherein the image sensor is determined to be in a position based on the need to “recognize the user”. Therefore, when the camera is determined to be moved to “recognize the user”, the image sensor/camera’s current orientation is determined to assist in moving the position to another position. The another position is associated with the image sets that are used to help in recognizing the user. Each of the position of the image sensor angle is therefore correlated to the image sets used in determining/recognizing the user. When a user enters the field of view of the camera, each of the images taken is therefore used to determine the location of the user. The claimed becoming more similar is achieved by the system aligning itself to the matching image set for the ideal frame to recognize the user); and
It is noted that Li teaches a system wherein a determination is made that the face recognition result is not ideal to comprehensively recognize a user and thereby determines the current orientation to be non-ideal, and therefore results in the moving of the image sensor to capture the user better by changing the angle and height of the camera. Therefore, while Li recognizes the need to identify a user’s face, fails to explicitly teach the following limitations of:
obtaining synthetic images depicting a set of synthetic scenes, wherein each synthetic image of the synthetic images depicts a synthetic scene of the set of synthetic scenes, the synthetic images comprising (i) a first synthetic image that depicts a first synthetic scene corresponding to a first classification and (ii) a second synthetic image that depicts a second synthetic scene corresponding to a second classification; 
identifying the first classification for a live scene in the live image stream of the ATM environment based on the live scene being similar to the first synthetic scene with respect to a criteria set, the first classification corresponding to the first synthetic scene being identified for the live scene over the second classification corresponding to the second synthetic scene;
In an analogous art, Wrenninge teaches:
obtaining synthetic images depicting a set of synthetic scenes, wherein each synthetic image of the synthetic images depicts a synthetic scene of the set of synthetic scenes, the synthetic images comprising (i) a first synthetic image that depicts a first synthetic scene corresponding to a first classification and (ii) a second synthetic image that depicts a second synthetic scene corresponding to a second classification (Figs. 1A and 2, wherein a set of synthetic image dataset is generated/obtained using different video scenes and later used for training a classification model. Col. 23, lines 17-33 teaches repeatedly performing steps S100-S300 for numerous video scenes used to generate the synthetic image dataset. Additionally, col. 4, lines 13-33 teaches “labels” such as labels of object within synthetic images, object types, layouts, relative orientations and positions, etc. The set of scenes are met by the synthetic images, which naturally includes a “scene”);
identifying the first classification for a live scene in the live image stream of the ATM environment based on the live scene being similar to the first synthetic scene with respect to a criteria set, the first classification corresponding to the first synthetic scene being identified for the live scene over the second classification corresponding to the second synthetic scene (Figs. 1A and 2, wherein a set of synthetic image dataset is generated using different video scenes and later used for training/testing a classification model in steps S600 and S700. Wrenninge also teaches a convolutional neural network that repeats the process to retrain (to fine tune) and/or to test/evaluate the trained model in Col. 23, lines 17-33, col. 20, lines 6-61 and col. 1, lines 31-61). Furthermore, step S500, in generating the image dataset, it is stated in col. 19, lines 24-38 that the images for the image dataset includes real world images. Following that, in step S700, the model is tested based on the image dataset (which includes the real-world images). The comparison uses checks to see whether the same labels (labels of object within synthetic images, object types, layouts, relative orientations and positions, etc. as discussed above) appears between the model and the image dataset to classify and thereafter evaluate the mode. Therefore, the ability to detect the same first/second classifications of objects/layouts/etc. between a live image an existing trained image datasets are taught by Wrenninge);
It would have been obvious to one of ordinary skill in the art at the time of the invention to incorporate the teachings of Wrenninge into the system of Li such that Li also utilizes the ability to obtain image data sets and to classify captured images (similar to the real images from the image dataset in Wrenninge) using the above synthetic images (from the generated model) into various groups during the testing/evaluation so that it allows Li to determine whether a user’s face is ideally located in the image frame (prior to the decision to adjust the camera) because such incorporation allows for the benefit of improving the accuracy of image based comparisons (Wrenninge: col. 1, lines 31-61).
Regarding claim 2, Wrenninge teaches the claimed wherein identifying the first classification for the live scene comprises: determining, via a classification model, the first classification for the live scene in the image stream of the ATM environment, wherein the classification model indicates that the live scene matches the first synthetic scene corresponding to the first classification (Figs. 1A and 2, wherein a set of synthetic image dataset is generated using different video scenes and later used for training/testing a classification model in steps S600 and S700. Wrenninge also teaches a convolutional neural network that repeats the process to retrain (to fine tune) and/or to test/evaluate the trained model in Col. 23, lines 17-33, col. 20, lines 6-61 and col. 1, lines 31-61). Furthermore, step S500, in generating the image dataset, it is stated in col. 19, lines 24-38 that the images for the image dataset includes real world images. Following that, in step S700, the model is tested based on the image dataset (which includes the real-world images). The comparison uses checks to see whether the same labels (labels of object within synthetic images, object types, layouts, relative orientations and positions, etc. as discussed above) appears between the model and the image dataset to classify and thereafter evaluate the mode. Therefore, the ability to detect the same first/second classifications of objects/layouts/etc. between a live image an existing trained image datasets are taught by Wrenninge).
Regarding claims 3 and 11, Li teaches a method comprising:
obtaining, via an image sensor, an image stream of a kiosk environment (abstract, camera images the user of the ATM. The machinery for an ATM would meet the same requirements for a kiosk per se);
causing adjustment of image sensor angle parameters of the image sensor based on (1) the identification of the first classification for the live scene and (ii) a first set of image sensor angle parameters corresponding to the first synthetic scene (abstract teaches “the ATM … adjusts the angle and height of the camera through the adjusting mechanism” based on face recognition technology. It also teaches wherein the image sensor is determined to be in a position based on the need to “recognize the user”. The determination of a user’s face on incoming video frames meets the claimed first classification since it classifies whether a user’s face is identified or not. Therefore, when the camera is determined to be moved to “recognize the user”, the image sensor/camera’s current orientation is determined to assist in moving the position to another position. The another position is associated with the image sets that are used to help in recognizing the user. Each of the position of the image sensor angle is therefore correlated to the image sets used in determining/recognizing the user. When a user enters the field of view of the camera, each of the images taken is therefore used to determine the location of the user. The claimed becoming more similar is achieved by the system aligning itself to the matching image set for the ideal frame to recognize the user); and
It is noted that Li teaches a system wherein a determination is made that the face recognition result is not ideal to comprehensively recognize a user and thereby determines the current orientation to be non-ideal, and therefore results in the moving of the image sensor to capture the user better by changing the angle and height of the camera. Therefore, while Li recognizes the need to identify a user’s face, fails to explicitly teach, but Wrenninge teaches the following limitations of:
storing representations of synthetic scenes of a set of synthetic scenes, the stored representations comprising (i) a first representation of a first synthetic scene corresponding to a first classification and (i1) a second representation of a second synthetic scene corresponding to a second classification (Figs. 1A and 2, wherein a set of synthetic image dataset is generated using different video scenes and later used for training a classification model. Col. 23, lines 17-33 teaches repeatedly performing steps S100-S300 for numerous video scenes used to generate the synthetic image dataset. Additionally, col. 4, lines 13-33 teaches “labels” such as labels of object within synthetic images, object types, layouts, relative orientations and positions, etc. The set of scenes are met by the synthetic images, which naturally includes a “scene”. These labels for objects/layouts/orientation/etc. meets the first and second types of classifications applicable to the current scene/frame);
identifying the first classification for a live scene in the image stream of the kiosk environment based on the live scene matching the first synthetic scene, the first classification corresponding to the first synthetic scene being identified for the live scene over the second classification corresponding to the second synthetic scene (Figs. 1A and 2, wherein a set of synthetic image dataset is generated using different video scenes and later used for training/testing a classification model in steps S600 and S700. Wrenninge also teaches a convolutional neural network that repeats the process to retrain (to fine tune) and/or to test/evaluate the trained model in Col. 23, lines 17-33, col. 20, lines 6-61 and col. 1, lines 31-61). Furthermore, step S500, in generating the image dataset, it is stated in col. 19, lines 24-38 that the images for the image dataset includes real world images. Following that, in step S700, the model is tested based on the image dataset (which includes the real-world images). The comparison uses checks to see whether the same labels (labels of object within synthetic images, object types, layouts, relative orientations and positions, etc. as discussed above) appears between the model and the image dataset to classify and thereafter evaluate the mode. Therefore, the ability to detect the same first/second classifications of objects/layouts/etc. between a live image an existing trained image datasets are taught by Wrenninge).
It would have been obvious to one of ordinary skill in the art at the time of the invention to incorporate the teachings of Wrenninge into the system of Li such that Li also utilizes the ability to obtain image data sets and to classify captured images (similar to the real images from the image dataset in Wrenninge) using the above synthetic images (from the generated model) into various groups during the testing/evaluation so that it allows Li to determine whether a user’s face is ideally located in the image frame (prior to the decision to adjust the camera) because such incorporation allows for the benefit of improving the accuracy of image based comparisons (Wrenninge: col. 1, lines 31-61). The prior motivation as discussed above is incorporated herein.
Regarding claims 4 and 12, the proposed combination of Li and Wrenninge teaches the claimed wherein causing the adjustment of the image sensor angle parameters of the image sensor comprises adjusting the image sensor angle parameters of the image sensor from a different set of image sensor angle parameters to the first set of image sensor angle parameters corresponding to the first synthetic scene (Li: abstract teaches “the ATM … adjusts the angle and height of the camera through the adjusting mechanism” based on face recognition technology. It also teaches wherein the image sensor is determined to be in a position based on the need to “recognize the user”. Therefore, when the camera is determined to be moved to “recognize the user”, the image sensor/camera’s current orientation is determined to assist in moving the position to another position. The another position is associated with the image sets that are used to help in recognizing the user. Each of the position of the image sensor angle is therefore correlated to the image sets used in determining/recognizing the user. When a user enters the field of view of the camera, each of the images taken is therefore used to determine the location of the user. The claimed becoming more similar is achieved by the system aligning itself to the matching image set for the ideal frame to recognize the user. Furthermore, in combining Li with Wrenninge, Wrenninge teaches the ability to associate a set of synthetic image datasets to assist in the process of machine learning system, such as Li’s, to improve the accuracy of Li’s face detection and orientation correction). The prior motivation as discussed above is incorporated herein.
Regarding claims 5 and 13, the proposed combination of Li and Wrenninge teaches the claimed wherein causing the adjustment of the image sensor angle parameters of the image sensor comprises causing the adjustment of the image sensor angle parameters of the image sensor based on (i) the identification of the first classification for the live scene and (ii) the first set of image sensor angle parameters corresponding to the first synthetic scene such that the live scene in the image stream becomes more similar to the first synthetic scene as a result of the adjustment to the first set of image sensor angle parameters (Li: abstract teaches “the ATM … adjusts the angle and height of the camera through the adjusting mechanism” based on face recognition technology. It also teaches wherein the image sensor is determined to be in a position based on the need to “recognize the user”. The determination of a user’s face on incoming video frames meets the claimed first classification since it classifies whether a user’s face is identified or not. Therefore, when the camera is determined to be moved to “recognize the user”, the image sensor/camera’s current orientation is determined to assist in moving the position to another position. The another position is associated with the image sets that are used to help in recognizing the user. Each of the position of the image sensor angle is therefore correlated to the image sets used in determining/recognizing the user. When a user enters the field of view of the camera, each of the images taken is therefore used to determine the location of the user. The claimed becoming more similar is achieved by the system aligning itself to the matching image set for the ideal frame to recognize the user. Furthermore, in combining Li with Wrenninge, Wrenninge teaches the ability to associate a set of synthetic image datasets to assist in the process of machine learning system, such as Li’s, to improve the accuracy of Li’s face detection and orientation correction). The prior motivation as discussed above is incorporated herein.
Regarding claims 6 and 14, the proposed combination of Li and Wrenninge teaches the claimed wherein causing the adjustment of the image sensor angle parameters of the image sensor comprises causing the adjustment of the image sensor angle parameters of the image sensor based on (i) the identification of the first classification for the live scene and (ii) the first set of image sensor angle parameters corresponding to the first synthetic scene such that the live scene in the image stream avoids capturing one or more portions of an input device of a kiosk in the kiosk environment as a result of the adjustment to the first set of image sensor angle parameters (Li: abstract teaches “the ATM … adjusts the angle and height of the camera through the adjusting mechanism” based on face recognition technology. It also teaches wherein the image sensor is determined to be in a position based on the need to “recognize the user”. The determination of a user’s face on incoming video frames meets the claimed first classification since it classifies whether a user’s face is identified or not. Therefore, when the camera is determined to be moved to “recognize the user”, the image sensor/camera’s current orientation is determined to assist in moving the position to another position. The another position is associated with the image sets that are used to help in recognizing the user. Each of the position of the image sensor angle is therefore correlated to the image sets used in determining/recognizing the user. When a user enters the field of view of the camera, each of the images taken is therefore used to determine the location of the user. The claimed becoming more similar is achieved by the system aligning itself to the matching image set for the ideal frame to recognize the user. Li’s camera system therefore avoids capturing portions around the ATM/kiosk that’s incorrect. Furthermore, in combining Li with Wrenninge, Wrenninge teaches the ability to associate a set of synthetic image datasets to assist in the process of machine learning system, such as Li’s, to improve the accuracy of Li’s face detection and orientation correction). The prior motivation as discussed above is incorporated herein.
Regarding claims 7 and 15, the proposed combination of Li and Wrenninge teaches the claimed wherein causing the adjustment of the image sensor angle parameters of the image sensor comprises causing the adjustment of the image sensor angle parameters of the image sensor based on (i) the identification of the first classification for the live scene and (ii) the first set of image sensor angle parameters corresponding to the first synthetic scene such that the live scene in the image stream includes a face of an individual in the kiosk environment as a result of the adjustment to the first set of image sensor angle parameters (Li: abstract teaches “the ATM … adjusts the angle and height of the camera through the adjusting mechanism” based on face recognition technology. It also teaches wherein the image sensor is determined to be in a position based on the need to “recognize the user”. The determination of a user’s face on incoming video frames meets the claimed first classification since it classifies whether a user’s face is identified or not. Therefore, when the camera is determined to be moved to “recognize the user”, the image sensor/camera’s current orientation is determined to assist in moving the position to another position. The another position is associated with the image sets that are used to help in recognizing the user. Each of the position of the image sensor angle is therefore correlated to the image sets used in determining/recognizing the user. When a user enters the field of view of the camera, each of the images taken is therefore used to determine the location of the user. The claimed becoming more similar is achieved by the system aligning itself to the matching image set for the ideal frame to recognize the user. Li’s camera system therefore moves the camera towards positions that would capture a user’s face better. Furthermore, in combining Li with Wrenninge, Wrenninge teaches the ability to associate a set of synthetic image datasets to assist in the process of machine learning system, such as Li’s, to improve the accuracy of Li’s face detection and orientation correction). The prior motivation as discussed above is incorporated herein.
Regarding claims 8 and 16, the proposed combination of Li and Wrenninge teaches the claimed further comprising: detecting, via one or more sensors, use of a kiosk by a user in the kiosk environment (Li: as discussed in claims 3 and 11 above, wherein the user’s face is detected in the environment of the ATM/kiosk),
wherein causing the adjustment of the image sensor angle parameters of the image sensor comprises causing the adjustment of the image sensor angle parameters of the image sensor based on (i) the identification of the first classification for the live scene, (11) the first set of image sensor angle parameters corresponding to the first synthetic scene, and (iii) the detection of the use of the kiosk by the user (Li: abstract teaches “the ATM … adjusts the angle and height of the camera through the adjusting mechanism” based on face recognition technology. It also teaches wherein the image sensor is determined to be in a position based on the need to “recognize the user”. The determination of a user’s face on incoming video frames meets the claimed first classification since it classifies whether a user’s face is identified or not. Therefore, when the camera is determined to be moved to “recognize the user”, the image sensor/camera’s current orientation is determined to assist in moving the position to another position. The another position is associated with the image sets that are used to help in recognizing the user. Each of the position of the image sensor angle is therefore correlated to the image sets used in determining/recognizing the user. When a user enters the field of view of the camera, each of the images taken is therefore used to determine the location of the user. The claimed becoming more similar is achieved by the system aligning itself to the matching image set for the ideal frame to recognize the user. Li’s camera system therefore moves the camera towards positions that would capture a user’s face better. Furthermore, in combining Li with Wrenninge, Wrenninge teaches the ability to associate a set of synthetic image datasets to assist in the process of machine learning system, such as Li’s, to improve the accuracy of Li’s face detection and orientation correction). The prior motivation as discussed above is incorporated herein.
Regarding claims 9 and 17, Li and Wrenninge teaches the claimed wherein identifying the first classification for the live scene comprises:
determining, via a classification model, the first classification for the live scene in the image stream of the kiosk environment (Li: abstract teaches images captured live at an ATM/Kiosk), wherein the classification model indicates that the live scene matches the first synthetic scene corresponding to the first classification (Wrenninge: Figs. 1A and 2, wherein a set of synthetic image dataset is generated using different video scenes and later used for training/testing a classification model in steps S600 and S700. Wrenninge also teaches a convolutional neural network that repeats the process to retrain (to fine tune) and/or to test/evaluate the trained model in Col. 23, lines 17-33, col. 20, lines 6-61 and col. 1, lines 31-61). Furthermore, step S500, in generating the image dataset, it is stated in col. 19, lines 24-38 that the images for the image dataset includes real world images. Following that, in step S700, the model is tested based on the image dataset (which includes the real-world images). The comparison uses checks to see whether the same labels (labels of object within synthetic images, object types, layouts, relative orientations and positions, etc. as discussed above) appears between the model and the image dataset to classify and thereafter evaluate the model). The prior motivation as discussed above is incorporated herein.
Regarding claims 10 and 18, Li and Wrenninge teaches the claimed wherein identifying the first classification for the live scene comprises:
determining, via a neural network, the first classification for the live scene in the image stream of the kiosk environment (Li: abstract teaches images captured live at an ATM/Kiosk), wherein the neural network is trained on a collection of synthetic images depicting synthetic scenes (Wrenninge: Figs. 1A and 2, wherein a set of synthetic image dataset is generated using different video scenes and later used for training/testing a classification model in steps S600 and S700. Wrenninge also teaches a convolutional neural network that repeats the process to retrain (to fine tune) and/or to test/evaluate the trained model in Col. 23, lines 17-33, col. 20, lines 6-61 and col. 1, lines 31-61). Furthermore, step S500, in generating the image dataset, it is stated in col. 19, lines 24-38 that the images for the image dataset includes real world images. Following that, in step S700, the model is tested based on the image dataset (which includes the real-world images). The comparison uses checks to see whether the same labels (labels of object within synthetic images, object types, layouts, relative orientations and positions, etc. as discussed above) appears between the model and the image dataset to classify and thereafter evaluate the model). The prior motivation as discussed above is incorporated herein.
Regarding claims 19 and 20, Li and Wrenninge teaches the claimed wherein causing the adjustment of the image sensor angle parameters of the image sensor comprises causing an adjustment of an orientation of the image sensor based on (i) the identification of the first classification for the live scene () and (ii) a first (angular position)/(orientation) of the first set of image sensor angle parameters corresponding to the first synthetic scene (Li: abstract teaches “the ATM … adjusts the angle and height of the camera through the adjusting mechanism” based on face recognition technology. It also teaches wherein the image sensor is determined to be in a position based on the need to “recognize the user”. The determination of a user’s face on incoming video frames meets the claimed first classification since it classifies whether a user’s face is identified or not. Therefore, when the camera is determined to be moved to “recognize the user”, the image sensor/camera’s current orientation is determined to assist in moving the position to another position. The another position is associated with the image sets that are used to help in recognizing the user. Each of the position of the image sensor angle is therefore correlated to the image sets used in determining/recognizing the user. When a user enters the field of view of the camera, each of the images taken is therefore used to determine the location of the user. The claimed becoming more similar is achieved by the system aligning itself to the matching image set for the ideal frame to recognize the user. Furthermore, in combining Li with Wrenninge, Wrenninge teaches the ability to associate a set of synthetic image datasets to assist in the process of machine learning system, such as Li’s, to improve the accuracy of Li’s face detection and orientation correction). The prior motivation as discussed above is incorporated herein).


Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GELEK W TOPGYAL whose telephone number is (571)272-8891. The examiner can normally be reached M-F (9:30-6 PST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, William Vaughn can be reached on 571-272-3922. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/GELEK W TOPGYAL/Primary Examiner, Art Unit 2481