DETAILED ACTION
Claims 1-20 have been examined.
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 2, 4, 6, 7, 9-12, 14, 15, and 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Okuda et al. “Machine Learning of Shooting Technique for Controlling a Robot Camera” (hereinafter Okuda), in view of Carr et al. (US 2016/0277673, hereinafter Carr). 

As per claim 1, Okuda teaches the invention as claimed, including a computer-implemented method for controlling a robot, the method comprising: 
control data selecting a first cinematic technique from a plurality of different cinematic techniques to execute when capturing sensor data (i.e., predicted status ȓt+1 of the robot camera, 
generating, by a configured network, a set of commands based on the control data (i.e., the learned neural network sends a control signal to bring the robot camera status to the predicted status ȓt+1, see at least pages 1111-1112, section II), wherein the configured network is trained based on the plurality of different cinematic technique (i.e., neural network learns a cameraman’s shooting technique in individual scenes, see at least page 1111, right column, paragraph 3, pages 1111-1112, section II); and 
causing a robotic camera to execute the first cinematic technique, based on the set of commands, to capture the sensor data (i.e., neural network sends a control signal to bring the robot camera status to the predicted status ȓt+1, see at least pages 1111-1112, section II).
Okura does not explicitly teach the control data selecting a first cinematic technique is received as a first input.
Carr teaches receiving a first input comprising control data selecting a first cinematic technique from a plurality of different cinematic techniques to execute when capturing sensor data (i.e., the regressor outputs pan-tilt-zoom settings predicting what a human camera operator would choose, the automatic broadcasting application may control motors in the autonomous robotic camera to achieve the planned pan-tilt-zoom settings, executing a separate algorithm which determines the signals that need to be sent to the autonomous robotic camera to control servo motors to achieve the desired pan-tilt-zoom settings, [0013], [0018], [0035]).
It would have been obvious to one of ordinary skill in the art at the time of the claimed invention to have modified Okura such that the control data selecting a first technique is received as an input as similarly taught by Carr because Okura discloses using the predicted status to 
	
As per claim 2, Okuda teaches wherein the configured network is trained to generate the set of commands based on training data associated with an exemplary robotic camera capturing exemplary sensor data (i.e., train the neural network using information sent to it when a cameraman manually controls the robot camera, see at least pages 1111-1114, sections II and III).

As per claim 4, Okuda teaches causing the robotic camera to execute the first cinematic technique by: processing the sensor data to identify a first cue (i.e., receive camera subject status, see at least pages 1111-1114, sections II and III); 
and configuring the robotic camera in response to the first cue (i.e., calculate predicted camera status based on the camera subject status, see at least pages 1111-1114, sections II and III).

As per claim 6, Okuda teaches causing the robotic camera to execute the first cinematic technique by adjusting a characteristic style with which the robotic camera captures the sensor data based on an exemplary characteristic style (i.e., the neural network is trained to bring the                                 
                                     
                                
                                                             
                                    
                                        
                                            
                                                
                                                    r
                                                
                                                ^
                                            
                                        
                                        
                                            t
                                            +
                                            1
                                        
                                    
                                
                             as close as possible to the value of the status                                 
                                    
                                        
                                            r
                                        
                                        
                                            t
                                            +
                                            1
                                        
                                    
                                
                             of the robot camera when it was being controlled by the cameraman, see at least pages 1111-1112, section II).

As per claim 7, Okuda teaches wherein the first cinematic technique comprises at least one cinematographic operation or a sequence of cinematographic operations (i.e., status of the robot camera including pan angle, tilt angle, zoom position, focus position, three-dimensional position, see at least pages 1111-1112, section II).

As per claim 9, Okuda teaches wherein the configured network comprises an artificial neural network trained via a machine learning algorithm (see at least pages 1111-1112, section II, pages 1113-1114, section III.C.). 

As per claims 10, 11, 15, 17, and 18, these are the computer-readable medium claims of claims 1, 2, 4, 6, and 7.  Therefore, claims 10, 11, 15, 17, and 18 are rejected using the same reasons as claims 1, 2, 4, 6, and 7.

As per claim 12, Okuda teaches wherein the training data indicates a mapping between a set of cues and corresponding cinematographic operations to be performed by the robotic camera in response to the set of cues (i.e., status                                 
                                    
                                        
                                            r
                                        
                                        
                                            t
                                        
                                    
                                
                             of the robot camera at time t and the camera subject status                                 
                                    
                                        
                                            s
                                        
                                        
                                            t
                                        
                                    
                                
                             at time t, see at least pages 1111-1112, section II).

As per claim 14, Okuda teaches wherein the training data indicates a set of style selections that influence execution of the first cinematic technique by the robotic camera (i.e., 
As per claim 19, this is the system claim of claim 1.  Therefore, claim 19 is rejected using the same reasons as claim 1.

As per claim 20, Okuda as modified teaches wherein the one or more processors, when executing the control engine, are configured to: receive the first input; generate the set of commands; and cause the robotic camera to execute the first cinematic technique (see at least pages 1111-1112, section II).
Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Okuda, in view of Carr, further in view of Kagei (US 2011/0267481).

As per claim 3, Okuda does not explicitly teach translating the set of commands into control signals for controlling one or more operations of the robotic camera.
Kagei teaches translating a set of commands into control signals for controlling one or more operations of a camera (i.e., operation instruction converted into signal, see at least [0032]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Okura to teach translating the set of commands into control signals for controlling one or more operations of the robotic camera as similarly taught by Kagei because Okuda teaches control signals for controlling one or more operations of . 

Claims 5 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Okuda, in view of Carr, further in view of Carr, Mistry, and Matthews “Hybrid Robotic/Virtual Pan-Tilt-Zoom Cameras for Autonomous Event Recording” (hereinafter Mistry).

As per claim 5, Okura does not explicitly teach causing the robotic camera to execute the first cinematic technique by: processing the sensor data to identify a first constraint to enforce; and configuring the robotic camera to enforce the first constraint.
Mistry teaches causing a robotic camera to execute a first cinematic technique by: processing sensor data to identify a first constraint to enforce (i.e., generate signals form the detected player positions output by the vision system, identify subjects of interest, rules of shot composition dictate how the camera should operate such that the object of interest appear in salient region of the image, see at least pages 193-194, section 1, pages 196-198, section 4); and 
configuring the robotic camera to enforce the first constraint (i.e., robotic camera to follow subject of interest, see at least pages 193-194, section 1, pages 196-198, section 4).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Okura to cause the robotic camera to execute the first technique by: processing the sensor data to identify a first constraint to enforce; and configuring the robotic camera to enforce the first constraint as similarly taught by Mistry in order to follow rules of shot composition when capturing a video (see at least pages 193-194, section 1, pages 196-198, section 4 of Mistry).
As per claim 16, this is the computer-readable medium claim of claim 5.  Therefore, claim 16 is rejected using the same reasons as claim 5.

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Okura, in view of Carr, further in view of Kim et al. (US 2017/0201714, hereinafter Kim).

As per claim 8, Okura does not explicitly teach further comprising converting the captured sensor data into multimedia data that comprises a sequence of video frames and a sequence of audio frames.
Kim teaches converting captured sensor data into multimedia data that comprises a sequence of video frames and a sequence of audio frames (i.e., capture image frames from image sensor, audio frames from audio sensor, generate video data, see at least [0029], [0032], [0034], [0035], [0038]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Okura to convert the captured sensor data into multimedia data that comprises a sequence of video frames and a sequence of audio frames as similarly taught by Kim because it is using known techniques to generate video from a camera. 

Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Okuda, in view of Carr, further in view of Lima et al. “Support Vector Machines for Cinematography Real-Time Camera Control in Storytelling Environments” (hereinafter Lima).

As per claim 13, Okura does not explicitly teach wherein the training data indicates one or more constraints that prevent the robotic camera from performing a set of cinematographic operations under a corresponding set of conditions.
Lima teaches training data indicates one or more constraints that prevent a camera from performing a set of cinematographic operations under a corresponding set of conditions (i.e., based on cinematography rules and principles, we perform the selection of best shots for these scenes and store them in a database together with features from the simulated scenes, training database is composed of several samples of the simulated scenes, each one with the features and the selected shot, see at least page 45, section III. pages 48-49, section V.B).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Okura the training data indicates one or more constraints that prevent the robotic camera from performing a set of cinematographic operations under a corresponding set of conditions as similarly taught by Lima in order to follow cinematography rules when selecting good shots (see at least page 45, section III. pages 48-49, section B of Lima).

Response to Arguments
Rejection of claims under §103:
As per independent claims 1, 10, and 19, Applicant argued that none of the cited of the cited references teaches or suggest receiving a first input comprising control data selecting a first cinematic technique from a plurality of different cinematic techniques to execute when capturing sensor data generating, by a configured network, a set of commands based on the control data, where the configured network is trained based on the plurality of different cinematic techniques.
t+1 of the robot camera, and sending a control signal to the robot camera to bring the robot camera to status ȓt+1, where the status of the robot camera include pan angle, title angle, zoom position, focus position, and three-dimensional position (see at least pages 1111-1112, section Il).  The camera status are cinematic techniques because pan angle, title angle, zoom position, focus position, and three-dimensional position are reasonably interpreted as cinematic techniques.  The prediction of camera status selects a first cinematic technique from a plurality of different cinematic techniques to execute. Carr teaches the control data selecting a first cinematic technique can be received as an input.  Okuda further teaches configured network is trained based on the plurality of different cinematic techniques by teaching neural network learns a cameraman’s shooting technique in individual scenes, which includes status of the robot camera when it was being controlled by the cameraman (see at least page 1111, right column, paragraph 3, pages 1111-1112, section II).

Conclusion
Applicant’s amendment necessitated the new ground(s) of rejection presented in this office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP §706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). 
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jue Louie whose telephone number is 571-270-1655.  The examiner can normally be reached on M-F 9:30 am - 5:00pm (EST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li Zhen can be reached on 571-272-3768.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/Jue Louie/
Primary Examiner
Art Unit 2121