DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicants arguments filed on 5/17/2022 are fully considered as follows:
Applicant argues that the 35 USC 103 rejection to the independent claims should not be maintained in view of “Oleynik does not disclose or suggest the arrangement where the processor classifies video data through a pre-trained video classification model when the received demonstration data is the video data, the classified video data is pre-labeled video data, and the label of the video data is at least one category or attribute determined according to at least one of content or format of the video data, as described in newly amended claim 1 of the present application. Behrend et al. fails to disclose or suggest the features of newly amended claim 1.” However, a new ground of rejection is below in view of the amendments.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 4, 6, 8, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Oleynik (US 9815191 B2) in view of Behrend (WO 2015187813 A1) in further view of Wolf (US 20200273575 A1).
 	Regarding Claim 1, Oleynik teaches A cooking robot system, which is server-based and recognizes an image of an object to implement a motion, the 5cooking robot system comprising (Fig. 7D food preparation software element 14, computer element 16, Abstract: The present disclosure is directed to methods, computer program products, and computer systems for instructing a robot to prepare a food dish by replacing the human chef's movements and actions.): 
a robot to (Col 32 Lines 19-20 dual-arm robotics system comprised of torso 74, arms 72, wrists 71 and multi-fingered hands 72): 
acquire the image of the object through a sensor and generate image data to transmit the image data to a server (Col 45 Lines 61-66 The robotic apparatus analyzes the image of the immediate environment from the visual sensors and compares it with the saved image of the chefs studio cooking program, so that appropriate movements are made to achieve identical results.), and 
implement a motion for the object based on motion data corresponding to the image data (Col 24 Lines 63-57 The quality check module 96 can also be configured to conduct quality testing of an object based on senses, such as the smell of the food, the color of the food, the taste of the food, and the image or appearance of the food. Col 25 Lines 18-21 The chef movements replication module 106 is configured to replicate the chef's precise movements in preparing a dish based on the stored software recipe file in the memory 52) and 
15the server configured to detect the motion data for the object (Col 68 Lines 38-44 At step 1426, the computer 16 creates three-dimensional models for all non-standardized objects and stores their type and attributes (size, dimensions, usage, etc.) in the computer's system memory, either on a computing device or on a cloud computing environment, and defines the shape, size and type of the non-standardized objects Col 92 Lines 61-63 The computer devices 12 may represent any or all of the 24, server 10, or any network intermediary devices) and control the robot by searching for a motion corresponding to the image data via a web server to generate the motion data (Fig. 55 Col 70 Lines 4-16 In step 1462, robotic food preparation engine 56 is configured to send instructions to the robotic apparatus to move food or ingredients from standardized containers to the food preparation position. In step 1464, the robotic food preparation engine 56 is configured to instruct the robotic apparatus to start food preparation at the start time “0” by replicating the food dish from the software recipe script file. In step 1466, the robotic apparatus in the standardized kitchen 50 replicates the food dish with the same movement as the chef's arms and fingers, the same ingredients, with the same pace, and using the same standardized kitchen equipment and tools.) 
wherein the cooking robot system interworks with an artificial intelligence server and is implemented based on an artificial intelligence to generate the motion data by automatically 5recognizing the image of the object  (Col 1 Lines 55-61 The present invention relates generally to the interdisciplinary fields of robotics and artificial intelligence, more particularly to computerized robotic food preparation systems for food preparation by digitizing the food preparation process of professional and non-professional chef dishes and subsequently replicating a chef's cooking movements, processes and techniques with real-time electronic adjustments Col 45 Lines 61-66 The robotic apparatus analyzes the image of the immediate environment from the visual sensors and compares it with the saved image of the chefs studio cooking program, so that appropriate movements are made to achieve identical results)  
wherein the robot includes: 
the sensor to generate the image data by 10acquiring the image of the object and (Col 40 Lines 7-11 FIG. 9A is a block diagram illustrating an example of the robotic hand 72 with five fingers and a wrist with RGB-D sensor, camera sensors and sonar sensor capabilities for detecting and moving a kitchen tool, an object, or an item of kitchen equipment.); 
wherein the server includes: 
a database to store at least one of the image data for the object, the demonstration data, or the motion data (Col 56 Lines 20-23 At step 968, the computer 16 stores the updated revision information to the knowledge database pertaining to the corrected process, condition and parameters.); 15and 
a processor to: 
search for and compare the image data received from the robot in the database (Col 53 Lines 3-7 the one or more robotic arms 70 and hands 72 compare the results of cooking against the controlled data (such as temperature, weight, loss, etc.) and the media data (such as color, appearance, smell, portion-size, etc.), as illustrated in step 858.)
transmit the motion data to control the robot (Col 4 Line 54-57 transmitting the respective electronic record for a food dish to a robotic apparatus capable of replicating the sequence of stored mini-manipulations, corresponding to the original actions of the chef; )
Oleynik does not expressly disclose, but Behrend discloses or receive a motion of a user with respect to the object from an 10input device upon a request of the server and generate demonstration data to transmit the demonstration data to the server (Fig. 1 [0012] a computing system including one or more processors and one or more memories that comprises receiving, using the one or more processors, a request for a product demonstration from a demo application executing on a client device of a user; receiving, using the one or more processors, survey response data input by the user into the demo application and transmitted via a computer network;), 
or the demonstration data ([0012] the user's interactions with the product demonstration; ); 
or by generating the motion data corresponding to the demonstration data.  ([0077]  In block 506, responsive to the user being authenticated, the demo curation module 236 collects demo metadata and configuration data from the user.) 
the input device to generate the demonstration data by receiving the motion of the user upon the request of the server and  ([0078]  block 510, the demo curation module 234 receives form data (demo data) which includes the user-submitted information about that feature of the product.) 
In this way, the system of Behrend includes automated product demonstration. Like Oleynik, Behrend is concerned with demonstration.
Therefore, from these teachings of Behrend and Oleynik , one of ordinary skill in the art at the time the filing was made would have found it obvious to apply the teachings of Behrend to the system of Oleynik since doing so would enhance the system by improving viewer satisfaction by saving them time and helping them quickly get to the video content they need to see.
Oleynik does not expressly disclose, but Wolf discloses request the demonstration data to the robot when 20matched data is absent, and ([0535] A surgical procedure may include a procedure performed by one or more surgeons. A surgeon may include any person performing a surgical procedure, including a doctor or other medical professional, any person assisting a surgical procedure, and/or a surgical robot. [0208] a surgeon may request surgical summary footage for review or training purposes. The user may submit the request through a computer device, such as a laptop, a desktop computer, a mobile phone, a tablet, smart glasses or any other form of computing device capable of submitting requests. In some embodiments, the request may be received electronically through a network and the aggregate may be presented based on receipt of the request.)
wherein the processor is configured to classify video data through a pre-trained video classification model when received demonstration data is the video data, ([0586] received image data may be analyzed using an artificial neural network configured to detect and/or identify an anatomical structure from images and/or videos. Training examples may include image data labeled or otherwise classified as depicting an anatomical structure (e.g., images classified as depicting a pancreas).)
wherein the classified video data is pre-labeled video data, andBIRCH, STEWART, KOLASCH & BIRCH, LLPEHC/HJK/thdApplication No.: 16/565,802Docket No.: 3449-3842PUS1Reply to Office Action of February 17, 2022Page 3 of 13 ([0586] Training examples may include image data labeled or otherwise classified as depicting an anatomical structure)
wherein a label of the video data is at least one category or attribute determined according to at least one of content or format of the video data. ([0189] An example of such training example may include a video clip depicting an event together with a label indicating the event type. [0218] upon request of a user, presenting to the user an aggregate of the first group of frames of the particular surgical footage, while omitting presentation to the user of the second group of frames. The request of the user may be received from a computing device which may include a user interface enabling the user to make the request. In some embodiments, the user may further request frames associated with a particular type or category of intraoperative events.)
In this way, the system of Wolf includes methods for analysis of videos of surgical procedures. Like Oleynik, Wolf is concerned with robots.
Therefore, from these teachings of Wolf and Oleynik , one of ordinary skill in the art at the time the filing was made would have found it obvious to apply the teachings of Wolf to the system of Oleynik since doing so would enhance the system by providing various types of decision support to surgeons.20
Regarding Claim 154, Oleynik teaches wherein the robot further includes: a communication device to transmit data acquired from the sensor or the input device to the server.  (Col 46 Lines 43-49 The head costume 628 includes feedback devices with vision camera, sonar, laser, or radio frequency identification (RFID) and a custom pair of glasses that are used to sense, capture, and transmit the captured data to the computer 16 for recording and storing images that the chef 48 observes during the food preparation process)
Regarding Claim 56, Oleynik teaches wherein the robot further includes: an output device including a speaker or a display to notify a current status or progress of the robot to an outside of the robot by using a voice or an image.  (Col 56 Lines 5-11 At step 956, the computer 16 monitors the food preparation process via a multimodal sensor that generates raw data supplied to abstraction software where the robotic apparatus compares real-world output against controlled data based on multimodal sensory data (visual, audio, and any other sensory feedback). Col 147 Lines 13-15 an output device (such as a screen, speaker, and/or the like))
Regarding Claim 8, Oleynik teaches wherein the server further includes: 5a communicator to receive data acquired from the sensor and the input device and transmit the motion data to the robot.  (Fig. 88 communication module element 2112, input/output module element 1362, Col 4 Lines 54-57 transmitting the respective electronic record for a food dish to a robotic apparatus capable of replicating the sequence of stored mini-manipulations, corresponding to the original actions of the chef.)
Regarding Claim 19, Oleynik does not expressly disclose, but Behrend discloses wherein the robot requests a demonstration video to the user and generates the demonstration data using the motion of the user received through the input device to transmit the demonstration data to the server. (Fig. 1 [0081] the stakeholder logs into the demonstration system through an application interface (e.g., a web interface), selects a button to create a new product demo, fills out online forms for demo properties, adds custom questions for personalization survey, uploads a short summary video and a longer in-depth video for each feature (e.g., topic), finishes creating the demo and embeds the demo into a website he/she administers using embed code provided by the demonstration system, which allows the stakeholder's customers to view the demo via the website)
In this way, the system of Behrend includes automated product demonstration. Like Oleynik, Behrend is concerned with demonstration.
Therefore, from these teachings of Behrend and Oleynik , one of ordinary skill in the art at the time the filing was made would have found it obvious to apply the teachings of Behrend to the system of Oleynik since doing so would enhance the system by improving viewer satisfaction by saving them time and helping them quickly get to the video content they need to see.
Claims 5, 9, 10, 12, 16, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Oleynik (US 9815191 B2) in view of Behrend (Wo 2015187813 A1) in further view of Francis (US 8996429 B1) in further view of Wolf (US 20200273575 A1).
Regarding Claim 5, Oleynik teaches wherein the sensor uses at least one of an RGB sensor or a depth sensor to generate the image data by recognizing the object (Col 40 Lines 14-17 The RGB-D sensor 500 or the sonar sensor 504 f is capable of detecting the location, dimensions and shape of the object to create a three-dimensional model of the object. Col 40 Lines 32-37 A suitable example of RGB-D (a red light beam, a green light beam, a blue light beam, and depth) sensor is the Kinect system by Microsoft, which features an RGB camera, depth sensor and multi-array microphone running on software, which provide full-body 3D motion capture, facial recognition and voice recognition capabilities), 
Oleynik does not expressly disclose, but Francis discloses and uses an RGBD recorder to generate the demonstration data.  (Col 13 Lines 27-37 At block 510, the method 500 includes storing the received information for future recognitions. For example, after receiving the information from the cloud, the robot would be able to recognize the object in the future enabling the robot to learn and adapt. At block 512, the method 500 includes performing an action based on the received information. The action may vary based on a type of received information, or the query that is presented by the robot. As an example, the robot may query the cloud to identify an object and details of the object to enable the robot to interact with the object)
In this way, the system of Francis includes a robot requesting information about an image and receiving from a cloud/user information about an object to enable the robot to perform an action to interact with the object. Like Oleynik, Francis is concerned with user and robot instruction.
Therefore, from these teachings of Oleynik and Francis, one of ordinary skill in the art at the time the filing was made would have found it obvious to apply the teachings of Francis to the system of Oleynik since doing so would enhance the system by including a robot requesting information about an image and receiving from a cloud/user information about an object to enable the robot to perform an action to interact with the object.
Regarding Claim 9, Oleynik teaches wherein 10the processor includes: a searcher to search for and compare the image data received from the robot in the database (Col 53 Lines 3-7 the one or more robotic arms 70 and hands 72 compare the results of cooking against the controlled data (such as temperature, weight, loss, etc.) and the media data (such as color, appearance, smell, portion-size, etc.), as illustrated in step 858.))
Oleynik does not expressly disclose, but Francis discloses a calculator to estimate the motion of the user from the demonstration data received from the robot (Col 6 Lines 66-67 – Col 7 Line 1 The cloud 102 may be configured to perform calculations or analysis on the data and return processed data to the robot client 118); and 15a converter to generate motion data for converting the motion estimated by the calculator into a motion of the robot.  (Col 13 Lines 27-37 At block 510, the method 500 includes storing the received information for future recognitions. For example, after receiving the information from the cloud, the robot would be able to recognize the object in the future enabling the robot to learn and adapt. At block 512, the method 500 includes performing an action based on the received information. The action may vary based on a type of received information, or the query that is presented by the robot. As an example, the robot may query the cloud to identify an object and details of the object to enable the robot to interact with the object)
In this way, the system of Francis includes a robot requesting information about an image and receiving from a cloud/user information about an object to enable the robot to perform an action to interact with the object. Like Oleynik, Francis is concerned with user and robot instruction.
Therefore, from these teachings of Oleynik and Francis, one of ordinary skill in the art at the time the filing was made would have found it obvious to apply the teachings of Francis to the system of Oleynik since doing so would enhance the system by including a robot requesting information about an image and receiving from a cloud/user information about an object to enable the robot to perform an action to interact with the object.
Regarding Claim 10, Oleynik teaches A method of controlling a cooking robot, the method 20comprising (Abstract: The present disclosure is directed to methods, computer program products, and computer systems for instructing a robot to prepare a food dish by replacing the human chef's movements and actions): 
searching for and comparing image data in a database by a processor of a server that receives the image data of an object acquired through a sensor of a robot (Col 45 Lines 61-66 The robotic apparatus analyzes the image of the immediate environment from the visual sensors and compares it with the saved image of the chefs studio cooking program, so that appropriate movements are made to achieve identical results); 
wherein the cooking robot interworks with an artificial intelligence server and is implemented based on an artificial intelligence to automatically recognize an image of the object to generate the motion data.  (Col 1 Lines 55-61 The present invention relates generally to the interdisciplinary fields of robotics and artificial intelligence, more particularly to computerized robotic food preparation systems for food preparation by digitizing the food preparation process of professional and non-professional chef dishes and subsequently replicating a chef's cooking movements, processes and techniques with real-time electronic adjustments Col 45 Lines 61-66 The robotic apparatus analyzes the image of the immediate environment from the visual sensors and compares it with the saved image of the chefs studio cooking program, so that appropriate movements are made to achieve identical results)
to transmit the motion data to the robot (Col 4 Line 54-57 transmitting the respective electronic record for a food dish to a robotic apparatus capable of replicating the sequence of stored mini-manipulations, corresponding to the original actions of the chef; )
Oleynik does not expressly disclose, but Behrend discloses searching for demonstration data that 5represents a motion for a new object from a web server when the image data is determined as new image data not stored in the database, and requesting the demonstration data with respect to the new object to a user when the demonstration data is absent ([0081] the stakeholder logs into the demonstration system through an application interface (e.g., a web interface), selects a button to create a new product demo, fills out online forms for demo properties, adds custom questions for personalization survey, uploads a short summary video and a longer in-depth video for each feature (e.g., topic), finishes creating the demo and embeds the demo into a website he/she administers using embed code provided by the demonstration system, which allows the stakeholder's customers to view the demo via the website); and 
receiving the demonstration data for the new 10object by the processor to store the received demonstration data in the database and generating motion data corresponding to the image data by the processor (0080] In block 512, upon receiving the demo data (incrementally, in bulk, etc.), the demo curation module 234 processes the demo data according to the future to which it corresponds and stores it in block 514 accordingly in the data store 210), 
wherein the searching for demonstration data further includes requesting for the demonstration data to the user when the video matching the image data is absent from the web server and ([0081] the stakeholder logs into the demonstration system through an application interface (e.g., a web interface), selects a button to create a new product demo, fills out online forms for demo properties, adds custom questions for personalization survey, uploads a short summary video and a longer in-depth video for each feature (e.g., topic), finishes creating the demo and embeds the demo into a website he/she administers using embed code provided by the demonstration system, which allows the stakeholder's customers to view the demo via the website) 
In this way, the system of Behrend includes automated product demonstration. Like Oleynik, Behrend is concerned with demonstration.
Therefore, from these teachings of Behrend and Oleynik , one of ordinary skill in the art at the time the filing was made would have found it obvious to apply the teachings of Behrend to the system of Oleynik since doing so would enhance the system by improving viewer satisfaction by saving them time and helping them quickly get to the video content they need to see.
Oleynik does not expressly disclose, but Francis discloses and performing the motion for the object by the robot. (Col 13 Lines 32-37 At block 512, the method 500 includes performing an action based on the received information. The action may vary based on a type of received information, or the query that is presented by the robot. As an example, the robot may query the cloud to identify an object and details of the object to enable the robot to interact with the object) 
wherein the searching for demonstration data includes: 
10extracting a video matching the image data from the web server by the processor (Col 13 Lines 16-22 At block 508, the method 500 includes receiving information or retrieving information associated with the object. For example, the robot may receive data from the cloud indicating an identity of an object in the image, or other information related to or associated with characteristics about the object. In some examples, the cloud may perform object recognition on the uploaded image or video.); and 
generating motion data by extracting the motion for the object from the video. (Col 13 Lines 32-37 At block 512, the method 500 includes performing an action based on the received information. The action may vary based on a type of received information, or the query that is presented by the robot. As an example, the robot may query the cloud to identify an object and details of the object to enable the robot to interact with the object) 
wherein receiving the demonstration data for the new object further includes: 
estimating a motion of the user from the demonstration data received from the robot by a calculator (Col 6 Lines 66-67 – Col 7 Line 1 The cloud 102 may be configured to perform calculations or analysis on the data and return processed data to the robot client 118.); and 
generating the motion data configured to convert the motion 5estimated by the calculation unit into a motion of the robot by a converter.  (Col 13 Lines 27-37 At block 510, the method 500 includes storing the received information for future recognitions. For example, after receiving the information from the cloud, the robot would be able to recognize the object in the future enabling the robot to learn and adapt. At block 512, the method 500 includes performing an action based on the received information. The action may vary based on a type of received information, or the query that is presented by the robot. As an example, the robot may query the cloud to identify an object and details of the object to enable the robot to interact with the object)
In this way, the system of Francis includes a robot requesting information about an image and receiving from a cloud/user information about an object to enable the robot to perform an action to interact with the object. Like Oleynik, Francis is concerned with user and robot interaction.
Therefore, from these teachings of Oleynik and Francis, one of ordinary skill in the art at the time the filing was made would have found it obvious to apply the teachings of Francis to the system of Oleynik since doing so would enhance the system by including a robot requesting information about an image and receiving from a cloud/user information about an object to enable the robot to perform an action to interact with the object.
Oleynik does not expressly disclose, but Wolf discloses wherein the processor is configured to classify video data through a pre-trained video classification model when the received demonstration data is the video data, ([0586] received image data may be analyzed using an artificial neural network configured to detect and/or identify an anatomical structure from images and/or videos. Training examples may include image data labeled or otherwise classified as depicting an anatomical structure (e.g., images classified as depicting a pancreas).)
wherein the classified video data is pre-labeled video data, and ([0586] Training examples may include image data labeled or otherwise classified as depicting an anatomical structure)
wherein a label of the video data is at least one category or attribute determined according to at least one of content or format of the video data([0189] An example of such training example may include a video clip depicting an event together with a label indicating the event type. [0218] upon request of a user, presenting to the user an aggregate of the first group of frames of the particular surgical footage, while omitting presentation to the user of the second group of frames. The request of the user may be received from a computing device which may include a user interface enabling the user to make the request. In some embodiments, the user may further request frames associated with a particular type or category of intraoperative events.)
In this way, the system of Wolf includes methods for analysis of videos of surgical procedures. Like Oleynik, Wolf is concerned with robots.
Therefore, from these teachings of Wolf and Oleynik, one of ordinary skill in the art at the time the filing was made would have found it obvious to apply the teachings of Wolf to the system of Oleynik since doing so would enhance the system by providing various types of decision support to surgeons.20
Regarding Claim 12, Oleynik teaches wherein searching for and comparing image data in a database includes: generating the motion data from the demonstration data when the demonstration data matching the image data is present in the database (Col 45 Lines 61-66 The robotic apparatus analyzes the image of the immediate environment from the visual sensors and compares it with the saved image of the chefs studio cooking program, so that appropriate movements are made to achieve identical results); and 5transmitting the motion data to the robot to perform a motion for the object. (Fig. 55 Col 70 Lines 4-16 In step 1462, robotic food preparation engine 56 is configured to send instructions to the robotic apparatus to move food or ingredients from standardized containers to the food preparation position. In step 1464, the robotic food preparation engine 56 is configured to instruct the robotic apparatus to start food preparation at the start time “0” by replicating the food dish from the software recipe script file. In step 1466, the robotic apparatus in the standardized kitchen 50 replicates the food dish with the same movement as the chef's arms and fingers, the same ingredients, with the same pace, and using the same standardized kitchen equipment and tools)1S
Regarding Claim 16, Oleynik does not expressly disclose, but Francis discloses wherein receiving the demonstration data for the new object further includes: 10re-searching for a video from the web server when the motion of the robot is not converted. (Col 13 Lines 27-37 At block 510, the method 500 includes storing the received information for future recognitions. For example, after receiving the information from the cloud, the robot would be able to recognize the object in the future enabling the robot to learn and adapt. At block 512, the method 500 includes performing an action based on the received information. The action may vary based on a type of received information, or the query that is presented by the robot. As an example, the robot may query the cloud to identify an object and details of the object to enable the robot to interact with the object)
In this way, the system of Francis includes a robot requesting information about an image/video and receiving from a cloud/user information about an object to enable the robot to perform an action to interact with the object. Like Oleynik, Francis is concerned with user and robot interaction.
Therefore, from these teachings of Oleynik and Francis, one of ordinary skill in the art at the time the filing was made would have found it obvious to apply the teachings of Francis to the system of Oleynik since doing so would enhance the system by including a robot requesting information about an image/video and receiving from a cloud/user information about an object to enable the robot to perform an action to interact with the object.
Regarding Claim 17, Oleynik does not expressly disclose, but Francis discloses wherein receiving the demonstration data for the new object further includes: 15generating a motion model of the robot with respect to the object by accumulating at least one of the image data, the demonstration data, or the motion data into the database. (Col 13 Lines 32-37 At block 512, the method 500 includes performing an action based on the received information. The action may vary based on a type of received information, or the query that is presented by the robot. As an example, the robot may query the cloud to identify an object and details of the object to enable the robot to interact with the object)
In this way, the system of Francis includes a robot requesting information about an image/video and receiving from a cloud/user information about an object to enable the robot to perform an action to interact with the object. Like Oleynik, Francis is concerned with user and robot interaction.
Therefore, from these teachings of Oleynik and Francis, one of ordinary skill in the art at the time the filing was made would have found it obvious to apply the teachings of Francis to the system of Oleynik since doing so would enhance the system by including a robot requesting information about an image/video and receiving from a cloud/user information about an object to enable the robot to perform an action to interact with the object.
Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SARAH TRAN whose telephone number is (313)446-6642. The examiner can normally be reached 7:30am-4:30pm M-Th.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Khoi Tran can be reached on (571) 272-6919. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/S.A.T./Examiner, Art Unit 3664                                                                                                                                                                                                        

/KHOI H TRAN/Supervisory Patent Examiner, Art Unit 3664