DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 01/20/2022 has been entered.
Response to Amendment
The Amendment filed 07/29/2021 overcomes the objection to claims 13 and 20 for informalities.  
Response to Arguments
Applicant’s arguments with respect to claim(s) 1-21 have been considered but are moot in view of the new grounds of rejection set forth below, as necessitated by the amendment. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-3, 5-6, 8-10, 12-13, 15-17, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over US PG Pub. 2019/0204907 A1 (hereinafter “Xie”) in view of “Characterizing Structural Relationships in Scenes Using Graph Kernels” (hereinafter “Fisher”, 2011), and further in view of US PG Pub. 2012/0213426 A1 (hereinafter “Fei-Fei”).
Regarding claim 1, Xie teaches a method implemented on at least one machine including at least one processor, memory, and communication platform capable of connecting to a network for determining a type of a scene including a user (Xie, Fig. 1A-1B, Fig. 2, Fig. 8, ¶0034-0046, 0099-0100; Xie discloses a method for achieving human-machine interaction which includes determining scene information), the method comprising: 
receiving image data acquired by a camera with respect to the scene (Xie, ¶0057-0060: "The HMI processing unit 540 may be configured to process the information received or stored by the server 150. For example, the HMI processing unit 540 may perform calculations on the information, make a judgment on the information, or the like. The information may include image information...The image information may include a photo or video relating to a user and an application scene."); 
Although Xie further teaches that "The scene recognition unit 543 may perform a scene recognition using the input information collected by the input device 120 to obtain a target scene in which the user uses the HMI system 100...For example, the scene recognition unit 543 may determine the target scene based on the image signal captured by the camera/video camera" (Xie, ¶0068), he does not expressly teach the limitations as further claimed, but, in analogous field of endeavor, Fisher does as follows. 
Fisher teaches detecting, from the image data, discrete objects present in the scene (Fisher, Section 2.2 ; “we first segment our scene into meaningful objects, then insert edges that represent relationships between pairs of objects.”);
analyzing the discrete objects to extract spatial locations in the scene associated therewith (Fisher, Section 2.4 – Section 3; “We define a polygon in mesh A to be a contact polygon with respect to mesh B if there is a polygon in mesh B such that the distance between the two polygons is less than a small epsilon, and the angle between the unoriented face normals of the polygons is less than one degree. The results in this paper use a contact epsilon of 2mm, using the unit scaling provided with the scene. For the databases we use, we also have a well defined gravity vector that describes the global orientation of the scene.”);  
determining spatial relationships among the discrete objects based on the spatial locations (Fisher, Section 2.4 – Section 3; “In order to represent a scene as a graph, we need a way to take the geometric representation of the scene and produce a set of relationships between pairs of objects. These relationships might be largely geometric in nature (“object A is horizontally displaced by two meters relative to object B”) or largely semantic (“object A is in front of object B”).”); and 
inferring a type of the scene based on the discrete objects and the spatial relationships thereof in accordance with at least one scene context-free grammar model, each of which models one type of scene by specifying multiple discrete objects and spatial relationships thereof generally observed in the type of scene (Fisher, p.2, Section 2.4 – Section 4; “By constructing a node and edge set for each input scene we obtain its representation as a relationship graph. Our goal is to compare two relationship graphs or subparts of relationship graphs.” A database of scene models comprising object nodes and a relationship graph is used to determine if an input scene matches a model in the database.).
Fisher is considered analogous art because it pertains to image-based scene classification. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the scene classification method taught by Xie to use scene relationship graph representations as the models corresponding to each type of scene, incorporating both object and object spatial (Fisher, Section 7). 
The combination of Xie in view of Fisher does not expressly teach the limitations as further claimed, but, in an analogous field of endeavor, Fei-Fei does as follows. 
Fei-Fei teaches detecting, from the image data, discrete objects present in the scene based on one or more object recognition models, wherein the discrete objects are different from the user in the scene and the camera (Fei-Fei, ¶0039-0040; “a number of object detectors 406 are run across an image 402 at different scales 404.”).
Fei-Fei is considered analogous art because it pertains to image-based scene classification. Therefore, it would have been obvious to one of ordinary sill in the art before the effective filing date of the claimed invention to modify the scene classification step taught by the combination of Xie and Fisher to include the steps of analyzing the image and detecting objects based on multiple object detectors, as taught by Fei-Fei, in order to more accurately distinguish between and identify different types of scenes based on the detected objects (Fei-Fei, ¶0065).

Regarding claim 2, claim 1 is incorporated, and Xie in the combination further teaches wherein the scene further includes a user engaged in a dialogue with a machine (Xie, ¶0068; “The scene recognition unit 543 may perform a scene recognition using the input information collected by the input device 120 to obtain a target scene in which the user uses the HMI system 100.”).

Regarding claim 3, claim 2 is incorporated, and Xie in the combination further teaches wherein the type of the scene, once inferred based on the at least one scene context-free grammar model, is to be used by the machine to facilitate dialogue control (Xie, ¶0069-0070; "The output information generation unit 544 may generate output information based on the semantic analysis result generated by the semantic judgment unit 542 and the image information, the text information, information regarding the geographic location, the scene information, and other information received by the input device 120…the output information may include information related to a verbal expression, a tone, a voiceprint information, etc. of the language represented by the avatar that can generate a voice signal.").

Regarding claim 5, claim 1 is incorporated, and Fisher in the combination further teaches wherein each of the scene context-free grammar model corresponds to a type of scene, represented by at least one of
a first type of node representing a first object and specifying a plurality of sub-objects that need to be all present in the scene in order for the scene to qualify as the type; and a second type of node representing a second object present in the scene and specifying a plurality of alternative instances of the object (Fisher, Figure 2, Section 3: Representing Scenes as Graphs; “The nodes of a relationship graph represent all objects in the scene and the edges represent the relationships between these objects… Object nodes can be connected by both contact and scene graph inheritance relationships. Note how the monitor and keyboard nodes are both scene graph children of the computer node.”). 

Regarding claim 6, claim 1 is incorporated, and Xie in the combination further teaches wherein the step of inferring the type of the scene is further based on sound associated with the scene (Xie, ¶0060-0061, 0068; “In some embodiments, the scene recognition unit 543 may determine the target scene by matching the user intention information with information of specific scenes stored in the database 160.” The user intention information is based on voice input by the user, which has been interpreted to correspond to the claimed “sound associated with the scene”). 

Claim 8 recites machine readable and non-transitory medium having information recorded thereon for determining a type of a scene having features corresponding to elements recited in method claim 1. The elements recited in claim 8 are therefore mapped to the proposed combination in the same manner as the corresponding elements of claim 1. Additionally, the rationale and motivation to combine the Xie, Fisher, and Fei-Fei references presented in the rejection of claim 1 apply to this claim, and Xie in the combination further teaches a machine readable and non-transitory medium having information recorded thereon for determining a type of a scene (Xie, ¶0118-0119).

Claim 9 recites a machine readable and non-transitory medium having features corresponding to elements recited in method claim 2. The elements recited in claim 9 are therefore mapped to the proposed combination in the same manner as the corresponding elements of claim 2. Additionally, the rationale and motivation to combine the Xie, Fisher, and Fei-Fei references presented in the rejection of claim 2 apply to this claim.

Claim 10 recites a machine readable and non-transitory medium having features corresponding to elements recited in method claim 3. The elements recited in claim 10 are therefore mapped to the proposed combination in the same manner as the corresponding elements of claim 3. Additionally, the rationale and motivation to combine the Xie, Fisher, and Fei-Fei references presented in the rejection of claim 3 apply to this claim.

Claim 12 recites machine readable and non-transitory medium having features corresponding to elements recited in method claim 5. The elements recited in claim 12 are therefore mapped to the proposed combination in the same manner as the corresponding elements of claim 5. Additionally, the rationale and motivation to combine the Xie, Fisher, and Fei-Fei references presented in the rejection of claim 5 apply to this claim.

Claim 13 recites machine readable and non-transitory medium having features corresponding to elements recited in method claim 6. The elements recited in claim 13 

Claim 15 recites a system having features corresponding to elements recited in method claim 1. The elements recited in claim 15 are therefore mapped to the proposed combination in the same manner as the corresponding elements of claim 1. Additionally, the rationale and motivation to combine the Xie, Fisher, and Fei-Fei references presented in the rejection of claim 1 apply to this claim.

Claim 16 recites a system having features corresponding to elements recited in method claim 2. The elements recited in claim 16 are therefore mapped to the proposed combination in the same manner as the corresponding elements of claim 2. Additionally, the rationale and motivation to combine the Xie, Fisher, and Fei-Fei references presented in the rejection of claim 2 apply to this claim.

Claim 17 recites a system having features corresponding to elements recited in method claim 3. The elements recited in claim 17 are therefore mapped to the proposed combination in the same manner as the corresponding elements of claim 3. Additionally, the rationale and motivation to combine the Xie, Fisher, and Fei-Fei references presented in the rejection of claim 3 apply to this claim.

Claim 19 recites a system having features corresponding to elements recited in method claim 5. The elements recited in claim 19 are therefore mapped to the proposed combination in the same manner as the corresponding elements of claim 5. Additionally, the rationale and motivation to combine the Xie, Fisher, and Fei-Fei references presented in the rejection of claim 5 apply to this claim.

Claim 20 recites a system having features corresponding to elements recited in method claim 6. The elements recited in claim 20 are therefore mapped to the proposed combination in the same manner as the corresponding elements of claim 6. Additionally, the rationale and motivation to combine the Xie, Fisher, and Fei-Fei i references presented in the rejection of claim 6 apply to this claim.

Claims 4, 11 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Xie in view of Fisher and Fei-Fei, as applied to claims 1, 8 and 15 above, and further in view of US PG Pub. 2017/0249504 A1 (hereinafter “Martinson”).
Regarding claim 4, claim 1 is incorporated, and the combination of Xie, Fisher and Fei-Fei does not expressly teach the limitations as further claimed, but, in an analogous field of endeavor, Kollar does as follows.
Kollar teaches receiving training data related to different scenes, wherein the training data include information related to objects in each of the training scenes and spatial relationships thereof, and machine learning, based on the training data, the at least one scene context-free grammar model (Kollar, Introduction, Section III.C.; “we will show that by using object-object and object-scene context learned from captions attached to photos on Flickr, we can robustly predict the locations of a wide variety of other objects and scenes” and “we require a large database of information about which objects tend to be spatially co-located, and which objects tend to occur in which scenes.”).
Kollar is considered analogous art because it pertains to image-based scene recognition by a robot interacting with a user. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the scene modeling method taught by the combination of Xie, Fisher and Fei-Fei to model the scenes based on learning using a large database of information about which objects tend to be spatially co-located, and which objects tend to occur in which scenes, as taught by Kollar, in order to improve the accuracy of scene recognition and thereby improve the ability of the robot to interact with the user (Kollar, Introduction).

Claim 11 recites machine readable and non-transitory medium having features corresponding to elements recited in method claim 4. The elements recited in claim 11 are therefore mapped to the proposed combination in the same manner as the corresponding elements of claim 4. Additionally, the rationale and motivation to combine the Xie, Fisher, Fei-Fei and Kollar references presented in the rejection of claim 4 apply to this claim.

Claim 18 recites a system having features corresponding to elements recited in method claim 4. The elements recited in claim 18 are therefore mapped to the proposed .

Claims 7, 14 and 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Xie in view of Fisher and Fei-Fei, as applied to claims 1, 8 and 15 above, and further in view of US PG Pub. 2017/0249504 A1 (hereinafter “Martinson”).
Regarding claim 7, claim 5 is incorporated, and the combination of Xie, Fisher, and Fei-Fei does not expressly teach the limitations as further claimed, but, in an analogous field of endeavor, Martinson does as follows. 
Martinson teaches updating a 3D space occupancy record corresponding to the scene based on the objects (Martinson, ¶0008; “…using each measurement from a place recognition algorithm to update an occupancy grid as the robot moves through the space. Importantly, each measurement update reflects the region of view observed by the camera, attempting to learn a classification for both obstacles and empty space in the occupancy grid.”). 
Martinson is considered analogous art because it pertains to scene classification using a robot engaged in dialog with a human user. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the human-machine interaction and scene classification method taught by the combination of Xie, Fisher, and Fei-Fei to include a step of updating an occupancy grid of a space observed by the machine, as taught by Martinson, in order to (Martinson, ¶0008).

Claim 14 recites machine readable and non-transitory medium having features corresponding to elements recited in method claim 7. The elements recited in claim 14 are therefore mapped to the proposed combination in the same manner as the corresponding elements of claim 7. Additionally, the rationale and motivation to combine the Xie, Fisher, Fei-Fei and Martinson references presented in the rejection of claim 7 apply to this claim.

Claim 21 recites a system having features corresponding to elements recited in method claim 7. The elements recited in claim 21 are therefore mapped to the proposed combination in the same manner as the corresponding elements of claim 7. Additionally, the rationale and motivation to combine the Xie, Fisher, Fei-Fei and Martinson references presented in the rejection of claim 7 apply to this claim.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure in that the references cited pertain to image-based object detection and scene classification.

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SAMAH A BEG whose telephone number is (571)270-7912. The examiner can normally be reached M-F 9 AM - 5 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, VU LE can be reached on 571-272-7332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/SAMAH A BEG/Primary Examiner, Art Unit 2668