DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 10/19/2021 has been entered.

Response to Arguments
Applicant's arguments and amendments filed in the Amendment with RCE filed October 19, 2021 (herein “Amendment”), regarding the rejection of claims 1-21 under 35 U.S.C. 103 have been fully considered but they are not persuasive. 
On page 11 of the Amendment, Applicant first argues that none of the cited references of record disclose that the claimed obtained semantics include at least one functional aspect implied by the input. Regarding the cited Sproat reference, Applicant argues that Sproat performs linguistic analysis on the input text to understand the spatial semantics based on dependency structures of the input text, but does not 
However, Sproat is directed towards a comprehensive analysis of input text to convert same into a three-dimensional scene description, and sets forth right in the Abstract that in generating the scene, objects, poses, facial expressions, and environments are combined so that they represent the input set of words. Sproat details the “posing” and “facial expression” descriptor generation within its broader disclosure, and accordingly these passages are cited below in the updated rejection rationale. As such, Applicant’s remarks regarding Sproat not teaching or suggesting the newly amended limitations are not persuasive.
Applicant further argues that Sproat does not teach human machine dialogue at all. While Sproat is not relied upon for disclosing the teachings of “in a human machine dialogue” or that the received input is “from one of an automated dialogue companion and a person engaged in the human machine dialogue with the automated dialogue companion,” Sproat does disclose that text input is from typed text into a computer using a computer keyboard (Sproat col. 3, ll. 13-17) Sproat further discloses potential use cases for its system, including integration into a videogame by a user
Applicant further argues on the bottom of page 11 onto page 12 that Hatami-Hanza “does not disclose human machine dialogue at all,” despite Hatami-Hanza literally stating in cited para. 119: “Using the methods, a chat-robot or chatting machine can produce relevant responses to the input of a chatter so as to make the conversation 
between the user and the machine an intelligent conversation.  A system can be envisioned that can converse with a user in which a user write or say something and the system, using the disclosed method, response back in some form or type of media content that has certain semantic relationship with the user input.  Such system can be used as a Q&A service for users and clients wherein the system provides variety of contents for the user in response to his input (question).”
Applicant then characterizes the above paragraph of Hatami-Hanza as “speculation” and argues it does not provide sufficient enablement for the claimed limitations upon which it was relied upon, to qualify as prior art. However, the limitations for which para 119 of Hatami-Hanza was relied upon exactly recite: 1) in a human machine dialogue; and 2) an input from one of an automated dialogue companion and a person engaged in the human machine dialogue with the automated dialogue companion.
It is noted that prior art is presumed to be operable/enabling, and once such a reference is found, the burden is on applicant to rebut the presumption of operability. See MPEP 2121(I). In this way, Applicant’s remarks characterizing Hatami-Hanza as being non-enabled simply because in one part, Hatami-Hanza uses the phrase “a system can be envisioned” fails to adequately rebut this presumption of operability. Indeed, so long as a “vision” is detailed enough to enable one of ordinary skill in the art See MPEP 2121.01. In any event, "Even if a reference discloses an inoperative device, it is prior art for all that it teaches." MPEP 2121.01 (II), citing Beckman Instruments v. LKB Produkter AB, 892 F.2d 1547, 1551 (Fed. Cir. 1989). Given that Hatami-Hanza literally discloses a chat-robot or chatting machine can produce relevant responses to the input of a chatter so as to make the conversation 
between the user and the machine an intelligent conversation, Hatami-Hanza at least in para. 119, is enabled for the teachings of a human machine dialogue; and an input from one of an automated dialogue companion and a person engaged in the human machine dialogue with the automated dialogue companion.
	Therefore in view of the above, while all of Applicant’s arguments have been fully considered, they are not persuasive and the rejection in view of the combination of Sproat and Hatami-Hanza is maintained.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have 

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-21 are rejected under 35 U.S.C. 103 as being unpatentable over Sproat, (US 8,086,028 B2, herein “Sproat”), further in view of Hatami-Hanza, (US 2011/0093343 A1, herein “Hatami-Hanza”).
Regarding claim 1, Sproat teaches a method implemented on at least one machine including at least one processor, memory, (Sproat col. 30, lines 40-62, apparatus implementing a text-to-scene conversion system including a processor and memory) and communication platform capable of connecting to a network for (Sproat col. 30, line 44, and col. 31, lines 4-9, network interface that operatively couples the processor to a communications network) for visualizing a scene, the method comprising (Sproat col. 4, lines 10-24, flow of actions embodied by the invention with an end result that a three-dimensional scene is rendered onto an image): 
receiving an input, wherein the input provides a description of a visual scene to be created (Sproat col. 4, lines 10-16 and col. 3, lines 13-23, text is input to the system (thus the system receiving the input), where the text is for example something like “John said that the cat is on the table” (which is a description of a visual scene to be created)); 
Sproat col. 4, lines 16-21, text is passed to a part of speech tagger, which tags the text with grammatical parts of speech, and parsed to form a dependency structure (linguistic processing), and the dependency structure is semantically interpreted and converted into a scene description (semantics)), wherein the semantics include at least one functional aspect implied by the input (Sproat col. 18, lines 10-55, and col. 7, lines 3-45, and col. 12, lines 53-63, the scene description including depiction of actions from the input text which can include visualizations of those actions via poses, the poses being implied from the action verb of the input text); 
generating a scene log to be used for rendering the visual scene based on the semantics of the input, wherein the scene log specifies, at least one of (Sproat col. 4, lines 21-24, and col. 10, lines 44-48, the scene description (scene log) is generated from the dependency structure being semantically interpreted, and where the scene description is interpreted into a three-dimensional scene for rendering into an image)
a background of the visual scene (Sproat col. 12, lines 44-45, ground planes and lighting is added to the scene (background elements), where col. 19, lines 42-51, teaches that the environment or setting of the scene is specified by the entered text, and is considered to be the background of the scene, and gives an example of “John walked through a forest” to be the background of the scene to be a forest, where an environmental database is accessed to render the background – thus the background determined based on the meaning of the input text (semantics) to be a forest), 
Sproat col. 7, lines 9-61, scene description including nodes of type OBJECT, and the designations therein that are a reference to a viewpoint™ catalog number, where the example given for “John said that the cat is on the table” shows an object referring to John with an “Action” defined as “say”), and 
at least one parameter associated with the one or more entities/objects for dynamically configuring the one or more entities/objects in the background in a manner that the rendered one or more entities/objects are capable of satisfying the at least one functional aspect (Sproat col. 7, lines 4-61, scene description is a description of the objects to be depicted in the scene and the relationships between the objects as determined from the processing/parsing of the input text sentences, where each object node includes a reference to a viewpoint™ catalog number to a 3D model representation of the object, and where col. 19, lines 42-61 teaches that the objects analyzed to be in the input text are placed upon the specified or default background, where col. 12, lines 52-61, and col. 13, line 29 – col. 14, line 10, teach each description element in the scene description having a type such as ACTION, and where ACTION is disclosed further as having instances and giving an example where the ACTION instance is “kick” and includes a make-pose-depictor “kick” which configures the subject “John” from input text “John kicked the ball” to be three feet behind a ball object (in the background) and in a “kick ball” pose to satisfy the kick action (functional aspect)); and 
Sproat col. 20, lines 24-56, after a three-dimensional scene description has been generated, a three-dimensional image is rendered using any number of three-dimensional rendering programs, where col. 15, line 64 – col. 16, line 44 teach that the viewpoint™ objects (from the catalog number (parameter)) define polygonal models for a library of objects and include additional information such as parts, color parts, opacity parts, default size and spatial tags for visual rendering of the objects).
Sproat does not explicitly teach visualizing a scene in a human machine dialogue, or an input from one of an automated dialogue companion and a person engaged in the human machine dialogue with the automated dialogue companion, or determined based on a current state of the human machine dialogue.
Hatami-Hanza teaches visualizing a scene in a human machine dialogue (Hatami-Hanza paras. [0010], [0077], [0096]-[0098], system for generating visual representative content for a given textual content which partitions input and matches it to an inventory of content partitions including descriptions of a scene, where para. [0119] teaches that one of the applications for the disclosed invention is in a chat-robot or chatting machine), an input from one of an automated dialogue companion and a person engaged in the human machine dialogue with the automated dialogue companion (Hatami-Hanza para. [0119] chat-robot or chatting machine to produce relevant responses to the input of a chatter so as to make the conversation between the user and the machine an intelligent conversation, where the (human) user writes something and the system responds back using media content that has a semantic relationship with the user input). Hatami-Hanza further teaches determined based on a current state of the human machine dialogue (Hatami-Hanza para. [0119] given that the chat-robot is responding to user input and generating content with a semantic relationship to the user input, it is based on a current state of the chat-robot and user chat).
Therefore, taking the teachings of Sproat and Hatami-Hanzi together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the text to scene conversion aspects specifically cited to above from Sproat with the chat-robot application disclosed in Hatami-Hanzi at least because a chat-bot that could convert text to a visual message would be more appealing and informative in a question/answer service chat-bot context seeking to answer questions from a user (see Hatami-Hanzi paras. [0009] and  [0119]).
Regarding claims 2, 9, and 16, Sproat teaches wherein the input is at least one of an utterance and a text (Sproat col. 3, lines 13-23, text input is by entering the text via a voice-to-text translation program or by typing the text into a computer using the computer keyboard).
Regarding claims 3 and 10, Sproat teaches wherein the step of performing linguistic processing of the input comprises: recognizing a plurality of words in the input based on a vocabulary (Sproat col. 4, lines 16-21 and 49-63, text is tagged and parsed to form a dependency structure, where the tags are all parts of speech, where each word in the sentence is recognized as a specific part of speech including that “John” is a proper noun, based on the MXPOST tagger mapping words to parts of speech (thus using a vocabulary)); 
Sproat col. 4, lines 18-20 and col. 4, line 64 – col. 5, line 48, tagged text is parsed and converted to a dependency structure, including generating a parse tree representing the structure of the sentence using a statistical parser organizing the parts of speech tags into grammar groups (thus based on a language model) such as prepositional phrase and noun phrase); and 
identifying the semantics of the input based on the language processing result (Sproat col. 5, line 49 – col. 5, line 62, the parse tree is converted into a dependency structure which is one possible representation of the semantic relations of a sentence).
Regarding claims 4, 11 and 18, Sproat teaches wherein the at least one parameter includes at least one of a spatial parameter, a functional parameter, a contextual parameter, and a semantic parameter (Sproat col. 15, line 65 – col. 16, line 44, noting that the claim only requires “at least one,” additional information is associated with a viewpoint model such as a spatial tag (spatial parameter), also col. 17, lines 17-55 teaching that additional information including functional properties to describe objects operating in a particular way, also col. 7, line 62 – col. 8, line 57 teaching that scene description fragments are derived from the dependency structure by semantic interpretation frames, where a thesaurus is referenced to provide various kinds of semantic relations such as a hypenym and hyponym relation (cat is a kind of animal, table is a type of furniture) and the three-dimensional model will include a list of all cats, and a list of all tables – for example see col. 11, lines 1 – 49 giving an example of a scene description for “the animal was next to a bowl of apples,” and including that the two main objects “animal” and “bowl of apples” includes various types of animals and various types of bowls such as just a bowl or a fruit bowl specifically, also see the “stative-relation” (another semantic parameter) giving a positional relation “next to” for the animal object with the bowl object).
Regarding claims 5, 12, and 19, Sproat teaches wherein the spatial parameter associated with an entity/object specifies at least one of a pose of the entity/object in the background, an orientation of the entity/object, and a spatial relatedness of the entity/object with respect to another entity/object (noting that the claim only requires “at least one of,” Sproat col. 16, line 38 – col. 17, line 16, spatial tags used to spatially arrange and juxtapose objects together, such as depicting the “in or “on” spatial relationship, where col. 15, lines 51-58 teaches spatial tags enclosure and base provide a rendering of a bird on the floor of a birdcage (spatial relatedness of a bird object to a birdcage object)).
Regarding claims 6, 13, and 20, Sproat teaches wherein the functional parameter associated with an entity/object specifies a function of the entity/object determined based on the semantics of the input and an associated visual feature of another entity/object due to the function of the entity/object (Sproat col. 17, lines 17-55 teaching that additional information including functional properties to describe objects operating in a particular way, such as translating the sentence “John rides to the store” as associating “ride” (a verb entity) as a vehicle based on what makes sense from the input sentence (the semantics – where a human “rides” a vehicle), and objects that function as land vehicles specifically are marked, where a land vehicle (as another entity/object) would have an associated visual feature of contact with land)
Regarding claims 7, 14, and 21, Sproat teaches wherein the semantic parameter associated with an entity/object specifies a first visual feature associated with the entity/object and a second visual feature associated with a different entity/object, wherein the first and second features are in alignment based on the semantics (Sproat col. 10, line 50 – col. 11, line 49, and col. 12, line 60 to col. 13, line 40, scene description parameters such as “POSSIBLE-COREFERENT” and “STATIVE-RELATION” providing information from the semantics of the input sentence (thus alignment based on the semantics) to have placement of the various nouns (objects) in the sentence such as animal and bowl of apples to have the animal be placed next to (visual feature) the bowl of apples, and that there is a visual relationship (visual feature) for placement of the apples with respect to the bowl, for example, the figure and ground references).
Regarding claim 8, Sproat teaches machine readable and non-transitory medium having information recorded thereon (Sproat col. 30, lines 40-62, apparatus implementing a text-to-scene conversion system including a processor and memory) for visualizing a scene (Sproat col. 4, lines 10-24, flow of actions embodied by the invention with an end result that a three-dimensional scene is rendered onto an image), wherein the information, when read by the machine, causes the machine to perform (Sproat col. 30, lines 40-67, memory storing instructions to perform a method according to the disclosed invention, and the processor being programmed to execute instructions to perform the method):
receiving an input wherein the input provides a description of a visual scene to be created (Sproat col. 4, lines 10-16 and col. 3, lines 13-23, text is input to the system (thus the system receiving the input), where the text is for example something like “John said that the cat is on the table” (which is a description of a visual scene to be created)); 
performing linguistic processing of the input to obtain semantics of the input (Sproat col. 4, lines 16-21, text is passed to a part of speech tagger, which tags the text with grammatical parts of speech, and parsed to form a dependency structure (linguistic processing), and the dependency structure is semantically interpreted and converted into a scene description (semantics)), wherein the semantics include at least one functional aspect implied by the input (Sproat col. 18, lines 10-55, and col. 7, lines 3-45, and col. 12, lines 53-63, the scene description including depiction of actions from the input text which can include visualizations of those actions via poses, the poses being implied from the action verb of the input text); 
generating a scene log to be used for rendering the visual scene based on the semantics of the input, wherein the scene log specifies at least one of (Sproat col. 4, lines 21-24, and col. 10, lines 44-48, the scene description (scene log) is generated from the dependency structure being semantically interpreted, and where the scene description is interpreted into a three-dimensional scene for rendering into an image)
a background of the visual scene (Sproat col. 12, lines 44-45, ground planes and lighting is added to the scene (background elements), where col. 19, lines 42-51, teaches that the environment or setting of the scene is specified by the entered text, and is considered to be the background of the scene, and gives an example of “John walked through a forest” to be the background of the scene to be a forest, where an environmental database is accessed to render the background – thus the background determined based on the meaning of the input text (semantics) to be a forest), 
Sproat col. 7, lines 9-61, scene description including nodes of type OBJECT, and the designations therein that are a reference to a viewpoint™ catalog number, where the example given for “John said that the cat is on the table” shows an object referring to John with an “Action” defined as “say”), and 
at least one parameter associated with the one or more entities/objects for dynamically configuring the one or more entities/objects in the background in a manner that the rendered one or more entities/objects are capable of satisfying the at least one functional aspect (Sproat col. 7, lines 4-61, scene description is a description of the objects to be depicted in the scene and the relationships between the objects as determined from the processing/parsing of the input text sentences, where each object node includes a reference to a viewpoint™ catalog number to a 3D model representation of the object, and where col. 19, lines 42-61 teaches that the objects analyzed to be in the input text are placed upon the specified or default background, where col. 12, lines 52-61, and col. 13, line 29 – col. 14, line 10, teach each description element in the scene description having a type such as ACTION, and where ACTION is disclosed further as having instances and giving an example where the ACTION instance is “kick” and includes a make-pose-depictor “kick” which configures the subject “John” from input text “John kicked the ball” to be three feet behind a ball object (in the background) and in a “kick ball” pose to satisfy the kick action (functional aspect)); and 
rendering the visual scene based on the scene log by visualizing the background and the one or more entities/objects in accordance with the at least one parameter (Sproat col. 20, lines 24-56, after a three-dimensional scene description has been generated, a three-dimensional image is rendered using any number of three-dimensional rendering programs, where col. 15, line 64 – col. 16, line 44 teach that the viewpoint™ objects (from the catalog number (parameter)) define polygonal models for a library of objects and include additional information such as parts, color parts, opacity parts, default size and spatial tags for visual rendering of the objects).
Sproat does not explicitly teach visualizing a scene in a human machine dialogue, or an input from one of an automated dialogue companion and a person engaged in the human machine dialogue with the automated dialogue companion, or determined based on a current state of the human machine dialogue.
Hatami-Hanza teaches visualizing a scene in a human machine dialogue (Hatami-Hanza paras. [0010], [0077], [0096]-[0098], system for generating visual representative content for a given textual content which partitions input and matches it to an inventory of content partitions including descriptions of a scene, where para. [0119] teaches that one of the applications for the disclosed invention is in a chat-robot or chatting machine), an input from one of an automated dialogue companion and a person engaged in the human machine dialogue with the automated dialogue companion (Hatami-Hanza para. [0119] chat-robot or chatting machine to produce relevant responses to the input of a chatter so as to make the conversation between the user and the machine an intelligent conversation, where the (human) user writes something and the system responds back using media content that has a semantic relationship with the user input). Hatami-Hanza further teaches determined based on a current state of the human machine dialogue (Hatami-Hanza para. [0119] given that the chat-robot is responding to user input and generating content with a semantic relationship to the user input, it is based on a current state of the chat-robot and user chat).
Therefore, taking the teachings of Sproat and Hatami-Hanzi together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the text to scene conversion aspects specifically cited to above from Sproat with the chat-robot application disclosed in Hatami-Hanzi at least because a chat-bot that could convert text to a visual message would be more appealing and informative in a question/answer service chat-bot context seeking to answer questions from a user (see Hatami-Hanzi paras. [0009] and  [0119]).
Regarding claim 15, a system for visualizing a scene, comprising (Sproat col. 30, lines 40-62, col. 4, lines 10-24, apparatus implementing a text-to-scene conversion system including a processor and memory, where the flow of actions embodied by the invention with an end result that a three-dimensional scene is rendered onto an image):
a textual input based scene understanding unit implemented by a processor and configured for (Sproat col. 30, lines 40-67, processor configured to execute instructions required to perform the method in accordance with the invention) 
receiving an input wherein the input provides a description of a visual scene to be created (Sproat col. 4, lines 10-16 and col. 3, lines 13-23, text is input to the system (thus the system receiving the input), where the text is for example something like “John said that the cat is on the table” (which is a description of a visual scene to be created)); 
performing linguistic processing of the input to obtain semantics of the input (Sproat col. 4, lines 16-21, text is passed to a part of speech tagger, which tags the text with grammatical parts of speech, and parsed to form a dependency structure (linguistic processing), and the dependency structure is semantically interpreted and converted into a scene description (semantics)), wherein the semantics include at least one functional aspect implied by the input (Sproat col. 18, lines 10-55, and col. 7, lines 3-45, and col. 12, lines 53-63, the scene description including depiction of actions from the input text which can include visualizations of those actions via poses, the poses being implied from the action verb of the input text); 
generating a scene log to be used for rendering the visual scene based on the semantics of the input, wherein the scene log specifies at least one of (Sproat col. 4, lines 21-24, and col. 10, lines 44-48, the scene description (scene log) is generated from the dependency structure being semantically interpreted, and where the scene description is interpreted into a three-dimensional scene for rendering into an image)
a background of the visual scene (Sproat col. 12, lines 44-45, ground planes and lighting is added to the scene (background elements), where col. 19, lines 42-51, teaches that the environment or setting of the scene is specified by the entered text, and is considered to be the background of the scene, and gives an example of “John walked through a forest” to be the background of the scene to be a forest, where an environmental database is accessed to render the background – thus the background determined based on the meaning of the input text (semantics) to be a forest), 
one or more entities/objects that are to appear in the visual scene to achieve the at least one functional aspect (Sproat col. 7, lines 9-61, scene description including nodes of type OBJECT, and the designations therein that are a reference to a viewpoint™ catalog number, where the example given for “John said that the cat is on the table” shows an object referring to John with an “Action” defined as “say”), and 
Sproat col. 7, lines 4-61, scene description is a description of the objects to be depicted in the scene and the relationships between the objects as determined from the processing/parsing of the input text sentences, where each object node includes a reference to a viewpoint™ catalog number to a 3D model representation of the object, and where col. 19, lines 42-61 teaches that the objects analyzed to be in the input text are placed upon the specified or default background, where col. 12, lines 52-61, and col. 13, line 29 – col. 14, line 10, teach each description element in the scene description having a type such as ACTION, and where ACTION is disclosed further as having instances and giving an example where the ACTION instance is “kick” and includes a make-pose-depictor “kick” which configures the subject “John” from input text “John kicked the ball” to be three feet behind a ball object (in the background) and in a “kick ball” pose to satisfy the kick action (functional aspect)); and 
a semantics based visual scene rendering unit implemented by a processor and configured for (Sproat col. 30, lines 40-67, processor configured to execute instructions required to perform the method in accordance with the invention) rendering the visual scene based on the scene log by visualizing the background and the one or more entities/objects in accordance with the at least one parameter (Sproat col. 20, lines 24-56, after a three-dimensional scene description has been generated, a three-dimensional image is rendered using any number of three-dimensional rendering programs, where col. 15, line 64 – col. 16, line 44 teach that the viewpoint™ objects (from the catalog number (parameter)) define polygonal models for a library of objects and include additional information such as parts, color parts, opacity parts, default size and spatial tags for visual rendering of the objects).

Hatami-Hanza teaches visualizing a scene in a human machine dialogue (Hatami-Hanza paras. [0010], [0077], [0096]-[0098], system for generating visual representative content for a given textual content which partitions input and matches it to an inventory of content partitions including descriptions of a scene, where para. [0119] teaches that one of the applications for the disclosed invention is in a chat-robot or chatting machine), an input from one of an automated dialogue companion and a person engaged in the human machine dialogue with the automated dialogue companion (Hatami-Hanza para. [0119] chat-robot or chatting machine to produce relevant responses to the input of a chatter so as to make the conversation between the user and the machine an intelligent conversation, where the (human) user writes something and the system responds back using media content that has a semantic relationship with the user input). Hatami-Hanza further teaches determined based on a current state of the human machine dialogue (Hatami-Hanza para. [0119] given that the chat-robot is responding to user input and generating content with a semantic relationship to the user input, it is based on a current state of the chat-robot and user chat).
Therefore, taking the teachings of Sproat and Hatami-Hanzi together as a whole, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the text to scene conversion aspects 
Regarding claim 17, Sproat teaches wherein the textual input based scene semantics understanding unit comprises: a signal processing unit implemented by a processor and configured for (Sproat col. 30, lines 40-67, processor configured to execute instructions required to perform the method in accordance with the invention) recognizing a plurality of words in the input based on a vocabulary (Sproat col. 4, lines 16-21 and 49-63, text is tagged and parsed to form a dependency structure, where the tags are all parts of speech, where each word in the sentence is recognized as a specific part of speech including that “John” is a proper noun, based on the MXPOST tagger mapping words to parts of speech (thus using a vocabulary)); 
a language understanding unit implemented by a processor and configured for (Sproat col. 30, lines 40-67, processor configured to execute instructions required to perform the method in accordance with the invention) generating a language processing result based on the plurality of words in accordance with a language model (Sproat col. 4, lines 18-20 and col. 4, line 64 – col. 5, line 48, tagged text is parsed and converted to a dependency structure, including generating a parse tree representing the structure of the sentence using a statistical parser organizing the parts of speech tags into grammar groups (thus based on a language model) such as prepositional phrase and noun phrase); and 
Sproat col. 30, lines 40-67, processor configured to execute instructions required to perform the method in accordance with the invention) identifying the semantics of the input based on the language processing result (Sproat col. 5, line 49 – col. 5, line 62, the parse tree is converted into a dependency structure which is one possible representation of the semantic relations of a sentence).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHELLE M KOETH whose telephone number is (571)272-5908. The examiner can normally be reached Monday-Friday, 09:30-18:30 EDT/EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

MICHELLE M. KOETH
Primary Examiner
Art Unit 2656



/MICHELLE M KOETH/Primary Examiner, Art Unit 2656