Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 01/24/2020, 03/19/2021 and 09/29/2021 are being considered by the examiner.
Drawings
The drawing submitted on 11/05/2019 is accepted by the examiner.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-4, 6-9, 11-13, 15-18 and 20, are rejected under 35 U.S.C. 102(a) (1) as being anticipated by Knipp et al. (US 2016/0225187 A1).
Regarding Claims 1, 11, and 20, Knipp et al. teach: A method comprising: receiving, by a processing device, audio data ([0025] Similarly, one embodiment of Narratarium employs natural language processing (sometimes referred to herein as automatic speech recognition) to interpret stories told by a user or to convert spoken words, such as those read from a book, into text. [0041] Returning to FIGS. 1A, 1B, and 1C, example system 100 also includes one or more presentation environment sensor(s) 145. Sensor(s) 145 provide information from the presentation environment to storytelling platform 110 for facilitating storytelling. For example, such information might include spoken information from a user, such as words spoken, cadence, energy level, rate, spoken sound effects, singing, or other spoken and/or audible information; other sound generated by a user (e.g., clapping, tapping or patting, playing an instrument, etc.)); analyzing contextual data associated with a the user([0027] In some embodiments, the storytelling experience provided by the Narratarium is contextually tailored to the user and/or the user's environment, based on, for example, user history or preferences, information about the presentation environment, storytelling conditions or other contextual input. [0028] The story experience provided by embodiments of Narratarium are dynamic because the story elements included in a presented story may be modified based on user and/or contextual input, including information provided in real time. [0041] …physical information about the presentation environment, such as dimensions, presence, and/or location of objects in the presentation environment, such as toys, furniture, windows, doorways, etc., which may be incorporated into a story; ambient light; and information about the user, such as location, motion or gestures, activity level, position (e.g., the direction a user is facing), the number of users in the presentation environment, or other information that may be sensed from the presentation environment.); determining a match between the audio data and data of a the text source ([0025] For example, in one embodiment, text-matching is employed to recognize a specific known story text or scene and provide corresponding imagery and sounds. Similarly, one embodiment of Narratarium employs natural language processing (sometimes referred to herein as automatic speech recognition) to interpret stories told by a user or to convert spoken words, such as those read from a book, into text.); and initiating a physical effect in response to the determining of the match, wherein the physical effect corresponds to the text source and is based on the contextual data([0025] For example, in one embodiment, text-matching is employed to recognize a specific known story text or scene and provide corresponding imagery and sounds. Similarly, one embodiment of Narratarium employs natural language processing (sometimes referred to herein as automatic speech recognition) to interpret stories told by a user or to convert spoken words, such as those read from a book, into text. [0026] For example, a story might involve the user exercising, dancing, or reaching out to touch or interact with a projected character. Other cues and objects in the presentation environment, such as doorways, windows, and furniture, may also be incorporated into a story experience; for example, a character's appearance and sound may be projected as though entering through an actual doorway to the room or flying through an actual window of the room. [0067] In an embodiment, story content is included (or pointed to by the coded story) in a manner corresponding to locations of the story wherein the story element should be introduced (for example, playing a sound of a wave crashing when the story shows an image of a beach). [0068] In some embodiments, this includes an immersive story experience that wraps a projected visual story around the audience while playing layered audio, such as music, character dialog, sound effects, etc. [0134] In other words, story content comprising images, sounds, animations, settings, etc., corresponding to ice, icebergs, south pole, etc., is determined. In the example above, this story content may include visual and/or audio information of story elements such as a penguin, ice, howling winds, icebergs, water, splashing, etc.).).

Regarding Claims 2 and 12, Knipp et al. teach: The method of claim 1, wherein the physical effect corresponding to the text source modifies an environment of the user and comprises at least one of an acoustic effect, an optical effect, or a haptic effect (See rejection of claim 1).

Regarding Claims 3 and 13, Knipp et al. teach: The method of claim 1, wherein the contextual data comprises at least one of sound data, light data, time data, weather data, calendar data, or user profile data (See rejection of claim 1 and [0026] For example, a story might involve the user exercising, dancing, or reaching out to touch or interact with a projected character. Other cues and objects in the presentation environment, such as doorways, windows, and furniture, may also be incorporated into a story experience; for example, a character's appearance and sound may be projected as though entering through an actual doorway to the room or flying through an actual window of the room. [0027] In some embodiments, the storytelling experience provided by the Narratarium is contextually tailored to the user and/or the user's environment, based on, for example, user history or preferences, information about the presentation environment, storytelling conditions or other contextual input. For example, the time of day may determine story length and level of excitement such that a story presented 10 minutes before a user's bedtime is made an appropriate length and winds down the excitement level so as to prepare a child for bed. [0066] In one embodiment, story logic includes logic for altering the story (or presenting story content for facilitating story branching, such as the firefly example) based on a determined attention level of the child. Where the child appears distracted, logic may specify introducing a heightened energy level, which may correspond to more sound effects, visual movement, or character actions. [0067] In an embodiment, story content is included (or pointed to by the coded story) in a manner corresponding to locations of the story wherein the story element should be introduced (for example, playing a sound of a wave crashing when the story shows an image of a beach).).

Regarding Claim 4, Knipp et al. teach: The method of claim 1, wherein the contextual data comprises sound data (images or visual component of a story associated with a sound, i.e. beach, penguin etc.) of an environment of the user and wherein the physical effect comprises an acoustic effect at a volume (sound effect) based on the sound data (See rejection of claim 1 and [0049] For example, a bedtime rule or condition may indicate that a story should end by the user's bedtime and should wind down energy level so as to prepare a user for sleep. In some embodiments, story logic 127 may also be associated with items in the resources library 125 and/or relationships knowledge representation component 115; for example, a penguin might be associated with library items such as scenes and sounds of the arctic.
 [0067] In an embodiment, story content is included (or pointed to by the coded story) in a manner corresponding to locations of the story wherein the story element should be introduced (for example, playing a sound of a wave crashing when the story shows an image of a beach). [0068] In some embodiments, this includes an immersive story experience that wraps a projected visual story around the audience while playing layered audio, such as music, character dialog, sound effects, etc. [0134] In other words, story content comprising images, sounds, animations, settings, etc., corresponding to ice, icebergs, south pole, etc., is determined. In the example above, this story content may include visual and/or audio information of story elements such as a penguin, ice, howling winds, icebergs, water, splashing, etc.).
 
Regarding Claims 6 and 15, Knipp et al. teach: The method of claim 1, wherein the text source comprises a word and wherein initiating the physical effect is responsive to detecting that the audio data comprises the word (See rejection of claim 1).

Regarding Claims 7 and 16, Knipp et al. teach: The method of claim 6, further comprising: selecting the physical effect based on the word of the text source; and updating an attribute of the physical effect based on the contextual data (See rejection of claim 1).

Regarding Claims 8 and 17, Knipp et al. teach: The method of claim 1, wherein determining the match between the audio data and data of a text source comprises detecting that the audio data comprises a word of the text source (story) using phoneme data (text of the story) of the text source (See rejection of claim1, specifically [0025] For example, in one embodiment, text-matching is employed to recognize a specific known story text or scene and provide corresponding imagery and sounds. Similarly, one embodiment of Narratarium employs natural language processing (sometimes referred to herein as automatic speech recognition) to interpret stories told by a user or to convert spoken words, such as those read from a book, into text.). Note: determining the match between the audio data and data of a text source using phoneme data is inherent in the process of natural language processing of text matching and conversion.).

Regarding Claims 9 and 18, Knipp et al. teach: The method of claim 1, wherein the data of the text source comprises phoneme data (story), and wherein determining the match comprises calculating a phoneme edit distance between the phoneme data of the text source and phoneme data of the audio data (See rejection of claim1, specifically [0025] For example, in one embodiment, text-matching is employed to recognize a specific known story text or scene and provide corresponding imagery and sounds. Similarly, one embodiment of Narratarium employs natural language processing (sometimes referred to herein as automatic speech recognition) to interpret stories told by a user or to convert spoken words, such as those read from a book, into text.). Note: calculating a phoneme edit distance between the phoneme data of the text source and phoneme data of the audio data is inherent in the process of natural language processing of text matching and conversion.).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b) (2) (C) for any potential 35 U.S.C. 102(a) (2) prior art against the later invention.

Claims 5 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Knipp et al. in view of Roche et al. (US 10586369 B1).
Regarding Claim 5, Knipp et al. teach:  contextual data comprises data of an environment (including ambient light) of the user and wherein the physical effect comprises an optical effect that modifies data based on the story and contextual data (See rejection of claim 1 and [0025] Thus, some embodiments of the Narratarium may be considered to be context aware. For example, as a child tells a story about a jungle, the child's room is filled with greens and browns and foliage comes into view. Animals that live in the jungle may be introduced or suggested as characters to the story. Similarly, as a parent tells a story to a child (including a parent, grandparent, or other person(s) telling the story from a remote location), the room is filled with images, colors, sounds, and presence, based on the story. Narratarium further determines other story elements consistent with the provided information, such as appropriate settings or characters; for example, a penguin might appear in an arctic setting but not in a desert setting. [0041] Returning to FIGS. 1A, 1B, and 1C, example system 100 also includes one or more presentation environment sensor(s) 145. Sensor(s) 145 provide information from the presentation environment to storytelling platform 110 for facilitating storytelling. For example, such information might include spoken information from a user, such as words spoken, cadence, energy level, rate, spoken sound effects, singing, or other spoken and/or audible information; other sound generated by a user (e.g., clapping, tapping or patting, playing an instrument, etc.); physical information about the presentation environment, such as dimensions, presence, and/or location of objects in the presentation environment, such as toys, furniture, windows, doorways, etc., which may be incorporated into a story; ambient light; and information about the user, such as location, motion or gestures, activity level, position (e.g., the direction a user is facing), the number of users in the presentation environment, or other information that may be sensed from the presentation environment. Examples of sensor(s) 145 include, by way of example and not limitation, one or more cameras, depth-imaging or depth-determining systems, microphones, which might include noise-canceling functionality, ambient light sensors, motion sensors, scanners, GPS or location sensors, or other such devices capable of receiving information from the presentation environment.).
Knipp et al. however do not explicitly teach: wherein the contextual data comprises light data of an environment of the user and wherein the physical effect comprises an optical effect that modifies a luminance of a light source based on the light data.
Roche et al. teach: wherein the contextual data comprises light data of an environment of the user and wherein the physical effect comprises an optical effect that modifies a luminance of a light source based on the light data (Col 3, line 51 to Col 4, line 22, In various embodiments, the text may indicate effects associated with objects in the virtual environment and/or objects in the real world environment (e.g., Internet of Things (IoT) and/or other devices capable of exchanging data with the various services and/or devices described herein, such as via wireless signals. The effects may include animations associated with virtual objects, such as movement of those objects and/or operation of the objects (e.g., output of light, sounds, etc.). For example, input text may include “I am turning on the lights.” The services described herein may create animations to show an avatar interact with a virtual light switch in the virtual environment and then depict a light changing from a state of “off” to a state of “on” by creating visual effects to show the same. When configured, the input text may cause a real world light to turn on or may control other real world devices such as haptic devices, sound producing devices, air movement devices (e.g., fans, mist devices, etc.), vibration devices for haptic feedback, and/or other real world devices, possibly via transmission of wireless signals. The speech input ingestion service may analyze the input text to determine virtual objects and/or real world objects that may be candidates for animation or other effects. The speech input ingestion service may create metadata and/or include information in the SMD to initiate the object animations and/or effects, such as in coordination with animation and speech associated with words that indicate the object animation and/or effect. The input text may include explicit words to enable association of tags, speakers, and/or real world objects (e.g., IoT devices, etc.). However, in some instances, contextual information may be used to create these associations. For example when turning on a light, the context may suggest that a tag “light switch” is relevant to the phrase “turning on a light”. Other contextual information may be used, which may aggregated based on machine learning algorithms and/or historical data, as discussed herein.).
Therefore it would have been obvious to one of ordinary skilled in the art before the effective filling date of the invention was made for Knipp et al. to include the teaching of Roche et al. above in order to contextual information to be used to create visual effects based on the input text.

Regarding Claim 14, Knipp et al. teach: The system of claim 11, wherein: the contextual data comprises sound data of an environment of the user and wherein the physical effect comprises an acoustic effect at a volume based on the sound data (See rejection of claim 4) and contextual data comprises data of an environment (including ambient light) of the user and wherein the physical effect comprises an optical effect that modifies data based on the story and contextual data (see Knipp et al. teaching in rejection of claim 5).
Knipp et al. however do not explicitly teach: wherein the contextual data comprises light data of an environment of the user and wherein the physical effect comprises an optical effect that modifies a luminance of a light source based on the light data.
Roche et al. teach: wherein the contextual data comprises light data of an environment of the user and wherein the physical effect comprises an optical effect that modifies a luminance of a light source based on the light data (Col 3, line 51 to Col 4, line 22, In various embodiments, the text may indicate effects associated with objects in the virtual environment and/or objects in the real world environment (e.g., Internet of Things (IoT) and/or other devices capable of exchanging data with the various services and/or devices described herein, such as via wireless signals. The effects may include animations associated with virtual objects, such as movement of those objects and/or operation of the objects (e.g., output of light, sounds, etc.). For example, input text may include “I am turning on the lights.” The services described herein may create animations to show an avatar interact with a virtual light switch in the virtual environment and then depict a light changing from a state of “off” to a state of “on” by creating visual effects to show the same. When configured, the input text may cause a real world light to turn on or may control other real world devices such as haptic devices, sound producing devices, air movement devices (e.g., fans, mist devices, etc.), vibration devices for haptic feedback, and/or other real world devices, possibly via transmission of wireless signals. The speech input ingestion service may analyze the input text to determine virtual objects and/or real world objects that may be candidates for animation or other effects. The speech input ingestion service may create metadata and/or include information in the SMD to initiate the object animations and/or effects, such as in coordination with animation and speech associated with words that indicate the object animation and/or effect. The input text may include explicit words to enable association of tags, speakers, and/or real world objects (e.g., IoT devices, etc.). However, in some instances, contextual information may be used to create these associations. For example when turning on a light, the context may suggest that a tag “light switch” is relevant to the phrase “turning on a light”. Other contextual information may be used, which may aggregated based on machine learning algorithms and/or historical data, as discussed herein.).
Therefore it would have been obvious to one of ordinary skilled in the art before the effective filling date of the invention was made for Knipp et al. to include the teaching of Roche et al. above in order to contextual information to be used to create visual effects based on the input text.

Claims 10 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Knipp et al.
Regarding Claims 10 and 19, Knipp et al. teach: wherein the contextual data comprises user profile data including an identity and preference of a user (i.e. a child) and wherein the physical effect comprises an acoustic effect selected based on the story and the identity and preference of the user (i.e. child) ([0024] At a high level, embodiments of the Narratarium are designed to augment storytelling and creative play of the user, which might include a child or a parent. The story experience provided by some embodiments of Narratarium may include story elements (such as plotlines, characters, settings, themes, duration, or other aspects of a story) based on information provided by the user and/or the environment, including information provided in real time, as well as information derived from printed stories, audio recordings, toys, or other sources. Thus, some embodiments of the Narratarium may be considered to be context aware. For example, as a child tells a story about a jungle, the child's room is filled with greens and browns and foliage comes into view. Animals that live in the jungle may be introduced or suggested as characters to the story. Similarly, as a parent tells a story to a child (including a parent, grandparent, or other person(s) telling the story from a remote location), the room is filled with images, colors, sounds, and presence, based on the story. Narratarium further determines other story elements consistent with the provided information, such as appropriate settings or characters; for example, a penguin might appear in an arctic setting but not in a desert setting. [0050] Some embodiments of storage 120 also stores user information 129, which may include, by way of example and not limitation, user preferences (e.g., favorite characters, themes, user(s)′ bedtime information), which may be used by storytelling engine 160 for determining story length and energy level; previous stories (including dynamic stories) presented to a user, which may be used for presenting a user's favorite story elements (e.g., characters, themes, etc.) more frequently or presenting new story elements (e.g., new plots, settings, characters, etc.); user profiles, which may be used for storing user information when there is more than one user, and which may include voice profile information for identifying a user based on their voice; user accounts or account information, which may be used by embodiments providing content through a subscription model or downloadable story packages or expansion sets, or may facilitate users sharing their stories or story content with other users on other Narratarium systems.). [0101] Furthermore, in some embodiments, the virtual assistant is tailored to the storyteller or listener, for example, accounting for age and ability to interact. [0108] Additionally, some embodiments of storytelling engine 160 are capable of modifying the story experience (including real-time changes) to keep a child-user (or audience) appropriately engaged for a suitable length of time, such as after the parent leaves the room and/or until the child-user falls asleep.
Knipp et al. do not explicitly teach: contextual data comprises user profile data indicating an age of a child and wherein the physical effect comprises an acoustic effect selected based on the age of the child.
However contextual data comprises user profile data indicating an age of a child and wherein the physical effect comprises an acoustic effect selected based on the age of the child, would be obvious in the Knipp et al. teaching since Knipp et al. teach that Narratarium is context aware, i.e. child tells a story and Narratarium further determines other story elements consistent with the provided information, such as appropriate settings or characters ([0024]). Knipp et al. further teaches user profiles, containing user information which include information for identifying a user and account information, which may be used providing contents ([0050]) and further teaches: virtual assistant is tailored to the storyteller or listener, for example, accounting for age and ability to interact ([0101] and storytelling engine 160 are capable of modifying the story experience (including real-time changes) to keep a child-user (or audience) appropriately engaged for a suitable length of time ([0108]) and wherein the contextual information comprises at least one of the time of day, day of week, age level of a user located in the presentation environment, bed time of a user located in the presentation environment, or emotional-energy level of speech by a storyteller in the presentation environment (Claim 6).
Therefore it would have been obvious to one of ordinary skilled in the art before the effective filling date of the invention was made for Knipp et al. to explicitly include the teaching of, “contextual data comprises user profile data indicating an age of a child and wherein the physical effect comprises an acoustic effect selected based on the age of the child” in order to determines story elements consistent with a user information, such as appropriate settings or characters with respect to age of the user located in the presentation environment.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Goslin et al.(US 2017/0228026 A1) teach: Identifying a plurality of storytelling devices available to participate in a storytelling experience, including a first storytelling device at a first physical location and a second storytelling device at a remote second physical location. The method further comprises receiving, based on user input during playback of a story, an instruction to perform a first action of a predetermined plurality of actions using the second storytelling device, the user input indicating a user interaction with a depiction of the second storytelling device at the first physical location.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMAD K ISLAM whose telephone number is (571)270-5878. The examiner can normally be reached Monday -Friday, EST (IFP).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached on 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/MOHAMMAD K ISLAM/Primary Examiner, Art Unit 2656