DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
	This Office Action has been issued in response to Applicant’s Communication of amended application S/N 17/090,939 filed on November 18, 2022.  Claims 1, 3 to 8, 10 to 15, 17 to 22, and 24 are currently pending with the application.
	
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1, 3 to 8, 10 to 15, 17 to 22, and 24 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 
Claim 1 recites “reproduces one or more acoustic properties of the location and perspectives at different points in time based on interactions with the one or more images” in line 7.  These elements are not described in the specification.  In other words, it was not found where this process is described in the specification, and more specifically, a description of how the acoustic properties are reproduced based on interactions with the one or more images.
Same rationale applies to independent claims 8 and 15, since they recite similar limitations, and to claims 3 to 7, 10 to 14, 17 to 22, and 24, by virtue of their dependency on claims 1, 8, and 15.
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1, 3 to 8, 10 to 15, 17 to 22, and 24 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 1 recites the limitation “the embedded audio that reproduces one or more acoustic properties of the location and perspectives at different points in time” in line 6.  There is insufficient antecedent basis for this limitation in the claim.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 7, 8, 14, 15, 21, and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Schofield et al. (U.S. Publication No. 2019/0318525) hereinafter Schofield, in view of Zellner (U.S. Publication No. 2020/0250588), and further in view of Jarske et al. (U.S. Publication No. 2015/0316640) hereinafter Jarske.
As to claim 1:
Schofield discloses:
A computer-implemented method comprising: 
dynamically generating audio for one or more images associated with a location based on contextual information that satisfies a request [Paragraph 0030 teaches demonstrating aesthetic and sound characteristics of various materials and items; Paragraph 0042 upon receiving user input selecting an environment, environmental attributes, and items, retrieving visual and audio data relevant for the virtual reality experience selected by the user, hence, dynamically generating audio and visual data associated with a location or environment based on the user request]; 
	embedding the generated audio into the one or more images [Paragraph 0024 teaches collecting sound samples; Paragraph 0042 teaches simulating items, by retrieving visual and audio data relevant for the experience selected by the user, therefore, embedding generated audio into the image presented to the user; Paragraph 0044 teaches output one or more sounds associated with the one or more items displayed in the rendering]; and
emulating audio in a generated user interface that provides a realistic representation of the location using the embedded audio that reproduces one or more acoustic properties of the location based on interactions with the one or more images [Paragraph 0029 teaches providing an immersive visual and audio experience to users, providing a virtual representation of the aesthetic and sound characteristics of various items; Paragraph 0030 teaches demonstrating aesthetic and sound characteristics of various items; Paragraph 0036 teaches displaying an image, overlaying one or more surfaces displayed within the displayed image with a virtual representation of an item (e.g., a flooring material and/or ceiling material), to provide a virtually augmented representation of a particular environment, including one or more items; Paragraph 0039 teaches realistically replicating sound characteristics of particular items presented to the user; Paragraph 0044 teaches executing sound files to output sounds representative of sound characteristics of the one or more of the displayed items, that can be in response to a trigger event generated by a selection menu, therefore, based on interactions with the images; Paragraph 0047 teaches sound files may be representative of ambient noises generated within a particular room, to provide an audible understanding of how a particular item impacts sound travel within a given space; Paragraph 0052 teaches the virtual reality system thereby demonstrates how various sounds are perceptible to a user during simulated, real-world experiences].
Schofield does not appear to expressly disclose reproducing perspectives at different points in time; implementing a feedback mechanism that solicits feedback indicating satisfaction levels for the emulated audio; and in response to receiving solicited feedback indicated satisfaction levels for the emulated audio, altering depiction of the one or more images and corresponding emulated audio.
Zellner discloses:
implementing a feedback mechanism that solicits feedback indicating satisfaction levels for the emulated audio [Paragraph 0027 teaches feedback application can facilitate sound feedback provided by the user while the user is listening to the audio input, which allows the user to provide sound feedback such as more or less of different frequencies, etc.; Paragraph 0032 teaches when user is satisfied with the ambient noise conditions, the user can select a continue option, therefore, a feedback mechanism that requests feedback indicating satisfaction for the audio].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Schofield, by implementing a feedback mechanism that solicits feedback indicating satisfaction levels for the emulated audio, as taught by Zellner [Paragraphs 0027, 0032], because both applications are directed to processing of media including simulations of audio in an environment; including the ability to solicit feedback enables the improvement of the user’s experience, since the obtained feedback can be used to optimize personalization, and the listening experience of the user (See Zellner Paras [0022],[0028]).
Neither Schofield nor Zellner appear to expressly disclose reproducing perspectives at different points in time; and in response to receiving solicited feedback indicated satisfaction levels for the emulated audio, altering depiction of the one or more images and corresponding emulated audio.
Jarske discloses:
reproducing perspectives at different points in time [Paragraph 0041 teaches one or more audio objects may be determined from metadata associated with a content item, for example, a given shopping district at a given time of day during a shopping day; Paragraph 0047 teaches audio objects may be associated with different times of a day, days of a week, etc.; Paragraph 0049 teaches user may view various content items and hear various associated audio objects and their elements under different conditions (e.g., daytime, night time, rush hour, lunch time, weekends, etc.)]; and 
in response to receiving solicited feedback indicated satisfaction levels for the emulated audio, altering depiction of the one or more images and corresponding emulated audio [Paragraph 0044 teaches audio objects and its elements may be recomposed, altered, manipulated, etc., based on viewer preferences, therefore, in response to user feedback; Paragraph 0048 teaches UI application with various elements may be presented to a user whereby the user may indicate various preferences; Paragraph 0049 teaches causing a modification to the objects based on user interactions, i.e., when a user selects different angles or positions for viewing various content items, and for hearing various associated audio objects and their elements under different conditions, therefore, in response to receiving user feedback, altering depiction of the images and corresponding audio].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Schofield, by reproducing perspectives at different points in time; and in response to receiving solicited feedback indicated satisfaction levels for the emulated audio, altering depiction of the one or more images and corresponding emulated audio, as taught by Jarske [Paragraphs 0041, 0044, 0047, 0049, 0032], because the applications are directed to processing of media including simulations of audio in an environment; reproducing perspectives at different times, and altering depiction of images and audio based on user feedback increases flexibility of the system, and allows for a more complete and improved user experience, with a true sense of the place of interest to the user (See Jarske Paras [0002],[0026]).


	As to claim 7:
Schofield discloses:
generating a score that indicates a noise level associated with an object depicted in respective images of the one or more images [Paragraph 0020 teaches sound samples may be generated for various sound rating levels along the Sound Transmission Class (STS) rating scale, for various values of the Noise Reduction Coefficient (NRC) rating scale, etc.; Paragraph 0047 teaches providing the user with an understanding of the volume of noise that passes through one or more items]; and 
in response to the generated score meeting or exceeding a threshold score for the noise level, recommending an action that alters acoustic properties of the object [Paragraph 0016 teaches comparing sound transmission characteristics against other reference points, and outputting other sound samples of perceived “quieter” products, e.g., identified based on sound transmission ratings, therefore, the reference points represent the threshold scores for the noise level; Paragraph 0050 teaches provide users with an indication regarding how similar—or how different—sound is transmitted through products having adjacent rating levels (e.g., to compare similarities in sound transmission between products having an IIC-50 rating versus products having an IIC-49 rating)].

As to claim 8:
Schofield discloses:
A computer program product comprising: one or more computer readable storage media and program instructions stored on the one or more computer readable storage media, the program instructions comprising: 
program instructions to dynamically generate audio for one or more images associated with a location based on contextual information that satisfies a request [Paragraph 0030 teaches demonstrating aesthetic and sound characteristics of various materials and items; Paragraph 0042 upon receiving user input selecting an environment, environmental attributes, and items, retrieving visual and audio data relevant for the virtual reality experience selected by the user, hence, dynamically generating audio and visual data associated with a location or environment based on the user request]; 
	program instructions to embed the generated audio into the one or more images [Paragraph 0024 teaches collecting sound samples; Paragraph 0042 teaches simulating items, by retrieving visual and audio data relevant for the experience selected by the user, therefore, embedding generated audio into the image presented to the user; Paragraph 0044 teaches output one or more sounds associated with the one or more items displayed in the rendering]; and
program instructions to emulate audio in a generated user interface that provides a realistic representation of the location using the embedded audio that reproduces one or more acoustic properties of the location based on interactions with the one or more images [Paragraph 0029 teaches providing an immersive visual and audio experience to users, providing a virtual representation of the aesthetic and sound characteristics of various items; Paragraph 0030 teaches demonstrating aesthetic and sound characteristics of various items; Paragraph 0036 teaches displaying an image, overlaying one or more surfaces displayed within the displayed image with a virtual representation of an item (e.g., a flooring material and/or ceiling material), to provide a virtually augmented representation of a particular environment, including one or more items; Paragraph 0039 teaches realistically replicating sound characteristics of particular items presented to the user; Paragraph 0044 teaches executing sound files to output sounds representative of sound characteristics of the one or more of the displayed items, that can be in response to a trigger event generated by a selection menu, therefore, based on interactions with the images; Paragraph 0047 teaches sound files may be representative of ambient noises generated within a particular room, to provide an audible understanding of how a particular item impacts sound travel within a given space; Paragraph 0052 teaches the virtual reality system thereby demonstrates how various sounds are perceptible to a user during simulated, real-world experiences].
Schofield does not appear to expressly disclose reproducing perspectives at different points in time; implementing a feedback mechanism that solicits feedback indicating satisfaction levels for the emulated audio; and in response to receiving solicited feedback indicated satisfaction levels for the emulated audio, altering depiction of the one or more images and corresponding emulated audio.
Zellner discloses:
program instructions to implementing a feedback mechanism that solicits feedback indicating satisfaction levels for the emulated audio [Paragraph 0027 teaches feedback application can facilitate sound feedback provided by the user while the user is listening to the audio input, which allows the user to provide sound feedback such as more or less of different frequencies, etc.; Paragraph 0032 teaches when user is satisfied with the ambient noise conditions, the user can select a continue option, therefore, a feedback mechanism that requests feedback indicating satisfaction for the audio].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Schofield, by implementing a feedback mechanism that solicits feedback indicating satisfaction levels for the emulated audio, as taught by Zellner [Paragraphs 0027, 0032], because both applications are directed to processing of media including simulations of audio in an environment; including the ability to solicit feedback enables the improvement of the user’s experience, since the obtained feedback can be used to optimize personalization, and the listening experience of the user (See Zellner Paras [0022],[0028]).
Neither Schofield nor Zellner appear to expressly disclose reproducing perspectives at different points in time; and in response to receiving solicited feedback indicated satisfaction levels for the emulated audio, altering depiction of the one or more images and corresponding emulated audio.
Jarske discloses:
reproducing perspectives at different points in time [Paragraph 0041 teaches one or more audio objects may be determined from metadata associated with a content item, for example, a given shopping district at a given time of day during a shopping day; Paragraph 0047 teaches audio objects may be associated with different times of a day, days of a week, etc.; Paragraph 0049 teaches user may view various content items and hear various associated audio objects and their elements under different conditions (e.g., daytime, night time, rush hour, lunch time, weekends, etc.)]; and 
program instructions to, in response to receiving solicited feedback indicated satisfaction levels for the emulated audio, alter depiction of the one or more images and corresponding emulated audio [Paragraph 0044 teaches audio objects and its elements may be recomposed, altered, manipulated, etc., based on viewer preferences, therefore, in response to user feedback; Paragraph 0048 teaches UI application with various elements may be presented to a user whereby the user may indicate various preferences; Paragraph 0049 teaches causing a modification to the objects based on user interactions, i.e., when a user selects different angles or positions for viewing various content items, and for hearing various associated audio objects and their elements under different conditions, therefore, in response to receiving user feedback, altering depiction of the images and corresponding audio].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Schofield, by reproducing perspectives at different points in time; and in response to receiving solicited feedback indicated satisfaction levels for the emulated audio, altering depiction of the one or more images and corresponding emulated audio, as taught by Jarske [Paragraphs 0041, 0044, 0047, 0049, 0032], because the applications are directed to processing of media including simulations of audio in an environment; reproducing perspectives at different times, and altering depiction of images and audio based on user feedback increases flexibility of the system, and allows for a more complete and improved user experience, with a true sense of the place of interest to the user (See Jarske Paras [0002],[0026]).

	As to claim 14:
Schofield discloses:
program instructions to generate a score that indicates a noise level associated with an object depicted in respective images of the one or more images [Paragraph 0020 teaches sound samples may be generated for various sound rating levels along the Sound Transmission Class (STS) rating scale, for various values of the Noise Reduction Coefficient (NRC) rating scale, etc.; Paragraph 0047 teaches providing the user with an understanding of the volume of noise that passes through one or more items]; and 
program instructions to, in response to the generated score meeting or exceeding a threshold score for the noise level, recommend an action that alters acoustic properties of the object [Paragraph 0016 teaches comparing sound transmission characteristics against other reference points, and outputting other sound samples of perceived “quieter” products, e.g., identified based on sound transmission ratings, therefore, the reference points represent the threshold scores for the noise level; Paragraph 0050 teaches provide users with an indication regarding how similar—or how different—sound is transmitted through products having adjacent rating levels (e.g., to compare similarities in sound transmission between products having an IIC-50 rating versus products having an IIC-49 rating)].

As to claim 15:
Schofield discloses:
A computer system for comprising: one or more computer processors; one or more computer readable storage media; and program instructions stored on the one or more computer readable storage media for execution by at least one of the one or more computer processors, the program instructions comprising: 
program instructions to dynamically generate audio for one or more images associated with a location based on contextual information that satisfies a request [Paragraph 0030 teaches demonstrating aesthetic and sound characteristics of various materials and items; Paragraph 0042 upon receiving user input selecting an environment, environmental attributes, and items, retrieving visual and audio data relevant for the virtual reality experience selected by the user, hence, dynamically generating audio and visual data associated with a location or environment based on the user request]; 
	program instructions to embed the generated audio into the one or more images [Paragraph 0024 teaches collecting sound samples; Paragraph 0042 teaches simulating items, by retrieving visual and audio data relevant for the experience selected by the user, therefore, embedding generated audio into the image presented to the user; Paragraph 0044 teaches output one or more sounds associated with the one or more items displayed in the rendering]; and
program instructions to emulate audio in a generated user interface that provides a realistic representation of the location using the embedded audio that reproduces one or more acoustic properties of the location based on interactions with the one or more images [Paragraph 0029 teaches providing an immersive visual and audio experience to users, providing a virtual representation of the aesthetic and sound characteristics of various items; Paragraph 0030 teaches demonstrating aesthetic and sound characteristics of various items; Paragraph 0036 teaches displaying an image, overlaying one or more surfaces displayed within the displayed image with a virtual representation of an item (e.g., a flooring material and/or ceiling material), to provide a virtually augmented representation of a particular environment, including one or more items; Paragraph 0039 teaches realistically replicating sound characteristics of particular items presented to the user; Paragraph 0044 teaches executing sound files to output sounds representative of sound characteristics of the one or more of the displayed items, that can be in response to a trigger event generated by a selection menu, therefore, based on interactions with the images; Paragraph 0047 teaches sound files may be representative of ambient noises generated within a particular room, to provide an audible understanding of how a particular item impacts sound travel within a given space; Paragraph 0052 teaches the virtual reality system thereby demonstrates how various sounds are perceptible to a user during simulated, real-world experiences].
Schofield does not appear to expressly disclose reproducing perspectives at different points in time; implementing a feedback mechanism that solicits feedback indicating satisfaction levels for the emulated audio; and in response to receiving solicited feedback indicated satisfaction levels for the emulated audio, altering depiction of the one or more images and corresponding emulated audio.
Zellner discloses:
program instructions to implementing a feedback mechanism that solicits feedback indicating satisfaction levels for the emulated audio [Paragraph 0027 teaches feedback application can facilitate sound feedback provided by the user while the user is listening to the audio input, which allows the user to provide sound feedback such as more or less of different frequencies, etc.; Paragraph 0032 teaches when user is satisfied with the ambient noise conditions, the user can select a continue option, therefore, a feedback mechanism that requests feedback indicating satisfaction for the audio].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Schofield, by implementing a feedback mechanism that solicits feedback indicating satisfaction levels for the emulated audio, as taught by Zellner [Paragraphs 0027, 0032], because both applications are directed to processing of media including simulations of audio in an environment; including the ability to solicit feedback enables the improvement of the user’s experience, since the obtained feedback can be used to optimize personalization, and the listening experience of the user (See Zellner Paras [0022],[0028]).
Neither Schofield nor Zellner appear to expressly disclose reproducing perspectives at different points in time; and in response to receiving solicited feedback indicated satisfaction levels for the emulated audio, altering depiction of the one or more images and corresponding emulated audio.
Jarske discloses:
reproducing perspectives at different points in time [Paragraph 0041 teaches one or more audio objects may be determined from metadata associated with a content item, for example, a given shopping district at a given time of day during a shopping day; Paragraph 0047 teaches audio objects may be associated with different times of a day, days of a week, etc.; Paragraph 0049 teaches user may view various content items and hear various associated audio objects and their elements under different conditions (e.g., daytime, night time, rush hour, lunch time, weekends, etc.)]; and 
program instructions to, in response to receiving solicited feedback indicated satisfaction levels for the emulated audio, alter depiction of the one or more images and corresponding emulated audio [Paragraph 0044 teaches audio objects and its elements may be recomposed, altered, manipulated, etc., based on viewer preferences, therefore, in response to user feedback; Paragraph 0048 teaches UI application with various elements may be presented to a user whereby the user may indicate various preferences; Paragraph 0049 teaches causing a modification to the objects based on user interactions, i.e., when a user selects different angles or positions for viewing various content items, and for hearing various associated audio objects and their elements under different conditions, therefore, in response to receiving user feedback, altering depiction of the images and corresponding audio].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Schofield, by reproducing perspectives at different points in time; and in response to receiving solicited feedback indicated satisfaction levels for the emulated audio, altering depiction of the one or more images and corresponding emulated audio, as taught by Jarske [Paragraphs 0041, 0044, 0047, 0049, 0032], because the applications are directed to processing of media including simulations of audio in an environment; reproducing perspectives at different times, and altering depiction of images and audio based on user feedback increases flexibility of the system, and allows for a more complete and improved user experience, with a true sense of the place of interest to the user (See Jarske Paras [0002],[0026]).

	As to claim 21:
	Schofield as modified by Zellner discloses:
simulating audio prior to a user entering the location based on health parameters of the user [Zellner - Paragraph 0025 teaches utilize hearing profiles to modify the audio output for the user; Paragraph 0031 teaches hearing profiles can include results of hearing tests conducted by an audiologist, and can comprise percentage of hearing loss for the user, frequencies he or she has difficulty hearing, etc.].

	As to claim 22:
Schofield as modified by Jarske further discloses:
altering emulated audio of the location based on a time and weather data associated with the location [Jarske - Paragraph 0047 teaches audio objects may be associated with different times of a day, days of a week, etc.; Paragraph 0049 teaches user may view various content items and hear various associated audio objects and their elements under different conditions (e.g., daytime, night time, rush hour, lunch time, weekends, etc.); Paragraph 0032 teaches obtaining various audio objects for detected sources including waterfalls, rain, etc.; Paragraph 0040 teaches determining audio sources including persons, rain, date, time, associated with the POI, therefore, including weather data].

Claims 3, 4, 10, 11, 17, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Schofield et al. (U.S. Publication No. 2019/0318525) hereinafter Schofield, in view of Zellner (U.S. Publication No. 2020/0250588), in view of Jarske et al. (U.S. Publication No. 2015/0316640) hereinafter Jarske, and further in view of HARRIS et al. (U.S. Publication No. 2016/0014219) hereinafter Harris.
As to claim 3:
Schofield discloses:
generating one or more images that match the contextual information [Paragraph 0042 upon receiving user input selecting an environment, environmental attributes, and items, retrieving visual data relevant for the virtual reality experience selected by the user, hence, generating visual data matching the user request; 303, Fig. 3, Retrieve visual display data regarding the environment and selected products]; and 
generating audio associated with the one or more generated images that match the contextual information [Paragraph 0042 upon receiving user input selecting an environment, environmental attributes, and items, retrieving audio data relevant for the virtual reality experience selected by the user, hence, generating audio data matching the user request; 304, Fig. 3, Retrieve audio files representing sound characteristics of the one or more selected products].
Schofield does not appear to expressly disclose prioritizing contextual information associated with the location.
Harris discloses:
prioritizing contextual information associated with the location [Paragraph 0017 teaches a rank or score may be assigned to the received content based on various ranking factors, i.e., relevance to a specified geo-location, or other attributes or factors that can be used to rank the content].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Schofield, by prioritizing contextual information associated with the location, as taught by Harris [Paragraphs 0017], because both applications are directed to processing and aggregation of media; by ranking or prioritizing the information and content, relevancy of the content can be improved.

	As to claim 4:
	Schofield discloses:
simulating different events at the location by altering at least one object of a plurality of identified objects based on contextual information [Paragraph 0052 teaches execute a first sound file comprising a sound representative of a conversation occurring within a particular room (e.g., the simulated room in which the user is virtually occupying) and a second sound file comprising a sound representative of high-heels walking on a floor above the simulated room; Paragraph 0041 teaches selection menu enables a user to select one or more different items, different environments, and/or the like to be represented via the virtual reality system, therefore, simulating different events].
Schofield does not appear to expressly disclose wherein an event simulates interactions of users at the location, and wherein altering at least one object comprises: generating one or more images to represent the event; and embedding audio representative of audio emitted by the generated one or more images representing the event.
Jarske further discloses:
an event simulates interactions of users at the location [Paragraph 0031 teaches based on user interactions with the images, the objects are modified, and corresponding audio, where the audio object’s sources include people talking], and 
wherein altering at least one object comprises: generating one or more images to represent the event [Paragraph 0032 teaches enhancing, altering, augmenting, or modifying objects to improve its quality or to add one or more elements for a higher quality to allow better audio rendering; Paragraph 0035 teaches content items may be rendering of various representations of images of objects including people]; and 
embedding audio representative of audio emitted by the generated one or more images representing the event [Paragraph 0032 teaches i.e., image contains people talking, and modifying the object to render audio of a generic chatter of a crowd; Paragraph 0078 teaches determining audio objects associated with audio sources including people].

As to claim 10:
Schofield discloses:
program instruction to generate one or more images that match the contextual information [Paragraph 0042 upon receiving user input selecting an environment, environmental attributes, and items, retrieving visual data relevant for the virtual reality experience selected by the user, hence, generating visual data matching the user request; 303, Fig. 3, Retrieve visual display data regarding the environment and selected products]; and 
program instruction to generate audio associated with the one or more generated images that match the contextual information [Paragraph 0042 upon receiving user input selecting an environment, environmental attributes, and items, retrieving audio data relevant for the virtual reality experience selected by the user, hence, generating audio data matching the user request; 304, Fig. 3, Retrieve audio files representing sound characteristics of the one or more selected products].
Schofield does not appear to expressly disclose prioritizing contextual information associated with the location.
Harris discloses:
program instruction to prioritize contextual information associated with the location [Paragraph 0017 teaches a rank or score may be assigned to the received content based on various ranking factors, i.e., relevance to a specified geo-location, or other attributes or factors that can be used to rank the content].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Schofield, by prioritizing contextual information associated with the location, as taught by Harris [Paragraphs 0017], because both applications are directed to processing and aggregation of media; by ranking or prioritizing the information and content, relevancy of the content can be improved.

As to claim 11:
	Schofield discloses:
program instructions to simulate different events at the location by altering at least one object of a plurality of identified objects based on contextual information [Paragraph 0052 teaches execute a first sound file comprising a sound representative of a conversation occurring within a particular room (e.g., the simulated room in which the user is virtually occupying) and a second sound file comprising a sound representative of high-heels walking on a floor above the simulated room; Paragraph 0041 teaches selection menu enables a user to select one or more different items, different environments, and/or the like to be represented via the virtual reality system, therefore, simulating different events].
Schofield does not appear to expressly disclose wherein an event simulates interactions of users at the location, and wherein altering at least one object comprises: generating one or more images to represent the event; and embedding audio representative of audio emitted by the generated one or more images representing the event.
Jarske further discloses:
an event simulates interactions of users at the location [Paragraph 0031 teaches based on user interactions with the images, the objects are modified, and corresponding audio, where the audio object’s sources include people talking], and 
wherein the program instructions to alter at least one object comprise: program instructions to generate one or more images to represent the event [Paragraph 0032 teaches enhancing, altering, augmenting, or modifying objects to improve its quality or to add one or more elements for a higher quality to allow better audio rendering; Paragraph 0035 teaches content items may be rendering of various representations of images of objects including people]; and 
program instructions to embed audio representative of audio emitted by the generated one or more images representing the event [Paragraph 0032 teaches i.e., image contains people talking, and modifying the object to render audio of a generic chatter of a crowd; Paragraph 0078 teaches determining audio objects associated with audio sources including people].

As to claim 17:
Schofield discloses:
program instruction to generate one or more images that match the contextual information [Paragraph 0042 upon receiving user input selecting an environment, environmental attributes, and items, retrieving visual data relevant for the virtual reality experience selected by the user, hence, generating visual data matching the user request; 303, Fig. 3, Retrieve visual display data regarding the environment and selected products]; and 
program instruction to generate audio associated with the one or more generated images that match the contextual information [Paragraph 0042 upon receiving user input selecting an environment, environmental attributes, and items, retrieving audio data relevant for the virtual reality experience selected by the user, hence, generating audio data matching the user request; 304, Fig. 3, Retrieve audio files representing sound characteristics of the one or more selected products].
Schofield does not appear to expressly disclose prioritizing contextual information associated with the location.
Harris discloses:
program instruction to prioritize contextual information associated with the location [Paragraph 0017 teaches a rank or score may be assigned to the received content based on various ranking factors, i.e., relevance to a specified geo-location, or other attributes or factors that can be used to rank the content].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Schofield, by prioritizing contextual information associated with the location, as taught by Harris [Paragraphs 0017], because both applications are directed to processing and aggregation of media; by ranking or prioritizing the information and content, relevancy of the content can be improved.

As to claim 18:
Schofield discloses:
program instructions to simulate different events at the location by altering at least one object of a plurality of identified objects based on contextual information [Paragraph 0052 teaches execute a first sound file comprising a sound representative of a conversation occurring within a particular room (e.g., the simulated room in which the user is virtually occupying) and a second sound file comprising a sound representative of high-heels walking on a floor above the simulated room; Paragraph 0041 teaches selection menu enables a user to select one or more different items, different environments, and/or the like to be represented via the virtual reality system, therefore, simulating different events].
Schofield does not appear to expressly disclose wherein an event simulates interactions of users at the location, and wherein altering at least one object comprises: generating one or more images to represent the event; and embedding audio representative of audio emitted by the generated one or more images representing the event.
Jarske further discloses:
an event simulates interactions of users at the location [Paragraph 0031 teaches based on user interactions with the images, the objects are modified, and corresponding audio, where the 
audio object’s sources include people talking], and 
wherein the program instructions to alter at least one object comprise: program instructions to generate one or more images to represent the event [Paragraph 0032 teaches enhancing, altering, augmenting, or modifying objects to improve its quality or to add one or more elements for a higher quality to allow better audio rendering; Paragraph 0035 teaches content items may be rendering of various representations of images of objects including people]; and 
program instructions to embed audio representative of audio emitted by the generated one or more images representing the event [Paragraph 0032 teaches i.e., image contains people talking, and modifying the object to render audio of a generic chatter of a crowd; Paragraph 0078 teaches determining audio objects associated with audio sources including people].

Claims 5, 12, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Schofield et al. (U.S. Publication No. 2019/0318525) hereinafter Schofield, in view of Zellner (U.S. Publication No. 2020/0250588), in view of Jarske et al. (U.S. Publication No. 2015/0316640) hereinafter Jarske, in view of HARRIS et al. (U.S. Publication No. 2016/0014219) hereinafter Harris, and further in view of Eronen et al. (U.S. Publication No. 2012/0102066) hereinafter Eronen.
As to claim 5:
	Schofield discloses all the limitations as set forth in the rejections of claim 4 above, but does not appear to expressly disclose indexing the plurality of identified objects based on acoustic properties of each identified object of the plurality of identified objects.
	Eronen discloses:
indexing the plurality of identified objects based on acoustic properties of each identified object of the plurality of identified objects [Paragraph 0044 teaches audio attributes may be utilized in searching for still images, by storing the audio clip analysis in association with the images; Paragraph 0051 teaches extracting audio attributes and features, and storing as image metadata or associated with the image in some other way].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Schofield, by indexing the plurality of identified objects based on acoustic properties of each identified object of the plurality of identified objects, as taught by Eronen [Paragraphs 0044, 0051], because the applications are directed to processing and aggregation of media; storing and indexing the images based on acoustic properties enables the search of images and media by sound, which may improve the search results, as audio contains a set of information related, e.g., to the context, situation, or the environment where the image was taken (sounds of nature and people) (See Eronen Para [0043]).

As to claim 12:
	Schofield disclose all the limitations as set forth in the rejections of claim 11 above, but does 
not appear to expressly disclose indexing the plurality of identified objects based on acoustic properties of each identified object of the plurality of identified objects.
	Eronen discloses:
program instructions to index the plurality of identified objects based on acoustic properties of each identified object of the plurality of identified objects [Paragraph 0044 teaches audio attributes may be utilized in searching for still images, by storing the audio clip analysis in association with the images; Paragraph 0051 teaches extracting audio attributes and features, and storing as image metadata or associated with the image in some other way].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Schofield, by indexing the plurality of identified objects based on acoustic properties of each identified object of the plurality of identified objects, as taught by Eronen [Paragraphs 0044, 0051], because the applications are directed to processing and aggregation of media; storing and indexing the images based on acoustic properties enables the search of images and media by sound, which may improve the search results, as audio contains a set of information related, e.g., to the context, situation, or the environment where the image was taken (sounds of nature and people) (See Eronen Para [0043]).

As to claim 19:
	Schofield disclose all the limitations as set forth in the rejections of claim 18 above, but does not appear to expressly disclose indexing the plurality of identified objects based on acoustic properties of each identified object of the plurality of identified objects.
	Eronen discloses:
program instructions to index the plurality of identified objects based on acoustic properties of each identified object of the plurality of identified objects [Paragraph 0044 teaches audio attributes may be utilized in searching for still images, by storing the audio clip analysis in association with the images; Paragraph 0051 teaches extracting audio attributes and features, and storing as image metadata or associated with the image in some other way].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Schofield, by indexing the plurality of identified objects based on acoustic properties of each identified object of the plurality of identified objects, as taught by Eronen [Paragraphs 0044, 0051], because the applications are directed to processing and aggregation of media; storing and indexing the images based on acoustic properties enables the search of images and media by sound, which may improve the search results, as audio contains a set of information related, e.g., to the context, situation, or the environment where the image was taken (sounds of nature and people) (See Eronen Para [0043]).

Claims 6, 13, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Schofield et al. (U.S. Publication No. 2019/0318525) hereinafter Schofield, in view of Zellner (U.S. Publication No. 2020/0250588), in view of Jarske et al. (U.S. Publication No. 2015/0316640) hereinafter Jarske, in view of HARRIS et al. (U.S. Publication No. 2016/0014219) hereinafter Harris, and further in view of Hunter et al. (U.S. Publication No. 2012/0221687) hereinafter Hunter.
As to claim 6:
Schofield as modified by Harris discloses:
generating one or more graphical icons to be overlaid on the one or more generated images that represents at least one object of the plurality of objects [Harris - Paragraph 0110 teaches content may include an icon, logo, or other identifying indicia, which, when selected, moused over, clicked, touched, or otherwise interacted with, may cause content detail to appear, which may be the content itself, such as video, photo, audio, text, etc.].
Neither Schofield nor Harris appear to appear to expressly disclose overlaying the at least one or more generated graphical icons over a generated image of the one or more generated images displayed on the user device; and in response to selecting at least one generated graphical icon of the generated one or more graphical icons, playing audio associated with a respective object of the plurality of objects.
Hunter discloses:
overlaying the at least one or more generated graphical icons over a generated image of the one or more generated images displayed on the user device [Paragraph 0169 teaches media items may be presented to a user as a visual, augmented reality overlay to images viewed in the user device; Paragraph 0116 teaches each media file is represented by an image and/or via a map view that displays the specific locations of each file on an interactive map]; and 
in response to selecting at least one generated graphical icon of the generated one or more graphical icons, playing audio associated with a respective object of the plurality of objects [Paragraph 0163 teaches embedded items may be displayed as an image on a map or gallery view and can be played; Paragraph 0042 teaches the user may listen to sounds, stories, music, educational information, or other types of audio content collected and geotagged; Paragraph 0130 teaches each audio file has an image associated with the audio file, a play button, etc.].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Harris as modified by Schofield, by indexing the plurality of identified objects based on acoustic properties of each identified object of the plurality of identified objects, as taught by Hunter [Paragraphs 0130, 0169, 0163], because the applications are directed to processing and aggregation of media; providing the media as icons or images displayed over the associated image, may provide an immersive and/or enhanced sensory, educational, or entertainment experience to a user of a mobile device, augmenting the user’s real world, physical experience at a given location (See Hunter Para [0042]).

As to claim 13:
Schofield as modified by Harris discloses: 
program instructions to generate one or more graphical icons to be overlaid on the one or more generated images that represents at least one object of the plurality of objects [Harris - Paragraph 0110 teaches content may include an icon, logo, or other identifying indicia, which, when selected, moused over, clicked, touched, or otherwise interacted with, may cause content detail to appear, which may be the content itself, such as video, photo, audio, text, etc.].
Neither Schofield nor Harris appear to expressly disclose overlaying the at least one or more generated graphical icons over a generated image of the one or more generated images displayed on the user device; and in response to selecting at least one generated graphical icon of the generated one or more graphical icons, playing audio associated with a respective object of the plurality of objects.
Hunter discloses:
program instructions to overlay the at least one or more generated graphical icons over a generated image of the one or more generated images displayed on the user device [Paragraph 0169 teaches media items may be presented to a user as a visual, augmented reality overlay to images viewed in the user device; Paragraph 0116 teaches each media file is represented by an image and/or via a map view that displays the specific locations of each file on an interactive map]; and 
program instructions to, in response to selecting at least one generated graphical icon of the generated one or more graphical icons, play audio associated with a respective object of the plurality of objects [Paragraph 0163 teaches embedded items may be displayed as an image on a map or gallery view and can be played; Paragraph 0042 teaches the user may listen to sounds, stories, music, educational information, or other types of audio content collected and geotagged; Paragraph 0130 teaches each audio file has an image associated with the audio file, a play button, etc.].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Harris as modified by Schofield, by indexing the plurality of identified objects based on acoustic properties of each identified object of the plurality of identified objects, as taught by Hunter [Paragraphs 0130, 0169, 0163], because the applications are directed to processing and aggregation of media; providing the media as icons or images displayed over the associated image, may provide an immersive and/or enhanced sensory, educational, or entertainment experience to a user of a mobile device, augmenting the user’s real world, physical experience at a given location (See Hunter Para [0042]).

As to claim 20:
Schofield as modified by Harris discloses:
program instructions to generate one or more graphical icons to be overlaid on the one or more generated images that represents at least one object of the plurality of objects [Harris - Paragraph 0110 teaches content may include an icon, logo, or other identifying indicia, which, when selected, moused over, clicked, touched, or otherwise interacted with, may cause content detail to appear, which may be the content itself, such as video, photo, audio, text, etc.].
Neither Schofield nor Harris appear to appear to expressly disclose overlaying the at least one or more generated graphical icons over a generated image of the one or more generated images displayed on the user device; and in response to selecting at least one generated graphical icon of the generated one or more graphical icons, playing audio associated with a respective object of the plurality of objects.
Hunter discloses:
program instructions to overlay the at least one or more generated graphical icons over a generated image of the one or more generated images displayed on the user device [Paragraph 0169 teaches media items may be presented to a user as a visual, augmented reality overlay to images viewed in the user device; Paragraph 0116 teaches each media file is represented by an image and/or via a map view that displays the specific locations of each file on an interactive map]; and 
program instructions to, in response to selecting at least one generated graphical icon of the generated one or more graphical icons, play audio associated with a respective object of the plurality of objects [Paragraph 0163 teaches embedded items may be displayed as an image on a map or gallery view and can be played; Paragraph 0042 teaches the user may listen to sounds, stories, music, educational information, or other types of audio content collected and geotagged; Paragraph 0130 teaches each audio file has an image associated with the audio file, a play button, etc.].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Harris as modified by Schofield, by indexing the plurality of identified objects based on acoustic properties of each identified object of the plurality of identified objects, as taught by Hunter [Paragraphs 0130, 0169, 0163], because the applications are directed to processing and aggregation of media; providing the media as icons or images displayed over the associated image, may provide an immersive and/or enhanced sensory, educational, or entertainment experience to a user of a mobile device, augmenting the user’s real world, physical experience at a given location (See Hunter Para [0042]).

Claim 24 is rejected under 35 U.S.C. 103 as being unpatentable over Schofield et al. (U.S. Publication No. 2019/0318525) hereinafter Schofield, in view of Zellner (U.S. Publication No. 2020/0250588), in view of Jarske et al. (U.S. Publication No. 2015/0316640) hereinafter Jarske, and further in view of Whitley et al. (U.S. Publication No. 2012/0185769).
As to claim 24:
Schofield discloses all the limitations as set forth in the rejections of claim 1 above, but does not appear to expressly disclose generating recommendations to place one or more objects to optimize audio coverage based on number of users present and current location of each of the users that are present.
Whitley discloses:
generating recommendations to place one or more objects to optimize audio coverage based on number of users present and current location of each of the users that are present [Paragraph 0023 teaches creating any number of spot focused sound regions that correspond to the number of locations where each one of the users are likely to be in the media room; Paragraph 0024 teaches the number of and orientation of the spot focused sound regions may be adjusted based on the actual number of and actual location of the user; Paragraph 0047 teaches recommending orientation change and/or location change is based upon improving the sound quality of a spot focused sound region, and based upon a determined optimal orientation and/or location of the sound reproducing element for generation of the associated spot focused sound region; Paragraph 0074 teaches making recommendations for the location and/or orientation of the sound reproducing elements based on locations of users].
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, to combine the teachings of the cited references and modify the invention as taught by Schofield, by generating recommendations to place one or more objects to optimize audio coverage based on number of users present and current location of each of the users that are present, as taught by Whitley [Paragraphs 0024, 0047], because the applications are directed to processing and presentation of audio content in an environment; providing recommendations to the user to optimize audio coverage increases the system’s flexibility, and enhances thereby the user’s experience.

Response to Arguments
	The following is in response to arguments filed on November 18, 2022.  Applicant’s arguments have been carefully and respectfully considered. 
Claim Rejections - 35 USC § 112
	Applicant’s arguments have been fully and respectfully considered, but are not persuasive.  
	In regards to claim 1, Applicant argues that “support for “emulating audio in a generated user interface that provides a realistic representation of the location using the embedded audio that reproduces one or more acoustic properties of the location” can be found in paragraphs [0031]”.
	In response to the preceding argument, Examiner respectfully disagrees, and respectfully points out that, although paragraph [0031] describes “sound emulator 110 generates audio emulations by determining contextually relevant information, prioritizing the relevant information, and generating images and respective audio emulations that match the contextual information”, it does not describe the elements indicated as deficient in the rejection. It was not found in the specification, a description of “embedded audio that reproduces acoustic properties … based on interactions with the one or more images”, and more specifically, “reproducing acoustic properties … based on interactions with the one or more images”.  Simulating interactions of generated noises with the materials at a location, or simulating interactions of users at a location, is not the same as “reproducing acoustic properties based on interactions with the one or more images”.  Therefore, 112(a) rejections are hereby sustained.

Claim Rejections - 35 USC § 103
	Applicant’s arguments have been fully and respectfully considered, but are moot in view of new grounds of rejections, as necessitated by the amendments.
	
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RAQUEL PEREZ-ARROYO whose telephone number is (571)272-8969. The examiner can normally be reached Monday - Friday, 8:00am - 5:30pm, Alt Friday, EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Usmaan Saeed can be reached on 571-272-4046. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/RAQUEL PEREZ-ARROYO/Primary Examiner, Art Unit 2169