DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Delamont (US20200368616) in view of Tseng et al (US20170064154, hereinafter “Tseng”).
Regarding claim 1, Delamont teaches a system (abstract, mixed reality system) comprising:
a microphone (Fig. 1B, microphone);
a sensor (Fig. 1B, cameras);
a display (Fig. 1A, display 3);
an audio output (Fig. 1B, speakers); and
one or more processors (Fig. 1B, processors) configured to execute a method comprising:
generating, via the microphone, an audio stream (Fig. 1B, microphone capturing audio and generating audio stream);
generating, via the sensor, a video stream (Fig. 1B, cameras capturing video and generating video stream);
determining that a trigger event has occurred (Fig. 6, modules capable of detecting a “trigger event” [collision event]);
in accordance with a determination that the trigger event has occurred:
identifying a timestamp associated with the trigger event (¶815, server is capable of identifying event trigger information and attaching a time stamp attribute to it);
and
generating a first audio signal based on at least one of the portion of the audio stream and the portion of the video stream (¶221, audio manager takes into consideration a plurality of variables [e.g. projectile, directional vector, user positioning, etc.] when determining/processing audio for output; ¶219, realistic audio is desired to create a 3D experience);
presenting, on the display, a virtual object colliding with a surface, wherein the surface is associated with the trigger event (¶251, Fig. 1A, a collision manager and associated modules detect objects [real and virtual] colliding and to display it for the user as AR);
generating a second audio signal based on the first audio signal (¶219, 221, audio manager responsible for creating a realistic 3D audio experience by considering a plurality of variables when determining audio output); and
presenting, via the audio output, the second audio signal (Fig. 1B, speakers for outputting audio).
Delamont fails to explicitly teach identifying a portion of the audio stream based on the timestamp;
identifying a portion of the video stream based on the timestamp;
Tseng teaches identifying a portion of the audio stream based on the timestamp;
identifying a portion of the video stream based on the timestamp (¶28, Fig. 5, various methods can be used to imbed a “timestamp” or equivalent thereof to both video and audio such that both audio and video are in sync during playback);
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply the technique of applying timestamps to both audio and video (as taught by Tseng) on the mixed reality system (as taught by Delamont). The rationale to do so is to apply a known technique to a known device ready for improvement to yield the predictable result of achieving audio and video that are in sync.
Regarding claim 2, Delamont in view of Tseng teaches wherein the trigger event comprises a footstep on the surface (Delamont, Fig. 6, trigger event is a collision between two objects; it would’ve been obvious to one of ordinary skill in the art to apply the teachings of Delamont to detect a collision between a footstep and a surface instead).
Regarding claim 3, Delamont in view of Tseng teaches further comprising a second sensor, wherein determining that the trigger event has occurred is based on movement data captured by the second sensor (Delamont, ¶168, various sensors capable of tracking user movement).
Regarding claim 4, Delamont in view of Tseng teaches further comprising an inertial measurement unit, wherein determining that the trigger event has occurred is based on inertial data captured by the inertial measurement unit (Delamont, ¶144, various sensors can include a sensor to track inertial measurements).
Regarding claim 5, Delamont in view of Tseng teaches further comprising an auxiliary device and a wearable head device (Delamont, Fig. 1A, 5A, auxiliary apparatus 47 and wearable device 1), wherein:
determining that the trigger event has occurred is based on inertial data of the auxiliary device (Delamont, ¶8, sensor inputs can have an effect on the auxiliary device), and
the auxiliary device is coupled to the wearable head device (Delamont, ¶6, actions by the auxiliary device can be received/displayed on the wearable device).
Regarding claim 6, Delamont in view of Tseng teaches wherein the method further comprises:
in accordance with the determination that the trigger event has occurred:
identifying a position of a wearable head device of the system based on the timestamp;
determining a position of the trigger event based on the position of the wearable head device of the system; and
associating the position of the trigger event with the first audio signal (Delamont, ¶199-201, user positioning data is used when calculating audio and video data for playback such that determining whether a collision [i.e. trigger event] has occurred is also based on said user positioning data).
Regarding claim 7, Delamont in view of Tseng teaches wherein the method further comprises:
determining a position of the collision of the virtual object with the surface; and
determining whether the position of the collision of the virtual object with the surface is associated with the position of the trigger event,
wherein generating the second audio signal is further based on a determination that the position of the collision of the virtual object with the surface is associated with the position of the trigger event (Delamont, ¶199-201, Fig. 6, when determining whether a collision or hit [i.e. trigger event] has occurred, user positioning data as well as position data for the actual hit/collision is also determined; the realistic 3D audio playback is based on said collision data).
Regarding claim 8, Delamont in view of Tseng teaches wherein generating the second audio signal is further based on at least one of a physical model of the surface and a physical model of the virtual object (Delamont, ¶65, 85, Fig. 6, surface model as well as virtual object models can change with time and various sensors are used to track and display updates on said models as user moves around the real-world).
Regarding claim 9, Delamont in view of Tseng teaches wherein generating the second audio signal is further based on analysis-and-resynthesis of the first audio signal (Delamont, ¶219, audio synthesis techniques are used to create 3D audio for the user).
Regarding claim 10, Delamont in view of Tseng teaches wherein the method further comprises: in accordance with the determination that the trigger event has occurred, associating the first audio signal with the surface (Delamont, ¶65, 200, both realistic audio and video are presented to the user at the time of collision).
Regarding claim 11, Delamont in view of Tseng teaches wherein the second audio signal corresponds to the collision of the virtual object with the surface (Delamont, ¶65, 200, both realistic audio and video are presented to the user at the time of collision).
Regarding claim 12, Delamont in view of Tseng teaches wherein the method further comprises storing the first audio signal (Delamont, Fig. 1B, storage module), wherein:
the collision of the virtual object with the surface is presented on a display of a second system (Delamont, Fig. 7, plurality of users each with a similar system as shown in Fig. 1 and Fig. 1B wherein a collision would’ve been shown to a different user of a different system), and
generating the second audio signal is further based on the stored first audio signal (Delamont, Fig. 1B, speakers for playback of stored signals).
Regarding claim 13, Delamont in view of Tseng teaches wherein the method further comprises:
generating a second audio stream;
generating a second video stream;
determining that a second trigger event has occurred;
in accordance with a determination that the second trigger event has occurred:
identifying a second timestamp associated with the trigger event;
identifying a portion of the second audio stream based on the second timestamp;
identifying a portion of the second video stream based on the second timestamp; and
generating a third audio signal based on at least one of the portion of the second audio stream and the portion of the second video stream;
wherein generating the second audio signal is further based on the third audio signal (limitations of this claim are rejected similarly as claim 1, the additional “second…” is taught by Delamont, see Fig. 7 wherein a plurality of users with similar systems are shown in Figs. 1A-1B).
Regarding claim 14, Delamont in view of Tseng teaches wherein the video stream includes information associated with the surface (Delamont, ¶251, surface data is determined which can be displayed for the user [Fig. 1A, display 3]).
Regarding claim 15, Delamont in view of Tseng teaches wherein the virtual object comprises a foot of a virtual character (Delamont, Fig. 6, trigger event is a collision between two objects; it would’ve been obvious to one of ordinary skill in the art to apply the teachings of Delamont to detect a collision between a virtual foot and a surface instead).
Regarding claim 16, Delamont in view of Tseng teaches wherein:
a material of the surface is associated with an acoustic property, and
generating the second audio signal is further based on the acoustic property of the material of the surface (Delamont, ¶204, acoustic properties of the surface [such as its reflective or absorbent properties] are taken into consideration when generating audio playback).
Regarding claim 17, Delamont in view of Tseng teaches wherein the method further comprises determining the acoustic property of the material of the surface based on at least one of measured coefficient of absorption of the material, manual definition, acoustic data, and inertial data (Delamont, ¶204, acoustic properties of the surface [such as its reflective or absorbent properties] are taken into consideration when generating audio playback).
Regarding claim 18, it is rejected similarly as claim 1. The method can be found in Delamont (¶13, method) 
Regarding claim 19, it is rejected similarly as claim 11. The method can be found in Delamont (¶13, method)
Regarding claim 20, it is rejected similarly as claim 1. The method can be found in Delamont (Fig. 1B, computer-readable medium)

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Refer to PTO-892, Notice of References Cited for a listing of analogous art.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to QIN ZHU whose telephone number is (571)270-1304.  The examiner can normally be reached on Monday-Thursday 6AM-4PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc Nguyen can be reached on 571-272-7503.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/QIN ZHU/Primary Examiner, Art Unit 2651