DETAILED ACTION
Remarks
This communication is in response to the Applicant’s response filed on January 8, 2021 to a prior Office Action.  Claims 7 and 13 have been canceled.  Therefore, claims 1-6, 8-12 and 14-17 are pending for examination.  
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

Examiner Notes
Examiner cites particular columns and line numbers in the references as applied to the claims below for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the applicant fully consider the references in entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the examiner.
The examiner requests, in response to this Office action, support are shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line no(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.


Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on January 8, 2021 has been entered.  

Information Disclosure Statement
As required by M.P.E.P.  609(C), the applicant’s submissions of the Information Disclosure Statements dated January 11, 2021 is acknowledged by the examiner and the cited references have been considered in the examination of the claims now pending. As required by M.P.E.P 609 C (2), a copy of the PTOL-1449 initialed and dated by the examiner is attached to the instant office action.

Response to Arguments
Applicant’s arguments with respect to claim(s) 6, 8-12 and 14-17 have been considered but are moot in view of Zhang et al. (US Patent Application No. 2014/0325354 A1) and Baker et al. (US Patent No. 6,106,399).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:

A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:

1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.


Claims 1-5 are rejected under 35 U.S.C. 103 as being unpatentable over Mufti (US Application No. 2015/0221316 A1) in view of Baker et al. (US Patent No. 8,463,612 B1, ‘Baker’, hereafter)  and further in view of Neath et al. (US Patent No. 8,463,612 B1, ‘Neath’, hereafter).

As per claim 1, Mufti teaches a method for audio processing, comprising (Mufti, par. [0005], “[…] a new and innovative system, method, and apparatus for marking and tracking identification codes within media content […]” where the media content is interpreted to comprises of audio data, and the marking and tracking identification codes is interpreted as processing the identification codes with can be associated with the media data): 
detecting (Mufti, par. [0069], detecting audio signatures corresponding to media content), by a server device that interacts with user equipment in an interactive system (Mufti, fig. 11, par. [0033]-[0035], “[…] one or more servers […], and […] consumer devices […]” where the one or more servers are interpreted to comprises of the server device. The consumer devices is interpreted as the user equipment, where the servers and consumer devices are interpreted to interact to each other in order to transmit and receive call/requests for audio files for example. Further, par. [0067], “[…] an operating system of the consumer device […]” Where the operating system of the consumer device is interpreted as the interactive system that interacts with the server device), a first function call to a first audio interface from an interactive application (Mufti, figs. 1-2, 9, par. [0039]-[0042], [0059], request to embed the audio file into media content can be performed from an interface. Where the request to embed is interpreted as the function call which is interpreted to request/call audio files to be presenting in a user interface. Where the interface is interpreted as an audio interface. Further, an application programmable interface API. Where the application programmable interface is interpreted as the interactive application. Furthermore, fig. 4, par. [0079], an application interface capable of detecting audio contents, where the application provides the user to interact with the display audio content. For example, the user can play the detected audio via the application interface); 
generating (Mufti, par. [0039], “[…] generating and encoding an identification code within an audio file […]”), by the server device (Mufti, fig. 1, par. [0032], one or more servers), according to a type of the first audio interface (Mufti, fig. 2, par. [0058]-[0059], “[…] The audio file may include a pulse-coded modulated Waveform Audio File Format ("WAV"), AC-3 format, advanced audio coding ("AAC") format, MP3, etc […]” wherein each audio format is interpreted to being presented via interface and the type is interpreted the format), a first audio instruction corresponding to the type of the first audio interface when the function call is detected (Mufti, fig. 4, par. [0079], an identification code is generated associated with a content information include a file, a video, an image, music, a gift card, a hyperlink to a web service, or a link to a social media application, where the identification code can be interpreted as a record comprises of the instruction corresponding to the type when the function call is detected. Further, fig. 5, par. [0039], [0085], “[…] generate a new identification code and modulate the code into media content for redistribution, the `Content Creator 1` third-party client only has to access the database 124 and update the content information […]” wherein the new identification code is interpreted to comprises of audio instructions corresponding to the type when the function call is detected);
sending, by the server device in response to the determination that the first record exists (Mufti, fig. 1, par. [0032], one or more servers), the first audio instruction without the first audio data to the user equipment (Mufti, figs. 4, 14, par. [0124], “[…] if at least one identification code is detected, the application 120 on the consumer device 118 transmits a message 1409 including the detected identification codes to the identification code service 102 […]” where the message is interpreted to comprise of the audio instruction);
Mufti does not explicitly teach determining, by the server device, that a first  record exists in a storage of the server device, the first record indicating that first audio data corresponding to the first audio instruction has been previously sent to the user equipment and is cached at the user equipment;
However, Baker in the same field of endeavor, teaches
determining, by the server device, that a first  record exists in a storage of the server device, the first record indicating that first audio data corresponding to the first audio instruction has been previously sent to the user equipment and is cached at the user equipment (The server keeps track of which sounds are currently playing in each sector of the game world, and then passes that information to the client with coordinates relative to the player's position within the sector. The client plays the sounds accordingly. Additionally, the server keeps track of the direction a player's game persona is facing, so that if he moves north, he will hear sounds to the west on his left and to the east on his right, and vice versa if he moves south, Baker, Column 5, Lines 38-48.  Data is cached on the client machine. A large part of the data that is normally sent between client and server comprises object and sector descriptions, many of which are sent repetitively. According to the present invention, whenever the server wishes the client to display a description, it sends a number representing that object (2 bytes compared with up to 500 bytes for a full text description). The client refers to the cache and prints the description. If it does not find the description in its cache, it will request the description from the server at that time. Thus, no description is ever sent more than once, Baker, Column 5, Line 14, line 62 – Column 15, line 5); and 
Therefore, it would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention was made having the teachings of Mufti and Baker before him/her, to modify Mufti with the teaching of Baker’s internet multi-user role-playing game.  One would have been motivated to do so for the benefit of improving over how data is passed between client and server, such as caching long object and sector descriptions on the client so that they only have to be sent once, and extensive client-side error checking on player commands to avoid sending some data at all (Baker, Column 5, Lines 25-30).
Mufti and Baker do not explicitly teach wherein the first audio instruction causes the user equipment to execute the audio instruction using the first audio data cached at the user equipment.
However, Neath in the same field of endeavor, teaches wherein the first audio instruction causes the user equipment to execute the audio instruction using the first audio data cached at the user equipment (Neath,  figs. 3-4, Column 11, Lines 28-36, a client computer device comprised of event cache wherein the client computer device can be interpreted as the user equipment. Further, the event cache may be used to store representations of audio events including without limitation audio input and/or output data from an audio application, as well as re-created audio streams based on such data, wherein the representations of audio events is interpreted as the audio instructions which can be executed to produce an audio output data on the computer device via the audio application where the computer device can be for example a user equipment).
Therefore, it would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention was made having the teachings of Mufti, Baker and Neath before him/her, to further modify Mufti with the teaching of Neath’s monitoring of audio events on computer systems.  One would have been motivated to do so for the benefit of providing Mufti a capability of executing data from a cache memory of a computer device.  The motivation for doing so would be to have an improved solution for monitoring audio events on a computer (Neath, Column 6, Lines 1-30).

2. (Currently Amended) The method according to claim 1, further comprising: 
detecting (Mufti [0069]), by the server device (Mufti, Fig. 11, [0033]-[0035]), a second function call to a second audio interface from the interactive application (Mufti, Figs. 1-2, 4, 9, [0039]-[0042], [0059], [0079]) 
generating (Mufti [0039]), by the server device (Mufti, Figs. 1, 11, [0032]-[0035]), according to a type of the second audio interface, a second audio instruction corresponding to the type of the second audio interface when the second function call is detected (Mufti, Fig. 4, [0079]); 2Application No. 15/975,547 
extracting from the storage, by the server device in response to the determination that the second record does not exist in the storage, second audio data corresponding to the second audio instruction (Mufti Fig. 1, [0032], [0087]); and 
sending the second audio data extracted from the storage and the second audio instruction to the user equipment (Mufti [0070]),
Mufti does not explicitly teach 
determining, by the server device, that a second record does not exist in the storage of the server device;
However, Baker in the same field of endeavor, teaches Reply to Office Action of September 11, 2020
determining, by the server device, that a second record does not exist in the storage of the server device (Baker, Column 5, Lines 38-48, Column 14, line 62 – Column 15, line 5);
Therefore, it would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention was made having the teachings of Mufti and Baker before him/her, to modify Mufti with the teaching of Baker’s internet multi-user role-playing game.  One would have been motivated to do so for the benefit of improving over how data is passed between client and server, such as caching long object and sector descriptions on the client so that they only have to be sent once, and extensive client-side error checking on player commands to avoid sending some data at all (Baker, Column 5, Lines 25-30). 
Mufti and Baker do not explicitly teach wherein the second audio instruction causes the user equipment to execute the second audio instruction using the second audio data.
However, Neath in the same field of endeavor, teaches wherein the second audio instruction causes the user equipment to execute the second audio instruction using the second audio data (Neath,  Figs. 3-4, Column 11, Lines 28-36).
Therefore, it would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention was made having the teachings of Mufti, Baker and Neath before him/her, to further modify Mufti with the teaching of Neath’s monitoring of audio events on computer systems.  One would have been motivated to do so for the benefit of providing Mufti a capability of executing data from a cache memory of a computer device.  The motivation for doing so would be to have an improved solution for monitoring audio events on a computer (Neath, Column 6, Lines 1-30).

As per claim 3, Mufti teaches further comprising: adding, in a data sending record, the record that the audio data, extracted from the storage and corresponding to the audio instruction, has been sent (Mufti, fig. 5-6, par. [0032],  [0062], “[…] the identification code service 102 includes one or more servers configured to generate identification codes, encode identification codes within audio files, manage identification codes, and provide tracking of identification codes […]”. Further, as illustrated on figs. 5-6 and recited on par. [0082]-[0083], a track of the send data to the consumer device is implemented the track is interpreted to adding the identified record that the audio data corresponding to the audio instruction has been sent to tracking the data sending record sent to the consumer device).

As per claim 4, Neath teaches wherein the detecting the first function call to the first audio interface from the interactive application comprises:2Application No. 15/975,547 Reply to Office Action of March 9, 2020using a Hook function to monitor the first function call to the audio interface (Neath, fig. 7, Column 13, Lines 55-64, a API hook wherein interpreted as the hook function to monitor function calls to the audio interface. The API hook is monitoring/intercepting input and output of audio data); and 
detecting the first function call to the first audio interface from the interactive application when a calling instruction of the interactive application to the first audio interface is detected by the Hook function (Neath, fig. 7, Column 13, Lines 50-59, “[…] The audio-enabled application 515 sends an output audio stream in buffers to the WinMM dll WaveOutWrite function call 705. […]” wherein the API hook function is intercept or detecting the function call in a application that enables audio data to output, wherin the function call is interpreted as the calling instruction of the interactive application to the audio interface).

As per claim 5, Neath teaches wherein the extracting from the storage the second audio data corresponding to the second audio instruction comprises: 
using a Hook function to monitor accesses to the storage for the second audio data corresponding to the second audio instruction (Neath, figs. 5-7, Column 14, Lines 18-26, a API hook is configured to intercept/monitor WaveinUnprepareHeader function call or access of the input audio buffer where the audio buffer is interpreted to store audio data); 
detecting, by the Hook function, a calling instruction to access the storage for the second audio data corresponding to the second audio instruction (Neath, figs. 5-7, Column 14, Lines 18-26, a API hook is intercepting or detecting calling instruction data that is accessing the buffer/storage where the audio data and its corresponding instructions are stored); and 
calling the second audio data corresponding to the audio instruction (Neath, fig. 5, Column 13, Lines 5-15, the audio data is capture via a API hook function, wherein the audio data is interpreted to comprise of the audio instruction).

Claims 6 and 12 are rejected under 35 U.S.C. § 103 as being unpatentable over Zhang et a. (US Publication No. 2014/0325354 A1, ‘Zhang’, hereafter) in view of Neath et al. (US Patent No. 8,463,612 B1, ‘Neath’, hereafter)

As per claim 6, Zhang teaches a method for audio processing, comprising (Zhang, Abstract, [0006] discloses a method and the method includes steps for initiating capture of an audio stream by shaking the mobile device; capturing the audio stream via a microphone in the mobile device; converting the captured audio stream into an audio fingerprint): 
receiving, by user equipment that interacts with a server device in an interactive system, an audio instruction sent by the server device, the audio instruction associated with a first function call to an audio interface from an interactive application (Zhang, [0009] discloses that receiving a query from a mobile device, the query including an audio fingerprint derived from an audio stream captured by a microphone of the mobile device; comparing the audio fingerprint with a plurality of candidate audio fingerprints, each candidate audio fingerprint corresponding to a respective live program broadcast that has an associated interactive content accessible to the server; and returning an interactive content to the mobile device.  Zhang, [0003] discloses that some live program broadcasts that ask users to call a phone number, email, text, or Tweet in their vote for a particular television show's contestant); 
receiving, by the user equipment separately from the audio instruction, audio data that is sent by the server device, the audio data associated with a second function call to the audio interface from the interactive application (Zhang, [0009] discloses that receiving a query from a mobile device, the query including an audio fingerprint derived from an audio stream captured by a microphone of the mobile device; comparing the audio fingerprint with a plurality of candidate audio fingerprints, each candidate audio fingerprint corresponding to a respective live program broadcast that has an associated interactive content accessible to the server; and returning an interactive content to the mobile device.  Zhang, [0003] discloses that some live program broadcasts that ask users to call a phone number, email, text, or Tweet in their vote for a particular television show's contestant); and 
executing, by the user equipment, the audio instruction to generate audio signals based on the audio data obtained from the cache (Zhang, Abstract, [0006] discloses discloses capturing the audio stream via a microphone in the mobile device; converting the captured audio stream into an audio fingerprint; sending the audio fingerprint to a server.  Zhang [0019] and FIG. 2 is a block diagram illustrating different components of the mobile device 102 that may be configured for capturing, converting, and transmitting the audio stream 101a to the remote server 107 in response to a user activating the mobile device to interact with the live program broadcast).
Zhang does not explicitly teach 
storing, by the user equipment, the audio data in a cache; 
accessing the cache to obtain the audio data corresponding to the audio instruction;
However, Neath in the same field of endeavor, teaches 
storing, by the user equipment, the audio data in a cache (Neath, Column 11, Lines 28-54, store in an event cache where the event cache herein is interpreted as the cache. Further the event cache is interpreted to store audio data);
accessing the cache to obtain the audio data corresponding to the audio instruction (Neath,  figs. 3-4, Column 11, Lines 28-36, the event cache can be used to store representations of audio events including without limitation audio input and/or output data from an audio application, as well as re-created audio streams based on such data, wherein the event cache is interpreted as the cache, and audio data including audio instructions such as instructions to re-created audio streams based on such data can be obtained or retrieved from the event cache, where the re-created audio streams are interpreted to correspond the audio instruction);
Therefore, it would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention was made having the teachings of Zhang and Neath before him/her, to modify Zhang with the teaching of Neath’s monitoring of audio events on computer systems.  One would have been motivated to do so for the benefit of providing Zhang a capability of executing data from a cache memory of a computer device.  The motivation for doing so would be to have an improved solution for monitoring audio events on a computer (Neath, Column 6, Lines 1-30). 

As per claim 12, although claim 12 directed to a system, it is similar in scope to claim 6.  The method steps of claim 6 substantially encompass the system recited in claim 12. Therefore; claim 12 is rejected for at least the same reason as claim 6 above.
 
Claims 8-11, and 14-17 are rejected under 35 U.S.C. § 103 as being unpatentable over Zhang et a. (US Publication No. 2014/0325354 A1, ‘Zhang’, hereafter) in view of Neath et al. (US Patent No. 8,463,612 B1, ‘Neath’, hereafter) and further in view of Mufti (US Application No. 2015/0221316 A1).

As per claim 8, Zhang and Neath do not teach
wherein the receiving, by the user equipment, the audio data that is sent by the server device further comprises: receiving the audio data with a header that identifies the audio data
However, Mufti teaches wherein the receiving, by the user equipment, the audio data that is sent by the server device further comprises: receiving the audio data with a header that identifies the audio data (Mufti, fig. 2, par. [0042], an identification code parameters, wherein the identification code parameters is interpreted as the audio data with a header that identifies the audio data).
Therefore, it would have been obvious to one ordinary skill in the art before the effective filing date of the claimed invention was made having the teachings of Zhang, Neath and Mufti before him/her, to further modify Zhang with the teaching of Mufti’s system, method, and apparatus for media content marking and tracking.  One would have been motivated to do so for the benefit of providing Zhang capability of dynamic management and tracking of media content (i.e., live program broadcast) (Mufti, Abstract and [0004]).

As per claim 9, Mufti as modified teaches wherein the receiving the audio data with the header that identifies the audio data further comprises: receiving the audio data with the header that is indicative of an audio file to which the audio data belongs and a position of the audio data in the audio file (Mufti, fig. 2, par. [0042], [0046], the identification code parameters comprises of a name which can be interpreted as the indicative of an audio file to which the audio data belongs, and a channel which can be interpreted to determine the position of the audio data in the audio file, see fig. 17).

As per claim 10, Mufti as modified teaches wherein the receiving the audio data with the header that is indicative of the audio file to which the audio data belongs and the position of the audio data in the audio file further comprises: receiving the audio data with the header that includes a key value for identifying the audio file to which the audio data belongs, an offset position of the audio data in the audio file and a length of the audio data (Mufti, fig. 2, par. [0042], [0046], “[…] a requestor name 202 (e.g., an identifier of a requesting third-party client) […]” where the name that identify the audio data owner is interpreted as a key value. Further, “[…] an offset 206 specifying a start of a timecode […]”. Furthermore, “[…] a duration during which an identification code (or an identifier of an identification code) is to be repeated within an audio file 204 […]” where the duration is interpreted to identify the length of the audio data).

As per claim 11, Mufti as modified teaches wherein the storing the audio data in the cache comprises: 
determining, by the user equipment, the audio file according to the key value (Mufti, fig. 14, par. [0122]-[0124], a consumer device receives audio data which is interpreted to comprise of audio file. Wherein the consumer device is interpreted to identify the audio data base on key value such as the audio file name, where consumer device is interpreted as the user equipment); 
determining, by the user equipment, a storage position of the audio data according to the offset position and the length of the audio data (Mufti, fig. 14, par. [0120], a consumer device determines where to store a media content information receive, wherein the consumer device is interpreted to determine the location base on the offset positon and the length of the media content received. Where consumer device is interpreted as the user equipment, and the media content is interpreted to comprise of the audio data); and 
caching, by the user equipment, the audio data according to the storage position (Mufti, fig. 14, par. [0123], a consumer device receives and play media content, where the action to make the media content to play is interpreted to includes caching the media content on the consumer device base on the location where the media content is store. Where consumer device is interpreted as the user equipment, and the media content is interpreted to comprise of the audio data).

As per claim 14, Mufti as modified teaches wherein: the interface circuitry is configured to receive the audio data with a header that identifies the audio data (Mufti, fig. 2, par. [0042], an identification code parameters, wherein the identification code parameters is interpreted as the audio data with a header that identifies the audio data).

As per claim 15, Mufti as modified teaches wherein: the interface circuitry is configured to receive the audio data with the header that is indicative of an audio file to which the audio data belongs and a position of the audio data in the audio file (Mufti, fig. 2, par. [0042], [0046], the identification code parameters comprises of a name which can be interpreted as the indicative of an audio file to which the audio data belongs, and a channel which can be interpreted to determine the position of the audio data in the audio file, see fig. 17).

As per claim 16, Mufti as modified teaches wherein: the interface circuitry is configured to receive the audio data with the header that includes a key value for identifying the audio file to which the audio data belongs, an offset position of the audio data in the audio file and a length of the audio data (Mufti, fig. 2, par. [0042], [0046], “[…] a requestor name 202 (e.g., an identifier of a requesting third-party client) […]” where the name that identify the audio data owner is interpreted as a key value. Further, “[…] an offset 206 specifying a start of a timecode […]”. Furthermore, “[…] a duration during which an identification code (or an identifier of an identification code) is to be repeated within an audio file 204 […]” where the duration is interpreted to identify the length of the audio data).

As per claim 17, Mufti as modified teaches wherein: the processing circuitry is configured to: determine the audio file according to the key value (Mufti, fig. 14, par. [0122]-[0124], a consumer device receives audio data which is interpreted to comprise of audio file. Wherein the consumer device is interpreted to identify the audio data base on key value such as the audio file name, where consumer device is interpreted as the user equipment); 
determine a storage position of the audio data according to the offset position and the length of the audio data (Mufti, fig. 14, par. [0120], a consumer device determines where to store a media content information receive, wherein the consumer device is interpreted to determine the location base on the offset positon and the length of the media content received. Where consumer device is interpreted as the user equipment, and the media content is interpreted to comprise of the audio data); and 
store the audio data in the storage circuitry according to the storage position (Mufti, fig. 14, par. [0123], a consumer device receives and play media content, where the action to make the media content to play is interpreted to includes stores the media content on the consumer device base on the location where the media content is store. Where consumer device is interpreted as the user equipment, and the media content is interpreted to comprise of the audio data).



Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HASANUL MOBIN whose telephone number is (571)270-1289.  The examiner can normally be reached on 9:30AM to 6:00PM EST M-F.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Fred Ehichioya can be reached on 571-272-4034.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.  Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/HASANUL MOBIN/
Primary Examiner, Art Unit 2168