DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they do not include the following reference sign mentioned in the description: S360. 
The drawings are objected to because:
In Figure 4, reference signs 310 and 320 should be S310 and S320 to be consistent with the specification.
In Figure 5, reference signs 330, 340, and 350 should be S330, S340, and S350 to be consistent with the specification.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Specification
The disclosure is objected to because of the following informalities:
In paragraph 0002, line 1, the acronym “VoIP” is used without being defined.
In paragraph 0002, line 1, the acronym “LTE” is used without being defined.
In paragraph 0002, line 2, the acronym “IP” is used without being defined.
In paragraph 0002, line 4, the acronym “CS” is used without being defined.
In paragraph 0002, line 4, the acronym “PS” is used without being defined.
In paragraph 0002, line 6, the acronym “VoBB” is used without being defined.
In paragraph 0002, line 6, the acronym “PSTN” is used without being defined.
In paragraph 0002, line 7, the acronym “3GPP” is used without being defined.
In paragraph 0002, line 8, the acronym “GSMA” is used without being defined.
In paragraph 0005, line 6, the acronym “SID” is used without being defined.
In paragraph 0081, line 8, “step 11 in step 2” should read “step 11 in FIG. 2”.
In paragraph 0096, line 1, The step “S350: The terminal receives authorization information sent by the apparatus.” is not in Figure 5.
In paragraph 0098, lines 1-3, The step “S360: The terminal obtains, from buffered data based on the quantity of to-be-sent bytes, voice data corresponding to the quantity of to-be-sent bytes, and sends the voice data to the apparatus.” is labeled 350 in Figure 5.
In paragraph 0103, line 3, “80 ms, 10 ms, 120 ms” should read “80 ms, 100 ms, 120 ms”.
In paragraph 0109, line 2, the acronym “eNB” is used without being defined.
In paragraph 0147, line 10, “computer instruction” should read “computer instructions”.
Appropriate correction is required.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1 – 2, 4, 7 – 8, 10, 18 – 19, 21 – 22, 25 and 27 are rejected under 35 U.S.C. 103 as being unpatentable over Harris et al. (US Patent No. 6,999,921), hereinafter Harris, in view of Kim et al. (US Patent No. 9,116,658), hereinafter Kim.
Regarding claim 1, Harris discloses a method for improving voice call quality, implemented by a terminal (Column 2, lines 28-30, “System 100 comprises a system infrastructure, fixed network equipment (FNE) 110, and numerous mobile stations (MSs)”; The mobile stations read on a terminal.), wherein the method comprises:
wherein the maximum allowable buffer duration limits a buffer duration for voice data of a buffer of the terminal (Column 3, lines 8-9, "Processor 112 stores the voice frames in frame buffer 113 after they are received."; Column 3, lines 30-34, "When the number of frames stored in either buffer exceeds a predetermined size threshold (e.g., 300 milliseconds worth of voice frames), then processor 104/112 attempts to delete one or more silent frames."; Deleting silent frames when the number of frames stored in a buffer exceeds a predetermined size threshold reads on the maximum allowable buffer duration limiting a buffer duration for voice data of a buffer.);
buffering the voice data according to the maximum allowable buffer duration (Column 3, lines 8-9, "Processor 112 stores the voice frames in frame buffer 113 after they are received."; Column 3, lines 30-34, "When the number of frames stored in either buffer exceeds a predetermined size threshold (e.g., 300 milliseconds worth of voice frames), then processor 104/112 attempts to delete one or more silent frames."; Storing the voice frames in a frame buffer and deleting silent frames when the number of frames stored in a buffer exceeds a predetermined size threshold reads on buffering the voice data according to the maximum allowable buffer duration.);
determining that the voice data is in an accumulated state (Column 4, lines 16-24, "Logic flow 200 begins (202) with a communication device (an MS and/or FNE) intermittently receiving (204) and storing voice frames in a frame buffer, as it does throughout the duration of a wireless call. When (206) the audio overhang feature is enabled, the number of frames stored in the buffer is monitored (208). When (210) the number stored exceeds a threshold or maximum number, then the wireless call is developing overhang, and thus delay beyond what is optimal."; Monitoring the number of voice frames stored in a buffer and determining when the number of voice frames stored exceeds a threshold or maximum number reads on determining that the voice data is in an accumulated state.);
and cutting off a first silence insertion descriptor (SID) frame in the voice data, wherein the first SID frame does not comprise semantic data (Column 2, lines 63 - Column 3, line 1, "Receiver 111 receives the voice frames that convey the voice information of the call from MS 101. Some of these frames are so-called "silent frames." In one embodiment, these frames have been marked by MS 101 to indicate that they convey either low voice activity or no voice activity."; Column 3, lines 30-34, "When the number of frames stored in either buffer exceeds a predetermined size threshold (e.g., 300 milliseconds worth of voice frames), then processor 104/112 attempts to delete one or more silent frames."; The silent frames read on the silence insertion descriptor (SID) frame, the silent frames indicating low voice activity or no voice activity reads on the SID frame not comprising semantic data, and deleting silent frames reads on cutting off a SID frame.).
Harris does not specifically disclose: receiving a maximum allowable buffer duration from an apparatus.
Kim teaches:
receiving a maximum allowable buffer duration from an apparatus (Column 14, lines 40-41, "Subsequently, the UPnP server 101 transmits an updated client profile to the UPnP client 201"; Column 15, lines 22-26, "FIG. 5 is a table of a client profile according to an embodiment of the present invention. Referring to FIG. 5, a client file is general information on the UPnP client 201 and may contain connectivity information, RTP information and icon preference."; Column 15, lines 32-38, "The RTP information is the information on RTP (real time protocol) of the UPnP client 201 and may contain supportable RTP payload type information (payloadtype), minimum buffer size information (audioIPL) required for playing back audio data received by RTP, maximum buffer size information (audioMPL) for buffering audio data received by RTP and the like."; The client profile containing RTP information, with the RTP information including maximum buffer size information for buffering audio data, and the client profile being transmitted from the UPnP server to the UPnP client, reads on receiving a maximum allowable buffer duration from an apparatus, where the UPnP server reads on the apparatus.).
Kim teaches a client device receiving client profile information from a server device, with the client profile information including maximum buffer size information for buffering audio data, in order to adjust the compression of a transmission for a mobile terminal image display device (Column 1, lines 45-53, "An object of the present invention is to provide a mobile terminal, image display device mounted on a vehicle and data processing method using the same, by which a transmission rate of screen data can be improved in a manner of adjusting a compression scheme for a transmission screen, a loss rate of the transmission screen and the like to be suitable for a screen situation in transmitting screen data of the mobile terminal to the image display device to enable the image display device and the mobile terminal to share a screen with each other.").
Harris and Kim are considered to be analogous to the claimed invention because they are in the same field of voice communication systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Harris to incorporate the teachings of Kim to implement a client device receiving client profile information from a server device, with the client profile information including maximum buffer size information for buffering audio data.  Doing so would allow for adjusting the compression of a transmission for a mobile terminal image display device.
Regarding claim 2, Harris in view of Kim discloses the method as claimed in claim 1.
Harris further discloses:
determining that the voice data is in the accumulated state when the buffer duration meets a first preset threshold (Column 4, lines 16-24, "Logic flow 200 begins (202) with a communication device (an MS and/or FNE) intermittently receiving (204) and storing voice frames in a frame buffer, as it does throughout the duration of a wireless call. When (206) the audio overhang feature is enabled, the number of frames stored in the buffer is monitored (208). When (210) the number stored exceeds a threshold or maximum number, then the wireless call is developing overhang, and thus delay beyond what is optimal."; Monitoring the number of voice frames stored in a buffer and determining when the number of voice frames stored exceeds a threshold or maximum number reads on determining that the voice data is in an accumulated state when the buffer duration meets a preset threshold.).
Regarding claim 4, Harris in view of Kim discloses the method as claimed in claim 1.
Harris further discloses:
detecting a plurality of SID frames in the voice data, wherein the SID frames are consecutive (Column 3, lines 37- 41, "In one embodiment, processor 104/112 scans frame buffer 105/113 for consecutive silent frames longer than a predetermined length (e.g., 90 msecs) and deletes a percentage (e.g., 25%) of the consecutive silent frames that exceed this length."; The consecutive silent frames read on a plurality of consecutive SID frames.);
and, cutting off, in response to detecting the SID frames, from a second SID frame in the voice data until the buffer duration meets a third preset threshold (Column 3, lines 37- 46, "In one embodiment, processor 104/112 scans frame buffer 105/113 for consecutive silent frames longer than a predetermined length (e.g., 90 msecs) and deletes a percentage (e.g., 25%) of the consecutive silent frames that exceed this length. In another embodiment, processor 104/112 monitors the voice frames as they are stored in the buffer. Processor 104/112 determines that a threshold number of consecutive silent frames have been stored in the frame buffer and deletes a percentage of subsequent consecutive silent frames as they are being received and stored."; Scanning a frame buffer for consecutive silent frames reads on detecting SID frames in voice data, deleting silent frames for consecutive silent frames that exceed a predetermined length reads on cutting off from a second SID frame, for the case where the predetermined length is set to one frame, and deleting silent frames when a threshold number of consecutive silent frames have been stored in the frame buffer reads on cutting off SID frames until the buffer duration meets a third preset threshold.).
Regarding claim 7, Harris discloses a terminal (Column 2, lines 28-30, “System 100 comprises a system infrastructure, fixed network equipment (FNE) 110, and numerous mobile stations (MSs)”; The mobile stations read on a terminal.), comprising:
a buffer comprising voice data (Figure 1, “Frame Buffer 105”);
a processor coupled to the buffer (Figure 1, “Processor 104”);
and a memory coupled to the processor and configured to store instructions that, when executed by the processor (Column 2, lines 35-38, “In particular, MS 102 comprises receiver 103, speaker 106, frame buffer 105, and processor 104 (comprising one or more memory devices and processing devices such as microprocessors and digital signal processors).”), cause the terminal to be configured to:
wherein the maximum allowable buffer duration limits a buffer duration for the voice data (Column 3, lines 8-9, "Processor 112 stores the voice frames in frame buffer 113 after they are received."; Column 3, lines 30-34, "When the number of frames stored in either buffer exceeds a predetermined size threshold (e.g., 300 milliseconds worth of voice frames), then processor 104/112 attempts to delete one or more silent frames."; Deleting silent frames when the number of frames stored in a buffer exceeds a predetermined size threshold reads on the maximum allowable buffer duration limiting a buffer duration for voice data of a buffer.);
buffer the voice data according to the maximum allowable buffer duration (Column 3, lines 8-9, "Processor 112 stores the voice frames in frame buffer 113 after they are received."; Column 3, lines 30-34, "When the number of frames stored in either buffer exceeds a predetermined size threshold (e.g., 300 milliseconds worth of voice frames), then processor 104/112 attempts to delete one or more silent frames."; Storing the voice frames in a frame buffer and deleting silent frames when the number of frames stored in a buffer exceeds a predetermined size threshold reads on buffering the voice data according to the maximum allowable buffer duration.);
determine that voice data is in an accumulated state (Column 4, lines 16-24, "Logic flow 200 begins (202) with a communication device (an MS and/or FNE) intermittently receiving (204) and storing voice frames in a frame buffer, as it does throughout the duration of a wireless call. When (206) the audio overhang feature is enabled, the number of frames stored in the buffer is monitored (208). When (210) the number stored exceeds a threshold or maximum number, then the wireless call is developing overhang, and thus delay beyond what is optimal."; Monitoring the number of voice frames stored in a buffer and determining when the number of voice frames stored exceeds a threshold or maximum number reads on determining that the voice data is in an accumulated state.);
and cut off a first silence insertion descriptor (SID) frame in the voice data, wherein the first SID frame does not comprise semantic data (Column 2, lines 63 - Column 3, line 1, "Receiver 111 receives the voice frames that convey the voice information of the call from MS 101. Some of these frames are so-called "silent frames." In one embodiment, these frames have been marked by MS 101 to indicate that they convey either low voice activity or no voice activity."; Column 3, lines 30-34, "When the number of frames stored in either buffer exceeds a predetermined size threshold (e.g., 300 milliseconds worth of voice frames), then processor 104/112 attempts to delete one or more silent frames."; The silent frames read on the silence insertion descriptor (SID) frame, the silent frames indicating low voice activity or no voice activity reads on the SID frame not comprising semantic data, and deleting silent frames reads on cutting off a SID frame.).
Harris does not specifically disclose: receive a maximum allowable buffer duration from an apparatus.
Kim teaches:
receive a maximum allowable buffer duration from an apparatus (Column 14, lines 40-41, "Subsequently, the UPnP server 101 transmits an updated client profile to the UPnP client 201"; Column 15, lines 22-26, "FIG. 5 is a table of a client profile according to an embodiment of the present invention. Referring to FIG. 5, a client file is general information on the UPnP client 201 and may contain connectivity information, RTP information and icon preference."; Column 15, lines 32-38, "The RTP information is the information on RTP (real time protocol) of the UPnP client 201 and may contain supportable RTP payload type information (payloadtype), minimum buffer size information (audioIPL) required for playing back audio data received by RTP, maximum buffer size information (audioMPL) for buffering audio data received by RTP and the like."; The client profile containing RTP information, with the RTP information including maximum buffer size information for buffering audio data, and the client profile being transmitted from the UPnP server to the UPnP client, reads on receiving a maximum allowable buffer duration from an apparatus, where the UPnP server reads on the apparatus.).
Kim teaches a client device receiving client profile information from a server device, with the client profile information including maximum buffer size information for buffering audio data, in order to adjust the compression of a transmission for a mobile terminal image display device (Column 1, lines 45-53, "An object of the present invention is to provide a mobile terminal, image display device mounted on a vehicle and data processing method using the same, by which a transmission rate of screen data can be improved in a manner of adjusting a compression scheme for a transmission screen, a loss rate of the transmission screen and the like to be suitable for a screen situation in transmitting screen data of the mobile terminal to the image display device to enable the image display device and the mobile terminal to share a screen with each other.").
Harris and Kim are considered to be analogous to the claimed invention because they are in the same field of voice communication systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Harris to incorporate the teachings of Kim to implement a client device receiving client profile information from a server device, with the client profile information including maximum buffer size information for buffering audio data.  Doing so would allow for adjusting the compression of a transmission for a mobile terminal image display device.
Regarding claim 8, Harris in view of Kim discloses the terminal as claimed in claim 7.
Harris further discloses:
wherein the instructions further cause the terminal to be configured to determine that the voice data is in the accumulated state when the buffer duration meets a first preset threshold (Column 4, lines 16-24, "Logic flow 200 begins (202) with a communication device (an MS and/or FNE) intermittently receiving (204) and storing voice frames in a frame buffer, as it does throughout the duration of a wireless call. When (206) the audio overhang feature is enabled, the number of frames stored in the buffer is monitored (208). When (210) the number stored exceeds a threshold or maximum number, then the wireless call is developing overhang, and thus delay beyond what is optimal."; Monitoring the number of voice frames stored in a buffer and determining when the number of voice frames stored exceeds a threshold or maximum number reads on determining that the voice data is in an accumulated state when the buffer duration meets a preset threshold.).
Regarding claim 10, Harris in view of Kim discloses the terminal as claimed in claim 7.
Harris further discloses wherein the instructions further cause the terminal to be configured to:
detect a plurality of SID frames in the voice data, wherein the SID frames are consecutive (Column 3, lines 37- 41, "In one embodiment, processor 104/112 scans frame buffer 105/113 for consecutive silent frames longer than a predetermined length (e.g., 90 msecs) and deletes a percentage (e.g., 25%) of the consecutive silent frames that exceed this length."; The consecutive silent frames read on a plurality of consecutive SID frames.);
and cut off, in response to detecting the SID frames, from a second SID frame in the voice data until the buffer duration meets a third preset threshold (Column 3, lines 37- 46, "In one embodiment, processor 104/112 scans frame buffer 105/113 for consecutive silent frames longer than a predetermined length (e.g., 90 msecs) and deletes a percentage (e.g., 25%) of the consecutive silent frames that exceed this length. In another embodiment, processor 104/112 monitors the voice frames as they are stored in the buffer. Processor 104/112 determines that a threshold number of consecutive silent frames have been stored in the frame buffer and deletes a percentage of subsequent consecutive silent frames as they are being received and stored."; Scanning a frame buffer for consecutive silent frames reads on detecting SID frames in voice data, deleting silent frames for consecutive silent frames that exceed a predetermined length reads on cutting off from a second SID frame, for the case where the predetermined length is set to one frame, and deleting silent frames when a threshold number of consecutive silent frames have been stored in the frame buffer reads on cutting off SID frames until the buffer duration meets a third preset threshold.).
Regarding claim 18, Harris discloses a computer program product comprising computer-executable instructions stored on a non-transitory computer-readable medium that, when executed by a processor (Column 2, lines 35-38, “In particular, MS 102 comprises receiver 103, speaker 106, frame buffer 105, and processor 104 (comprising one or more memory devices and processing devices such as microprocessors and digital signal processors).”), cause a terminal (Column 2, lines 28-30, “System 100 comprises a system infrastructure, fixed network equipment (FNE) 110, and numerous mobile stations (MSs)”; The mobile stations read on a terminal.) to:
wherein the maximum allowable buffer duration limits a buffer duration for voice data of a buffer of the terminal (Column 3, lines 8-9, "Processor 112 stores the voice frames in frame buffer 113 after they are received."; Column 3, lines 30-34, "When the number of frames stored in either buffer exceeds a predetermined size threshold (e.g., 300 milliseconds worth of voice frames), then processor 104/112 attempts to delete one or more silent frames."; Deleting silent frames when the number of frames stored in a buffer exceeds a predetermined size threshold reads on the maximum allowable buffer duration limiting a buffer duration for voice data of a buffer.);
buffer the voice data according to the maximum allowable buffer duration (Column 3, lines 8-9, "Processor 112 stores the voice frames in frame buffer 113 after they are received."; Column 3, lines 30-34, "When the number of frames stored in either buffer exceeds a predetermined size threshold (e.g., 300 milliseconds worth of voice frames), then processor 104/112 attempts to delete one or more silent frames."; Storing the voice frames in a frame buffer and deleting silent frames when the number of frames stored in a buffer exceeds a predetermined size threshold reads on buffering the voice data according to the maximum allowable buffer duration.);
determine that the voice data is in an accumulated state (Column 4, lines 16-24, "Logic flow 200 begins (202) with a communication device (an MS and/or FNE) intermittently receiving (204) and storing voice frames in a frame buffer, as it does throughout the duration of a wireless call. When (206) the audio overhang feature is enabled, the number of frames stored in the buffer is monitored (208). When (210) the number stored exceeds a threshold or maximum number, then the wireless call is developing overhang, and thus delay beyond what is optimal."; Monitoring the number of voice frames stored in a buffer and determining when the number of voice frames stored exceeds a threshold or maximum number reads on determining that the voice data is in an accumulated state.);
and cut off a first silence insertion descriptor (SID) frame of the voice data, wherein the first SID frame has no semantic data (Column 2, lines 63 - Column 3, line 1, "Receiver 111 receives the voice frames that convey the voice information of the call from MS 101. Some of these frames are so-called "silent frames." In one embodiment, these frames have been marked by MS 101 to indicate that they convey either low voice activity or no voice activity."; Column 3, lines 30-34, "When the number of frames stored in either buffer exceeds a predetermined size threshold (e.g., 300 milliseconds worth of voice frames), then processor 104/112 attempts to delete one or more silent frames."; The silent frames read on the silence insertion descriptor (SID) frame, the silent frames indicating low voice activity or no voice activity reads on the SID frame not comprising semantic data, and deleting silent frames reads on cutting off a SID frame.).
Harris does not specifically disclose: receive a maximum allowable buffer duration from an apparatus.
Kim teaches:
receive a maximum allowable buffer duration from an apparatus (Column 14, lines 40-41, "Subsequently, the UPnP server 101 transmits an updated client profile to the UPnP client 201"; Column 15, lines 22-26, "FIG. 5 is a table of a client profile according to an embodiment of the present invention. Referring to FIG. 5, a client file is general information on the UPnP client 201 and may contain connectivity information, RTP information and icon preference."; Column 15, lines 32-38, "The RTP information is the information on RTP (real time protocol) of the UPnP client 201 and may contain supportable RTP payload type information (payloadtype), minimum buffer size information (audioIPL) required for playing back audio data received by RTP, maximum buffer size information (audioMPL) for buffering audio data received by RTP and the like."; The client profile containing RTP information, with the RTP information including maximum buffer size information for buffering audio data, and the client profile being transmitted from the UPnP server to the UPnP client, reads on receiving a maximum allowable buffer duration from an apparatus, where the UPnP server reads on the apparatus.).
Kim teaches a client device receiving client profile information from a server device, with the client profile information including maximum buffer size information for buffering audio data, in order to adjust the compression of a transmission for a mobile terminal image display device (Column 1, lines 45-53, "An object of the present invention is to provide a mobile terminal, image display device mounted on a vehicle and data processing method using the same, by which a transmission rate of screen data can be improved in a manner of adjusting a compression scheme for a transmission screen, a loss rate of the transmission screen and the like to be suitable for a screen situation in transmitting screen data of the mobile terminal to the image display device to enable the image display device and the mobile terminal to share a screen with each other.").
Harris and Kim are considered to be analogous to the claimed invention because they are in the same field of voice communication systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Harris to incorporate the teachings of Kim to implement a client device receiving client profile information from a server device, with the client profile information including maximum buffer size information for buffering audio data.  Doing so would allow for adjusting the compression of a transmission for a mobile terminal image display device.
Regarding claim 19, Harris in view of Kim discloses the computer program product as claimed in claim 18.
Harris further discloses:
wherein the instructions further cause the terminal to determine that the voice data is in the accumulated state when the buffer duration meets a first preset threshold (Column 4, lines 16-24, "Logic flow 200 begins (202) with a communication device (an MS and/or FNE) intermittently receiving (204) and storing voice frames in a frame buffer, as it does throughout the duration of a wireless call. When (206) the audio overhang feature is enabled, the number of frames stored in the buffer is monitored (208). When (210) the number stored exceeds a threshold or maximum number, then the wireless call is developing overhang, and thus delay beyond what is optimal."; Monitoring the number of voice frames stored in a buffer and determining when the number of voice frames stored exceeds a threshold or maximum number reads on determining that the voice data is in an accumulated state when the buffer duration meets a preset threshold.).
Regarding claim 21, Harris in view of Kim discloses the computer program product as claimed in claim 18.
Harris further discloses wherein the instructions further cause the terminal to:
detect a plurality of SID frames of the voice data, wherein the SID frames are consecutive (Column 3, lines 37- 41, "In one embodiment, processor 104/112 scans frame buffer 105/113 for consecutive silent frames longer than a predetermined length (e.g., 90 msecs) and deletes a percentage (e.g., 25%) of the consecutive silent frames that exceed this length."; The consecutive silent frames read on a plurality of consecutive SID frames.);
and cut off, in response to detecting the SID frames, from a second SID frame in the voice data until the buffer duration meets a third preset threshold (Column 3, lines 37- 46, "In one embodiment, processor 104/112 scans frame buffer 105/113 for consecutive silent frames longer than a predetermined length (e.g., 90 msecs) and deletes a percentage (e.g., 25%) of the consecutive silent frames that exceed this length. In another embodiment, processor 104/112 monitors the voice frames as they are stored in the buffer. Processor 104/112 determines that a threshold number of consecutive silent frames have been stored in the frame buffer and deletes a percentage of subsequent consecutive silent frames as they are being received and stored."; Scanning a frame buffer for consecutive silent frames reads on detecting SID frames in voice data, deleting silent frames for consecutive silent frames that exceed a predetermined length reads on cutting off from a second SID frame, for the case where the predetermined length is set to one frame, and deleting silent frames when a threshold number of consecutive silent frames have been stored in the frame buffer reads on cutting off SID frames until the buffer duration meets a third preset threshold.).
Regarding claim 22, Harris in view of Kim discloses the computer program product as claimed in claim 18.
Harris further discloses wherein the instructions further cause the terminal to:
detect a plurality of SID frames in the voice data, wherein the SID frames are consecutive (Column 3, lines 37- 41, "In one embodiment, processor 104/112 scans frame buffer 105/113 for consecutive silent frames longer than a predetermined length (e.g., 90 msecs) and deletes a percentage (e.g., 25%) of the consecutive silent frames that exceed this length."; The consecutive silent frames read on a plurality of consecutive SID frames.);
detect whether the voice data comprises a speech frame (Column 3, lines 46-52, "In another embodiment, the deletion processing is triggered by the receipt of the last voice frame of each dispatch session within the dispatch call. Processor 104/112 determines that a threshold number of silent frames have been consecutively stored in the frame buffer prior to the last voice frame and deletes a percentage of prior consecutive silent frames."; Initiating the deletion of silent frames based on the detection of a voice frame reads on detecting whether the voice data comprises a speech frame.);
and cut off, in response to detecting the SID frames, from a second SID frame in the voice data until the speech frame is detected (Column 3, lines 37- 41, "In one embodiment, processor 104/112 scans frame buffer 105/113 for consecutive silent frames longer than a predetermined length (e.g., 90 msecs) and deletes a percentage (e.g., 25%) of the consecutive silent frames that exceed this length."; Column 3, lines 46-52, "In another embodiment, the deletion processing is triggered by the receipt of the last voice frame of each dispatch session within the dispatch call. Processor 104/112 determines that a threshold number of silent frames have been consecutively stored in the frame buffer prior to the last voice frame and deletes a percentage of prior consecutive silent frames."; Scanning a frame buffer for consecutive silent frames reads on detecting SID frames in voice data, deleting silent frames for consecutive silent frames that exceed a predetermined length reads on cutting off from a second SID frame, for the case where the predetermined length is set to one frame, and deleting silent frames prior to a voice frame reads on cutting off SID frames until a speech frame is detected.).
Regarding claim 25, Harris in view of Kim discloses the method as claimed in claim 1.
Harris further discloses:
detecting a plurality of SID frames in the voice data, wherein the SID frames are consecutive (Column 3, lines 37- 41, "In one embodiment, processor 104/112 scans frame buffer 105/113 for consecutive silent frames longer than a predetermined length (e.g., 90 msecs) and deletes a percentage (e.g., 25%) of the consecutive silent frames that exceed this length."; The consecutive silent frames read on a plurality of consecutive SID frames.);
detecting whether the voice data comprises a speech frame (Column 3, lines 46-52, "In another embodiment, the deletion processing is triggered by the receipt of the last voice frame of each dispatch session within the dispatch call. Processor 104/112 determines that a threshold number of silent frames have been consecutively stored in the frame buffer prior to the last voice frame and deletes a percentage of prior consecutive silent frames."; Initiating the deletion of silent frames based on the detection of a voice frame reads on detecting whether the voice data comprises a speech frame.);
and cutting off, in response to detecting the SID frames, from a second SID frame in the voice data until the speech frame is detected (Column 3, lines 37- 41, "In one embodiment, processor 104/112 scans frame buffer 105/113 for consecutive silent frames longer than a predetermined length (e.g., 90 msecs) and deletes a percentage (e.g., 25%) of the consecutive silent frames that exceed this length."; Column 3, lines 46-52, "In another embodiment, the deletion processing is triggered by the receipt of the last voice frame of each dispatch session within the dispatch call. Processor 104/112 determines that a threshold number of silent frames have been consecutively stored in the frame buffer prior to the last voice frame and deletes a percentage of prior consecutive silent frames."; Scanning a frame buffer for consecutive silent frames reads on detecting SID frames in voice data, deleting silent frames for consecutive silent frames that exceed a predetermined length reads on cutting off from a second SID frame, for the case where the predetermined length is set to one frame, and deleting silent frames prior to a voice frame reads on cutting off SID frames until a speech frame is detected.).
Regarding claim 27, Harris in view of Kim discloses the terminal as claimed in claim 7.
Harris further discloses wherein the instructions further cause the terminal to be configured to:
detect a plurality of SID frames in the voice data, wherein the SID frames are consecutive (Column 3, lines 37- 41, "In one embodiment, processor 104/112 scans frame buffer 105/113 for consecutive silent frames longer than a predetermined length (e.g., 90 msecs) and deletes a percentage (e.g., 25%) of the consecutive silent frames that exceed this length."; The consecutive silent frames read on a plurality of consecutive SID frames.);
detect whether the voice data comprises a speech frame (Column 3, lines 46-52, "In another embodiment, the deletion processing is triggered by the receipt of the last voice frame of each dispatch session within the dispatch call. Processor 104/112 determines that a threshold number of silent frames have been consecutively stored in the frame buffer prior to the last voice frame and deletes a percentage of prior consecutive silent frames."; Initiating the deletion of silent frames based on the detection of a voice frame reads on detecting whether the voice data comprises a speech frame.);
and cut off, in response to detecting the SID frames, from a second SID frame in the voice data until the speech frame is detected (Column 3, lines 37- 41, "In one embodiment, processor 104/112 scans frame buffer 105/113 for consecutive silent frames longer than a predetermined length (e.g., 90 msecs) and deletes a percentage (e.g., 25%) of the consecutive silent frames that exceed this length."; Column 3, lines 46-52, "In another embodiment, the deletion processing is triggered by the receipt of the last voice frame of each dispatch session within the dispatch call. Processor 104/112 determines that a threshold number of silent frames have been consecutively stored in the frame buffer prior to the last voice frame and deletes a percentage of prior consecutive silent frames."; Scanning a frame buffer for consecutive silent frames reads on detecting SID frames in voice data, deleting silent frames for consecutive silent frames that exceed a predetermined length reads on cutting off from a second SID frame, for the case where the predetermined length is set to one frame, and deleting silent frames prior to a voice frame reads on cutting off SID frames until a speech frame is detected.).
Claims 3, 9 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Harris in view of Kim, and further in view of Wu et al. (US Patent No. 10,602,139), hereinafter Wu.
Regarding claim 3, Harris in view of Kim discloses the method as claimed in claim 1, but does not specifically disclose: further comprising, determining that the voice data is in the accumulated state when a ratio of the buffer duration to the maximum allowable buffer duration meets a second preset threshold.
Wu teaches:
determining that the voice data is in the accumulated state when a ratio of the buffer duration to the maximum allowable buffer duration meets a second preset threshold (Column 8, lines 3-6, "The bandwidth adaptiveness block 330 may determine whether the buffer fill ratio of the video packet buffer is above a buffer fill threshold and determine whether the video packet loss rate is above a packet loss threshold."; Column 8, lines 14-17, "In some embodiments, the buffer fill ratio is defined as the number of buffered video packets divided by the total or maximum buffer size. The buffer fill threshold then may be forty percent or four tenths."; The buffer fill ratio reads on a ratio of the buffer duration to the maximum allowable buffer duration, and determining when the buffer fill ratio for a buffer is above a buffer fill threshold reads on determining that the voice data is in the accumulated state when a ratio of the buffer duration to the maximum allowable buffer duration meets a preset threshold.).
Wu teaches determining when the buffer fill ratio for a buffer is above a buffer fill threshold in order to implement adaptive rate control for power efficiency (Column 3, lines 4-15, "The system 100 may implement an adaptive rate control to achieve power-efficient video streaming. In some embodiments, the system 100 may adjust a bit rate of the video data (e.g., by encoding at least a portion of the video data at a first bit rate different from the source bit rate with the video encoder 118) and a power of the wireless chip 111 to achieve the power-efficient video streaming. Additionally, the system 100 may determine whether or not to send a packet (e.g., video packets representative of portions of the video data) from the sender 101 to the receiver 104 based on at least one of a transmission rate threshold, a transmission interval threshold, or a buffer fill threshold of sender 101.").
Harris, Kim, and Wu are considered to be analogous to the claimed invention because they are in the same field of communication systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Harris in view of Kim to incorporate the teachings of Wu to determine when the buffer fill ratio for a buffer is above a buffer fill threshold.  Doing so would allow for implementing adaptive rate control for power efficiency.
Regarding claim 9, Harris in view of Kim discloses the terminal as claimed in claim 7, but does not specifically disclose: wherein the instructions further cause the terminal to be configured to, determine that the voice data is in the accumulated state when a ratio of the buffer duration to the maximum allowable buffer duration meets a second preset threshold.
Wu teaches:
wherein the instructions further cause the terminal to be configured to, determine that the voice data is in the accumulated state when a ratio of the buffer duration to the maximum allowable buffer duration meets a second preset threshold (Column 8, lines 3-6, "The bandwidth adaptiveness block 330 may determine whether the buffer fill ratio of the video packet buffer is above a buffer fill threshold and determine whether the video packet loss rate is above a packet loss threshold."; Column 8, lines 14-17, "In some embodiments, the buffer fill ratio is defined as the number of buffered video packets divided by the total or maximum buffer size. The buffer fill threshold then may be forty percent or four tenths."; The buffer fill ratio reads on a ratio of the buffer duration to the maximum allowable buffer duration, and determining when the buffer fill ratio for a buffer is above a buffer fill threshold reads on determining that the voice data is in the accumulated state when a ratio of the buffer duration to the maximum allowable buffer duration meets a preset threshold.).
Wu teaches determining when the buffer fill ratio for a buffer is above a buffer fill threshold in order to implement adaptive rate control for power efficiency (Column 3, lines 4-15, "The system 100 may implement an adaptive rate control to achieve power-efficient video streaming. In some embodiments, the system 100 may adjust a bit rate of the video data (e.g., by encoding at least a portion of the video data at a first bit rate different from the source bit rate with the video encoder 118) and a power of the wireless chip 111 to achieve the power-efficient video streaming. Additionally, the system 100 may determine whether or not to send a packet (e.g., video packets representative of portions of the video data) from the sender 101 to the receiver 104 based on at least one of a transmission rate threshold, a transmission interval threshold, or a buffer fill threshold of sender 101.").
Harris, Kim, and Wu are considered to be analogous to the claimed invention because they are in the same field of communication systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Harris in view of Kim to incorporate the teachings of Wu to determine when the buffer fill ratio for a buffer is above a buffer fill threshold.  Doing so would allow for implementing adaptive rate control for power efficiency.
Regarding claim 20, Harris in view of Kim discloses the computer program product as claimed in claim 18, but does not specifically disclose: wherein the instructions further cause the terminal to determine that the voice data is in the accumulated state when a ratio of the buffer duration to the maximum allowable buffer duration meets a second preset threshold.
Wu teaches:
wherein the instructions further cause the terminal to determine that the voice data is in the accumulated state when a ratio of the buffer duration to the maximum allowable buffer duration meets a second preset threshold (Column 8, lines 3-6, "The bandwidth adaptiveness block 330 may determine whether the buffer fill ratio of the video packet buffer is above a buffer fill threshold and determine whether the video packet loss rate is above a packet loss threshold."; Column 8, lines 14-17, "In some embodiments, the buffer fill ratio is defined as the number of buffered video packets divided by the total or maximum buffer size. The buffer fill threshold then may be forty percent or four tenths."; The buffer fill ratio reads on a ratio of the buffer duration to the maximum allowable buffer duration, and determining when the buffer fill ratio for a buffer is above a buffer fill threshold reads on determining that the voice data is in the accumulated state when a ratio of the buffer duration to the maximum allowable buffer duration meets a preset threshold.).
Wu teaches determining when the buffer fill ratio for a buffer is above a buffer fill threshold in order to implement adaptive rate control for power efficiency (Column 3, lines 4-15, "The system 100 may implement an adaptive rate control to achieve power-efficient video streaming. In some embodiments, the system 100 may adjust a bit rate of the video data (e.g., by encoding at least a portion of the video data at a first bit rate different from the source bit rate with the video encoder 118) and a power of the wireless chip 111 to achieve the power-efficient video streaming. Additionally, the system 100 may determine whether or not to send a packet (e.g., video packets representative of portions of the video data) from the sender 101 to the receiver 104 based on at least one of a transmission rate threshold, a transmission interval threshold, or a buffer fill threshold of sender 101.").
Harris, Kim, and Wu are considered to be analogous to the claimed invention because they are in the same field of communication systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Harris in view of Kim to incorporate the teachings of Wu to determine when the buffer fill ratio for a buffer is above a buffer fill threshold.  Doing so would allow for implementing adaptive rate control for power efficiency.
Claims 6, 12 and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Harris in view of Kim, and further in view of Lawrence et al. (US Patent No. 10,424,299), hereinafter Lawrence.
Regarding claim 6, Harris in view of Kim discloses the method as claimed in claim 1, but does not specifically disclose: wherein the voice data is of a fifth generation (5G) call.
Lawrence teaches:
wherein the voice data is of a fifth generation (5G) call (Column 6, lines 6-9, "The telecommunications network 104 may also include cellular networks, such as LTE, 4G, 5G, HSDPA/HSUPA, TD-SCDMA, W-CDMA, CDMA, WIFI, Bluetooth, EvDo, GSM, and iDEN networks."; Column 8, lines 59-61, "In act 306, the command processor determines whether the acoustic data buffered in the audio buffer includes an executable speech command sequence."; The 5G cellular networks reads on a fifth generation (5G) call.).
Lawrence teaches buffering audio data for a fifth generation (5G) cellular network in order to mask speech commands for telecommunications devices other than the devices direct to receive the commands (Column 8, lines 51-62, "The systems and methods disclosed herein reliably mask speech commands directed to one or more telecommunications devices to prevent these speech commands from being rendered via other telecommunications devices to remote participants in a teleconference.").
Harris, Kim, and Lawrence are considered to be analogous to the claimed invention because they are in the same field of communication systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Harris in view of Kim to incorporate the teachings of Lawrence to buffer audio data for a fifth generation (5G) cellular network.  Doing so would allow for masking speech commands for telecommunications devices other than the devices direct to receive the commands.
Regarding claim 12, Harris in view of Kim discloses the terminal as claimed in claim 7, but does not specifically disclose: wherein the voice data is voice data of a fifth generation (5G) call or a video call.
Lawrence teaches:
wherein the voice data is voice data of a fifth generation (5G) call or a video call (Column 6, lines 6-9, "The telecommunications network 104 may also include cellular networks, such as LTE, 4G, 5G, HSDPA/HSUPA, TD-SCDMA, W-CDMA, CDMA, WIFI, Bluetooth, EvDo, GSM, and iDEN networks."; Column 8, lines 59-61, "In act 306, the command processor determines whether the acoustic data buffered in the audio buffer includes an executable speech command sequence."; The 5G cellular networks reads on a fifth generation (5G) call.).
Lawrence teaches buffering audio data for a fifth generation (5G) cellular network in order to mask speech commands for telecommunications devices other than the devices direct to receive the commands (Column 8, lines 51-62, "The systems and methods disclosed herein reliably mask speech commands directed to one or more telecommunications devices to prevent these speech commands from being rendered via other telecommunications devices to remote participants in a teleconference.").
Harris, Kim, and Lawrence are considered to be analogous to the claimed invention because they are in the same field of communication systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Harris in view of Kim to incorporate the teachings of Lawrence to buffer audio data for a fifth generation (5G) cellular network.  Doing so would allow for masking speech commands for telecommunications devices other than the devices direct to receive the commands.
Regarding claim 23, Harris in view of Kim discloses the computer program product as claimed in claim 18, but does not specifically disclose: wherein the voice data is of a fifth generation (5G) call.
Lawrence teaches:
wherein the voice data is of a fifth generation (5G) call (Column 6, lines 6-9, "The telecommunications network 104 may also include cellular networks, such as LTE, 4G, 5G, HSDPA/HSUPA, TD-SCDMA, W-CDMA, CDMA, WIFI, Bluetooth, EvDo, GSM, and iDEN networks."; Column 8, lines 59-61, "In act 306, the command processor determines whether the acoustic data buffered in the audio buffer includes an executable speech command sequence."; The 5G cellular networks reads on a fifth generation (5G) call.).
Lawrence teaches buffering audio data for a fifth generation (5G) cellular network in order to mask speech commands for telecommunications devices other than the devices direct to receive the commands (Column 8, lines 51-62, "The systems and methods disclosed herein reliably mask speech commands directed to one or more telecommunications devices to prevent these speech commands from being rendered via other telecommunications devices to remote participants in a teleconference.").
Harris, Kim, and Lawrence are considered to be analogous to the claimed invention because they are in the same field of communication systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Harris in view of Kim to incorporate the teachings of Lawrence to buffer audio data for a fifth generation (5G) cellular network.  Doing so would allow for masking speech commands for telecommunications devices other than the devices direct to receive the commands.
Claims 24 and 26 are rejected under 35 U.S.C. 103 as being unpatentable over Harris in view of Kim, and further in view of Boehme (US Patent No. 10,362,173).
Regarding claim 24, Harris in view of Kim discloses the computer program product as claimed in claim 18, but does not specifically disclose: wherein the voice data is of a video call.
Boehme teaches:
wherein the voice data is of a video call (Column 7, line 62 - Column 8, line 6, "Additionally or alternatively, in some embodiments, the sizes of the audio data increments that may form the audio stream 166 may be based on properties of a remote WebRTC client and associated remote system (described in further detail below) that may receive the audio stream 166 from the WebRTC client 152. For example, the remote WebRTC client and associated remote system may have an audio buffer of a particular size that may receive the audio stream 166. The sizes of the audio data increments may be selected such that more information than the audio buffer can handle at a particular point in time may not be received by the audio buffer."; Column 13, lines 12-16, "In some embodiments, the system 208 may operate as an exchange configured to establish communication sessions, such as telephone calls, video calls, etc., between devices such as the telephone 210 and another device or devices as described in this disclosure, among other operations.").
Boehme teaches buffering audio data for a video call in order to implement a transcription system that provides text transcriptions of the audio of video mail messages (Column 4, lines 6-11, "Additionally or alternatively, one or more embodiments of the present disclosure may include a video mail messaging service. In these or other embodiments, the video mail messaging service may include a transcription system that may provide text transcriptions of audio of the video mail messages.").
Harris, Kim, and Boehme are considered to be analogous to the claimed invention because they are in the same field of communication systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Harris in view of Kim to incorporate the teachings of Boehme to buffer audio data for a video call.  Doing so would allow for implementing a transcription system that provides text transcriptions of the audio of video mail messages.
Regarding claim 26, Harris in view of Kim discloses the method as claimed in claim 1, but does not specifically disclose: wherein the voice data is of a video call.
Boehme teaches:
wherein the voice data is of a video call (Column 7, line 62 - Column 8, line 6, "Additionally or alternatively, in some embodiments, the sizes of the audio data increments that may form the audio stream 166 may be based on properties of a remote WebRTC client and associated remote system (described in further detail below) that may receive the audio stream 166 from the WebRTC client 152. For example, the remote WebRTC client and associated remote system may have an audio buffer of a particular size that may receive the audio stream 166. The sizes of the audio data increments may be selected such that more information than the audio buffer can handle at a particular point in time may not be received by the audio buffer."; Column 13, lines 12-16, "In some embodiments, the system 208 may operate as an exchange configured to establish communication sessions, such as telephone calls, video calls, etc., between devices such as the telephone 210 and another device or devices as described in this disclosure, among other operations.").
Boehme teaches buffering audio data for a video call in order to implement a transcription system that provides text transcriptions of the audio of video mail messages (Column 4, lines 6-11, "Additionally or alternatively, one or more embodiments of the present disclosure may include a video mail messaging service. In these or other embodiments, the video mail messaging service may include a transcription system that may provide text transcriptions of audio of the video mail messages.").
Harris, Kim, and Boehme are considered to be analogous to the claimed invention because they are in the same field of communication systems.  Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Harris in view of Kim to incorporate the teachings of Boehme to buffer audio data for a video call.  Doing so would allow for implementing a transcription system that provides text transcriptions of the audio of video mail messages.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to James Boggs whose telephone number is (571)272-2968. The examiner can normally be reached M-F 8:00 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on (571)272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JAMES BOGGS/Examiner, Art Unit 2657                                                                                                                                                                                                        
/DANIEL C WASHBURN/Supervisory Patent Examiner, Art Unit 2657