DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Claim Rejections - 35 USC § 103
1.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
2.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

3.	Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Lemon et al. (US 10,431,216 B1 hereinafter, Lemon ‘216) in combination with Gordon (US 20170085506 A1 hereinafter, Gordon ‘506).
Regarding claim 11; Lemon ‘216 discloses a system (Fig. 1A, Enhanced Transcription System 100);
comprising: 
one or more processors (Fig. 1A, Processor(s) 102, 104, 106)
and a non-transitory computer readable storage medium comprising instructions that when executed by the one or processors cause the one or more processors to perform operations (i.e. The memory 108, 110, and/or 112 may be implemented as computer-readable storage media (“CRSM”), which may be any available physical media accessible by the processor(s) 102, 104, and/or 106 to execute instructions stored on the memory 108, 110, and/or 112. Column 3, lines 40-64); 
comprising: in a messaging system for exchanging data over a network (Fig. 1A, Network 128 i.e. The user speech may include a message directed to a second device associated with a second user. The audio data may be generated by at least one microphone associated with the first device. The audio data may include the user speech and other components, such as, for example, background noise. The audio data corresponding to user speech may be received over a network. The network may represent an array of wired networks, wireless networks (e.g., WiFi), or combinations thereof. Column 11, lines 6-22); 
detecting a request to communicate a voice chat message from a sender client device associated with a sender identification to a recipient associated with a recipient identification (Fig. 2, Blocks 202-204 i.e. At block 202, the method 200 may include receiving, from a first device associated with a first user, audio data corresponding to user speech. The user speech may include a message directed to a second device associated with a second user. At block 204, the method 200 may include receiving, from the first device, an indication that the message is directed to the second device. The indication that the message is directed to the second device may be a selection by the first user of a name of the second user and/or contact information associated with the second user. The indication may also correspond to the first user speaking or otherwise entering a command to send the audio data and/or start a conversation with the second user. Column 11, lines 6-51); 
in response to the request: receiving input audio stream associated with the sender identification (Fig. 2, Block 202 i.e. At block 202, the method 200 may include receiving, from a first device associated with a first user, audio data corresponding to user speech. The user speech may include a message directed to a second device associated with a second user. The audio data may be generated by at least one microphone associated with the first device. Column 11, line 6-22); 
synchronously with the receiving of the input audio stream generating text corresponding to audio content from the input audio stream (Fig. 2, Block 206 i.e. At block 206, the method 200 may performing speech recognition on the audio data to generate text data representing a transcription of the user speech. Column 11, line 52 thru Column 12, line 2);
synchronously with the generating of text corresponding to audio content from the input audio stream, causing rendering the generated text on the sender client device (Fig. 2, Block 208 i.e.  At block 208, the method 200 may include sending the text data representing the transcription to the first device.  Column 12, lines 3-4);
and causing to synchronously render the generated text and the audio content from the input audio stream on a recipient client device associated with the recipient identification (Fig. 2, Blocks 210 & 214 i.e. At block 210, the method 200 may include sending the audio data and the text data representing the transcription to the second device. At block 214, the method 200 may include causing the second device to display the transcription, or a portion thereof. The transcription may be displayed as typed text, for example. The method 200 may also include causing the second device to display an icon adjacent to the transcription, the icon corresponding to the audio data. The icon, when selected, may cause one or more speakers to output the audio associated with the audio data. Column 12, lines 5-33);
Examiner reasonably believes that Lemon ‘216 discloses a non-transitory computer readable medium at Column 3, lines 40-64 because Lemon ‘216 teaches wherein a CRSM may include random access memory (“RAM”) and Flash memory. In other implementations, CRSM may include, but is not limited to, read-only memory (“ROM”), electrically erasable programmable read-only memory (“EEPROM”), or any other tangible medium which can be used to store the desired information and which can be accessed by the processor(s). However, Examiner cites Gordon ‘506 to remedy this deficiency. Moreover, Note that Gordon’506 at Paragraph 0036 teaches that by input through an interface of an instant messaging application, a first instant message is received from a first user of a first mobile device, at least a portion of which first instant message is recorded as a voice input. The voice input is automatically transcribed as text as it is received. The first instant message is transmitted to an instant messaging application on a second mobile device. Voice and transcribed text portions of the first instant message are transmitted substantially simultaneously as the first instant message is received.
Lemon ‘216 discloses a non-transitory computer readable medium (i.e. Embodiments of method 600 are implemented as a computer-readable storage media or non-transitory computer-readable medium comprising instructions that, when executed by one or more processors of a device, cause the device to perform method 600. Column 16, lines 7-17).
Lemon ‘216 and Gordon ‘506 are combinable because they are from same field of endeavor of speech systems (Gordon ‘506 at “Field of Invention”). 
	At the time the invention was effectively filed, it would have been obvious to a person of ordinary skill in the art to modify the speech system as taught by Lemon ‘216 by adding a non-transitory computer readable medium as taught by Gordon ‘506. The motivation for doing so would have been advantageous because it would also be beneficial to have a system allowing for transitioning between voice and text input while benefiting from the convenience and cost-efficiency of an instant messaging system, and allowing for a full transcript of all portions of the conversation, including those input by voice. Therefore, it would have been obvious to combine Lemon ‘216 with Gordon ‘506 to obtain the invention as specified.

Regarding claim 12; Lemon ‘216 discloses wherein the operations caused by instructions executed by the one or processors include: causing inserting in a first chat conversation user interface (Fig. 1A First User Interface 136) associated with the sender identification (Fig. 1A, User B) a first voice chat message comprising the generated text (i.e. The remote system 126 may be further configured to send the audio data and text data representing the transcription to a first user interface 136 of the first device 120. The first user interface 136 may be the same or a different user interface that was used to record the audio. Sending the text data representing the transcription to the first user interface 136 may cause the first device 120 to display the transcription, or a portion thereof, on the first user interface 136. The transcription may be displayed as typed text. Column 5, line 37-50);
causing inserting in a second chat conversation user interface (Fig. 1A Second User Interface 140) associated with the recipient identification (Fig. 1A, User A) a second voice chat message comprising the generated text (i.e. The remote system 126 may be further configured to send the audio data and/or the text data representing the transcription to a second user interface 140 of the second device 122. Sending the text data representing the transcription to the second user interface 140 may cause the second device 122 to display the transcription, or a portion thereof, on the second user interface 140. The transcription may be displayed as typed text and may include the emphasized words and/or punctuation as described herein. Column 5, line 66 thru Column 6, lines 21); 
the first chat conversation user interface included in a first messaging client implemented by one or more processors of the sender client device (i.e. The memory 110 on the first device 120 may, when executed by the processor(s) 104, cause the processor(s) 104 to perform operations such as presenting the first user interface 136 on the first device 120. Column 7, lines 30-44);
the second chat conversation user interface included in a second messaging client implemented by one or more processors of the recipient client device (i.e. The operations may also include causing the transcription and the audio data to be sent to the second device 122 to be displayed on the second user interface 140. The operations may further include receiving, from one or more processors, such as processor(s) 102, second audio data corresponding to user speech recorded using the second device 122 and a second transcription corresponding to the second audio data. Column 7, lines 45-58).

Regarding claim 13; Lemon ‘216 discloses wherein the generating of the text from the audio content is using a speech recognition engine included in the first messaging client (i.e. User A may operate her device to open a messaging application that allows her to choose to send a text message or a voice message to a device associated with User B's profile. In the latter instances, User A presses an icon or the like and a microphone from User A's device generates audio data that is sent to a remote system, for example, for performing automatic speech recognition thereon. Column 2, line 35).

Regarding claim 14; Lemon ‘216 discloses wherein the generating of the text from the audio content is using a speech recognition engine hosted at a backend system, the backend system providing the first messaging client and the second messaging client (i.e. The remote system 126 may be local to an environment associated the first device 120, the second device 122, and/or the third device 124. For instance, the remote system 126 can be located within the third device 124. In some instances, some or all of the functionality of the remote system 126 may be performed by one or more of the first device 120, the second device 122, and/or the third device 124. Column 4, lines 50).

Regarding claim 15; Lemon ‘216 discloses wherein the second voice chat message comprises a visual indication of an audio source, the visual indication of an audio source positioned in the second chat conversation user interface as associated with the generated text and as associated with the sender identification (i.e. The remote system 126 may be further configured to send the audio data and/or the text data representing the translated transcription to a second user interface 140 of the second device 122. Sending the text data representing the translated transcription to the second user interface 140 may cause the second device 122 to display the translated transcription, or a portion thereof, on the second user interface 140. The translated transcription may be displayed as typed text or characters. Sending the audio data to the second user interface 140 may also cause the second device 122 to display an icon corresponding to the audio data on the second user interface 140. The icon, when selected by the second user 132, may cause one or more speakers 144 to output the audio associated with the audio data. Column 8, lines 30-53).

Regarding claim 16; Lemon ‘216 discloses wherein the second voice chat message is actionable to playback the audio content subsequent to completion of the rendering of the generated text on the recipient client device (i.e. Sending the audio data to the second user interface 140 may also cause the second device 122 to display an icon corresponding to the audio data on the second user interface 140. The icon, when selected by the second user 132, may cause one or more speakers 144 to output the audio associated with the audio data. Column 8, lines 30-53).

Regarding claim 17; Lemon ‘216 discloses wherein the operations caused by instructions executed by the one or processors include: storing the audio content in the messaging system as associated with the sender identification (i.e. The memory 108 of the remote system 126 may, when executed by the processor(s) 102, cause the processor(s) 102 to perform operations such as receiving, from the first device 120 of a first user 130, audio data corresponding to user speech in a first language. The contacts service may store contact data that is manually entered by a user, such as, for example, in the contacts storage 152, and/or the contacts service may integrate with external provides 154 that store contact information. Column 7, line 59 thru Column 8, line 3 & Column 9, line 64 thru Column 10, line 6). 
Lemon ‘216 does not expressly disclose the limitations as expressed below.
Gordon ‘506 discloses receiving an instruction to delete the first voice chat message (i.e. The IM participant initiating the entering or adding of any content may retain control over the content that is being shared in the IM or Chat session and may erase/delete/modify/edit the content at will. In another embodiment either participant in an IM session may erase/delete/modify/edit the content including the transcribed text and the voice clips being shared in the IM or Chat session. Paragraph 0151).
and deleting the audio content stored in the messaging system in response to the request to delete the first voice chat message (i.e. The IM participant initiating the entering or adding of any content may retain control over the content that is being shared in the IM or Chat session and may erase/delete/modify/edit the content at will. In another embodiment either participant in an IM session may erase/delete/modify/edit the content including the transcribed text and the voice clips being shared in the IM or Chat session. Paragraph 0151).
Lemon ‘216 and Gordon ‘506 are combinable because they are from same field of endeavor of speech systems (Gordon ‘506 at “Field of Invention”). 
	At the time the invention was effectively filed, it would have been obvious to a person of ordinary skill in the art to modify the speech system as taught by Lemon ‘216 by adding the limitations as taught by Gordon ‘506. The motivation for doing so would have been advantageous because it would also be beneficial to have a system allowing for transitioning between voice and text input while benefiting from the convenience and cost-efficiency of an instant messaging system, and allowing for a full transcript of all portions of the conversation, including those input by voice. Therefore, it would have been obvious to combine Lemon ‘216 with Gordon ‘506 to obtain the invention as specified.

Regarding claim 18; Lemon ‘216 discloses wherein the operations caused by instructions executed by the one or processors further include analyzing one or more audio characteristics of the input audio stream to determine a qualitative characteristic representing the input audio stream, wherein the second voice chat message comprises a representation of the qualitative characteristic (i.e. The techniques may also include determining characteristics of the user speech. The characteristics may include at least one of volume changes, pitch changes, or inflection changes of the user speech. For example, volume changes may be determined based at least in part on signal strength variation corresponding to the volume at which the user is speaking. By way of further example, pitch changes may be determined base at least in part on frequency and/or amplitude changes in the audio data corresponding to changes in voice tones by the user. Column 4, line 63 thru column 5, line 36).

Regarding claim 19; Lemon ‘216 discloses wherein the one or more audio characteristics include one or more of volume, pitch, and tone (i.e. The techniques may also include determining characteristics of the user speech. The characteristics may include at least one of volume changes, pitch changes, or inflection changes of the user speech. Column 4, line 63 thru column 5, line 36).

Regarding claims 1 & 20; Claims 1 & 20 contains substantially the same subject matter as claim 11. Therefore, claims 1 & 20 are rejected on the same grounds as claim 11.

Regarding claim 2; Claim 2 contains substantially the same subject matter as claim 12. Therefore, claim 2 is rejected on the same grounds as claim 12.

Regarding claim 3; Claim 3 contains substantially the same subject matter as claim 13. Therefore, claim 3 is rejected on the same grounds as claim 13.

Regarding claim 4; Claim 4 contains substantially the same subject matter as claim 14. Therefore, claim 4 is rejected on the same grounds as claim 14.

Regarding claim 5; Claim 5 contains substantially the same subject matter as claim 15. Therefore, claim 5 is rejected on the same grounds as claim 15.

Regarding claim 6; Claim 6 contains substantially the same subject matter as claim 16. Therefore, claim 6 is rejected on the same grounds as claim 16.

Regarding claim 7; Claim 7 contains substantially the same subject matter as claim 17. Therefore, claim 7 is rejected on the same grounds as claim 17.

Regarding claim 8; Claim 8 contains substantially the same subject matter as claim 18. Therefore, claim 8 is rejected on the same grounds as claim 18.

Regarding claim 9; Claim 9 contains substantially the same subject matter as claim 19. Therefore, claim 9 is rejected on the same grounds as claim 19.

Regarding claim 10; Lemon ‘216 discloses wherein the representation of the qualitative characteristic indicates a whisper or a laugh (i.e. The techniques may also include determining characteristics of the user speech. The characteristics may include at least one of volume changes, pitch changes, or inflection changes of the user speech. For example, volume changes may be determined based at least in part on signal strength variation corresponding to the volume at which the user is speaking. By way of further example, pitch changes may be determined base at least in part on frequency and/or amplitude changes in the audio data corresponding to changes in voice tones by the user. Column 4, line 63 thru column 5, line 36).


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARCUS T. RILEY, ESQ. whose telephone number is (571)270-1581. The examiner can normally be reached 9-5 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tammy P. Goddard can be reached on 517-272-7773. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

MARCUS T. RILEY, ESQ.
Primary Examiner
Art Unit 2677



/MARCUS T RILEY/Primary Examiner, Art Unit 2677