DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 12/23/2021 has been entered.
 
Response to Arguments
Applicant's arguments filed 12/23/2021 have been fully considered but they are  moot in view of the newly cited art.
Claim Objections
Claims 4 and 11 objected to because of the following informalities:  claims are dependent upon cancelled claims 3 and 10, they should be dependent upon new claims 18 and 19.  Appropriate correction is required.


Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: 
an instruction acquisition module, configured for…  
a voice acquisition module, configured for… 
a recognition module, configured for… 
a text moving module, configured for… 
in claims 8, 9 and 12-14.

Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1, 2, 5-9, 12-16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gruber U.S. PAP 2017/0263248 A1, in view of Johansson U.S. PAP 2014/01280667 A1. in view of Lee U.S. PAP 2009/0177300 A1.

Regarding claim 1 Gruber teaches a method for quickly inserting a text of a voice carrier, comprising: 
obtaining a voice acquisition instruction from a user in a first document editing software (receiving, from the microphone, a natural language user input, determining whether the natural language user input comprises a pre-defined editing command, see par. [0006]; dictation-based editing can be used to modify text in a document, see par. [0254]);

recognizing a text corresponding to the voice of interest in the first document editing software (modifying the textual data based on the predefined editing command, see par. [0006]; Method 1000 can be performed using one or more electronic devices that include speech recognition and natural-language processing capabilities, see par. [0356]); 
opening a plurality of documents to be edited through the first document editing software (and text input module 234, search module 251 includes executable instructions to search for text, music, sound, image, video, and/or other files in memory 202 that match one or more search criteria e.g., one or more user-specified search terms, in accordance with user instructions, see par. [0129]);
adding the text corresponding to the voice of interest  into the plurality of documents to be edited opened by the first document editing software (transcribing the natural language user input and adding the transcribed natural language user input to the textual data, see par. [0006]). 
Although Gruber teaches accepting voice input, see par. [0090]; receiving natural language user input, see par. [0006]. 
However Gruber does not teach automatically modifying the format of the text to the format of an original text in the documents to be edited.
generate accurate reports or contents simultaneously in multiple languages accessible by users from anywhere in any form in real-time as the live event proceeds. The disclosed system also enhances the real-time performance of the reporting process by enabling dynamic adjustment to the speech transcription operating parameters and by providing real-time editing of transcribed text using configurable event-specific text representations, see abstract. According to an example embodiment, when speech is detected from a speaker. This generates a signal to the real-time editor 60, which in turn accesses the location representation of source 14 stored in memory, retrieves the associated identity representation of source 14, and incorporate at least a portion of the associated identity representation of source 14 into the transcribed text 32, with any suitable modification or formatting thereof, which may comprise the time the audio 12 was detected, see par. [0078]. The real-time editor 60 accesses the stored data representing the location of the commissioner, and obtains the stored data representing the identity of the commissioner. The real-time editor 60 may perform formatting on the identity representation if need be, to produce a text representation of the identity. For example, the real-time editor 60 may prefix the identity text with timestamp, and/or add punctuations as appropriate, as in: "@ 10:45:57 THE COMMISSIONER:". The real-time editor 60 then incorporates this formatted text into the transcription text stream 32, see par. [0079].
It would have been obvious to one of ordinary skill in the art to combine the teachings of Gruber with the Johansson reference for the benefit of enhancing the real-time performance of the reporting process by enabling dynamic adjustment to the speech transcription operating 
Regarding claim 2 Gruber teaches the method according to claim 1, wherein the voice acquisition instruction is a selecting instruction; obtaining a voice of interest according to the voice acquisition instruction comprises: 
selecting one or more voice carrier files from stored voice carrier files as the voice of interest according to the selecting instruction (selectively provide information e.g. user data stored on the device, see par. [0094]). 
Regarding claim 5 Gruber teaches the method according to claim 1, wherein the voice acquisition instruction is a recording instruction; 
obtaining a voice of interest according to the voice acquisition instruction comprises: recording a voice by using an audio input device and using the recorded voice as the voice of interest (audio circuitry receives sound signals from microphone, see par. [0090]). 
Regarding claim 6 Gruber teaches the method according to claim 1, wherein adding the text into a document to be edited in the first document editing software comprises: adding the text into a location to be inserted in the document to be edited, wherein the location to be inserted is a location of a mouse cursor, or a location of a touch screen cursor (device adds the transcribed text to the previously obtained text by appending transcribed text to the end of the previously obtained text by inserting it at a focus location such as a location of a cursor, see par. [0267]). 
claim 7 Gruber teaches the method according to claim 6, wherein the format comprising one or more of font, font size, color and line spacing (modifying the textual data comprises modifying a visual formatting of the textual data, such as underlining, italicizing, striking through, changing letters to uppercase or lowercase, capitalizing etc. see par. [0295]). 
Regarding claim 8 Gruber teaches an apparatus for quickly inserting a text of a voice carrier (an electronic device, see par. [0009]), comprising: 
an instruction acquisition module, configured for obtaining a voice acquisition instruction from a user, in a first document editing software (receiving, from the microphone, a natural language user input, determining whether the natural language user input comprises a pre-defined editing command, see par. [0006]; dictation-based editing can be used to modify text in a document, see par. [0254]; device 104 determines that the natural-language user input 820 includes a predetermined editing command, see par. [0273]);
 a voice acquisition module, configured for obtaining a voice of interest according to the voice acquisition instruction (modifying the textual data based on the predefined editing command, see par. [0006]; Method 1000 can be performed using one or more electronic devices that include speech recognition and natural-language processing capabilities, see par. [0356]);
a recognition module, configured for recognizing a text corresponding to the voice of interest in the first document editing software (modifying the textual data based on the predefined editing command, see par. [0006]; Method 1000 can be performed using one or more electronic devices that include speech recognition and natural-language processing capabilities, see par. [0356]); 
234, search module 251 includes executable instructions to search for text, music, sound, image, video, and/or other files in memory 202 that match one or more search criteria e.g., one or more user-specified search terms, in accordance with user instructions, see par. [0129])). 
However Gruber does not teach a format modifying module, configured for automatically modifying the format of the text to the format of an original text in the documents to be edited.
In the same field of endeavor Johansson teaches a real-time multimedia event reporting system and method that enable reporters to generate accurate reports or contents simultaneously in multiple languages accessible by users from anywhere in any form in real-time as the live event proceeds. The disclosed system also enhances the real-time performance of the reporting process by enabling dynamic adjustment to the speech transcription operating parameters and by providing real-time editing of transcribed text using configurable event-specific text representations, see abstract. According to an example embodiment, when speech is detected from a speaker. This generates a signal to the real-time editor 60, which in turn accesses the location representation of source 14 stored in memory, retrieves the associated identity representation of source 14, and incorporate at least a portion of the associated identity representation of source 14 into the transcribed text 32, with any suitable modification or formatting thereof, which may comprise the time the audio 12 was detected, see par. [0078]. The real-time editor 60 accesses the stored data representing the location of the commissioner, 
It would have been obvious to one of ordinary skill in the art to combine the teachings of Gruber with the Johansson reference for the benefit of enhancing the real-time performance of the reporting process by enabling dynamic adjustment to the speech transcription operating parameters and by providing real-time editing of transcribed text using configurable event-specific text representations, see abstract.
Regarding claim 9 Gruber teaches the apparatus according to claim 8, wherein the voice acquisition instruction is a selecting instruction; the voice acquisition module is specifically configured for selecting one or more voice carrier files from stored voice carrier files as the voice of interest according to the selecting instruction ((selectively provide information e.g. user data stored on the device, see par. [0094]). 
Regarding claim 12 Gruber teaches the apparatus according to claim 8, wherein the voice acquisition instruction is a recording instruction; the voice acquisition module is specifically configured for: recording a voice by using an audio input device and using the recorded voice as the voice of interest (audio circuitry receives sound signals from microphone, see par. [0090]). 
claim 13 Gruber teaches the apparatus according to claim 8, wherein the text moving module is specifically configured for: adding the text into a location to be inserted in the document to be edited, wherein the location to be inserted is a location of a mouse cursor, or a location of a touch screen cursor (device adds the transcribed text to the previously obtained text by appending transcribed text to the end of the previously obtained text by inserting it at a focus location such as a location of a cursor, see par. [0267]). 
Regarding claim 14 Gruber teaches the apparatus according to claim 13, wherein the device further comprises: a format modifying module, configured for modifying the format of the text to the format of text in the document to be edited after adding the test into the document to be edited, the format comprising one or more of font, font size, color and line spacing (modifying the textual data comprises modifying a visual formatting of the textual data, such as underlining, italicizing, striking through, changing letters to uppercase or lowercase, capitalizing etc. see par. [0295]). 
Regarding claim 15 Gruber teaches an electronic device, wherein it comprises a processor and a memory, the memory is configured for storing a computer program; the processor is configured for implementing steps of the method according to claim 1 when executing the program stored in the memory (see figure 6B). 
Regarding claim 16. 
Claim 18, 4, 11, 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gruber U.S. PAP 2017/0263248 A1, in view of Johansson U.S. PAP 2014/01280667 A1, further in view of of Lee U.S. PAP 2009/0177300 A1.

Regarding claim 18 Gruber in view of Johansson does not disclose the method according to claim 1, wherein the voice acquisition instruction is an extracting instruction; obtaining a voice of interest according to the voice acquisition instruction comprises: obtaining a voice start point and a voice termination point in a voice carrier file through a voice extracting instruction; extracting a voice segment between the voice start point and the voice termination point in the voice carrier file as a voice of interest.
In the same field of endeavor Lee teaches a system which can utilize audio files containing given content having a first frequency characteristic corresponding to the spoken voice of one or more individuals, see par. [0008]. The media device may store audio data files in memory. The start and end times of the audio data file may also be specified in "start" and "end" tags respectively, see par. [0083]. Body 1004 of audio data file 1000 may also include an indication of the sections of the audio data file that a user wants to alter. As an example, tags "alter_start_n" and "alter_end_n" may signify the start and end times of the nth section of the audio data file that a user wants to alter. If a user selects a section of the audio data file to alter, the media device may create tags in audio data file 1000 to specify the section that has been selected, see par. [0084]. In some embodiments, a user may provide the input for both tags "alter_start_n" and "alter_end_n". For example, a user may use the media device to supply the media device with the start and end times of selected audio data files. In other embodiments, an audio data file may contain a first tag that labels the beginning of the audio data file as the beginning of a user-selected audio data file (e.g., the "alter_start_n" tag). While a user is playing the audio data file, the user may instruct the media device to establish a second tag to indicate the end of the user-selected audio data file (e.g., the "alter_end_n" tag), see par. [0085]. 
It would have been obvious to one of ordinary skill in the art to combine the Gruber in view of Johansson invention with the teachings of Lee for the benefit of adjusting audio output signals to emphasize important parts of the transcription, see par. [0004]. 

Regarding claim 4 Gruber in view of Johansson does not teach the method according to claim 1, wherein obtaining a voice start point and a voice termination point in a voice carrier file according to the extracting instruction comprises: determining the voice start point and the voice termination point in the voice carrier file by using a location of a mouse cursor, or determining the voice start point and the voice termination point in the voice carrier file by using a location of a touch screen cursor.
Lee teaches the method according to claim 1, wherein obtaining a voice start point and a voice termination point in a voice carrier file according to the extracting instruction comprises: determining the voice start point and the voice termination point in the voice carrier file by using a location of a mouse cursor, or determining the voice start point and the voice termination point in the voice carrier file by using a location of a touch screen cursor (user input component 106 could also be a mouse, keyboard, audio trackball, slider bar, one or more buttons, media device touch screen, any other input component or device, and/or a combination thereof. User input component 106 may also include a multi-touch screen such as that shown in FIG. 2, see par. [0021]). 
It would have been obvious to one of ordinary skill in the art to combine the Gruber in view of Johansson invention with the teachings of Lee for the benefit of adjusting audio output signals to emphasize important parts of the transcription, see par. [0004]. 
Regarding claim 11 Gruber in view of Johansson  does not teach the apparatus according to claim 8, wherein the interval acquisition sub-module is specifically configured for: determining the voice start point and the voice termination point in the voice carrier file by using a location of a mouse cursor, or determining the voice start point and the voice termination point in the voice carrier file by using a location of a touch screen cursor.
Lee teaches the apparatus according to claim 8, wherein the interval acquisition sub-module is specifically configured for: determining the voice start point and the voice termination point in the voice carrier file by using a location of a mouse cursor, or determining the voice start point and the voice termination point in the voice carrier file by using a location of a touch screen cursor (user input component 106 could also be a mouse, keyboard, audio trackball, slider bar, one or more buttons, media device pad, dial, keypad, click wheel, switch, touch screen, any other input component or device, and/or a combination thereof. User input component 106 may also include a multi-touch screen such as that shown in FIG. 2, see par. [0021]).


Regarding claim 19 Gruber in view of Johansson does not teach the apparatus according to claim 8, wherein the voice acquisition instruction is an extracting instruction; the voice acquisition module comprises: an interval acquisition sub-module, configured for obtaining a voice start point and a voice termination point in a voice carrier file according to the extracting instruction; an extracting sub-module, configured for extracting a voice segment between the voice start point and the voice termination point in the voice carrier file as the voice of interest.
In the same field of endeavor Lee teaches a system which can utilize audio files containing given content having a first frequency characteristic corresponding to the spoken voice of one or more individuals, see par. [0008]. The media device may store audio data files in memory. The start and end times of the audio data file may also be specified in "start" and "end" tags respectively, see par. [0083]. Body 1004 of audio data file 1000 may also include an indication of the sections of the audio data file that a user wants to alter. As an example, tags "alter_start_n" and "alter_end_n" may signify the start and end times of the nth section of the audio data file that a user wants to alter. If a user selects a section of the audio data file to alter, the media device may create tags in audio data file 1000 to specify the section that has been selected, see par. [0084]. In some embodiments, a user may provide the input for both tags "alter_start_n" and "alter_end_n". For example, a user may use the media device to supply the media device with the start and end times of selected audio data files. In other embodiments, an audio data file may contain a first tag that labels the beginning of the audio data file as the beginning of a user-selected audio data file (e.g., the "alter_start_n" tag). While a user is playing the audio data file, the user may instruct the media device to establish a second tag to indicate the end of the user-selected audio data file (e.g., the "alter_end_n" tag), see par. [0085]. 
It would have been obvious to one of ordinary skill in the art to combine the Gruber in view of Johansson invention with the teachings of Lee for the benefit of adjusting audio output signals to emphasize important parts of the transcription, see par. [0004]. 


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Pertinent prior art available on form 892.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michael Ortiz-Sanchez whose telephone number is (571)270-3711.  The examiner can normally be reached on Monday- Friday 9AM-6PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/MICHAEL ORTIZ-SANCHEZ/Primary Examiner, Art Unit 2656