DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
Examiner acknowledges the amendments to the claims received on 12/3/2021 have been entered, and that no new matter has been added.

Response to Arguments
Argument 1: Applicant argues on page 10-11 in the filing on 12/3/2021 that “neither a calendar invite nor the presentation attachment used to reject the similar subject matter are collaborative documents having the claimed comment functionality” in claim 1.
Response to Argument 1: Argument 1 is moot in view of new grounds of rejection.  The scope of the amendment has changed and new art has been applied.  

This meets the claim limitations as currently claimed, and Applicant's Arguments 1 filed on 12/3/2021 is moot in view of new grounds of rejection necessitated by the applicant’s amendment.  Applicant’s remaining statements regarding the remaining independent and dependent claims are moot for the reasons stated above.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1, 3-7, 9-13, 18-19, and 21-24 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lewis et al., Patent Application Publication Number US 20180365232 A1 (hereinafter “Lewis”), in view of Skarbovsky et al., Patent Application Publication number US 20180143956 A1 (hereinafter “Skarbovsky”), in view of Seeley et al., Patent Application Publication number US 20190179501 A1 (hereinafter “Seeley”).
Claim 1:  Lewis teaches “A computer-implemented method comprising:
receiving, by a content creation system, audio data including speech of a… speaker…, the audio data corresponding to a collaborative document (i.e. sends the audio to the transcription service 104 to extract text from the audio. The transcription service 104 may transcribe the text from the audio using a transcription data source [Lewis 0018, Fig. 3] note: 3 users are collaborating using dictation application);
responsive to receiving the audio data, accessing a custom lexicon (i.e. first user selects Italian as the language to receive text of the conversation on a first device corresponding to the first user. The second user selects English on a second device, and the third user selects French on a third device… the first user may say "Ciao, come stai?" which may be transcribed at a cloud service. The transcription may be sent back to the first device, displaying text of what the first user spoke. The second user may see a translation displayed on the second device in English of "Hello, how are you?" and the third user may see a translation displayed on the third device in French of "Bonjour, ca va?" [Lewis 0011] note: each person choosing a different language to receive text is a selection of a custom lexicon.  Note2: in response to each speech input, a custom language lexicon is accessed for the translation to appear) that is generated by:
identifying, by the content creation system, a respective account (i.e. a user may have a customized model for transcription [Lewis 0013]) for each speaker of the plurality of speakers (i.e. second user and the fourth user may receive different transcriptions if their personalized or customized transcription… models include different outputs [Lewis 0012]), each respective account associated with one or more documents stored by the content creation system (i.e. transcription personalization model data source 106… may retrieve or store information regarding local context, such as from contact lists, search history, geolocation, use case data, or the like, to customize the transcription… to a relevant scenario [Lewis 0025]… Personalization models may be created by a user… may upload data to be used in automatically generating a personalization model (e.g., a slideshow deck may be uploaded as an input for a presentation model other documents containing technical language, address books containing personal or place names, or the like) [Lewis 0027]), each document of the one or more documents associated with information identifying a set of accounts as collaborators having accessed the document (i.e. Personalization models may be created by a user… may upload data to be used in automatically generating a personalization model (e.g., a slideshow deck may be uploaded as an input for a presentation model other documents containing technical language, address books containing personal or place names, or the like) [Lewis 0027] note: a user uploading documents indicates that these documents are associated with the account of the uploading user),…
generating, by the content creation system for the… speaker…, the custom lexicon (i.e. text may be transcribed using a model customized to a user of the at least one of the plurality of user devices [Lewis 0035])…;
transcribing, by the content creation system, the audio data into text representative of the speech (i.e. sends the audio to the transcription service 104 to extract text from the audio. The transcription service 104 may transcribe the text from the audio using a transcription data source [Lewis 0018]) using the custom lexicon (i.e. after a personalization model is created for a user, the user may use the model to modify or accentuate transcription [Lewis 0027]); and
(i.e. after a personalization model is created for a user, the user may use the model to modify or accentuate transcription [Lewis 0027]).”
Lewis teaches a plurality of speakers each with a custom lexicon, and uploading documents to personalize transcription models.  Lewis teaches audio data for each user.  Lewis is silent regarding “audio data including speech of a plurality of speakers, each speaker of the plurality of speakers included in the audio data.”  Lewis teaches a custom lexicon for each speaker.  Lewis is silent regarding “generating, by the content creation system for the plurality of speakers, the custom lexicon based on the set of documents on which each speaker is a collaborator.”  Lewis is silent regarding “determining, by the content creation system, a plurality of collaborative documents on which each speaker is a collaborator based on the information identifying the set of accounts as collaborators, each collaborative document of the plurality of collaborative documents having been determined for inclusion in the plurality of collaborative documents based on it being stored in association with accounts of all of the plurality of speakers…”
Skarbovsky teaches “receiving, by a content creation system, audio data including speech of a plurality of speakers, each speaker of the plurality of speakers included in the audio data, the audio data corresponding to a collaborative document (i.e. speech data for conversion to text [Skarbovsky 0038]… an event to be transcribed is a meeting between department heads… the transcript [Skarbovsky 0040] note: the transcription document is a collaboration from each speaker);…
identifying, by the content creation system, a respective account for each speaker of the plurality of speakers (i.e. the n most recently accessed documents for each department head [Skarbovsky 0041] note: each department head has access to documents indicates they each have an account), each respective account associated with one or more documents stored by the content creation system (i.e. the n most recently accessed documents for each department head [Skarbovsky 0041] note: each account has at least read access, thus the account is associated with the documents), each document of the one or more documents associated with information identifying a set of accounts as collaborators having accessed the document (i.e. the n most recently accessed documents for each department head [Skarbovsky 0041]),
determining, by the content creation system, a plurality of collaborative documents (i.e. contextual information for the event from attendee/presenter lists, a meeting invitation, an attached presentation, etc [Skarbovsky 0040] note: these two documents are a plurality of documents) on which each speaker is a collaborator based on the information identifying the set of accounts as collaborators (i.e. contextual information for the event from attendee/presenter lists, a meeting invitation, an attached presentation, etc [Skarbovsky 0040] note: a POSITA understands that a meeting invitation is a first document sent electronically (most commonly via email) to each speaker/attendee of a meeting.  The invitation is edited by each user accepting the meeting, or collaborated.  An attached presentation gives each speakers/attendees a second document in common, with read access, or collaboration ability.  The emailed invitation/attachment document has a recipient field, indicating the email addresses, or accounts, that are invited to accept the invitation, or view the attachment), each collaborative document of the plurality of collaborative documents having been determined for inclusion in the plurality of collaborative documents based on it being stored in association with accounts of all of the plurality of speakers (i.e. contextual information for the event from attendee/presenter lists, a meeting invitation, an attached presentation, etc [Skarbovsky 0040] note: a POSITA understands a meeting invitation and/or attachment is sent to each speaker’s account.  Meeting invitations/attachments are stored in a user’s email account or a calendar account.  All attendees (speakers) in the meeting will at least have this document in common between all attendees/speakers),…
generating, by the content creation system for the plurality of speakers (i.e. a meeting between department heads [Skarbovsky 0040]), the custom lexicon (i.e. supplemental contextual data [Skarbovsky 0041]… supplemental contextual information… names and terms from the [suplemental] data retrieved are parsed and are used as supplemental contextual information to augment the contextual dictionary 130 [Skarbovsky 0042] note: the contextual dictionary indicates that there is one contextual dictionary for the meeting of multiple speakers) based on the plurality of collaborative documents on which each speaker is a collaborator (i.e. contextual information for the event from attendee/presenter lists, a meeting invitation, an attached presentation, etc [Skarbovsky 0040]);
transcribing, by the content creation system, the audio data into text representative of the speech using the custom lexicon (i.e. an event to be transcribed is a meeting… by querying the graph database for persons or documents related to the department heads… the contextual dictionary 130 can be expanded to include or reweight terms and names discovered that may be spoken during the event [Skarbovsky 0040] note: the contextual dictionary indicates that there is one contextual dictionary for the audio data of the meeting of multiple speakers);”
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Lewis to include the feature of having the ability to customize audio to text lexicons as disclosed by Skarbovsky.  
One would have been motivated to do so, before the effective filing date of the invention because it provides the benefit where “speech-to-text algorithm may be improved and made more efficient [Skarbovsky 0006].”
the plurality of collaborative documents each having one or more content level comments that are annotated in visual association with one or more text spans but displayed in an interface separate from the one or more text spans.”
Seeley teaches “the plurality of collaborative documents each having one or more content level comments that are annotated in visual association with one or more text spans but displayed in an interface separate from the one or more text spans (i.e. FIGS. 4A and 4B illustrate examples of various ways comments 119 may be created in collaborative documents [Seeley 0060]… may select a portion of the text (“10%”) in the collaborative text document 115 and select an icon or option to create a comment. The first user may enter text “Maybe we can shoot for 20%” into the comment 402 [Seeley 0064, Fig. 4B] note: the comment “maybe we can shoot for 20%” is visually associated with the highlighted “10%” in Fig. 4B, and displayed in a separate sidebar interface).”
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Lewis and Skarbovsky to include the feature of having the ability to add and share collaborative documents containing comments as disclosed by Seeley.  
One would have been motivated to do so, before the effective filing date of the invention because it provides the benefit to communicate and collaborate changes in a document without changing the content of the document itself.

Claim 3:  Lewis, Skarbovsky, and Seeley teach all the limitations of claim 1, above.  Skarbovsky teaches “wherein generating the custom lexicon comprises: identifying, by the content creation plurality of collaborative documents; and modifying, by the content creation system, a default lexicon to include the identified set of words or n-grams to generate the custom lexicon (i.e. names and terms from the data retrieved are parsed and are used as supplemental contextual information to augment the contextual dictionary [Skarbovsky 0042]).”
One would have been motivated to combine Lewis, Skarbovsky, and Seeley, before the effective filing date of the invention because it provides the benefit where “speech-to-text algorithm may be improved and made more efficient [Skarbovsky 0006].”

Claim 4:  Lewis, Skarbovsky, and Seeley teach all the limitations of claim 1, above.  Lewis teaches “wherein generating the custom lexicon for the plurality of speakers comprises selecting among a plurality of lexicons (i.e. a model may be selected based on a profession, a topic, a location (e.g., using regional dialects), or from a user-selected pre-defined model (e.g., including industry-specific language or in-jokes for a group) [Lewis 0013]) associated with each speaker of the plurality of speakers (i.e. text may be transcribed using a model customized to a user of the at least one of the plurality of user devices [Lewis 0035]) based on a subject matter of the audio data (i.e. personalization models may be created based on one or more sets of topics… [Lewis 0028]).”  

Claim 5:  Lewis, Skarbovsky, and Seeley teach all the limitations of claim 1, above.  Lewis teaches “wherein generating the custom lexicon for the plurality of speakers comprises selecting among a plurality of lexicons associated with each speaker of the plurality of speakers (i.e. text may be transcribed using a model customized to a user of the at least one of the plurality of user devices [Lewis 0035]) based on one or more characteristics of the one or more speakers (i.e. a model may be selected based on a profession, a topic, a location (e.g., using regional dialects), or from a user-selected pre-defined model (e.g., including industry-specific language or in-jokes for a group) [Lewis 0013]).”  

Claim 6:  Lewis, Skarbovsky, and Seeley teach all the limitations of claim 1, above.  Skarbovsky teaches “wherein the set of documents is accessible to each of the plurality of speakers (i.e. contextual information for the event from attendee/presenter lists, a meeting invitation, an attached presentation, etc. [Skarbovsky 0040] note: a meeting invitation and/or an attached presentation is a document sent to each attendee/speaker of a meeting, and are accessible to the receiver/attendee/speaker).”
One would have been motivated to combine Lewis, Skarbovsky, and Seeley, before the effective filing date of the invention because it provides the benefit where “speech-to-text algorithm may be improved and made more efficient [Skarbovsky 0006].”

Claim 7:  Lewis, Skarbovsky, and Seeley teach all the limitations of claim 1, above.  Skarbovsky teaches “wherein each document of the plurality of collaborative documents is stored within an account of one or more of the plurality of speakers (i.e. the n most recently accessed documents for each department head [Skarbovsky 0041]… contextual information for the event from attendee/presenter lists, a meeting invitation, an attached presentation, etc. [Skarbovsky 0040] note: each department head has access to documents indicates they each have an account.  Note2: a meeting invitation and/or an attached presentation are sent to each attendee/speaker, and thus stored within their respective accounts).”


Claim 9:  Lewis, Skarbovsky, and Seeley teach all the limitations of claim 1, above.  Skarbovsky teaches “wherein at least one document of the plurality of collaborative documents comprises the collaborative document (i.e. contextual information for the event from attendee/presenter lists, a meeting invitation, an attached presentation, etc. [Skarbovsky 0040] note: a meeting invitation is viewed and accepted (edited) among each attendee, and an attached presentation is shared for viewing (collaborating)).”
One would have been motivated to combine Lewis, Skarbovsky, and Seeley, before the effective filing date of the invention because it provides the benefit where “speech-to-text algorithm may be improved and made more efficient [Skarbovsky 0006].”

Claim 10:  Lewis, Skarbovsky, and Seeley teach all the limitations of claim 1, above.  Skarbovsky teaches “wherein the set of documents is associated with a subject matter of the audio data (i.e. contextual information for the event from…an attached presentation [Skarbovsky 0040] note: an attached presentation to a meeting invite is a document in a set of documents accessible by each attendee/speaker, which includes a subject matter of the meeting).”
One would have been motivated to combine Lewis, Skarbovsky, and Seeley, before the effective filing date of the invention because it provides the benefit where “speech-to-text algorithm may be improved and made more efficient [Skarbovsky 0006].”

Claim 11:  Lewis, Skarbovsky, and Seeley teach all the limitations of claim 1, above.  Lewis teaches “wherein the set of documents is selected by a speaker of the plurality of speakers (i.e. a user may upload data to be used in automatically generating a personalization model (e.g., a slideshow deck may be uploaded as an input for a presentation model other documents containing technical language, address books containing personal or place names, or the like) [Lewis 0027]).”

Claim 12:  Lewis, Skarbovsky, and Seeley teach all the limitations of claim 1, above.  Lewis teaches “wherein the audio data is captured during a meeting (i.e. participants may have natural conversations that include interruptions and multiple people speaking at same time or during overlapping time periods [Lewis 0025]).”

Claim 13:  Lewis, Skarbovsky, and Seeley teach all the limitations of claim 12, above.  Lewis teaches “wherein the set of documents is selected based on a characteristic of the meeting (i.e. a user may upload data to be used in automatically generating a personalization model (e.g., a slideshow deck may be uploaded as an input for a presentation model other documents containing technical language, address books containing personal or place names, or the like) [Lewis 0027]).”  

Claim 18: Lewis, Skarbovsky, and Seeley teach a system (i.e. system [Lewis 0040]) comprising: 
one or more processors (i.e. a hardware processor [Lewis 0040]); and 
a non-transitory computer-readable storage medium storing executable instructions (i.e. a machine readable medium 522 that is non-transitory on which is stored one or more sets of data structures or instructions [Lewis 0041]) that, when executed by the one or more (i.e. instructions may also reside… within the hardware processor 502 during execution [Lewis 0041]) comprising the operations corresponding to the method of claim 1, therefore it is rejected under the same rationale.

Claim 19:  Lewis, Skarbovsky, and Seeley teach all the limitations of claim 18, above.  Lewis teaches “wherein the custom lexicon includes a set of terms included within the plurality of collaborative documents and not included within a default lexicon (i.e. transcription personalization model data source 106 may include… language-specific models (e.g., based on… foreign-language words [Lewis 0018] note: one skilled in the arts understands that foreign language words include sets of terms not in a base language).”  

Claim 21: Lewis, Skarbovsky, and Seeley teach a non-transitory computer-readable medium comprising memory with instructions encoded thereon that, when executed, cause one or more processors to perform operations (i.e. a machine readable medium 522 that is non-transitory on which is stored one or more sets of data structures or instructions [Lewis 0041]), the instructions comprising instructions to perform operations corresponding to the method of claim 1, therefore it is rejected under the same rationale.

Claim 22: Claim 22 is similar in content and in scope to claim 3, thus it is rejected under the same rationale.

Claim 23: Claim 23 is similar in content and in scope to claim 4, thus it is rejected under the same rationale.

Claim 24: Claim 24 is similar in content and in scope to claim 5, thus it is rejected under the same rationale.

Claims 14-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lewis, in view of Skarbovsky, in view of Seeley, in view of Chang et al., Patent Number US 8447608 B1 (hereinafter “Chang”).
Claim 14:  Lewis, Skarbovsky, and Seeley teach all the limitations of claim 1, above.  Lewis, Skarbovsky, and Seeley are silent regarding “wherein a second custom lexicon is accessed in response to the custom lexicon not including text associated with a spoken word.”
Chang teaches “wherein a second custom lexicon is accessed in response to the custom lexicon not including text associated with a spoken word (i.e. if the audio file has both Spanish and English dialogue, the system may require both Spanish and English source texts to create the type-specific language model [Chang Col 7, lines 36-38]).”
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the combination of Lewis, Skarbovsky, and Seeley to include the feature of having the ability to use a second lexicon as disclosed by Chang.  
One would have been motivated to do so, before the effective filing date of the invention because it provides the benefit which “allows for highly accurate speech-to-text transcription of input texts belonging to a particular type. Additionally, the speech-to-text transcription of specialized input texts can be improved relative to speech-to-text transcription system using generic language models [Chang Col 2, lines 48-52].”

Claim 15:  Lewis, Skarbovsky, Seeley, and Chang teach all the limitations of claim 14, above.  Lewis teaches “wherein the second custom lexicon is generated based on a second set of (i.e. each user… may have a customized speech recognition model [Lewis 0024]… a user may upload data to be used in automatically generating a personalization model (e.g., a slideshow deck may be uploaded as an input for a presentation model other documents containing technical language, address books containing personal or place names, or the like) [Lewis 0027] note: each user has their own customized model, which is generated by uploading their respective set of documents).”

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Bultrowicz (US 20100095198 A1) listed on 892 is related to sharing collaborative documents, specifically including comments.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SAMUEL SHEN whose telephone number is (469)295-9169. The examiner can normally be reached Monday-Thursday, 7:00 am - 5:00 pm CT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Matthew Ell can be reached on (571) 270-3264. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






/S.S./Examiner, Art Unit 2171                                                                                                                                                                                                        

/MATTHEW ELL/Supervisory Primary Examiner, Art Unit 2171