DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the
first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed
to an abstract idea without significantly more. The claims recites various limitations that cover Performance of the limitations in the mind and/or using pencil and paper, but for the recitation of generic computer components. Independent claims 1, 13, and 17 recite, “collecting media; extracting one or more features from the media; and transcribing the media based on the extracted one or more features and one or more models.”
The limitations of “collecting…”, “extracting…”, and “transcribing…”
in its broadest reasonable interpretation covers mental processes. More specifically a human receiving a media and memorizing, depicting one or more features of the media, and transcribing the media based on the features and models developed through pencil and paper. Requiring only pen and paper or head and hand, see MPEP 2106.04(a) (2) III.
	This judicial exception is not integrated into a practical application because the claims
recite additional elements of a “computer-implemented”, “computer program product”, “computer-readable storage medium”, and “processors”, furthermore, the specifications indicates that these computer program instructions may be provided to a processor of a general purpose computer, see para. 89 as “these computer readable program instructions may be provided to a processor of a general purpose computer”. These elements are used to perform the claimed method/steps and are recited at a high-level of generality using generic computer components, see paras. 53-54. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.
	The remaining dependent claims 2-22, serve to further describe the
computerized method more generally alluded to in the parent claims. These claims deal with notifying a user which may be done by writing or speaking; user checks transcription for accuracy and confirms, notifying one or more other users of the transcription; pencil and paper models correlate one or more features with an appropriate transcription style and appropriately transcribing the media; receiving feedback indicative if the transcription was accurate, and adjusting the one or more pencil and paper models based on the received feedback; collecting training information, depicting training features from the training data, and training the one or more models based on the extracted features as through pencil and paper; determining a transcription style based on the extracted features and one or more pencil and paper models, wherein the media is transcribed (pencil and paper/mentally) according to the determined transcription style; the transcription style is selected from a group comprising a transcription, outline, summary, presentation with notes, blog with comments, and tutorial with examples; the user is notified vocally or written of the transcription along with playing back of the media, and the transcription is synchronized with the media, wherein the synchronization is based on the media’s content, transcription has timestamps that are created i.e. timing transcription and including it in transcription; transcription is searchable by the user i.e. user looks for a keyword within the transcription as an example; the features include topics, importance, frequency, vocabulary, tones, moods, pointing, waving, facial expressions, eye direction, and eye movement). 
The claims include additional elements such as  “computer-implemented”, “computer program product”, “computer-readable storage medium”, and “processors” are not sufficient to amount to significantly more than the judicial exception because as discussed above with respect to the integrations of the abstract idea into a practical application. The claims are not patent eligible. Viewed as a whole, these additional claim elements do not provide meaningful limitations to transform the abstract idea into a patent eligible application of the abstract idea such that the claims amount to significantly more than the abstract idea itself. 
The improvements have been considered (Para. 47, improved accuracy); however, other than the generic computer components for the client device, computer program product, computer-readable storage medium, and processors i.e. there are no additional elements besides the judicial exception; therefore, insufficient to integrate the judicial exception into a practical application such as an improvement in the functioning of a computer, or an improvement to other technology or technical field, as discussed in MPEP §§ 2106.04(d)(1) and 2106.05(a); Please refer to 2106.04(d)(II and III). By adding the computer-implements and generic computer components to each limitation, merely replicates application of a generic computer component to each limitation much like considered initially with “computer-implemented method” as the processor performs each of the limitations, the additional limitation of the processor does not integrate the judicial exception into a practical application as it is generic and is merely applying the method with no significant improvement of the generic computer component i.e. processor, generic computer components, does not integrate the abstract idea into a practical application in Step 2A Prong Two or add significantly more in Step 2B and merely sets instructions to apply an exception consideration which are additional elements that are well-understood, routine, conventional activity. Therefore claims 1-20 are rejected under 35 U.S.C 101 as being directed to non-statutory subject matter in the form of an abstract idea without significantly more. 

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35
U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form
the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-4, 7, 9-11, and 13-20 are rejected under 35 U.S.C. 102(a)(2) as being
anticipated by Bradley et al (US Pub. No. 2022/0115020 A1) hereinafter Bradley.

Regarding claim 1, Bradley teaches a computer-implemented method for transcribing media (Para. 66, a system that is configured to implement a method and system for automatic conversation transcription with non-exhaustive exemplary terminals and connected components; furthermore, para. 231 indicates modifications and variations may be made to the disclosed embodiments while remaining within the scope of the embodiments of the invention as defined by the following claims), the method comprising:
collecting media (Para. 66, first terminal device 21 can capture and play audio and/or video, both of which can be transmitted to a conferencing server 25 in the network 24);
extracting one or more features from the media (Para. 95, the system can perform speech feature extraction separately on each segment. Speech segments can be clustered based on the location of their speech feature vectors within a vector space i.e. figure 4 is exemplary of voice activity detection process performed in the method); and
transcribing the media based on the extracted one or more features and one or more models (Para. 70, the Automatic Speech Recognition system can adopt a domain-specific language model 29 to generate the transcripts; furthermore, where features extracted from the media are used within the ASR and NLU system, see para. 78).

Regarding claim 2, Bradley teaches the method of claim 1 (see claim 1 above), in addition, Bradley teaches further comprising:
notifying a user of the transcription (Para 73, Figure 2 has non-exhaustive exemplary terminals and connected components where  push transcript changes to all terminals so that they can display the latest version of the transcripts i.e. transcripts are pushed to the user and displayed; furthermore, broadest reasonable interpretation of notifying is presenting information and correlates with example given from specifications of the instant application on para. 44).

Regarding claim 3, Bradley teaches the method of claim 2 (see claim 2 above), in addition, Bradley teaches further comprising:
based on receiving confirmation of an accuracy of the transcription from the user (Para. 74, two or more editors can jointly modify the stored transcripts i.e. editors which are users are confirming accuracy and editing transcription of any inaccuracies), notifying one or more other users of the transcription (Para. 74, the system can, at the same time, push transcript changes to all terminals so that they can display the latest version of the transcripts i.e. presents transcripts based on receiving confirmations of accuracies through review of the editors/users).

Regarding claim 4, Bradley teaches the method of claim 1 (see claim 1 above), in addition, Bradley teaches,
wherein the one or more models correlate the one or more features with an appropriate transcription style and appropriately transcribing the media (Para. 70, The system can enable customization of a language model for a specific company, a specific person, a specific service subscriber, or other selection from a group. A domain-specific language model can allow a system to recognize words that are known and frequently used by certain people even when they are unknown or uncommon among a broader range of speakers. A language model can include custom dictionaries of recognizable words and their pronunciations i.e. model correlates features of the speaker with an appropriate transcription style as with speaker indicators in transcription; therefore appropriately transcribing the media according to different speakers as explained in para. 76).

Regarding claim 7, Bradley teaches the method of claim 1 (see claim 1 above), in addition, Bradley teaches further comprising:
determining a transcription style based on the extracted one or more features and one or more models (Para. 97, When a video meeting is conducted, the video stream captured by the embedded cameras can comprise the speaker's head or body images. For example, the system can analyze the images and identify a speaker by his or her facial features that was previously registered or determined in the system i.e. extracted features are images from the video; furthermore, Para. 70, The system can enable customization of a language model for a specific company, a specific person, a specific service subscriber, or other selection from a group. A domain-specific language model can allow a system to recognize words that are known and frequently used by certain people even when they are unknown or uncommon among a broader range of speakers. A language model can include custom dictionaries of recognizable words and their pronunciations i.e. model correlates features of the speaker with an appropriate transcription style as with speaker indicators in transcription i.e. determination of transcription style is differentiating speakers within the media content), wherein the media is transcribed according to the determined transcription style (Para. 146, as shown in FIG. 8C, the system can match the text of the later speaker to align with the text of the earlier speaker according to their recorded timestamps i.e. determination is made of different speakers through models and extracted features; therefore, transcription contains speech correlated to different speakers as seen in fig. 8C with Alice and Bob).

Regarding claim 9, Bradley teaches the method of claim 2 (see claim 2 above), in addition, Bradley teaches wherein:
the user is notified of the transcription along with audio or video of the media (Para. 74, can continuously update the transcript according to the audio streams in real-time. While two or more editors can jointly modify the stored transcripts, the system can, at the same time, push transcript changes to all terminals so that they can display the latest version of the transcripts i.e. transcript appears along with audio stream; furthermore, para. 141 depicts video of the media along with viewing transcription, while replaying the relevant audio segments, the system can also play the corresponding video data on the display 82. This would enable the editor to visually check and confirm the content of the transcript while listening to the audio); and
the transcription notification is synchronized with the audio or video of the media (Para. 153, FIG. 10 shows a GUI for playback of a meeting with multimedia in synchronization with a transcript i.e. transcription appears in sync with audio or video of the media), wherein the synchronization is based on the media's content (Para. 81, the system can provide the transcript 35 for live viewing by meeting participants in real-time i.e. live viewing indicates sync of transcript based on media’s content).

Regarding claim 10, Bradley teaches the method of claim 1 (see claim 1 above), in addition, Bradley teaches,
wherein the transcription includes one or more timestamps (Para. 132, Transcripts can also contain metadata such as tags with timestamps of the beginning of speech segments and possibly the ending of speech segments).

Regarding claim 11, Bradley teaches the method of claim 2 (see claim 2 above), in addition, Bradley teaches,
wherein the transcription is searchable by the user (Para. 127, A document editing application can read the transcript and continuously update the display as new text is combined. Furthermore, some editing applications can provide a search capability to search for text within the transcript).

Regarding claim 13, is directed to a computer program product claim of method claim 1 and is rejected under the same grounds as method claim 1. 
A computer program product for transcribing media, the computer program product (Para. 60, These can be implemented with computers that execute software instructions stored on non-transitory computer-readable media) comprising:
one or more non-transitory computer-readable storage media and program instructions stored on the one or more non-transitory computer-readable storage media capable of performing a method (Para. 60, These can be implemented with computers that execute software instructions stored on non-transitory computer-readable media; furthermore, paras. 223-224 shows an example of non-transitory computer readable medium that stores instructions executed by a computer to perform the steps with the difference being a rotating magnetic disk to flash random access memory).

Regarding claim 14, is directed to a computer program product claim of claim 2 and is rejected under the same grounds as method claim 2.

Regarding claim 15, is directed to a computer program product claim of claim 3 and is rejected under the same grounds as method claim 3.

Regarding claim 16, is directed to a computer program product claim of claim 4 and is rejected under the same grounds as method claim 4.

Regarding claim 17, is directed to a computer system claim of method claim 1 and is rejected under the same grounds as method claim 1.
A computer system for transcribing media, the computer system (Para. 60, These can be implemented with computers that execute software instructions stored on non-transitory computer-readable media)  comprising:
one or more computer processors, one or more computer-readable storage media, and program instructions stored on the one or more of the computer-readable storage media for execution by at least one of the one or more processors capable of performing a method (Para. 60, These can be implemented with computers that execute software instructions stored on non-transitory computer-readable media; furthermore, paras. 223-224 shows an example of non-transitory computer readable medium that stores instructions executed by a computer to perform the steps with the difference being a rotating magnetic disk to flash random access memory).

Regarding claim 18, is directed to a computer system claim of method claim 2 and is rejected under the same grounds as method claim 2.

Regarding claim 19, is directed to a computer system claim of method claim 3 and is rejected under the same grounds as method claim 3.

Regarding claim 20, is directed to a computer system claim of method claim 4 and is rejected under the same grounds as method claim 4.


Claims 1, 5-6, 13, and 17 are rejected under 35 U.S.C. 102(a)(1) as being
anticipated by Steelberg et al. (US Pub. No. 2020/0286485 A1) hereinafter Steelberg.

Regarding claim 1, Steelberg teaches a computer-implemented method for transcribing media (Para. 34, Transcripts may generally include transcribed texts of the audio portion of the media files. Transcript may also generally include features of the image portions of the media files. Transcripts may be generated and stored in segments having start times, end times, duration, text specific metadata, etc. Process 100 may use one or more network-connected servers, each including one or more processors and non-transitory computer readable memory storing instructions), the method comprising:
collecting media (Para. 36, Process 100 starts at 105 where an input media file to be transcribed is received and processed by a plurality of data preprocessors);
extracting one or more features from the media (Para. 36, an audio analysis preprocessor that is configured to extract audio features such as mel-frequency cepstral coefficients (MFCC)); and
transcribing the media based on the extracted one or more features and one or more models (Para. 37, Once the input media file is received, it can go through several preprocessors to condition, normalize, and/or to extract features in the content (data) of the input media file prior to being used as inputs of a transcription model; furthermore, para. 40 indicates transcribing the media which is outputted from the one or more models and extracted features).

Regarding claim 5, Steelberg teaches the method of claim 1 (see claim 1 above), in addition, Steelberg teaches, 
receiving feedback indicative of whether the transcription was accurate (Para. 32, on a high level, the transcription method and system with reinforcement learning has the capability to ingest feedback, in the form of a reward function, to generate a revised (improved) transcription based on the received reward function. The revised transcription is then analyzed, and a second reward function is generated as feedback to the transcription engine, which then uses the second reward function to generate yet another revised transcription. This process is repeated until the desired accuracy threshold for the transcription is reached); and
adjusting the one or more models based on the received feedback (Para. 47, One or more steps 110 through 165 can be considered to be part of a “conductor” which is configured to: train transcription models; select a transcription engine based on a trained model to transcribe the input media file; identify one or more segments of the transcribed media file with a low confidence of accuracy; select a new transcription engine to transcribe the one or more segments with a low confidence of accuracy; develop a new micro training model (e.g., reinforcement learning enabled transcription model) to transcribe one or more segments that cannot be transcribed to a desired level of accuracy by previously selected transcription engines (after several cycles); transcribe the one or more segments using a new micro engine, which is based on the new micro training model i.e. feedback is received from reward system indicative of accuracy of transcription and adjusting the models through a micro training model/engine depending on the received feedback).

Regarding claim 6, Steelberg teaches the method of claim 1 (see claim 1 above), in addition, Steelberg teaches,
collecting training data (Para. 46, The content of the accumulator may be joined with training data sets at 160 (described further below), which may then be used to further train one or more transcription models at 165);
extracting training features from the training data (Para. 50, Each training data set may include data from one or more media files and their corresponding features profiles and transcripts. Each training data set may be a segment of or an entire portion of a large media file. Additionally, each time a media file is ingested and transcribed, it can be added to the training data set i.e. as a media file is ingested, those features are extracted and are included as feature profiles as training data); and
training the one or more models based on the extracted training features (Para. 50, further describes Training module 200 may train one or more transcription models to improve an engine or to optimize the selection of engines using one or more training data sets from training database 215. Training module 200, shown with training modules 200-1 and 200-2, may train a transcription model using multiple, e.g., thousands or millions, of training data sets. Each training data set may include data from one or more media files and their corresponding features profiles and transcripts).

Regarding claim 13, is directed to a computer program product claim of method claim 1 and is rejected under the same grounds as method claim 1. 
A computer program product for transcribing media, the computer program product (Para. 131, FIG. 12 illustrates an exemplary system or apparatus 1200 in which processes 100, 200, 300, 400, 500, 800 and 100 can be implemented… Para. 134 indicates One or more processing circuits 1204 in the processing system may execute software or software components. Software shall be construed broadly to mean instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules, etc.) comprising:
one or more non-transitory computer-readable storage media and program instructions stored on the one or more non-transitory computer-readable storage media capable of performing a method (Para. 135, The software may reside on machine-readable medium 1206. The machine-readable medium 1206 may be a non-transitory machine-readable medium. A non-transitory processing circuit-readable, machine-readable or computer-readable medium… the various methods described herein may be fully or partially implemented by instructions and/or data that may be stored in a “machine-readable medium,” “computer-readable medium,” “processing circuit-readable medium” and/or “processor-readable medium” and executed by one or more processing circuits, machines and/or devices).

Regarding claim 17, is directed to a computer system claim of method claim 1 and is rejected under the same grounds as method claim 1.
A computer system for transcribing media, the computer system (Para. 131, FIG. 12 illustrates an exemplary system or apparatus 1200 in which processes 100, 200, 300, 400, 500, 800 and 100 can be implemented… Para. 134 indicates One or more processing circuits 1204 in the processing system may execute software or software components. Software shall be construed broadly to mean instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules, etc.)  comprising:
one or more computer processors, one or more computer-readable storage media, and program instructions stored on the one or more of the computer-readable storage media for execution by at least one of the one or more processors capable of performing a method (Para. 135, The software may reside on machine-readable medium 1206. The machine-readable medium 1206 may be a non-transitory machine-readable medium. A non-transitory processing circuit-readable, machine-readable or computer-readable medium… the various methods described herein may be fully or partially implemented by instructions and/or data that may be stored in a “machine-readable medium,” “computer-readable medium,” “processing circuit-readable medium” and/or “processor-readable medium” and executed by one or more processing circuits, machines and/or devices).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35
U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness
rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries for establishing a background for determining obviousness under
35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

This application currently names joint inventors. In considering patentability of the
claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 8 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Bradley in
view of Talieh et al. (US Pat. No. 11,315,569 B1) hereinafter Talieh.
Regarding claim 8, Bradley teaches the method of claim 7 (see claim 7 above), in addition, Bradley teaches,
wherein the transcription style is selected from a group comprising a transcription (Para. 175, a selection of the hippopotamuses sample image 112 in the grid of FIG. 11A. After receiving a selection of the sample image 112, the system can display the corresponding text transcript 114 i.e. selection is made from various configuration comprising a transcription), outline (Para. 146, as shown in FIG. 8C, the system can match the text of the later speaker to align with the text of the earlier speaker according to their recorded timestamps i.e. outline of speakers with related transcription), presentation with notes (Para. 166, lower threshold for a slide-view of a presentation i.e. as with Fig 11A in which presentation with slides are provided with associated notes of corresponding animals which may correspond to lecture notes or the transcription itself as an application of use recognized in para. 3), blog with comments (Para. 117, FIG. 7, the display 70 can show a speaker name at the beginning of the transcribed text of the speaker. As such, the speaker name can be an indicator to mark a speaker. According to some embodiments, other indicators, such as avatars and user IDs, can also be adopted; furthermore, para. 124 indicates that transcriptions may contain comments of editors i.e. structure of various speakers is written as a blog with comments included see figure 7 specifically text related to speakers and comments such as “I’ll handle this” from Porter), and tutorial with examples (Para. 166, lower threshold for a slide-view of a presentation i.e. as with Fig 11A in which presentation with slides are provided with associated notes of corresponding animals which may correspond to lecture notes (lecture may be a tutorial as it is teaching) or the transcription itself as an application of use recognized in para. 3 where examples may be images that are used to enrich).
In a related field of endeavor (e.g. generating a transcription of a media), Talieh teaches, Generative summary—In some embodiments, analytics subsystem 118 may provide a short, automatically generated abstract or summary of each transcript, condensing the gist of the information about a meeting. For example, the summary can be “This meeting was about discussing tasks to reach our next milestone on project A”, see lines 45-50 on col. 9.
Modifying Bradley to include the features disclosed by Talieh discloses:
wherein the transcription style is selected from a group comprising a transcription, outline, summary, presentation with notes, blog with comments, and tutorial with examples (e.g. Bradley’s computerized method now also including summary within the transcription styles as taught by Talieh, see lines 45-50 on col. 9).
It would have been obvious to one of ordinary skill in the art at the time the invention
was filed to apply the teachings of Talieh to the method of Bradley. Doing so would have been predictable to one of ordinary skill in the art given the similar nature between the two
disclosures, for example transcribing from media. Further, doing so would have provided the users of Bradley, with the added benefits of the analytics subsystem 118 may not display all information, and not everything will be explicitly shown to speakers. Some of this knowledge could be used to make informed decisions for enhancing user experience and displaying transcripts in a better way which maximizes productivity. Some information can be highlighted (e.g., key phrases, named entities) to help users identify important parts of transcripts. In some embodiments, based on the action items and intents determined by analytics subsystem 118, collaboration subsystem 112 can create tasks or tickets in project management tools, saving time for users, see lines 51-65 on col. 9.

Regarding claim 12, Bradley teaches the method of claim 1 (see claim 1 above), in addition, Bradley teaches,
wherein the one or more features include frequency (Para. 188, filter banks may be applied to determine values for one or more frequency domain features, such as Mel-Frequency Cepstral Coefficients i.e. features extracted from the audio), vocabulary (Para. 64, A networked server can perform higher accuracy ASR using larger models, large vocabulary, organization-specific vocabulary, custom phrase replacement, natural language grammar processing, of some combination of such features and techniques i.e. language model is used to correlated with vocabulary features extracted from media), facial expressions (Para. 97, the system can analyze the user's facial movement, such as mouth movement i.e. facial movements are particular expressions), 
However, Bradley fails to explicitly disclose:
wherein the one or more features include topics, importance, tones, moods, pointing, waving, eye direction, and eye movement.
In a related field of endeavor (e.g. generating a transcription of a media), Talieh teaches, feature extraction subsystem 116 processes an audio recording (e.g., first audio recording 204a or other audio recordings) or a transcript (e.g., first speaker-specific transcript 214a, meeting transcript 216, or other transcript) to extract multiple features associated with the meeting. A feature describes or indicates a characteristic of the meeting. The features can include vocabulary, semantic information of conversations, summarization of a call, voice signal associated features (e.g., a speech rate, a speech volume, a tone, and a timber), emotions of speakers (e.g., fear, anger, happiness, timidity, fatigue), personal attributes of speakers (e.g., an age, an accent, and a gender), non-aural features (e.g., visual features such as body language or facial expressions of the speaker i.e. including movements and directions of eye and body language with pointing or waving), or any other features. The features can also include subject matter related features such as a subject of the meeting, an industry or technology area related to the meeting, a product or service discussed during the meeting, or other features, see lines 14-31 on col. 8.
It would have been obvious to one of ordinary skill in the art at the time the invention
was filed to apply the teachings of Talieh to the method of Bradley. Doing so would have been predictable to one of ordinary skill in the art given the similar nature between the two
disclosures, for example transcribing from media. Further, doing so would have provided the users of Bradley, with the added benefits of the analytics subsystem 118 may not display all information, and not everything will be explicitly shown to speakers. Some of this knowledge could be used to make informed decisions for enhancing user experience and displaying transcripts in a better way which maximizes productivity. Some information can be highlighted (e.g., key phrases, named entities) to help users identify important parts of transcripts. In some embodiments, based on the action items and intents determined by analytics subsystem 118, collaboration subsystem 112 can create tasks or tickets in project management tools, saving time for users, see lines 51-65 on col. 9. Furthermore, analytics subsystem 118 processes the features to determine various analytics that can provide different types of information regarding the meeting or speakers as recognized by Talieh, see lines 38-43 on col. 8 i.e. information may gathered on following features to determine information regarding the meeting or speakers.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s
disclosure.
Danilo et al. (US 2021/0050000 A1) teaches, a system for generating a personality
assessment that uses multimodal behavioral signal processing technology and machine learning prediction technology. This system takes a video as input, processes it through an artificial intelligence software built for extracting hundreds of behavioral features, and consequently generates an accurate and reliable personality assessment with its machine-learning predictive software. The personality assessment is based on the five-factor model (FFM), also known as the big 5 personality traits.

Thomson et al. (US 2020/0243094 A1) teaches, method may include obtaining first
audio data originating at a first device during a communication session between the first device and a second device. The method may also include obtaining an availability of revoiced transcription units in a transcription system and in response to establishment of the communication session, selecting, based on the availability of revoiced transcription units, a revoiced transcription unit instead of a non-revoiced transcription unit to generate a transcript of the first audio data. The method may also include obtaining revoiced audio generated by a revoicing of the first audio data by a captioning assistant and generating a transcription of the revoiced audio using an automatic speech recognition system. The method may further include in response to selecting the revoiced transcription unit, directing the transcription of the revoiced audio to the second device as the transcript of the first audio data.

Taple et al. (US 2018/0315429 A1) teaches, Techniques for accurately recording
sworn deposition testimony without use of a court reporter are described herein. According to these techniques, participants in a deposition or other legal proceeding are identified in such a manner that speech in one or more audio files representing the deposition can be associated with the respective participants. The association of participants with recorded speech is used to automatically generate an accurate transcript sequentially reflecting what was said at the deposition proceeding and by which of the respective participants.

Raanani et al. (US 2018/0046710 A1) teaches, automatically generating a playlist of
conversations having a specified moment. A moment can be occurrence of a specific event or a specific characteristic in a conversation, or any event that is of specific interest for an application for which the playlist is being generated. For example, a moment can include laughter, fast-talking, objections, response to questions, a discussion on a particular topic such as budget, behavior of a speaker, intent to buy, etc., in a conversation. A moment identification system analyzes each of the conversations to determine if one or more features of a conversation correspond to a specified moment, and includes those of the conversations in the playlist having one or more features that correspond to the specified moment. The playlist may include a portion of a conversation that has the specified moment rather than the entire conversation.

Yoshioka et al. (US 2020/0349950 A1) teaches A computer implemented method
processes audio streams recorded during a meeting by a plurality of distributed devices. Operations include performing speech recognition on each audio stream by a corresponding speech recognition system to generate utterance-level posterior probabilities as hypotheses for each audio stream, aligning the hypotheses and formatting them as word confusion networks with associated word-level posteriors probabilities, performing speaker recognition on each audio stream by a speaker identification algorithm that generates a stream of speaker-attributed word hypotheses, formatting speaker hypotheses with associated speaker label posterior probabilities and speaker-attributed hypotheses for each audio stream as a speaker confusion network, aligning the word and speaker confusion networks from all audio streams to each other to merge the posterior probabilities and align word and speaker labels, and creating a best speaker-attributed word transcript by selecting the sequence of word and speaker labels with the highest posterior probabilities, specifically, para. 122 indicates, analysis of video data may indicate an eye gaze or track eye movements to infer where a user is looking. Eye gaze analysis may result in control commands for the AI application, and may differ based on fusion with audio data..

Any inquiry concerning this communication or earlier communications from the
examiner should be directed to JONATHAN E AMAYA HERNANDEZ whose telephone number is (571)272-2484. The examiner can normally be reached Monday - Friday 7:30 am - 3:30 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Flanders can be reached on 571-272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/J.E.A./             Examiner, Art Unit 2655   

/ANDREW C FLANDERS/             Supervisory Patent Examiner, Art Unit 2655