DETAILED ACTION
Response to Arguments
The amendment filed 5/02/2022 have been entered and made of record.

The application has pending claim(s) 1-20.

In response to the amendments filed on 5/02/2022:
The “Objections to the claims” have been entered and therefore the Examiner withdraws the objections to the claims.  
The “Claim rejections under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph” have been entered and therefore the Examiner withdraws the rejections under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph.

The Applicant's arguments with respect to claims 1-20 have been considered but are moot in view of the new ground(s) of rejection at least because the Applicant has amended independent claim(s) 1, 11, and 20.
Applicant’s arguments, see pages 6-9, filed 5/02/2022, with respect to the rejection(s) of claim(s) 1-20 under 35 U.S.C. 103 have been fully considered and are persuasive with regard to the amended limitations.  Therefore, the rejections have been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in further view of the newly found prior art reference Farha et al (“MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation” – arXiv – March 5, 2019, pages 1-10).  Further discussions are addressed in the prior art rejection section below.  Therefore claims 1-20 are still not in condition for allowance because they are still not patentably distinguishable over the prior art references.

Priority
Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, 365(c), or 386(c) is acknowledged. Applicant has not complied with one or more conditions for receiving the benefit of an earlier filing date under 35 U.S.C. 119(e), 120, 121, 365(c), or 386(c) as follows:
The later-filed application must be an application for a patent for an invention which is also disclosed in the prior application (the parent or original nonprovisional application or provisional application). The disclosure of the invention in the parent application and in the later-filed application must be sufficient to comply with the requirements of 35 U.S.C. 112(a) or the first paragraph of pre-AIA  35 U.S.C. 112, except for the best mode requirement.  See Transco Products, Inc. v. Performance Contracting, Inc., 38 F.3d 551, 32 USPQ2d 1077 (Fed. Cir. 1994)
The disclosure of the prior-filed application, Application No. 62/806,164, fails to provide adequate support or enablement in the manner provided by 35 U.S.C. 112(a) or pre-AIA  35 U.S.C. 112, first paragraph for one or more claims of this application.  Accordingly, amended claims 1-20 are not entitled to the benefit of the prior-filed application because the prior-filed application fails to provide adequate support or enablement toward at least the amended truncated mean-square error T-MSE and the combination of the T-MSE and the sigmoid binary cross-entropy loss.  Therefore this current application, Application No. 16/791,919, is entitled to the benefit of only its prior-filed application 62/944,033 with priority date 12/05/2019.

Claim Objections
Claims 8, 11, 18, and 20 are objected to because of the following informalities:  
Claim 8: Due to the amendments, the equation of claim 8 is too blurry.
Claim 11 at line 8; and claim 18 at line 3 respectively: Due to the amendments, “neural,” should be -- neural network, --.
Claim 11 at lines 10-11; and claim 20 at line 10 respectively: Due to the amendments, “loss; and a regression” should be -- loss and a regression --.
Claim 11 at the last line: Due to the amendments, “detection. .” should be -- detection. --.
Claim 20 at line 15: Due to the amendments, “the binary” should be -- the sigmoid binary --.

Appropriate correction is required.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim 1, 4, 9-10, 11, 14, 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Divine et al (US 2018/0247023 A1, as applied in previous Office Action) in view of Risman et al (US 2018/0033144 A1, as applied in previous Office Action) and further in view of Farha et al (“MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation” – arXiv – March 5, 2019, pages 1-10).
Regarding claim 1, Divine teaches a system for automatically generating data structures adapted for storing classifications relating to an adverse event based on audio or video data, the classifications based at least on a plurality of classification tasks (AR assistance module including procedure characterization component, procedure identification component, and procedure assessment component to identify the type of procedure being performed and evaluate the correctness of the procedure [potential errors resulting in severity classification] based on the descriptive information generated from visual or audio input, Figures 1 and 17, Paragraphs 0005, 0062-0064, 0073-0074, 0076, 0116, and 0176), the system comprising: a processor, operating in conjunction with computer memory, the processor configured to: receive a set of audio or video data (Paragraph 0029, lines 4-10; Paragraph 0073, lines 1-9); extract, using a feature extractor neural network, a vector of latent features from the set of audio or video data (descriptive information [latent features] generated by procedure characterization component according to video image data or audio input, Paragraphs 0068-0070). 
However, Divine fails to explicitly disclose provide, to each of a plurality of time-based classifiers, the vector of latent features from the feature extractor neural network, each time-based classifier corresponding to a classification task of the plurality of classification tasks; train the feature extractor neural network on a training data set using a sigmoid binary cross-entropy loss and a regression loss; and train each time-based classifier of the plurality of time-based classifiers separately on each classification task of the plurality of classification tasks with a loss function that includes at least the sigmoid binary cross-entropy loss and the regression loss; wherein the regression loss is a truncated mean-square error (T-MSE) that minimizes a number of transitions from one action to another, and the combination of the regression loss and the sigmoid binary cross-entropy loss adapts the feature extractor neural network for multi-task event detection.
Risman teaches extract, using a feature extractor neural network, a vector of latent features from the set of audio or video data (features extracted from each 2D image by using convolutional neural network [CNN], Fig. 4, Paragraphs 0037-0038); provide, to each of a plurality of time-based classifiers, the vector of latent features from the feature extractor neural network, each time-based classifier corresponding to a classification task of the plurality of classification tasks (features generated by convolution neural network are fed through recurrent neural network [RNN, time-based classifier], Fig. 4, Paragraph 0039; Fig. 12, multiple CNN-LSTM architectures used for different applications such as detection of intracranial hemorrhage, lung node detection, etc., Paragraphs 0083-0084, 0087-0088); train the feature extractor neural network on a training data set using a sigmoid binary cross-entropy loss (CNN trained using a sigmoid activation and binary cross-entropy loss for optimization, Figures 9 and 12, Paragraph 0070); and train each time-based classifier of the plurality of time-based classifiers separately on each classification task of the plurality of classification tasks with a loss function that includes at least the sigmoid binary cross-entropy loss (RNN trained using a sigmoid activation to optimize a binary cross-entropy loss using the Adam optimizer, Figures 10 and 12, Paragraph 0078).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Divine’s system using Risman’s teachings by including CNN-RNN models to Divine’s AR assistance module in order to improve the anomaly detection and determine the correct procedure to be performed (Risman, Paragraphs 0006-0007). 
However Divine as modified by Risman fails to explicitly disclose a combination of the sigmoid binary cross-entropy loss and a regression loss wherein the regression loss is a truncated mean-square error (T-MSE) that minimizes a number of transitions from one action to another, and the combination of the regression loss and the sigmoid binary cross-entropy loss adapts the feature extractor neural network for multi-task event detection.
Farha teaches a combination of the sigmoid binary cross-entropy loss and a regression loss wherein the regression loss is a truncated mean-square error (T-MSE) that minimizes a number of transitions from one action to another, and the combination of the regression loss and the sigmoid binary cross-entropy loss adapts the feature extractor neural network for multi-task event detection (Section 3.3. and 4.3.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further modify Divine’s system, as modified by Risman, using Farha’s teachings by including sigmoid cross-entropy loss and truncated mean squared error loss combination to Divine’s [as modified by Risman] loss function in order to improve the quality of the predictions (Farha, Section 3.3. and 4.3.). 

Regarding claim 4, the combination of Divine as modified by Risman and Farha teaches the system of claim 1, wherein the feature extractor neural network is a three dimensional (3D) or two-dimensional (2D) convolutional network (Risman, features extracted from each 2D image by using convolutional neural network [CNN], Fig. 4, Paragraphs 0037-0038; Risman, 2D CNN fed into RNN, Paragraph 0042, lines 10-12, Paragraph 0049).  See claim 1 for obviousness and motivation statements.

Regarding claim 9, the combination of Divine as modified by Risman and Farha teaches the system of claim 1, wherein the processor is configured to receive a set of audio data, and the feature extractor neural network extracts the vector of latent features from a combination of the set of audio data and the set of video data (Divine, Figures 8-9, the computer system can receive various forms of input that can be used to monitor and characterize various aspects of a procedure being performed, e.g., image data and audio data, Paragraph 0029, lines 4-18, Paragraph 0030; Divine, descriptive information [latent features] generated by procedure characterization component according to video image data and/or audio input, Paragraphs 0069, Paragraph 0070, Col. 1, lines 1-2, Col. 2, lines 1-5, Paragraphs 0071 and 0073).

Regarding claim 10, the combination of Divine as modified by Risman and Farha teaches the system of claim 9, wherein the training data set includes both training video data and training audio data (Divine, procedure characterization component can generate descriptive information based on a combination of data which can be learned by machine learning techniques for defined procedure events, Paragraphs 0071 and 0073; Divine, the optimization component can perform machine learning or deep learning techniques to improve the determinations made by the procedure assessment component using compiled data including descriptive information generated from image and audio data, Paragraph 0109).

Regarding claims 11, 14, and 19, the rationale provided in the rejection of claims 1, 4, and 9 is incorporated herein, respectively.  Further, Divine discloses a system comprising a processor, operating in conjunction with computer memory, to execute the method (Divine, Fig. 1, Paragraphs 0005, 0042-0043, and 0176).

Regarding claim 20, the rationale provided in the rejection of claim 1 is incorporated herein. Further, Divine teaches a non-transitory computer readable medium storing machine interpretable instructions, the machine interpretable instructions, which when executed by a processor, cause the processor to perform the method (Divine, Fig. 1, Paragraphs 0005, 0042-0043, and 0185).


Claims 2-3 and 12-13 are rejected under 35 U.S.C. 103 as being unpatentable over Divine as modified by Risman and Farha, and further in view of Liu et al (“Bundled camera paths for video stabilization”, ACM Transactions on Graphics (TOG), July 2013, pages 1–10, as applied in previous Office Action).  The teachings of Divine as modified by Risman and Farha have been discussed above.
Regarding claim 2, the combination of Divine as modified by Risman and Farha teaches the system of claim 1, but fails to disclose wherein the set of audio or video data includes a set of video frames that have been stabilized to reduce camera motion through the use of bundled-camera path stabilization that reduces jitter and smooths camera paths so that the latent features are accumulated across a plurality of frames.
Liu teaches wherein the set of audio or video data includes a set of video frames that have been stabilized to reduce camera motion through the use of bundled-camera path stabilization that reduces jitter and smooths camera paths so that the latent features are accumulated across a plurality of frames (a bundle of camera paths are smoothed for video stabilization by optimizing a single camera path to remove jitters and then doing a space-time optimization of all paths to enforce smoothness between neighboring camera paths, Section 4 – Path Optimization).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further modify Divine’s system, as modified by Risman and Farha, using Liu’s teachings by including bundled-camera path stabilization to Divine’s [as modified by Risman and Farha] AR assistance module in order to reduce jitter in video images captured by a camera and improve image quality for feature extraction (Liu, smoothing bundled-camera path to maintain spatial and temporal coherences across frames, Page 2/10, Col. 1, Paragraph 2). 

Regarding claim 3, the combination of Divine as modified by Risman, Farha, and Liu, teaches the system of claim 2, wherein stabilization includes warping images to align each frame's camera view based at least on homography (Liu, warping from frame t to frame t+1 based on homograph, Page 3/10, Section 3.1, Model; Section 3.3, Pre-warping).  See claim 2 for obviousness and motivation statements.

Regarding claims 12 and 13, the rationale provided in the rejection of claims 2 and 3 is incorporated herein, respectively.  Further, Divine discloses a system comprising a processor, operating in conjunction with computer memory, to execute the method (Divine, Fig. 1, Paragraphs 0005, 0042-0043, and 0176).


Claims 5 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Divine as modified by Risman and Farha as applied to claim 1 above, and further in view of Darty (US 2020/0240840 A1, as applied in previous Office Action).  The teachings of Divine as modified by Risman and Farha have been discussed above.
Regarding claim 5, the combination of Divine as modified by Risman and Farha teaches the system of claim 1, but fails to explicitly disclose wherein the classification tasks include at least bleeding and thermal injury detection, and wherein the classification tasks are causally distinct and include distinguishing active injury events from prior injury artifacts.
Darty teaches wherein the classification tasks include at least bleeding and thermal injury detection, and wherein the classification tasks are causally distinct and include distinguishing active injury events from prior injury artifacts (multispectral imaging system used to diagnose medical conditions by identifying wounds, burn, and capable of determining new or old injury, Paragraphs 0163, 0191, 0203, and 0215).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further modify Divine’s system, as modified by Risman and Farha, using Darty’s teachings by including injury detection to Divine’s [as modified by Risman and Farha] medical procedures in order to assist in image-guided surgery (Darty, Paragraph 0219, lines 1-6). 

Regarding claim 15, the rationale provided in the rejection of claim 5 is incorporated herein. Further, Divine discloses a system comprising a processor, operating in conjunction with computer memory, to execute the method (Divine, Fig. 1, Paragraphs 0005, 0042-0043, and 0176).


Claims 6 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Divine as modified by Risman and Farha as applied to claim 1 above, and further in view of Lin et al (“Focal Loss for Dense Object Detection”, 2017 IEEE International Conference on Computer Vision (ICCV) - pages 2999-3007, as applied in previous Office Action).  The teachings by Divine as modified by Risman and Farha have been discussed above.
Regarding claim 6, the combination of Divine as modified by Risman and Farha teaches the system of claim 1, but fails to disclose wherein the loss function for each time-based classifier further includes focal loss.
Lin teaches wherein the loss function for each time-based classifier further includes focal loss (focal loss is used to balance the loss from different classes by adding a weight factor to the cross entropy function, Section 3 – Focal Loss, subsections 3.1 and 3.2). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further modify Divine’s system, as modified by Risman and Farha, using Lin’s teachings by including focal loss to Divine’s [as modified by Risman and Farha] loss function in order to reduce the effects of imbalanced classes on the training of neural network models (Lin, Section 6. Conclusion). 

Regarding claim 16, the rationale provided in the rejection of claim 6 is incorporated herein, respectively.  Further, Divine discloses a system comprising a processor, operating in conjunction with computer memory, to execute the method (Divine, Fig. 1, Paragraphs 0005, 0042-0043, and 0176).


Claims 7 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Divine as modified by Risman and Farha as applied to claim 1 above, and further in view of Taieb et al (“Uncertainty Driven Multi-loss Fully Convolutional Networks for Histopathology” -CVII-STENT/LABELS 2017 – pages 155-163, as applied in previous Office Action).  The teachings of Divine as modified by Risman and Farha have been discussed above.
Regarding claim 7, the combination of Divine as modified by Risman and Farha teaches the system of claim 1, but fails to disclose wherein the loss function for each time-based classifier further includes uncertainty loss.
Taieb teaches wherein the loss function includes uncertainty loss (using a measure of uncertainty to weight each term in the multi-loss objective function, Page 158, Paragraph 3, equation 2).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further modify Divine’s system, as modified by Risman and Farha, using Taieb’s teachings by including uncertainty loss to Divine’s [as modified by Risman and Farha] loss function in order to optimize the learning of parameters in neural network models (Taieb, using uncertainty to weight each term to reduce the influence of uncertain terms on the total loss and hence on the model’s parameters update, Page 158, Paragraph 3). 

Regarding claim 17, the rationale provided in the rejection of claim 7 is incorporated herein, respectively.  Further, Divine discloses a system comprising a processor, operating in conjunction with computer memory, to execute the method (Divine, Fig. 1, Paragraphs 0005, 0042-0043, and 0176).


Claims 8 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Divine as modified by Risman and Farha as applied to claim 1 above, and further in view of Lin and Taieb.  The teachings of Divine as modified by Risman and Farha have been discussed above.
Regarding claim 8, the combination of Divine as modified by Risman and Farha teaches the system of claim 1, wherein the loss function is based on the relation where C is a number of classes, N is a number of samples, λ is a smoothing loss constant, LBCEcn is a corresponding binary cross entropy of class c and sample n, pcn is a confidence probability of class c at sample n (Farha, Section 3.3., equation 12) [see claim 1 for obviousness and motivation statements].  
However, the combination of Divine as modified by Risman and Farha fails to disclose wherein the loss function for each time-based classifier further includes both focal and uncertainty loss and σ2c is a learnable scalar added from the uncertainty loss.
Lin teaches wherein the loss function further includes focal loss (focal loss is used to balance the loss from different classes by adding a weight factor to the cross entropy function, Section 3 – Focal Loss, subsections 3.1 and 3.2). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further modify Divine’s system, as modified by Risman and Farha, using Lin’s teachings by including focal loss to Divine’s [as modified by Risman and Farha] loss function in order to reduce the effects of imbalanced classes on the training of neural network models (Lin, Section 6. Conclusion). 
However, the combination of Divine as modified by Risman, Farha, and Lin fails to explicitly disclose wherein the loss function for each time-based classifier further includes uncertainty loss and σ2c is a learnable scalar added from the uncertainty loss.
Taieb teaches wherein the loss function includes uncertainty loss and σ2c is a learnable scalar added from the uncertainty loss (using a measure of uncertainty to weight each term in the multi-loss objective function, Pages 158-159, equations 2, 6, and 8).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to further modify Divine’s system, as modified by Risman, Farha, and Lin, using Taieb’s teachings by including uncertainty loss to Divine’s [as modified by Risman, Farha, and Lin] loss function in order to optimize the learning of parameters in neural network models (Taieb, using uncertainty to weight each term to reduce the influence of uncertain terms on the total loss and hence on the model’s parameters update, Pages 158-159). 

Regarding claim 18, the rationale provided in the rejection of claim 8 is incorporated herein, respectively.  Further, Divine discloses a system comprising a processor, operating in conjunction with computer memory, to execute the method (Divine, Fig. 1, Paragraphs 0005, 0042-0043, and 0176).


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. CN 107451620 A discloses a multi-task loss function in equation 11.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to BERNARD KRASNIC whose telephone number is (571)270-1357.  The examiner can normally be reached on Mon. - Thur. and every other Friday from 8am - 4pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vincent Rudolph can be reached on (571) 272-8243.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/Bernard Krasnic/Primary Examiner, Art Unit 2661                                                                                                                                                                                                        May 26, 2022