Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
Applicant’s amendments to claims no longer invokes 35 U.S.C. 112(f).

Response to Arguments
Applicant's arguments filed June 15th, 2022 have been fully considered but they are not persuasive.
Applicant asserts that the combination of Brock et al. “Generative and discriminative voxel modeling with convolutional neural networks,” arXiv preprint arXiv.1608.04236 (2016); hereafter: Brock, Park et al. (US 2020/0167911 A1; hereafter: Park) and Zhang et al. (US 2020/0013165 A1; hereafter: Zhang) fails to disclose or suggests the limitations of claim 1, more specifically “a region of interest visualization part configured to generate a heat map which visualizes a region of interest identified in the 3D medical in artificial intelligence generating a diagnostic result of the 3D medical image; and a 3D structure disposed between the 3D Inception-Resnet block structure and the global pooling average structure … wherein the region of interest visualization part is configured to generate heat map by multiplying first features which are output of the 3D convolution structure and weights learned at the fully connected layer and summing multiplications of the first features and the weights.” Applicant asserts that Zhang is does not teach “generate the heat map by multiplying first features which are output of the 3D convolution structure and weights learned at the fully connected layer and summing multiplications of the first features and the weights.” Examiner respectfully disagrees. In the previous Office Action ¶62 of Zhang is cited as teaching “multiplying first features which are output of the 3D convolution structure and weights” as Zhang discloses the application of set of weights to generate heat maps. Application of a set of weights to a set of features to generate a final output often multiplies the weights with the features as commonly known to one of ordinary skill in the art. Furthermore, ¶42, 44, 46, and 50 of Zhang discloses the use of a weighted add process, which multiplies weights with associated features and sums the results together, to generate the heat map and image visualization. Therefore, Zhang teaches: the limitation of “multiplying first features which are output of the 3D convolution structure and weights.”

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-6 and 9-11 are rejected under 35 U.S.C. 103 as being unpatentable over Brock, and further in view of Park and Zhang.
Claims 1-6 and 9 recites an apparatus comprising of structural components of a specific kind of convolutional neural network, the Inception-Resnet model. Brock discloses the general structure of an Inception-Resnet model for three-dimensional image processing and computer vision. Park discloses the use of neural networks and deep learning, and more specifically any convolutional neural network, to analyze medical images of muscles and tendon tears and classify the extent of the tears. Therefore, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified Brock with the teachings of Park to incorporate the use of an Inception-Resnet model in classifying muscle and tendon tears in medical images. The motivation in doing so would lie in the ability to process three-dimensional medical images and obtain accurate results. As the majority of the limitations are found in Brock the claims will be rejected over Brock and in view of Park.
Regarding Claim 1, Brock, in view of Park, teaches: an automated classification apparatus for a shoulder disease comprising at least one hardware processor configured to implement (Park: Figure 5; Park: Abstract: “A determination process is performed at the data processing system to determine one or more characteristics relating to one or more abnormalities of the one or more target tendon”; Park: ¶15: “Optionally, the region of interest is a shoulder region of a human or animal body”): a 3D (three dimensional) Inception-Resnet block structure comprising a 3D Inception-Resnet structure (Brock: Figure 4; Figure 4 shows an entire Inception-Resnet block architecture; Brock discloses a Voxception-Resnet network but it is functionally the same as the Inception-Resnet network but instead uses 3D image data.”) configured to receive a 3D medical image of a patient’s shoulder and extract features from the 3D medical image (Park: ¶54: “medical image data is received at a data processing system. The data processing system may be an artificial intelligence-based system, such as a machine learning-based system”), and a 3D Inception-Downsampling structure configured to downsample information of a feature map including the features (Brock: Figure 4: “Voxception-ResNet 45 Layer Architecture. DS are Voxception-Downsample blocks.”; Brock: Section 3.3.1: “For downsampling layers (Figure 3, center), we stack 3x3x3 convolutions with strided pooling operations, using both max and average pooling, and concatenate those features with strided 3x3x3 and 1x1x1 convolutions.”); a global average pooling structure configured to operate an average pooling for an output of the 3D Inception-Resnet block structure (Brock: Figure 4: Global pool block); a fully connected layer disposed after the 3D global average pooling structure (Brock: Figure 4: FC1 block; Brock: Section 3.1.2: ¶2: “Our best-performing architecture is shown in Figure 4, and consists of an initial convolutional layer, four main units, each containing three stacked VRN [Voxception-ResNet] blocks and a Voxception-Downsample block, a final convolution with a residual connection and keep probability of 0.5, then a global pooling layer and two fully-connected layers.”) but does not explicitly teach a region of interest visualization part configured to generate a heat map which visualizes a region of interest identified in the 3D medical image in artificial intelligence generating a diagnostic result of the 3D medical image.
In a related art, Zhang teaches: a region of interest visualization part configured to generate a heat map which visualizes a region of interest identified in the 3D medical image in artificial intelligence generating a diagnostic result of the 3D medical image, (¶28: “To facilitate localization of one or more diseases associated with the medical imaging data, the machine learning component 104 can perform a local pooling process for an activation map associated with a convolutional layer of the convolutional neural network. Additionally, or alternatively, the machine learning component 104 can generate the learned medical imaging output based on a class activation mapping process that applies a set of weights to a set of heat maps associated with the medical image data.”) for easier identification of a region of interest in an output of a neural network.
Therefore, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to have modified Brock, in view of Park, with the above teachings of Zhang to incorporate the generation of a heat map. The motivation in doing so would lie in the ability to easily identify regions of interest from the output of a neural network.
Brock, in view of Park, and in further view of Zhang teaches: a 3D convolution structure disposed between the 3D Inception-Resnet block structure and the global pooling average structure (Brock: Figure 4: ResConv Block; Brock: Section 3.1.2: ¶2: “Our best-performing architecture is shown in Figure 4, and consists of an initial convolutional layer, four main units, each containing three stacked VRN [Voxception-ResNet] blocks and a Voxception-Downsample block, a final convolution with a residual connection and keep probability of 0.5, then a global pooling layer and two fully-connected layers.”) wherein the automated classification apparatus is configured to automatically classify the 3D medical image into a plurality of categories (Park: ¶31: “the determination process is a classification process for classifying the one or more target tendons with respect to the tendon tear based on the one or more characteristics”; Park: ¶33: “the classification attributed to the one or more target tendons is selected from a list of classifications including two or more classifications”), and further teaches: wherein the region of interest visualization part is configured to generate the heat map by multiplying first features which are output of the 3D convolution structure and weights learned at the fully connected layer and summing multiplications of the first features and the weights (Zhang: ¶62: “the methodology 1200 can additionally or alternatively include generating, by the system, the learned medical imaging output based on a class activation mapping process that applies a set of weights to a set of heat maps associated with the medical imaging data”; Zhang: Figure 8: element 820; Zhang: ¶28: “To facilitate localization of one or more diseases associated with the medical imaging data, the machine learning component 104 can perform a local pooling process for an activation map associated with a convolutional layer of the convolutional neural network. Additionally, or alternatively, the machine learning component 104 can generate the learned medical imaging output based on a class activation mapping process that applies a set of weights to a set of heat maps associated with the medical image data.”; Zhang: ¶50: “The global pooling layer process 818 can be followed by a weighted add process 820. The fully connected layer process 808 can also be followed by the weighted add process 820.”) for the same reasons stated above.
Regarding Claim 2, Brock, in view of Park, and in further view of Zhang, teaches: the automated classification apparatus of claim 1, wherein the plurality of the categories includes ‘None’ which means a rotator cuff tear of the patient’s shoulder is not present, and ‘Partial’, ‘Small’, ‘Medium’ and ‘Large’ according to a size of the patient’s rotator cuff tear (Park: ¶33: “the classification attributed to the one or more target tendons is selected from a list of classifications including two or more classifications, each classification in the list indicating: no tendon tear; a presence of a tendon tear; a partial tear; a low grade partial tear; a high grade partial tear; or a full tear.”; rotator cuff is a group of muscle and tendon found in the shoulder).
Regarding Claim 3, Brock, in view of Park, and in further view of Zhang, teaches: the automated classification apparatus of claim 1, wherein the 3D medical image sequentially passes through a first 3D convolution structure (Brock: Figure 4: Input Block and Conv0 Block; Brock: Section 3.1.2: ¶2: “Our best-performing architecture is shown in Figure 4, and consists of an initial convolutional layer, four main units, each containing three stacked VRN [Voxception-ResNet] blocks and a Voxception-Downsample block, a final convolution with a residual connection and keep probability of 0.5, then a global pooling layer and two fully-connected layers.”), a first 3D Inception-Resnet block structure, a second 3D Inception-Resnet block structure (Brock: Figure 4: Block 1 and Block 2; Brock: Section 3.1.2: ¶2: “Our best-performing architecture is shown in Figure 4, and consists of an initial convolutional layer, four main units, each containing three stacked VRN [Voxception-ResNet] blocks and a Voxception-Downsample block, a final convolution with a residual connection and keep probability of 0.5, then a global pooling layer and two fully-connected layers.”; Brock discloses that the architecture in Figure 4 is their best performing architecture implying that there were other architectures having varying number of components and different elements. Having two Inception-Resnet block structures instead of 4 would be a reasonable alternative and substitution), a second 3D convolution structure (Brock: Figure 4: ResConv Block), the global average pooling structure and the fully connected layer (Brock: Figure 4: Global pool and FC1 Block).
Regarding Claim 4, Brock, in view of Park, and in further view of Zhang, teaches: the automated classification apparatus of claim 1, wherein the 3D Inception-Resnet block structure comprises three of the 3D Inception-Resnet structures (Brock: Section 3.1.2: ¶2: “Our best-performing architecture is shown in Figure 4, and consists of an initial convolutional layer, four main units, each containing three stacked VRN [Voxception-ResNet] blocks and a Voxception-Downsample block, a final convolution with a residual connection and keep probability of 0.5, then a global pooling layer and two fully-connected layers.”).
Regarding Claim 5, Brock, in view of Park, and in further view of Zhang, teaches: the automated classification apparatus of Claim 1, wherein the 3D Inception-Resnet structure comprises: a first 3D convolutional structure, a second 3D convolution structure and a third convolution structure which are connected in series and form a first path (Brock: Figure 3: Voxception-Resnet diagram; shows 3 convolution layers in series along one path); a fourth 3D convolution structure and a fifth 3D convolution structure which are connected in series and form a second path (Brock: Figure 3: Voxception-Resnet diagram; shows 2 convolution layers in series along one path); a first concatenate structure configured to concatenate an output of the third 3D convolution structure and an output of the fifth 3D convolution structure (Brock: Figure 3: Voxception-Resnet diagram; concatenate blocks joining the two paths with 2 and 3 convolution layer together); and an add structure configured to operate an element-wise add operation of an output of the first concatenate structure and an input of the 3D Inception-Resnet Structure (Brock: Figure 3: Voxception-Resnet diagram; plus sign circle on the bottom left combining the output of the concatenate block and the input).
Regarding Claim 6, Brock, in view of Park, and in further view of Zhang, teaches: the automated classification apparatus of claim 5, wherein the 3D Inception-Downsampling structure comprises: a sixth 3D convolution structure and a maximum pooling structure forming a third path, the maximum pooling structure configured to select a maximum value from an output of the sixth 3D convolution structure (Brock: Figure 3: Voxception-Downsample diagram: a convolution layer and max pooling layer are along a single path; Brock: Section 3.1.1: ¶2: “For downsampling layers (Figure 3, center), we stack 3x3x3 convolutions with strided pooling operations, using both max and average pooling, and concatenate those features with strided 3x3x3 and 1x1x1 convolutions.”); a seventh 3D convolution and an average pooling structure forming a fourth path, the average pooling structure configured to select an average value from the output of the seventh 3D convolution structure (Brock: Figure 3: Voxception-Downsample diagram: a convolution layer and average pooling layer are along a single path; Brock: Section 3.1.1: ¶2: “For downsampling layers (Figure 3, center), we stack 3x3x3 convolutions with strided pooling operations, using both max and average pooling, and concatenate those features with strided 3x3x3 and 1x1x1 convolutions.”); a first stride 3D convolution structure including a convolution filter having an increased moving unit and forming a fifth path (Brock: Figure 3: Voxception-Downsample diagram: a strided convolution layer is shown along one path; Brock: Section 3.1.1: ¶2: “For downsampling layers (Figure 3, center), we stack 3x3x3 convolutions with strided pooling operations, using both max and average pooling, and concatenate those features with strided 3x3x3 and 1x1x1 convolutions.”); a second stride 3D convolution structure different from the first stride 3D convolution structure, including a convolution filter having an increased moving unit and forming a sixth path (Brock: Figure 3: Voxception-Downsample diagram: a different strided convolution layer is shown along one path; Brock: Section 3.1.1: ¶2: “For downsampling layers (Figure 3, center), we stack 3x3x3 convolutions with strided pooling operations, using both max and average pooling, and concatenate those features with strided 3x3x3 and 1x1x1 convolutions.”); and a second concatenate structure configured to concatenate an output of the maximum pooling structure, an output of the average pooling structure, an output of the first stride 3D convolution structure and an output of the second stride 3D convolution structure (Brock: Figure 3: Voxception-Downsample diagram: concatenate block combines the output of all four paths of the Downsample structure).
Regarding Claim 9, Brock, in view of Park, and in further view of Zhang, teaches: the classification apparatus of claim 1, wherein the heat map is a 3D class activation map (Zhang: ¶28: “To facilitate localization of one or more diseases associated with the medical imaging data, the machine learning component 104 can perform a local pooling process for an activation map associated with a convolutional layer of the convolutional neural network. Additionally, or alternatively, the machine learning component 104 can generate the learned medical imaging output based on a class activation mapping process that applies a set of weights to a set of heat maps associated with the medical image data.”).
Regarding Claim 10, Claim 10 recites a method that is implemented by the apparatus of claim 1. Therefore, the rejection of Claim 1 is equally applied. (See Brock: Section 3 and Park: Figure 1)
Regarding Claim 11, Claim 11 recites a non-transitory computer readable storage medium storing instruction that, when executed, executes the method of Claim 10 using the apparatus of Claim 1. Therefore, the rejection of Claim 1 is equally applied. (See Park: ¶37)

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JULIUS CHAI whose telephone number is (571)272-4209. The examiner can normally be reached Monday-Thursday and Alternate Friday 8am-5:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached on (571) 272-7332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/JULIUS CHAI/Examiner, Art Unit 2668                                                                                                                                                                                                        
/VU LE/Supervisory Patent Examiner, Art Unit 2668