DETAILED ACTION
This action is in response to claims filed 17 September 2021 for application 15/889275 filed 06 February 2018. Currently claims 1, 2, 4, and 8-18 are pending.
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 25 October 2021 has been entered.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having 

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim 1, 2, 4, 9-12, and 15-18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ferdman et al. (US 20190220734) in view of Bebee et al. (US Patent 10,409,560).

Regarding claims 1 and 15, Ferdman discloses: A method for accelerating a neural network, the method comprising: 
(“Accordingly, as a certain data region of the input feature maps 102 is processed, successive intermediate data regions of all other fused convolution layers that depend on that data region are also processed as follows, without writing intermediate data off chip to external memory 106. Only the output feature maps 104 of the last fused layer 114 are written off chip to external memory 106. More specifically, an output data region of intermediate feature maps computed and outputted by a convolutional layer depends only on an input data region of input feature maps that are inputted to that convolutional layer. The exploitation of this data locality in the fused layer dataflow of the fused layer convolutional neural network 100 allows the data to be passed directly from one convolutional layer to the next, without writing and reading the intermediate data to and from the external memory 106. This fusion of convolutional layers will be described in greater detail below with reference to FIGS. 3A-3D.” [0059]); 
generating code ([0111 and 0115]) … based on the identified neural network layers, wherein the generated code is used to implement depth first processing on the neural network ([0111 and 0115]);
performing the depth-first processing on the neural network based on the generated code (“Unlike the conventional CNN layer-by-layer methodology as illustrated in FIG. 1A, which processes each CNN layer independently through to completion, reading and writing the intermediate data between convolutional layers off chip to and from the external memory, the fused layer convolutional neural network 100 exploits inter-layer data locality among feature-map data (e.g., the three-dimensional tile structures) of the convolutional layers such that already processed intermediate data of the first convolutional layer 110 can be reused by the second convolutional layer 112, and so on with other fused convolutional layers, without reading and writing intermediate data between convolutional layers off chip to and from the external memory 106.” [0056]), wherein identifying the neural network layers comprises: 
identifying parts of the neural network that perform a serial processing of functions that are mergeable to obtain a stack (“As illustrated in the fused layer examples 108, 116 of FIG. 1B and FIG. 1C, the key to accelerating processing of the convolutional neural network 100 is restructuring the conventional CNN layer-by-layer processing with fusion of adjacent CNN layers and iterative processing across all of the fused layers using data regions of feature-map data (e.g., three-dimensional tile structures) that exploit inter-data locality among the fused layers, which largely eliminates intermediate data transfer off chip to and from the external memory 106. More specifically, a three-dimensional tile structure of the input feature maps 102 is an input data region from which other data regions, which can also be three-dimensional tile structures, of the intermediate feature maps of the fused convolution layers depend.” [0058], Fig 2A and 2B are implemented as a disclosing a stack, note: the merged method is serial instead of parallel layer-by-layer); and 
mapping each function in the stack to obtain at least one operation, wherein each operation has a loop type property designating whether the operation is based on a convolution layer, a pooling layer, or an element-wise operational layer, (“Additionally, well-known elements of the disclosure will not be described in detail or will be omitted so as not to obscure the relevant details of the disclosure. For example, there are additional intermediate layers, such as activation layer, padding layer, and pooling layer, which can be interspersed between the convolutional layers and which are also fused in the fused layer convolutional neural network. However, such intermediate layers are well known and will not be described in detail hereinafter.” [0049], “Example input feature maps 302 are read into the first convolutional layer 307, such as from external memory 106. The input feature maps 302 include N different feature maps (e.g., N=3) of example size R×C (e.g., R=7 and C=7). More specifically, there are N-number of feature maps 302, with each of the feature maps 302 including R×C data elements 304. There are associated with the first convolutional layer 307, M-number of filters 306 of K×K×N weights (e.g., K=3). The filters 306 can be similarly read from external memory 106.” [0071], “In this regard, 5×5×N data values are read from off-chip external memory 106 into one or more on-chip buffers. Convolutional layer 307 convolves these read data values (5×5×N) with M filters 306 (3×3×N) across the three-dimensional tile structure 318, producing the intermediate three-dimensional tile structure 320 (e.g., dashed box labeled Intermediate Tile 1). The intermediate three-dimensional tile structure 320 is considered the middle of the pyramid, includes 3×3×M intermediate data values, and extends through all M intermediate feature maps 308.” [0078], note: [0070-84] disclose various loop type property operations for convolution and element-wise operational layer. Pooling layers are between some of these layers and would have their own operations associated with their processing)
and wherein the loop type property designates whether the operation has a requirement for data evaluation (“Because the first and second pyramids overlap, a certain number of the same intermediate data values can be used to process the output three-dimensional tile structures 322, 332 in the output feature maps 314. There are two possible approaches. On the one hand, the data values can be re-processed (re-computed) each time they are needed in the subsequent pyramid. On the other hand, the data values can be cached and then reused when processing the subsequent pyramid. Re-processing the data values can add significant extra cost in terms of additional convolutional operations, but also has the benefit of simplicity (e.g., each pyramid's internal dataflow is the same). However, caching the intermediate overlapping data values eliminates this extra processing, but requires on-chip buffering and makes the computation for each pyramid irregular because certain pyramids will perform more processing than some other pyramids.“ [0088]).

However, while Ferdman states that CPUs, GPUs and other architectures can be used, Ferdman does not explicitly disclose: generating code for different hardware  … and wherein the different hardware comprises one or more central processing units (CPUs) or graphical processing units (GPUs).


 for different hardware (“The graph program acceleration system may be responsible in one embodiment for parsing and analyzing the source code, potentially transforming the source code into several types of intermediary data structures in multiple phases, searching for opportunities to optimize the algorithm in various platform-independent and platform-dependent ways, implementing some combination of optimizations (if any optimizations are found), optionally identifying the specific particular hardware platforms to be used, generating or selecting executable code modules tuned or optimized for the particular hardware platforms, executing the code at the hardware platforms, and providing the results of the execution to one or more destinations (e.g., to a submitter of the source code).”, C3 L1-14) … and wherein the different hardware comprises one or more central processing units (CPUs) or graphical processing units (GPUs) (“The programmer may submit source code (which may include a reference to one or more input graph data sets which are to be processed using the algorithm) to a graph program acceleration system (GPAS) in one embodiment for optimization and eventual execution. Various types of hardware platforms may be used in different embodiments for the execution of the algorithm indicated in the source code. In one embodiment, a platform may include one or more graphics processing units (GPUs). In another embodiment, the platform may include one or more conventional central processing unit (CPUs). A hybrid host or device which includes one or more GPUS and one or more CPUs may be used in one embodiment. A platform may include one or more FPGAs (field programmable gate arrays) in another embodiment. Accelerators implemented using system-on-chip (SOC) designs may be used in some embodiments. In one embodiment, a programmer may design and develop source code for the graph analysis algorithm in a programming language which does not require extensive expertise in parallel programming or detailed knowledge of the hardware platforms which might be used. The programming language used for the algorithms in such an embodiment may be referred to as a graphics program acceleration language or “GPALang” herein.” C2 L44-67).

Ferdman and Bebee are both in the same field of endeavor of accelerating neural network and are analogous. Ferdman teaches a depth first processing of convolutional neural networks. Bebee teaches that architecture specific code blocks can be used for generating code for the respective architectures. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the accelerator of Ferdman to include appropriate code for different architectures as taught by Bebee. One would have been motivated to combine as Bebee states that this optimizes the code for the respective hardware and makes use of the strengths of each C3 L1-14.

Regarding claim 2, Ferdman does not explicitly disclose: The method according to claim 1, wherein generating the code comprises: determining one or more pre-defined building blocks from the identified neural network layers; and combining the one or more pre-defined building blocks to obtain the code.
(“The graph program acceleration system may be responsible in one embodiment for parsing and analyzing the source code, potentially transforming the source code into several types of intermediary data structures in multiple phases, searching for opportunities to optimize the algorithm in various platform-independent and platform-dependent ways, implementing some combination of optimizations (if any optimizations are found), optionally identifying the specific particular hardware platforms to be used, generating or selecting executable code modules tuned or optimized for the particular hardware platforms, executing the code at the hardware platforms, and providing the results of the execution to one or more destinations (e.g., to a submitter of the source code).” C3 L1-14, “In at least one embodiment, the dependency graphs may be passed on to an execution coordinator component of the GPAS. The execution coordinator may perform at least some hardware-platform-specific operations in some embodiments. For example, the tasks or nodes of the dependency graph may be analyzed, and corresponding compute kernels, functions or routines may be identified from a library.” C4 L1-9).

Regarding claim 4, Ferdman does not explicitly disclose: The method according to claim 1, wherein the CPUs include fewer single instruction multiple data (SIMD) units compared to the GPUs.


Regarding claim 9, Ferdman discloses: The method according to claim 1, wherein identifying the neural network layers further comprises: merging the at least one operation into one or more steps, wherein a step includes only one operation with a loop type property designating that the one operation has a requirement for data evaluation (“At operations 412-418, any overlapping intermediate data are reused and non-overlapping intermediate data are computed for the first or current convolution layer. More specifically, at operation 412, a determination is made as to whether there are any overlapping intermediate data (convolved data) of a previous three-dimensional tile structure that are available for reuse in connection with the current three-dimensional tile structure. If it is determined at operation 412 that there are overlapping intermediate data, then at operation 414 these intermediate data of the previous three-dimensional tile structures are reused for the current three-dimensional tile structure, i.e., without re-computing the overlapping intermediate data.” [0095], see also [0070-84]).

Regarding claim 10, Ferdman discloses: The method according to claim 9, wherein identifying the neural network layers further comprises: grouping the one or more steps into one or more sequences, wherein a sequence includes steps with compatible loop types (“Because the first and second pyramids overlap, a certain number of the same intermediate data values can be used to process the output three-dimensional tile structures 322, 332 in the output feature maps 314. There are two possible approaches. On the one hand, the data values can be re-processed (re-computed) each time they are needed in the subsequent pyramid. On the other hand, the data values can be cached and then reused when processing the subsequent pyramid. Re-processing the data values can add significant extra cost in terms of additional convolutional operations, but also has the benefit of simplicity (e.g., each pyramid's internal dataflow is the same). However, caching the intermediate overlapping data values eliminates this extra processing, but requires on-chip buffering and makes the computation for each pyramid irregular because certain pyramids will perform more processing than some other pyramids.” [0088]).

Regarding claim 11, Ferdman does not explicitly disclose: The method according to claim 10, wherein sequences in the one or more sequences intended for CPUs have more steps than sequences in the one or more sequences intended for GPUs.

Bebee teaches: wherein sequences in the one or more sequences intended for CPUs have more steps than sequences in the one or more sequences intended for GPUs (“In one embodiment, using an appropriate parallel programming library, a given graph analysis program 412 may be converted by the GPAS into a sequence of sequential and parallel sections as shown. Serial code 425A and 425B of execution sections 420A and 420B may be run on a conventional host using a single CPU thread 430 in the depicted embodiment. In contrast, parallel compute kernels such as 426A and 426B of platform-specific parallel execution sections 440A and 440B may be run using parallel platform threads 450 (e.g., GPU threads) in various embodiments. In one embodiment, the GPAS may be responsible for determining the achievable parallelism possible in various parts of the code, for selecting the appropriate hardware-specific tuned compute kernels to be used based on the execution platforms available in various embodiments.” C11 L40-54, note: the serial nature of CPUs and parallel nature of GPUs disclose the CPU operations having more steps).

Regarding claim 12, Ferdman discloses: The method according to claim 10, wherein grouping the one or more steps into the one or more sequences includes determining how each step grouped in a sequence influences data requirements of the sequence so as to reduce an amount of available memory below a memory threshold (“Accordingly, there is a tradeoff between the costs incurred and the benefits obtained. In the case where all convolutional layers are fused into a single pyramid, the costs of on-chip memory increase by the largest amount but save the most bandwidth in writing and reading from external memory, as illustrated in the second fused layer example in FIG. 2A. As an example, a certain five-layer convolutional neural network that is fused into a single pyramid can reduce by 95% the external memory bandwidth requirements in exchange for only 20% of extra on-chip memory (e.g., convolutional neural network called VGGNet-E). On balance, this represents a significant acceleration of the processing of convolutional layers in the convolutional neural network. However, other cost/benefit tradeoffs can be chosen, in terms of efficiencies in processing, energy, memory capacity, decomposing the fusion of the convolutional layers using more than one pyramid, as illustrated in the second fused layer example in FIG. 2B. The tradeoff can be especially useful in convolutional neural networks that have many convolutional layers.” [0065]).

Regarding claim 16, Ferdman discloses: The method according to claim 1, wherein the stack comprises a first subset of neural network layers from the neural network layers, and wherein generating the code to implement the depth first processing for the different hardware based on the identified neural network layers comprises: generating code to loop back and re-process the first subset of neural network layers after completing an iteration of processing the first subset of neural network layers (“As illustrated in the fused layer examples 108, 116 of FIG. 1B and FIG. 1C, the key to accelerating processing of the convolutional neural network 100 is restructuring the conventional CNN layer-by-layer processing with fusion of adjacent CNN layers and iterative processing across all of the fused layers using data regions of feature-map data (e.g., three-dimensional tile structures) that exploit inter-data locality among the fused layers, which largely eliminates intermediate data transfer off chip to and from the external memory 106. More specifically, a three-dimensional tile structure of the input feature maps 102 is an input data region from which other data regions, which can also be three-dimensional tile structures, of the intermediate feature maps of the fused convolution layers depend.” [0058]).
claim 17, Ferdman discloses: The method according to claim 1, wherein the stack comprises a first subset of neural network layers from the neural network layers, wherein the neural network layers further comprise at least one other neural network layer that is immediately subsequent to the first subset of neural network layers (Fig 2A and 2B), and wherein generating the code for the different hardware based on the identified neural network layers comprises: 
generating code to: 
process the first subset of neural network layers (“As illustrated in the fused layer examples 108, 116 of FIG. 1B and FIG. 1C, the key to accelerating processing of the convolutional neural network 100 is restructuring the conventional CNN layer-by-layer processing with fusion of adjacent CNN layers and iterative processing across all of the fused layers using data regions of feature-map data (e.g., three-dimensional tile structures) that exploit inter-data locality among the fused layers, which largely eliminates intermediate data transfer off chip to and from the external memory 106. More specifically, a three-dimensional tile structure of the input feature maps 102 is an input data region from which other data regions, which can also be three-dimensional tile structures, of the intermediate feature maps of the fused convolution layers depend.” [0058]); 
store an output from processing the first subset of neural network layers in main memory (“As illustrated for the second fused layer example 116, the convolutional neural network 100 is partitioned such that only convolutional layers 110 and 112 are fused into example pyramid 203, while convolutional layer 114 is not fused in example pyramid 205. The fused dataflow of the pyramid 203 is from input feature maps 102, through intermediate feature maps 204, and to output feature maps 210, with input feature maps 102 being read from the external memory 106, processed iteratively through the example pyramid 203, and output feature maps 210 being written to external memory 106. The non-fused data flow of the pyramid 205 is from output feature maps 210 (i.e., input feature maps in the dataflow of the pyramid 205) to output feature maps 104, with output feature maps 210 being read as input feature maps from the external memory 106, processed through the example pyramid 205 in a conventional manner, and output feature maps 104 being written to external memory 106.” [0062]); and 
process the at least one other neural network layer based on retrieving the output from the main memory (“As illustrated for the second fused layer example 116, the convolutional neural network 100 is partitioned such that only convolutional layers 110 and 112 are fused into example pyramid 203, while convolutional layer 114 is not fused in example pyramid 205. The fused dataflow of the pyramid 203 is from input feature maps 102, through intermediate feature maps 204, and to output feature maps 210, with input feature maps 102 being read from the external memory 106, processed iteratively through the example pyramid 203, and output feature maps 210 being written to external memory 106. The non-fused data flow of the pyramid 205 is from output feature maps 210 (i.e., input feature maps in the dataflow of the pyramid 205) to output feature maps 104, with output feature maps 210 being read as input feature maps from the external memory 106, processed through the example pyramid 205 in a conventional manner, and output feature maps 104 being written to external memory 106.” [0062]).

Regarding claim 18, Ferdman discloses: The method according to claim 9, wherein the step further includes a second operation with a loop type property designating that the second operation does not have a requirement for data evaluation (“Because the first and second pyramids overlap, a certain number of the same intermediate data values can be used to process the output three-dimensional tile structures 322, 332 in the output feature maps 314. There are two possible approaches. On the one hand, the data values can be re-processed (re-computed) each time they are needed in the subsequent pyramid. On the other hand, the data values can be cached and then reused when processing the subsequent pyramid. Re-processing the data values can add significant extra cost in terms of additional convolutional operations, but also has the benefit of simplicity (e.g., each pyramid's internal dataflow is the same). However, caching the intermediate overlapping data values eliminates this extra processing, but requires on-chip buffering and makes the computation for each pyramid irregular because certain pyramids will perform more processing than some other pyramids.“ [0088]).

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Ferdman and Bebee and further in view of Gao et al. (TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory).

claim 8, Ferdman does not explicitly disclose: The method according to claim 1, wherein the at least one operation comprises an accumulation operation and a normalization operation, and wherein the accumulation operation has a loop type property requiring data to be processed in a certain area and the normalization operation has a loop type property with no data requirements.

However, Gao teaches: wherein the at least one operation comprises an accumulation operation and a normalization operation, and wherein the accumulation operation has a loop type property requiring data to be processed in a certain area and the normalization operation has a loop type property with no data requirements (“We also move simple accumulation operations close to the data locations (DRAM banks) in order to reduce memory accesses and improve performance and energy.” P752 ¶1, “Pooling (POOL) and normalization (NORM) layers are also common, but do not use trained weights and are fast to process through streaming, thus we focus on CONV and FC layers.” P752 §2.1 ¶3).

Ferdman, Bebee and Gao are in the same field of endeavor of accelerating neural network and are analogous. Ferdman teaches a depth first processing of convolutional neural networks. Bebee teaches that architecture specific code blocks can be used for generating code for the respective architectures. Gao teaches accumulation and normalization operations where accumulation operations have the requirement of being close to data locations. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the accelerator operations of Ferdman and . 

Claims 13 and 14 is rejected under 35 U.S.C. 103 as being unpatentable over Ferdman and Bebee and further in view of Chakradhar et al. (US 2010/0088490).

Regarding claim 13, Ferdman does not explicitly disclose: The method according to claim 11, wherein a patch size is reduced based on available memory exceeding the memory threshold, the patch size being related to an amount of data input to the sequence.

Chakradhar teaches: wherein a patch size is reduced based on available memory exceeding the memory threshold, the patch size being related to an amount of data input to the sequence (“Operator splitting ensures that a template can be executed on an accelerator memory regardless of its memory limitations. It is important to note that in this exemplary implementation, the operators are data parallel such that the parallelism is over several independent data units. Thus, if an operation cannot be fit in the accelerator memory, it can be split so that it is made to execute on a small portion of the data. Splitting enables execution of arbitrarily sized data units on the accelerator, thereby providing scalability. An exemplary process for splitting data structures includes: computing the memory requirements of all operators; splitting the operators that need more memory than present in the accelerator; repeat computing and splitting until all operators are feasible to run on the accelerator; and conduct any post processing to avoid unnecessary data splits. It should be noted that if both the input and output of an operator are split, then the operator should be split as well, even if it fits the GPU memory. This does not change the feasibility, but permits more degrees of freedom so that execution on the accelerator may be performed with fewer data transfers.” [0043]).

Ferdman, Bebee and Chakradhar are in the same field of endeavor of accelerating neural network and are analogous. Ferdman teaches a depth first processing of convolutional neural networks. Bebee teaches that architecture specific code blocks can be used for generating code for the respective architectures. Chakradhar teaches splitting data to fit in memory and also avoiding underutilizing the hardware. It would have been obvious to one of ordinary skill in the art before the effective filing date to modify the accelerator operations of Ferdman and Bebee to include known input data patch sizing as taught by Chakradhar to yield predictable results. One would be motivated as Chakradhar states that splitting the data provides scalability and optimization [0043].

Regarding claim 13, Ferdman does not explicitly disclose: The method according to claim 13, wherein reduction of the patch size is limited by an underutilization of the different hardware.

(“Operator splitting ensures that a template can be executed on an accelerator memory regardless of its memory limitations. It is important to note that in this exemplary implementation, the operators are data parallel such that the parallelism is over several independent data units. Thus, if an operation cannot be fit in the accelerator memory, it can be split so that it is made to execute on a small portion of the data. Splitting enables execution of arbitrarily sized data units on the accelerator, thereby providing scalability. An exemplary process for splitting data structures includes: computing the memory requirements of all operators; splitting the operators that need more memory than present in the accelerator; repeat computing and splitting until all operators are feasible to run on the accelerator; and conduct any post processing to avoid unnecessary data splits. It should be noted that if both the input and output of an operator are split, then the operator should be split as well, even if it fits the GPU memory. This does not change the feasibility, but permits more degrees of freedom so that execution on the accelerator may be performed with fewer data transfers.” [0043]).
Response to Arguments









Applicant’s arguments with respect to claim(s) 1, 2, 4, and 8-18 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically 
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC NILSSON whose telephone number is (571)272-5246. The examiner can normally be reached M-F: 9-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571)-272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ERIC NILSSON/           Primary Examiner, Art Unit 2122