DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Response to Amendment
This Office Action is in response to applicant’s communication filed 28 June 2022, in response to the Office Action mailed 26 May 2022.  The applicant’s remarks and any amendments to the claims or specification have been considered, with the results that follow.


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-29 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

The term “performance-critical” in claim 1 is a relative term which renders the claim indefinite. The term “performance-critical” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
Claims 2-14 depend upon claim 1, and thus include the aforementioned limitation(s).

The term “performance-critical” in claim 15 is a relative term which renders the claim indefinite. The term “performance-critical” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
Claims 16-19 depend upon claim 15, and thus include the aforementioned limitation(s).

The term “performance-critical” in claim 20 is a relative term which renders the claim indefinite. The term “performance-critical” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
Claims 21-29 depend upon claim 20, and thus include the aforementioned limitation(s).


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-5, 15-18, and 20-22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Nurvitadhi (US 2018/0315158) in view of Herrera (US 7,191,163).

As per claim 1, Nurvitadhi teaches a programmable hardware system for machine learning (ML) comprising: a core configured to: receive a plurality of commands and data from a host to be analyzed and inferred via machine learning [a subsystem including a plurality of processors/cores connected to a host CPU to receive graphics and/or machine learning operations and data (abstract, paras. 0043-46; figs. 2A, 4A; etc.)]; transmit each command of a first subset of commands of the plurality of commands for performance-critical operations and associated data thereof to an inference engine for processing [a subsystem including a plurality of processors connected to a host CPU to receive graphics and/or machine learning operations and data (abstract, paras. 0043-46; figs. 2A, 4A; etc.)], an instruction streaming engine coupled to the core and further coupled to the inference engine, wherein the streaming engine is configured to: retrieve and maintain the each command of the first subset of commands and/or the associated data at a specific location in a buffer [a front end and scheduler receives commands from the host and buffers commands to be passed to the array of processing units, as well as connecting to the memory crossbar to store to memory (paras. 0058-60, fig. 2A, etc.)]; stream the each command of the first subset of commands and/or its associated data to the inference engine from the buffer [a front end and scheduler receives commands from the host and buffers commands to be passed to the array of processing units (paras. 0058-60, fig. 2A, etc.)]; and said inference engine configured to: retrieve the each command of the first subset of commands and/or its associated data streamed from the buffer [the parallel processing units process the incoming commands and data (paras. 0062, 0067, etc.)]; and perform the performance-critical operations according to the each command of the first subset of commands to analyze and infer a subject from the data [the parallel processing units process the incoming commands and data (paras. 0062, 0067, etc.)].
While Nurvitadhi teaches transmitting commands and data to the inference engine (see above), it does not explicitly teach transmitting them via function call, wherein each command of the first subset of commands and/or the associated data are encapsulated as parameters in the function call.
Herrera teaches transmitting data via function call, wherein each command of the first subset of commands and/or the associated data are encapsulated as parameters in the function call [the inference engine can be invoked via function calls using various data structures for input/output arguments (parameters) to begin operations (col. 16, lines 20-56; etc.)].
Nurvitadhi and Herrera are analogous art, as they are within the same field of endeavor, namely processing for expert systems.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to utilize a function call to provide the inputs for the inference engine, as taught by Herrera, for the inputs to the inference processing engine in the system taught by Nurvitadhi.
Herrera provides motivation as [various types of function calls allow the user to encapsulate the required data as well as various rules, monitoring, and processing states (cols. 16-18, etc.)].

As per claim 2, Nurvitadhi/Herrera teaches wherein the streaming engine is further configured to receive a streamed inferred data from the inference engine [the computation results may be written back to the memory of the processing unit and then to the shared memory (Nurvitadhi: paras. 0062, 0124; figs. 2A-D; etc.)].

As per claim 3, Nurvitadhi/Herrera teaches wherein the streaming engine is further configured to stream the received inferred data to the core [the computation results may be written back to the memory of the processing unit and then to the shared memory (Nurvitadhi: paras. 0062, 0124; figs. 2A-D; etc.)].

As per claim 4, Nurvitadhi/Herrera teaches wherein the streaming engine comprises: an instruction streaming engine coupled to the core, wherein the instruction streaming engine is configured to stream the first subset of commands to the inference engine [a subsystem including a plurality of processors connected to a host CPU to receive graphics and/or machine learning operations and data (Nurvitadhi: abstract, paras. 0043-46; figs. 2A, 4A; etc.)]; and a data streaming engine coupled to the inference engine and configured to generate one or more streams of data associated with the first subset of commands, and wherein the data streaming engine is configured to stream the one or more streams of data to the inference engine to be analyzed and inferred [a subsystem including a plurality of processors connected to a host CPU to receive graphics and/or machine learning operations and data (Nurvitadhi: abstract, paras. 0043-46; figs. 2A, 4A; etc.)].

As per claim 5, Nurvitadhi/Herrera teaches wherein the streaming engine is configured to stream instructions to the inference engine in an instruction set architecture that is different from an instruction set architecture format received from the core [the processing units may use a different instruction set from other units or the host (Nurvitadhi: para. 0192, 0248, etc.)].

As per claim 15, Nurvitadhi/Herrera teaches a programmable hardware system for machine learning (ML), comprising: a core configured to receive a plurality of commands and a plurality of data from a host to be analyzed and inferred via machine learning [subsystem including a plurality of processors/cores connected to a host CPU to receive graphics and/or machine learning operations and data (Nurvitadhi: abstract, paras. 0043-46; figs. 2A, 4A; etc.)], wherein the core is further configured to transmit a first subset of commands of the plurality of commands that is performance-critical operations and associated data thereof of the plurality of data for efficient processing thereof [a front end and scheduler receives commands from the host and buffers commands to be passed to the array of processing units, as well as connecting to the memory crossbar to store to memory (Nurvitadhi: paras. 0058-60, fig. 2A, etc.)], wherein the first subset of commands and the associated data are based through via a function call [the inference engine can be invoked via function calls using various data structures for input/output arguments (parameters) to begin operations (Herrera: col. 16, lines 20-56; etc.)]; and a streaming engine coupled to the core configured to receive the first subset of commands and the associated data from the core [front end and scheduler receives commands from the host and buffers commands to be passed to the array of processing units, as well as connecting to the memory crossbar to store to memory (Nurvitadhi: paras. 0058-60, fig. 2A, etc.)], and wherein the streaming engine is configured to stream a second subset of commands of the first subset of commands and its associated data to an inference engine by executing a single instruction [the parallel processing units process the incoming commands and data (Nurvitadhi: paras. 0062, 0067, etc.) including executing each instruction (Nurvitadhi: para. 0289, etc.)].
Examiner’s Note: the reasoning and motivation for the combination are provided, above, in the rejection of claim 1.

As per claim 16, see the rejections of claims 2-3, above.

As per claim 17, see the rejection of claim 4, above.

As per claim 18, see the rejection of claim 5, above.

As per claim 20, see the rejection of claim 1, above.

As per claim 21, see the rejection of claim 2, above.

As per claim 22, see the rejection of claim 3, above.


Claims 6-13, 19, and 23-29 is/are rejected under 35 U.S.C. 103 as being unpatentable over Nurvitadhi and Herrera as applied to claims 1, 15, and 20, above, and further in view of Schober (US 7,089,380).

As per claim 6, Nurvitadhi/Herrera teaches wherein the buffer is coupled to the core and further coupled to the streaming engine [a front end and scheduler receives commands from the host and buffers commands to be passed to the array of processing units, as well as connecting to the memory crossbar to store to memory (Nurvitadhi: paras. 0058-60, fig. 2A, etc.)].
Nurvitadhi/Herrera does not explicitly teach wherein the core continuously writes to the buffer until a certain condition is met, and wherein the streaming engine continuously reads from the buffer until another certain condition is met.
Schober teaches wherein the core continuously writes to the buffer until a certain condition is met, and wherein the streaming engine continuously reads from the buffer until another certain condition is met [new commands are written into the buffer until the queue is full or near full (fig. 4 and associated description) commands are read until the queue is empty (fig. 5 and associated description)].
Nurvitadhi/Herrera and Schober are analogous art, as they are within the same field of endeavor, namely storing commands/data for parallel processing.
It would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to include the queue status monitoring taught by Schober for the buffer holding commands/data for the processing units in the system taught by Nurvitadhi/Herrera.
Schober provides motivation as [monitoring the status of the queue can prevent overflow/underflow errors (abstract; col. 1, lines 14-33; etc.)].

As per claim 7, Nurvitadhi/Herrera/Schober teaches wherein the certain condition is when available buffer associated with the buffer is below a threshold value [new commands are written into the buffer until the queue is full or near full (Schober: fig. 4 and associated description) commands are read until the queue is empty (Schober: fig. 5 and associated description)].

As per claim 8, Nurvitadhi/Herrera/Schober teaches wherein the available buffer is tracked using a head pointer maintained by the core locally, and wherein the head pointer is incremented each time the core writes to the buffer and the available buffer associated with the buffer is decremented each time the core writes to the buffer [a front end and scheduler receives commands from the host and buffers commands to be passed to the array of processing units, as well as connecting to the memory crossbar to store to memory (Nurvitadhi: paras. 0058-60, fig. 2A, etc.) including utilizing a head pointer stored in registers or memory (Schober: col. 5, lines 45-58; etc.) when an entry is added to the queue the tail pointer (the tail and head pointers are flipped for the claimed invention compared to Schober) is incremented and the status of the queue is updated to detect near full condition (Schober: col. 2, lines 43-50, etc.)].

As per claim 9, Nurvitadhi/Herrera/Schober teaches wherein the core reads a value stored in a memory mapped input/output (MIMO) responsive to the certain condition being met, wherein the MIMO stores a value of the head pointer and a tail pointer associated with a location the streaming engine reads from the buffer, and wherein the core is configured to set the available buffer size [the head and tail pointers may be stored in memory or registers associated with the memory (Schober: col. 1, lines 14-23; col. 5, lines 45-58; etc.) where the registers are accessed via memory mapped input/output (Nurvitadhi: para. 0261)].

As per claim 10, Nurvitadhi/Herrera/Schober teaches wherein the core is configured to set the available buffer size to the tail pointer minus the head pointer and result thereof modulo actual size of the buffer [for monitoring the full/empty and near full/empty status the available space is tail-head pointer modulo size of the queue (Schober: col. 2, line 22 to col. 3, line 15; etc.)].

As per claim 11, Nurvitadhi/Herrera/Schober teaches wherein the another certain condition is when buffer size to read from is greater than zero [new commands are written into the buffer until the queue is full or near full (Schober: fig. 4 and associated description) commands are read until the queue is empty (Schober: fig. 5 and associated description)].

As per claim 12, Nurvitadhi/Herrera/Schober teaches wherein the buffer size to read from is tracked using a tail pointer maintained by the streaming engine locally, and wherein the tail pointer is incremented each time the streaming engine reads from the buffer and wherein the buffer size to read from is the tail pointer minus a head pointer and result thereof modulo actual size of the buffer, wherein the head pointer is maintained by the core locally and incremented each time the core writes to the buffer [the head and tail pointers may be stored in memory or registers associated with the memory (Schober: col. 1, lines 14-23; col. 5, lines 45-58; etc.) and the head pointer is incremented when an entry is read from the queue and tail pointer when a new entry is added (head and tail are reversed in the claimed invention compared to Schober) where the remaining size is tail-head pointer modulo size of the queue (Schober: col. 2, line 22 to col. 3, line 15; etc.)].

As per claim 13, Nurvitadhi/Herrera/Schober teaches wherein the core maintains a head pointer where the core writes to and wherein the streaming engine maintains a tail pointer where the streaming engine reads from, and wherein the head pointer and the tail pointer are stored in a memory mapped input/output (MMIO) space that is mapped into registers in the streaming engine [the head and tail pointers may be stored in memory or registers associated with the memory (Schober: col. 1, lines 14-23; col. 5, lines 45-58; etc.) where the registers are accessed via memory mapped input/output (Nurvitadhi: para. 0261)].

As per claim 19, see the rejection of claim 6, above.

As per claim 23, see the rejection of claim 6, above.

As per claim 24, see the rejection of claim 7, above.

As per claim 25, see the rejection of claim 8, above.

As per claim 26, see the rejection of claim 9, above.

As per claim 27, see the rejection of claim 10, above.

As per claim 28, see the rejection of claim 11, above.

As per claim 29, see the rejection of claim 12, above.


Claim 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Nurvitadhi, Herrera, and Schober as applied to claim 1, and further in view of well-known practices in the art.

As per claim 14, Nurvitadhi/Herrera/Schober teaches wherein the buffer is a circular buffer allocated in memory, and wherein a size of the buffer is fixed a-priori at compile time [the command queue is a circular queue stored in specific memory space (Schober: abstract; col. 1, lines 14-48; etc.)].
Examiner’s Note: the reasoning and motivation for the combination of Nurvitadhi/Herrera/Schober is provided, above, in the rejection of claim 6.
Nurvitadhi/Herrera/Schober does not explicitly teach that the memory is DDR memory.  However, the examiner takes official notice that using DDR memory is old and well known within the art.  Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to implement the memory storing the circular buffer in Nurvitadhi/Herrera/Schober as DDR memory to achieve the predictable result of providing higher throughput/transfer rates.


Response to Arguments
Applicant's arguments filed 28 June 2022 have been fully considered but they are not persuasive.

Applicant argues that the term “performance-critical” is not a relative term because the “specification of the instant application explicitly provides examples of what is considered ‘performance-critical’”.
However, as shown by the citation given by the applicant, the specification does not provide an explicit definition of what constitutes “performance-critical”, and at best gives only an example.  Therefore, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.

Applicant also argues that the cited art does not teach a core configured to receive a plurality of commands… [and] transmit each command of a first subset of commands of the plurality of commands for performance-critical operations and associated data thereof to an inference engine; and an instruction streaming engine to retrieve and maintain each command of the first subset of commands and/or associated data from the call at a specific location in a buffer.
However, as described above, “performance-critical” is a relative term that is undefined.  Furthermore, Nurvitadhi teaches a subsystem including a plurality of processors connected to a host CPU to receive graphics and/or machine learning operations and data (abstract, paras. 0043-46; figs. 2A, 4A; etc.) and a front end and scheduler receives commands from the host and buffers commands to be passed to the array of processing units, as well as connecting to the memory crossbar to store to memory (paras. 0058-60, fig. 2A, etc.).  As the commands must be performed, they are within the broadest reasonable interpretation of “performance-critical”, 

Applicant also argues that the cited art does not teach that each command of the first subset of commands and/or associated data are encapsulated as parameters in a function call.
However, Herrera teaches the inference engine can be invoked via function calls using various data structures for input/output arguments (parameters) to begin operations (col. 16, lines 20-56; etc.).

Applicant also argues that the cited art does not teach that the buffer is a circular buffer and the size of the buffer is fixed a-piori at compile time.
However, Schober teaches the command queue is a circular queue stored in specific memory space (Schober: abstract; col. 1, lines 14-48; etc.).


Conclusion
The following is a summary of the treatment and status of all claims in the application as recommended by M.P.E.P. 707.07(i): claims 1-29 are rejected.

The examiner requests, in response to this Office action, that support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line number(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.

When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the references cited or the objections made. He or she must also show how the amendments avoid such references or objections.  See 37 CFR 1.111(c).

THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to GEORGE GIROUX whose telephone number is (571)272-9769. The examiner can normally be reached M-F 10am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez Rivas can be reached on 571-272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/GEORGE GIROUX/Primary Examiner, Art Unit 2128