Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Detailed Action
This office action is in response to applicant’s communication filed on 05/01/20. Claims 1-20 are pending in this application. 
Information Disclosure Statement
The information disclosure statement field on 05/24/22 and 09/23/21 have been received and are being considered.
Claim Rejections Under 35 U.S.C. §102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-11 and 14-20 are rejected under 35 U.S.C. §102 as being unpatentable over Aydonat (US 20170103299 A1).
Regarding claim 1, Aydonat at least at fig 9, discloses a method implemented on a computer system for implementing a machine learning network on a machine learning accelerator (MLA) see paras [0033] disclosing CNN (convolutional neural network), the MLA comprising a plurality of interconnected processing elements (see para [0046] disclosing interconnections PE 901-904 FPGA)implemented on a semiconductor die (para [0059]-[0061] disclosing on-chip configuration), the method comprising: 
receiving a description of the machine learning network(see para[0097], [0098] disclosing receiving descriptions); 
allocating computations that implement the machine learning network to processing elements for execution(see para [0102]-[0103], - configuration tool manager 1600 provides an interface that allows a user to input data into the configuration tool 1600 for instruction execution); 
determining data transfers for transfer of data to or from processing elements in order to execute the computations (see para [0041], [0061], [0096] sequencer unit coordinates transmission of data to processing elements); 
generating a computer program comprising compute instructions that implement the computations and data transfer instructions that implement the data transfers (see fig 5, step 503, 1500 generates and programs a sequencer unit that coordinates transmission of the data to appropriate processing element arrays on the CNN accelerator, see para [0041], [0061], [0096] sequencer unit coordinates transmission of data to processing elements), comprising: 
determining non-conflicting data transfer paths (see para [0041]) for the data transfers based on a topology of the interconnections between processing elements (see para [0044] disclosing logic gate and mapping optimized logical representation), on dependencies of the instructions and on a duration for execution of the instructions(see para [0044] disclosing dependencies/logic relationships [0068] disclosing a sequencer unit that coordinates the transmission of data to appropriate processing elements on the CNN accelerator at appropriate times in order to time multiplex computations on the processing elements); wherein each data transfer path specifies a routing and a time slot for the data transfer (see [0041] disclosing time multiplex); 
generating data transfer instructions that specify routing of the data transfers (see para [0087] disclosing instructions); and 
generating a static schedule that schedules execution of the data transfer instructions during the time slots for the data transfers (see para [0041] disclosing time multiplexing), the static schedule also scheduling execution of the compute instructions (element 503 coordinating time multiplexing instructions); and 
outputting the computer program (see fig 5, 504 and para [0055] disclosing outputting design).
Regarding claim 2, Aydonat discloses the computer-implemented method of claim 1, wherein determining non-conflicting data transfer paths comprises: determining candidate data transfer paths for the data transfers (see para [0041] and [0047] disclosing multiplexing data paths); determining which of the candidate data transfer paths are available (see para [0041] and [0047] disclosing multiplexing data paths); selecting from among the available candidate data transfer paths(see para [0041] and [0047] disclosing multiplexing data paths).
Regarding claim 3, Aydonat discloses the computer-implemented method of claim 1, wherein determining non-conflicting data transfer paths comprises:
generating default data transfer paths for the data transfers(see para [0041] and [0047] disclosing multiplexing data paths and defining slack);
determining conflicts among the default data transfer paths (see para [0066] disclosing); and
modifying the data transfer paths to resolve the conflicts(see para [0066] describing resolving convolution layer).
Regarding claim 4, Aydonat disclose the computer-implemented method of claim 1, wherein:
adjacent processing elements are connected by multiple data transfer lanes; and
the data transfer instructions also specify which data transfer lanes to use for the data
transfers (see fig 9 disclosing interconnected data transfer lines).
Regarding claim 5, Aydonat discloses the computer-implemented method of claim 1, wherein:
adjacent processing elements are connected by variable-width lanes (see par [0035]disclosing height and width regions having functional relationships); and the data transfer instructions also specify widths of the lanes to use for the data transfers (see para [0035]).
Regarding claim 6, Aydonat discloses the computer-implemented method of claim 1, wherein:
data transfer paths are virtualized; and the data transfer instructions specify virtual data transfer paths for the data transfers (see para [0047] disclosing data paths categorized by timing).
Regarding claim 7, Aydonat discloses the computer-implemented method of claim 1, wherein: processing elements have multiple i/o ports; and the data transfer instructions also specify which i/o ports to use for the data transfers (see para [0088] disclosing different interfaces and).
Regarding claim 8, Aydonat discloses the computer-implemented method of claim 1, wherein the data transfer instructions and the compute instructions are executed by the processing elements (see para [0041] disclosing processing elements processing transfer/compute instructions).
Regarding claim 9, Aydonat disclose he computer-implemented method of claim 1, wherein the data transfer instructions include a store data instruction to transfer data from one of the processing elements to a memory also on the semiconductor die but outside the interconnected processing elements (see para [0087] disclosing store data instructions being transferred by bus).
Regarding 10, Aydonat disclose the computer-implemented method of claim 1, wherein: the data transfer instructions include a load data instruction to transfer data from a memory also on the semiconductor die but outside the interconnected processing elements to one of the processing elements (see fig 14, disclosing memory storing compute and operate instructions, traveling through a bus); and the data transfer path 1401 for the load data instruction includes a request path from the processing element to the memory to provide a request for the data to the memory (see fig 14, disclosing paths), and a response path from the memory to the processing element to provide the data to the processing element responsive to the request 1401, see fig 14 disclosing the data paths. 
Regarding claim 11, Aydonat discloses the computer-implemented method of claim 1, wherein the data transfer instructions include a move data instruction to transfer data from one of the processing elements to another one of the processing elements (see fig 14, 1401 disclosing a bus that takes data/transfer/compute instructions from processor to memory to storage), and the data transfer instruction includes a sequence of codes defining the routing of the data transfer path (see para [0091] disclosing sequences of instructions and fig 15), each code defining a direction of data transfer between two adjacent processing elements (see fig 9, disclosing processing elements being adjacent to one another).
Regarding claim 14, Aydonat discloses the computer-implemented method of claim 1, wherein the interconnected processing elements(921-944etc) include interior processing elements that are directly connected to other processing elements and edge processing elements that are also directly connected to a memory (see figs 9 and 14) also on the semiconductor die but outside the interconnected processing elements (see exterior bus connecting and fig 9 disclosing different processing elements).
Regarding claim 15, Aydonat discloses the computer-implemented method of claim 1, wherein at least one interior processing element is directly connected to the memory (see fig 14 disclosing interconnected bus).
Regarding claim 16, Aydonat discloses the computer-implemented method of claim 14, wherein the data transfer instructions include a load data instruction to transfer data from the memory to an interior processing element via one of the edge processing elements (see paras [0090] and [0091] disclosing EDA tool and the CNN accelerator interconnected via BUS in fig 14).
Regarding claim 17, Aydonat discloses the computer-implemented method of claim 1, wherein the machine learning network comprises a plurality of layers (see para [0093] disclosing layers)and allocating computations to processing elements comprises: determining, for each layer, a partial computation metric based on the computations performed to implement that layer (see para [0065] disclosing partial sums, at each relu in fig 9); and allocating the processing elements to layers based on the partial computation metric.
Regarding claim 18, Aydonat discloses the computer-implemented method of claim 1, wherein the processing elements execute the data transfer instructions without using a hardware routing table (see para [0098] disclosing HDL routing).
Regarding claim 19, Aydonat discloses the computer-implemented method of claim 1, wherein the processing elements execute the data transfer instructions without performing congestion or collision arbitration in hardware (see para [0098] disclosing HDL).
Regarding claim 20, Aydonat discloses a non-transitory computer readable storage medium storing instructions (see para [0025], fig 14 disclosing computer/storage of instructions) for implementing a machine learning network on a machine learning accelerator (MLA) see paras [0033] disclosing CNN (convolutional neural network), the MLA comprising a plurality of interconnected processing elements (see para [0046] disclosing interconnections PE 901-904 FPGA)implemented on a semiconductor die (para [0059]-[0061] disclosing on-chip configuration), the method comprising: 
receiving a description of the machine learning network(see para[0097], [0098] disclosing receiving descriptions); 
allocating computations that implement the machine learning network to processing elements for execution(see para [0102]-[0103], - configuration tool manager 1600 provides an interface that allows a user to input data into the configuration tool 1600 for instruction execution); 
determining data transfers for transfer of data to or from processing elements in order to execute the computations (see para [0041], [0061], [0096] sequencer unit coordinates transmission of data to processing elements); 
generating a computer program comprising compute instructions that implement the computations and data transfer instructions that implement the data transfers (see fig 5, step 503, 1500 generates and programs a sequencer unit that coordinates transmission of the data to appropriate processing element arrays on the CNN accelerator, see para [0041], [0061], [0096] sequencer unit coordinates transmission of data to processing elements), comprising: 
determining non-conflicting data transfer paths (see para [0041]) for the data transfers based on a topology of the interconnections between processing elements (see para [0044] disclosing logic gate and mapping optimized logical representation), on dependencies of the instructions and on a duration for execution of the instructions(see para [0044] disclosing dependencies/logic relationships [0068] disclosing a sequencer unit that coordinates the transmission of data to appropriate processing elements on the CNN accelerator at appropriate times in order to time multiplex computations on the processing elements); wherein each data transfer path specifies a routing and a time slot for the data transfer (see [0041] disclosing time multiplex); 
generating data transfer instructions that specify routing of the data transfers (see para [0087] disclosing instructions); and 
generating a static schedule that schedules execution of the data transfer instructions during the time slots for the data transfers (see para [0041] disclosing time multiplexing), the static schedule also scheduling execution of the compute instructions (element 503 coordinating time multiplexing instructions); and 
outputting the computer program (see fig 5, 504 and para [0055] disclosing outputting design).

Claim Rejections Under 35 U.S.C. 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 12 and 13 are rejected under 35 U.S.C. §103 as being unpatentable over Aydonat and further in view of Kim (US 20210174137 A1).
 	Regarding claim 12, Aydonat discloses the computer-implemented method of claim 11, wherein the processing elements along the data transfer path: determine a next processing element along the data transfer path using a first code of the sequence of codes(see para [0091] disclosing sequences of instructions and fig 15); 
Kim further discloses generate an updated sequence of codes by replacing the first code with a second code (see paras [0115] and [0121] disclosing replacing first and second neural network), the second code defining a direction of data transfer opposite to the direction defined by the first code (see bidirectional code, para [0255]); and provide the updated sequence of codes to the next processing element see paras [0115] and [0121] disclosing updating with new neural network.
Aydonat and Kim are in the same or similar fields of endeavor. It would have been obvious to combine Aydonat and Kim. Aydonat and Kim may be combined by forming the neural network of Aydonat with the replacement method of Kim in order to optimize performance, see paras [0115].
Regarding claim 13, Aydonat discloses the computer-implemented method of claim 11, wherein the processing elements along the data transfer path: use the sequence of codes to route first data along the data transfer path; generate an updated sequence of codes that specify a reverse of the data transfer path (Kim discloses bidirectional recurrent deep neural network, and adversarial networks); and use the updated sequence of codes to route second data along the reverse of the data transfer path (see para [0255]). 
 Aydonat and Kim are in the same or similar fields of endeavor. It would have been obvious to combine Aydonat and Kim. Aydonat and Kim may be combined by forming the neural network of Aydonat with the replacement method of Kim in order to optimize performance, see paras [0115].

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EDWARD CHIN whose telephone number is (571)270-1827. The examiner can normally be reached M-F 9AM-5PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Steven Gauthier can be reached on (571) 270-0373. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/EDWARD CHIN/Primary Examiner, Art Unit 2813