DETAILED ACTION
Claims 1-20 are pending in the application.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

Examiner’s Notes
The Examiner cites particular sections in the references as applied to the claims below for the convenience of the applicant(s). Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the applicant(s) fully consider the references in their entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the Examiner.

Specification
Applicant is reminded of the proper language and format for an abstract of the disclosure.
The abstract should be in narrative form and generally limited to a single paragraph on a separate sheet within the range of 50 to 150 words in length. The abstract should describe the disclosure sufficiently to assist readers in deciding whether there is a need for consulting the full patent text for details.
The language should be clear and concise and should not repeat information given in the title. It should avoid using phrases which can be implied, such as, “The disclosure concerns,” “The disclosure defined by this invention,” “The disclosure describes,” etc.  In addition, the form and legal phraseology often used in patent claims, such as “means” and “said,” should be avoided.
The abstract of the disclosure is objected to because it uses phrases which can be implied, such as “Technologies are disclosed herein”.  Correction is required.  See MPEP § 608.01(b).
The use of the term BLUETOOTH, which is a trade name or a mark used in commerce, has been noted in this application. The term should be accompanied by the generic terminology; furthermore the term should be capitalized wherever it appears or, where appropriate, include a proper symbol indicating use in commerce such as ™, SM , or ® following the term.
Although the use of trade names and marks used in commerce (i.e., trademarks, service marks, certification marks, and collective marks) are permissible in patent applications, the proprietary nature of the marks should be respected and every effort made to prevent their use in any manner which might adversely affect their validity as commercial marks.

Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 1 recites the limitation "the interconnect topology" in lines 5-6.  There is insufficient antecedent basis for this limitation in the claim. It is not clear if this limitation is referring to “an interconnect topology for transmitting data between a plurality of graphical processing units (GPUs)”, “an inter-GPU point-to-point topology”, or “a shared interconnect topology”.
For the following analysis, the Examiner will consider this limitation as referring to –the interconnect topology for transmitting the data between the plurality of GPUs—.
Claims 2-7 inherit the features of claim 1 and are rejected accordingly.
Claim 5 contains the trademark/trade name “NVIDIA NVLINK”.  Where a trademark or trade name is used in a claim as a limitation to identify or describe a particular material or product, the claim does not comply with the requirements of 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph.  See Ex parte Simpson, 218 USPQ 1020 (Bd. App. 1982).  The claim scope is uncertain since the trademark or trade name cannot be used properly to identify any particular material or product.  A trademark or trade name is used to identify a source of goods, and not the goods themselves.  Thus, a trademark or trade name does not identify or describe the goods associated with the trademark or trade name.  In the present case, the trademark/trade name is used to identify/describe a GPU interconnect topology and, accordingly, the identification/description is indefinite.
Claim 8 recites the limitation "the interconnect topology" in lines 6-7.  There is insufficient antecedent basis for this limitation in the claim. It is not clear if this limitation is referring to “an interconnect topology for transmitting data between a plurality of graphical processing units (GPUs)”, “an inter-GPU point-to-point topology”, or “a shared interconnect topology”.
For the following analysis, the Examiner will consider this limitation as referring to –the interconnect topology for transmitting the data between the plurality of GPUs—.
Claims 9-14 inherit the features of claim 8 and are rejected accordingly.
Claim 9 recites the limitation "the interconnect topology" in lines 1-2.  There is insufficient antecedent basis for this limitation in the claim. It is not clear if this limitation is referring to “an interconnect topology for transmitting data between a plurality of graphical processing units (GPUs)” or  “an inter-GPU point-to-point topology” previously recited in claim 8.
For the following analysis, the Examiner will consider this limitation as referring to –the interconnect topology for transmitting the data between the plurality of GPUs—.
Claim 10 inherits the features of claim 9 and is rejected accordingly.
Claim 10 contains the trademark/trade name “NVIDIA NVLINK”.  Where a trademark or trade name is used in a claim as a limitation to identify or describe a particular material or product, the claim does not comply with the requirements of 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph.  See Ex parte Simpson, 218 USPQ 1020 (Bd. App. 1982).  The claim scope is uncertain since the trademark or trade name cannot be used properly to identify any particular material or product.  A trademark or trade name is used to identify a source of goods, and not the goods themselves.  Thus, a trademark or trade name does not identify or describe the goods associated with the trademark or trade name.  In the present case, the trademark/trade name is used to identify/describe a GPU interconnect topology and, accordingly, the identification/description is indefinite.
Claim 15 recites the limitation "the interconnect topology" in lines 8-9.  There is insufficient antecedent basis for this limitation in the claim. It is not clear if this limitation is referring to “an interconnect topology for transmitting data between a plurality of graphical processing units (GPUs)” or  “an inter-GPU point-to-point topology”.
For the following analysis, the Examiner will consider this limitation as referring to –the interconnect topology for transmitting the data between the plurality of GPUs—.
Claims 16-20 inherit the features of claim 15 and are rejected accordingly.
Claim 16 recites the limitation "the interconnect topology" in line 1.  There is insufficient antecedent basis for this limitation in the claim. It is not clear if this limitation is referring to “an interconnect topology for transmitting data between a plurality of graphical processing units (GPUs)” or  “an inter-GPU point-to-point topology” previously recited in claim 8.
For the following analysis, the Examiner will consider this limitation as referring to –the interconnect topology for transmitting the data between the plurality of GPUs—.
Claim 16 contains the trademark/trade name “NVIDIA NVLINK”.  Where a trademark or trade name is used in a claim as a limitation to identify or describe a particular material or product, the claim does not comply with the requirements of 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph.  See Ex parte Simpson, 218 USPQ 1020 (Bd. App. 1982).  The claim scope is uncertain since the trademark or trade name cannot be used properly to identify any particular material or product.  A trademark or trade name is used to identify a source of goods, and not the goods themselves.  Thus, a trademark or trade name does not identify or describe the goods associated with the trademark or trade name.  In the present case, the trademark/trade name is used to identify/describe a GPU interconnect topology and, accordingly, the identification/description is indefinite.

Claim Rejections - 35 USC § 112(d)
The following is a quotation of 35 U.S.C. 112(d):
(d) REFERENCE IN DEPENDENT FORMS.—Subject to subsection (e), a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

The following is a quotation of pre-AIA  35 U.S.C. 112, fourth paragraph:
Subject to the following paragraph [i.e., the fifth paragraph of pre-AIA  35 U.S.C. 112], a claim in dependent form shall contain a reference to a claim previously set forth and then specify a further limitation of the subject matter claimed. A claim in dependent form shall be construed to incorporate by reference all the limitations of the claim to which it refers.

Claims 14 and 20 are rejected under 35 U.S.C. 112(d) or pre-AIA  35 U.S.C. 112, 4th paragraph, as being of improper dependent form for failing to further limit the subject matter of the claim upon which it depends, or for failing to include all the limitations of the claim upon which it depends.
Claim 14, which depends on claim 8, recites “having further computer-executable instructions stored thereupon to execute the program code to transmit the data between the GPUs based on the directed spanning trees” in lines 1-3. However, claim 8 already discloses “program code which, when executed, will cause the data to be transmitted between the GPUs based on the directed spanning trees”. Therefore, claim 14 is in improper dependent form as it fails to further limit the subject matter of claim 8.
  Applicant may cancel the claim(s), amend the claim(s) to place the claim(s) in proper dependent form, rewrite the claim(s) in independent form, or present a sufficient showing that the dependent claim(s) complies with the statutory requirements.
Claim 20, which depends on claim 15, recites “has further computer-executable instructions stored thereupon to execute the program code to transmit the data between the GPUs based on the directed spanning trees” in lines 2-4. However, claim 15 already discloses “program code which, when executed, will cause the data to be transmitted between the GPUs based on the directed spanning trees”. Therefore, claim 20 is in improper dependent form as it fails to further limit the subject matter of claim 15.
  Applicant may cancel the claim(s), amend the claim(s) to place the claim(s) in proper dependent form, rewrite the claim(s) in independent form, or present a sufficient showing that the dependent claim(s) complies with the statutory requirements.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claims 1-6, 8-12, 14-18, and 20 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Smola et al. (US 11,176,489 B1; hereinafter Smola).

With respect to claim 1, Smola teaches: A computer-implemented method, comprising: 
determining an interconnect topology for transmitting data between a plurality of graphical processing units (GPUs) (see e.g. column 6, lines 52-55: “In FIG. 2, a current topology 205 with three nodes—e.g., three GPUs—and edges representing links between ones of the nodes (representing links between the GPUs) is shown as a ring”; column 6, lines 26-31: “determine, based on the topology information, a communication schedule to be used by the involved processing elements 130 during execution of the application. For example, the communication schedule may be used for gather-scatter type operations involved in distributed training of ML models”; and Fig. 2), the interconnect topology comprising an inter-GPU point-to-point topology (see e.g. column 3, lines 33-38: “architecture of two high-performance GPU-based servers shown in FIG. 3 and FIG. 8. For example, in FIG. 3, which roughly corresponds to the architecture of the DGX-1 (TM) from NVIDIA™ that is used in some provider networks, the communication design is shown with solid lines representing 300 GB NVLink interconnects between GPUs”; and Fig. 3, 8) and a shared interconnect topology (see e.g. column 3, lines 33-44: “architecture of two high-performance GPU-based servers shown in FIG. 3 and FIG. 8… dotted lines are PCI-Express bus lanes… in FIG. 8, the solid lines are PCI-Express bus lanes between GPUs (shown as “G0-G15”)”; and Fig. 3, 8); 
packing a quantity of directed spanning trees (see e.g. column 7, lines 27-29: “identify a single set of spanning trees (e.g., using a greedy algorithm described above), which will result in a very performant (and possibly optimal) schedule”) corresponding to the interconnect topology (see e.g. column 7, lines 21-24: “communications “schedule” 250 can be identified in which nodes A and C send half of their data to node B, nodes B and C send half of their data to node A, and nodes A and B send half of their data to node C”; column 6, lines 26-27: “determine, based on the topology information, a communication schedule”; and Fig. 2), the directed spanning trees comprising data defining communication links between the GPUs (see e.g. column 6, lines 52-61: “a current topology 205 with three nodes—e.g., three GPUs—and edges representing links between ones of the nodes (representing links between the GPUs) is shown as a ring… In a first iteration 200A, a minimal spanning tree can then be selected, using the current topology 205. For example, a minimal spanning tree having a maximal total weight of edges can be selected as “TREE 1””; column 7, lines 1-6: “In a second iteration 200B, a next minimal spanning tree can then be selected, using the “current” topology (here, the updated topology 215 from iteration 200A having edge weights of ½, ½, and 1). For example, a minimal spanning tree having a maximal total weight of edges can be selected as “TREE 2”—here, a tree B-A-C is selected at 210”; column 7, lines 11-17: “As another spanning tree can be selected, another third iteration 200C is performed to select the tree using the “current” topology (here, the updated topology 215 from second iteration 200B having edge weights of 0, ½, and ½). Thus, a minimal spanning tree having a maximal total weight of edges can be selected as “TREE 3”—here, a tree A-C-D is selected at 210”; and Fig. 2) and an amount of the data to be transmitted on the communication links (see e.g. column 6, lines 55-56: “Each edge has a weight indicating a proportion of bandwidth available on the link”; column 6, lines 63-65: “Each weight of each selected edge will be half consumed by data for this tree, so each link has a weight of ½”; column 7, lines 6-10: “Each selected link will have its weight/bandwidth reduced by ½, resulting in the updated topology 215 with weights (0, ½, and ½)”; and column 7, lines 17-19: “Each selected link will have its weight/bandwidth reduced by ½, resulting in the updated topology 215 with weights (0, 0, and 0)”); and 
generating program code which, when executed, will cause the data to be transmitted between the GPUs based on the directed spanning trees (see e.g. column 7, lines 27-29: “identify a single set of spanning trees (e.g., using a greedy algorithm described above), which will result in a very performant (and possibly optimal) schedule”; column 6, lines 26-31: “determine, based on the topology information, a communication schedule to be used by the involved processing elements 130 during execution of the application. For example, the communication schedule may be used for gather-scatter type operations involved in distributed training of ML models”; and Fig. 2).

With respect to claim 2, Smola teaches: The computer-implemented method of claim 1, wherein the quantity of directed spanning trees is selected to minimize the number of directed spanning trees (see e.g. column 7, lines 27-28: “identify a single set of spanning trees (e.g., using a greedy algorithm described above” ; column 6, lines 60-61: “a minimal spanning tree having a maximal total weight of edges can be selected as “TREE 1””; column 7, lines 4-6: “a minimal spanning tree having a maximal total weight of edges can be selected as “TREE 2””; and column 7, lines 14-15: “a minimal spanning tree having a maximal total weight of edges can be selected as “TREE 3””) and to maximize utilization of bandwidth available on the communication links (see e.g. column 11, lines 63-67: “possible bandwidth for a particular schedule of a topology can be determined, and if it is the same as a maximal theoretical bandwidth for the topology (or within some threshold amount) then the schedule can be deemed to be “optimal.””; and column 7, lines 25-29: “schedule can be determined to be an “optimal” schedule as detailed below. Accordingly, some embodiments may simply identify a single set of spanning trees (e.g., using a greedy algorithm described above), which will result in a very performant (and possibly optimal) schedule”).

With respect to claim 3, Smola teaches: The computer-implemented method of claim 1, wherein the program code is configured to select a chunk size for chunks of the data to be transferred between the GPUs (see e.g. column 10, lines 8-13: “send a fraction A of the data using T and the remainder of the data (1−λ) using T′. Hence, the amount of data flowing along an edge e is given by λTe+(1−λ)T′e where Te=1 if e∈T and zero otherwise. This will consequently lead to ce• (λTe+(1−λ)Te′) time to complete the data transfer. If the data flow is thus partitioned appropriately”) and to pipeline transmission of the chunks of the data between the GPUs (see e.g. column 10, line 14: “data flow is thus partitioned appropriately”; and Fig. 2-3, 8).

With respect to claim 4, Smola teaches: The computer-implemented method of claim 1, wherein the data comprises model parameters for a deep neural network (DNN) model (see e.g. column 6, lines 29-31: “the communication schedule may be used for gather-scatter type operations involved in distributed training of ML models”; and column 2, lines 53-64: “Machine learning (ML) algorithms, especially those considered to be “deep learning” algorithms, require large amounts of computation for the training of ML models. Deep learning is a type of machine learning that typically “trains” a computer to perform human-like tasks, such as recognizing human speech, identifying objects within images, understanding/generating human language, making predictions, etc. Deep learning algorithms may include, for example, neural networks such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), fully-connected deep neural nets (DNNs), deep stacking networks (DSNs), etc.”).

With respect to claim 5, Smola teaches: The computer-implemented method of claim 1, wherein the inter-GPU point-to-point topology comprises an NVIDIA NVLINK topology (see e.g. column 3, lines 34-38: “FIG. 3, which roughly corresponds to the architecture of the DGX-1 (TM) from NVIDIA™ that is used in some provider networks, the communication design is shown with solid lines representing 300 GB NVLink interconnects between GPUs”; and Fig. 3).

With respect to claim 6, Smola teaches: The computer-implemented method of claim 1, wherein the shared interconnect topology comprises a Peripheral Component Interconnect Express (PCIe) topology (see e.g. column 3, lines 33-44: “architecture of two high-performance GPU-based servers shown in FIG. 3 and FIG. 8… dotted lines are PCI-Express bus lanes… in FIG. 8, the solid lines are PCI-Express bus lanes between GPUs (shown as “G0-G15”)”; and Fig. 3, 8).

With respect to claims 8-12 and 14: Claims 8-12 and 14 are directed to a computer-readable storage medium having instruction stored thereupon which, when executed by a processor, cause the processor to perform active steps corresponding to the method disclosed in claims 1-3 and 5-6; please see the rejections directed to claims 1-3 and 5-6 above which also cover the limitations recited in claims 8-12 and 14. Note that, Smola also discloses a computer-accessible storage medium comprising instructions to perform operations corresponding to the method disclosed in claims 8-12 and 14 (see e.g. column 17, lines 61-67; column 18, lines 1-5).

With respect to claims 15-18 and 20: Claims 15-18 and 20 are directed to a computing system, comprising a processor and a computer-readable storage medium having instructions stored thereupon which, when executed by the processor, cause the processor to perform active steps corresponding to the method disclosed in claims 1-3 and 5-6; please see the rejections directed to claims 1-3 and 5-6 above which also cover the limitations recited in claims 15-18 and 20. Note that, Smola also discloses a computer system 1600 comprising processors 1610A-N and a system memory 1620 to perform operations corresponding to the method disclosed in claims 8-12 and 14 (see e.g. Fig. 16).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 7, 13, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Smola in view of Roy et al. (US 2011/0185328 A1; hereinafter Roy).

With respect to claim 7, Smola teaches: The computer-implemented method of claim 1, 
Smola does not but Roy teaches:
wherein the program code comprises Compute Unified Device Architecture (CUDA) program code (see e.g. Roy, paragraph 27: “CUDA 2.3 SDK may be used for GPU implementations”; and paragraph 13: “For NVIDIA GPUs, the CUDA (Compute Unified Device Architecture) programming model provides a platform for non-graphics applications to exploit the computation bandwidth of the GPU”).
Smola and Roy are analogous art because they are in the same field of endeavor: configuring GPU topologies. Therefore, it would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention to modify Smola with the teachings of Roy. The motivation/suggestion would be to provide a platform for non-graphics applications to exploit the computation bandwidth of a GPU; thus improving the overall extensibility of the system.

With respect to claim 13: Claim 13 is directed to a computer-readable storage medium having instruction stored thereupon which, when executed by a processor, cause the processor to perform active steps corresponding to the method disclosed in claim 7; please see the rejection directed to claim 7 above which also covers the limitations recited in claim 13.

With respect to claim 19: Claim 19 is directed to a computing system, comprising a processor and a computer-readable storage medium having instructions stored thereupon which, when executed by the processor, cause the processor to perform active steps corresponding to the method disclosed in claim 7; please see the rejection directed to claim 7 above which also cover the limitations recited in claim 19.

CONCLUSION
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
US 10,587,534 B2 by Gray et al. discloses an interconnect topology utilizing spanning trees for implementing communications between GPUs.

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Umut Onat whose telephone number is (571)270-1735. The examiner can normally be reached M-Th 9:00-7:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hyung (Sam) S Sough can be reached on (571) 272-6799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/UMUT ONAT/Primary Examiner, Art Unit 2194