DETAILED ACTION


Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .



Response to Amendment
Received 11/19/2020

	Claim(s) 1-5 and 11-20 are pending.
	Claim(s) 1, 11, and 16 have been amended.
Claim(s) 6-10 have been cancelled.
	The 35 USC § 112(f) interpretation to claims 11-15 have been maintained in view of the amendments received 11/19/2020.


Response to Arguments
Received 11/19/2020


Regarding independent claims 1, 11, and 16:

Applicant’s arguments (Remarks, Page 9: ¶ 2-3), filed 11/19/2020, with respect to the rejection(s) of claim(s) 1, 11, and 16 under 35 U.S.C § 103 have been fully considered and are persuasive. Therefore, the rejection has been withdrawn, necessitated by Applicant's amendments. 



Applicant's arguments filed 11/19/2020 have been fully considered but they are not persuasive; as expressed below.


Regarding 35 U.S.C. 112(f):

Applicant argues (Remarks, Page 6, ¶ 2), that “Claims 11-15 are rejected under 35 U.S.C. 112(f), sixth paragraph. Various claims have been amended to address the rejections under 35 USC § 112(f).”
The Examiner disagrees. Applicant’s arguments and amendments of independent claim 11 fails to overcome the interpretation under 35 U.S.C. 112(f).



Regarding the prior art of Chen et al. (US PGPUB No. 20180330239 A1):

Applicant argues (Remarks, Page 6, ¶ 4), “…  that the rejections under 35 USC §103(a) are improper as a matter of law because: (1) Chen is not prior art to this application, and (2) there is no evidence on the record that PCT/CN2016/078448 describes the subject matter used to support the rejection. This application was filed April 8, 2017. Chen was filed July 20, 2018, and therefore does not constitute prior art. The Action appears to rely on the filing date of the parent application PCT/CN2016/078448 to establish Chen as prior art. However, Chen was filed as a continuation-in-part of PCT/CN2016/078448. Under 35 USC § 102(d)(2) Chen is entitled to claim the Effective Date of PCT/CN2016/078448 only for the subject matter described in PCT/CN2016/078448. There is no evidence of record to establish that PCT/CN2016/078448 discloses the subject matter for which Chen is cited in the Action. Without such evidence, Chen is not entitled to the Effective Date of PCT/CN2016/078448. The rejection based on Chen is therefore improper as a matter of law and must be withdrawn.”
The Examiner disagrees. Applicant’s arguments fail to view that the continuation-in-part of PCT/CN2016/078448 is proper under MPEP 2152.01, wherein:

(A) If the application is a continuation or divisional of one or more earlier U.S. applications or international applications and if the requirements of 35 U.S.C. 120, 365(c), or 386(c)  have been satisfied, the effective filing date is the same as the earliest filing date in the line of continuation or divisional applications.
(B) If the application is a continuation-in-part of an earlier U.S. application or international application, any claims in the new application not supported by the specification and claims of the parent application have an effective filing date equal to the actual filing date of the new application. Any claims which are fully supported under 35 U.S.C. 112  by the earlier parent application have the effective filing date of that earlier parent application.
(C) If the application properly claims benefit under 35 U.S.C. 119(e)  to a provisional application, the effective filing date is the filing date of the provisional application for any claims which are fully supported under 35 U.S.C. 112  by the provisional application.

See MPEP § 1893.03(c) for a discussion of claims for priority to, or the benefit of, the filing date of a prior-filed foreign or domestic application in an application that entered the national stage under 35 U.S.C. 371. See MPEP §§ 211.01(c) and 1895 for additional information on determining the effective filing date of a continuation, divisional, or continuation-in-part of a PCT application designating the U.S. See also MPEP §§ 1895.01 and 1896 which discuss differences between applications filed under 35 U.S.C. 111(a) and international applications that enter national stage under 35 U.S.C. 371.”
Additionally, machine translations of PCT/CN2016/078448 are provided.



Claim Interpretation

The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and 

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
Claim limitation “instruction to perform operations” (within claim 11) has/have been interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because it uses/they use a generic placeholder “instruction” coupled with functional language “ to perform” without reciting sufficient structure to achieve the function.  Furthermore, the generic placeholder is not preceded by a structural modifier.
Since the claim limitation(s) invokes 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, claim(s) 11-15 has/have been interpreted to cover the corresponding 
A review of the specification shows that the following appears to be the corresponding structure described in the specification for the 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph limitation: wherein, an instruction to perform corresponds to a processing circuit (specification; [¶ 0058, ¶ 0060, ¶ 0066-0067, and ¶ 0072]) of a processor circuit 234 as depicted within Fig. 2D.  
If applicant wishes to provide further explanation or dispute the examiner’s interpretation of the corresponding structure, applicant must identify the corresponding structure with reference to the specification by page and line number, and to the drawing, if any, by reference characters in response to this Office action. 
If applicant does not intend to have the claim limitation(s) treated under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112 , sixth paragraph, applicant may amend the claim(s) so that it/they will clearly not invoke 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, or present a sufficient showing that the claim recites/recite sufficient structure, material, or acts for performing the claimed function to preclude application of 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.
For more information, see MPEP § 2173 et seq. and Supplementary Examination Guidelines for Determining Compliance With 35 U.S.C. 112 and for Treatment of Related Issues in Patent Applications, 76 FR 7162, 7167 (Feb. 9, 2011).




EXAMINER’S AMENDMENT

Claims 1-5 and 11-20 are allowed.
	Claims 1, 11, and 16 are amended.

An Examiner’s Amendment to the record appears below. Should the changes and/or additions be unacceptable to Applicant, an amendment may be filed as provided by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be submitted no later than the payment of the issue fee.
Authorization for this Examiner’s Amendment was given in a telephone interview with Jed W. Caven on February 11, 2021.

Amended claims 1, 11, and 16 are as follows:


AMENDMENTS TO THE CLAIMS:

1. (Currently Amended) A general purpose graphics processor comprising: 
an instruction cache to receive a stream of instructions; 
an instruction unit to execute the stream of instructions; 
a general-purpose graphics processing compute block comprising a plurality of graphics processing cores;

a processor to:
apply a matrix interpolation operation to one or more linearly dependent rows of a matrix comprising weights of a neural network;
apply a singular value decomposition algorithm to convert one or more weights of one or more linearly dependent rows of the matrix to a low rank;
characterize one or more rows of the matrix comprising weights of a neural network for which a rank of the one or more rows of the matrix is less than a threshold value as independent rows of the matrix;
determine a scalar associated with each of the one or more independent rows of the matrix;
encode a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data;
apply delta compression to compress the encoded weight data;
store the encoded weight data in the shared memory; and
load the matrix into the neural network using hardware when the rank is beneath a threshold. 

11. (Currently Amended) A method, comprising:
receiving, in an instruction cache, a stream of instructions;
executing, in an instruction unit, the stream of instructions;

applying a matrix interpolation operation to one or more linearly dependent rows of a matrix comprising weights of a neural network;
applying a singular value decomposition algorithm to convert one or more weights of one or more linearly dependent rows of the matrix to a low rank;
characterizing one or more rows of the matrix comprising weights of a neural network for which a rank of the one or more rows of the matrix is less than a threshold value as independent rows of the matrix;
determining a scalar associated with each of the one or more independent rows of the matrix;
encoding a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data;
implementing a delta compression algorithm to compress the encoded weight data;
storing the encoded weight data in the shared memory; and 
loading the matrix into the neural network using hardware when the rank is beneath a threshold.

16. (Currently Amended) An electronic device comprising:
a computer readable memory; and 

an instruction cache to receive a stream of instructions; 
an instruction unit to execute the stream of instructions;
a general-purpose graphics processing compute block comprising a plurality of graphics processing cores;
a shared memory communicatively coupled to the plurality of graphics processing cores; and
a processor communicatively coupled to the shared memory to:
apply a matrix interpolation operation to one or more linearly dependent rows of a matrix comprising weights of a neural network;
apply a singular value decomposition algorithm to convert one or more weights of one or more linearly dependent rows of the matrix to a low rank;
characterize one or more rows of the matrix comprising weights of a neural network for which a rank of the one or more rows of the matrix is less than a threshold value as independent rows of the matrix;
encode a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data;
implement a delta compression algorithm to compress the encoded weight data;
store the encoded weight data in the shared memory; and
load the matrix into the neural network using hardware when the rank is beneath a threshold.


	

Allowable Subject Matter

Claims 1-5 and 11-20 are allowed.

	The following is a statement of reasons for the indication of allowable subject matter:  
The following is an Examiner’s statement of reasons for allowance: 
Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee. Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Independent claims 1, 11, and 16 are distinguished from the closest know prior art alone or reasonable combination, in consideration of the claim as a whole, particularly the limitations:

(Claim 1)
a processor to:
apply a matrix interpolation operation to one or more linearly dependent rows of a matrix comprising weights of a neural network;

characterize one or more rows of the matrix comprising weights of a neural network for which a rank of the one or more rows of the matrix is less than a threshold value as independent rows of the matrix;
determine a scalar associated with each of the one or more independent rows of the matrix;
encode a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data;
apply delta compression to compress the encoded weight data;
store the encoded weight data in the shared memory; and
load the matrix into the neural network using hardware when the rank is beneath a threshold. 

(Claim 11)
passing the stream of instructions to a general-purpose graphics processing compute block comprising a plurality of graphics processing cores, plurality of graphics processor cores communicatively coupled to a shared memory, the instruction to perform operations comprising:
applying a matrix interpolation operation to one or more linearly dependent rows of a matrix comprising weights of a neural network;
applying a singular value decomposition algorithm to convert one or more weights of one or more linearly dependent rows of the matrix to a low rank;

determining a scalar associated with each of the one or more independent rows of the matrix;
encoding a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data;
implementing a delta compression algorithm to compress the encoded weight data;
storing the encoded weight data in the shared memory; and 
loading the matrix into the neural network using hardware when the rank is beneath a threshold.

(Claim 16)
a processor communicatively coupled to the shared memory to:
apply a matrix interpolation operation to one or more linearly dependent rows of a matrix comprising weights of a neural network;
apply a singular value decomposition algorithm to convert one or more weights of one or more linearly dependent rows of the matrix to a low rank;
characterize one or more rows of the matrix comprising weights of a neural network for which a rank of the one or more rows of the matrix is less than a threshold value as independent rows of the matrix; 

implement a delta compression algorithm to compress the encoded weight data;
store the encoded weight data in the shared memory; and 
load the matrix into the neural network using hardware when the rank is beneath a threshold.

Wherein:

Claim 1, claim 11, and claim 16 are similar however are not identical, although the subject matter of claim 1 is addressed below in view of the prior art, the same is similarly apply to the subject matter of claim 11 and claim 16.

Laine et al. (US PGPUB No. 20180101768 A1) teaches executing a stream of instructions, a general-purpose graphics processing compute block comprising a plurality of graphics processing cores, a shared memory communicatively coupled to the plurality of graphics processing cores, and to store data in the shared memory. However, Laine et al. fails to disclose a processor to: apply a matrix interpolation operation to one or more linearly dependent rows of a matrix comprising weights of a neural network; apply a singular value decomposition algorithm to convert one or more weights of one or more linearly dependent rows of the matrix to a low rank; characterize one or more rows of the matrix comprising weights of a neural network for which a rank of the one or more rows of the matrix is less than a threshold value as independent 
Ambrose et al. (US PGPUB No. 201703448822 A1) teaches storing the encoded weight data in the shared memory. However, Ambrose et al. fails to disclose a processor to: apply a matrix interpolation operation to one or more linearly dependent rows of a matrix comprising weights of a neural network; apply a singular value decomposition algorithm to convert one or more weights of one or more linearly dependent rows of the matrix to a low rank; characterize one or more rows of the matrix comprising weights of a neural network for which a rank of the one or more rows of the matrix is less than a threshold value as independent rows of the matrix; determine a scalar associated with each of the one or more independent rows of the matrix; encode a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data; apply delta compression to compress the encoded weight data; store the encoded weight data in the shared memory; and load the matrix into the neural network using hardware when the rank is beneath a threshold.
Chen et al. (US PGPUB No. 20180330239 A1) teaches an instruction cache to receive instructions; applying a matrix interpolation operation to one or more linear rows of a matrix comprising weights of a neural network; characterizing one or more rows of the matrix comprising weights of a neural network; determining a scalar associated with 
Kirchberg (US Patent No. 6058386) teaches applying a matrix interpolation operation to one or more linearly dependent rows of a matrix comprising weights of a neural network. However, Kirchberg fails to disclose a processor to: apply a matrix interpolation operation to one or more linearly dependent rows of a matrix comprising weights of a neural network; apply a singular value decomposition algorithm to convert one or more weights of one or more linearly dependent rows of the matrix to a low rank; characterize one or more rows of the matrix comprising weights of a neural network for which a rank of the one or more rows of the matrix is less than a threshold value as independent rows of the matrix; determine a scalar associated with each of the one or more independent rows of the matrix; encode a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data; apply delta compression to compress the encoded weight data; store the encoded weight data 
Annapureddy et al. (US PGPUB No. 20160217369 A1) teaches characterizing one or more rows of the matrix comprising weights of a neural network for which a rank of the one or more rows of the matrix is less than a threshold value as independent rows of the matrix. However, Annapureddy et al. fails to disclose a processor to: apply a matrix interpolation operation to one or more linearly dependent rows of a matrix comprising weights of a neural network; apply a singular value decomposition algorithm to convert one or more weights of one or more linearly dependent rows of the matrix to a low rank; encode a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data; apply delta compression to compress the encoded weight data; store the encoded weight data in the shared memory; and load the matrix into the neural network using hardware when the rank is beneath a threshold.
Suh et al. (US PGPUB No. 20160086047 A1) teaches a matrix corresponding to learning having a rank lower than a rank threshold. However, Suh et al. fails to disclose to store the encoded weight data in the shared memory; and load the matrix into the neural network using hardware when the rank is beneath a threshold.
He et al. (US PGPUB No. 20170061047 A1) teaches a rank of a matrix being less than a threshold. However, He et al. fails to disclose to encode a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data; apply delta compression to compress the encoded weight data; 
Sainath et al. (US PGPUB No. 20160092766 A1) teaches a low rank hidden input layer of a deep neural network not satisfying a threshold accuracy. However, Sainath et al. fails to disclose a processor to: apply delta compression to compress the encoded weight data; store the encoded weight data in the shared memory; and load the matrix into the neural network using hardware when the rank is beneath a threshold.
Garimella (US PGPUB No. 20150170020 A1) teaches low rank matrices, using a low rank matrix to produce an output layer, compressing a low rank matrices, and a trained neural network weight matrix as a product of low rank matrices. However, Garimella fails to disclose a processor to: store the encoded weight data in the shared memory; and load the matrix into the neural network using hardware when the rank is beneath a threshold.
Cheng et al. (US PGPUB No. 20190229997 A1) teaches a rank of a low-rank matrix that is less than or equal to a dimension of coordinates in a network coordinate system. However, Cheng et al. fails to disclose a processor to: apply delta compression to compress the encoded weight data; store the encoded weight data in the shared memory; and load the matrix into the neural network using hardware when the rank is beneath a threshold.
Moghadam et al. (US PGPUB No. 20130223523 A1) teaches apply a matrix interpolation operation to one or more linearly dependent rows of a matrix comprising weights of a neural network; apply a singular value decomposition algorithm to convert one or more weights of one or more linearly dependent rows of the matrix to a low rank; 
Li et al. (US PGPUB No. 2012089888 A1) teaches apply a matrix interpolation operation to one or more linearly dependent rows of a matrix comprising weights of a neural network; apply a singular value decomposition algorithm to convert one or more weights of one or more linearly dependent rows of the matrix to a low rank; and a rank of a matrix lower than the matrix columns/threshold. However, Li et al. fails to disclose a processor to: encode a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data; apply delta compression to compress the encoded weight data; store the encoded weight data in the shared memory; and load the matrix into the neural network using hardware when the rank is beneath a threshold.
Guevara et al. (US Patent No. 9767410 B1) teaches ranking a neural network and a compression to compress encoded weight data. However, Guevara et al. fails to disclose a processor to: encode a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data; apply delta compression to compress the encoded weight data; store the encoded weight data in the shared memory; and load the matrix into the neural network using hardware when the rank is beneath a threshold.
Goldberg et al. (US PGPUB No. 20060202893 A1) teaches a rank of a matrix lower than the number of an antenna array. However, Goldberg et al. fails to disclose a processor to: apply a matrix interpolation operation to one or more linearly dependent rows of a matrix comprising weights of a neural network; apply a singular value decomposition algorithm to convert one or more weights of one or more linearly dependent rows of the matrix to a low rank; characterize one or more rows of the matrix comprising weights of a neural network for which a rank of the one or more rows of the matrix is less than a threshold value as independent rows of the matrix; determine a scalar associated with each of the one or more independent rows of the matrix; encode a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data; apply delta compression to compress the encoded weight data; store the encoded weight data in the shared memory; and load the matrix into the neural network using hardware when the rank is beneath a threshold.
Yuan et al. (US Patent No. 5966444) teaches a rank of a matrix less than a dimension/threshold in relation with deciphering image data. However, Yuan et al. fails to disclose a processor to: apply delta compression to compress the encoded weight data; store the encoded weight data in the shared memory; and load the matrix into the neural network using hardware when the rank is beneath a threshold.
As a result of the limitations of independent claims 1, 11, and 16 as well as dependent claims 2-5, 12-15, and 17-20 are also considered as being distinguished from the closest known prior art alone or reasonable combination.




Conclusion

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Refer to PTO-892, Notice of Reference Cited for a listing of analogous art.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Charles Lloyd Beard whose telephone number is (571)272-5735.  The examiner can normally be reached on Monday - Friday, 8:00 AM - 5: 00 PM, alternate Fridays EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Devona Faulk can be reached on (571) 272-7515.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private 


CHARLES LLOYD. BEARD
Examiner
Art Unit 2616



/CHARLES L BEARD/Examiner, Art Unit 2616