DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application is being examined under the pre-AIA  first to invent provisions. 

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on February 21, 2022, has been entered.
 
Claims 1-37 are pending in this office action and presented for examination. Claims 1, 8, 19-20, 29, and 37 are newly amended by the response received February 21, 2022. 

Applicant is advised that should claim 31 be found allowable, claim 32 will be objected to under 37 CFR 1.75 as being a substantial duplicate thereof. When two claims in an application are duplicates or else are so close in content that they both cover the same thing, despite a slight difference in wording, it is proper after allowing one claim to object to the other as being a substantial duplicate of the allowed claim. See MPEP § 608.01(m).

Specification
The disclosure is objected to because of the following informalities. Appropriate correction is required.
In paragraph [0053], a dash should be used to connect a number and “bit” or “byte”, in view of the surrounding language. 
In paragraphs [0062], [0072], [0089]: “filed” should be “field”. 
In paragraph [00139], “ymm0-16” should be “ymm0-15”.
In paragraph [00188]: “an static” should be “a static”.
In [00194], line 2, “instructions the vector friendly instruction format” should be rephrased for clarity. 
In [00196], “a Intel” should be “an Intel”. 
In [00198], “can may be” should be rephrased for clarity.  

Drawings
The drawings are objected to because:
In Figure 5, step 509, a vertical line appears to cross “THIS”, and it is further unclear as to what this vertical line inside the block is intended to represent. 
MPEP 608.02, section V, states that “[l]ead lines are required for each reference character except for those which indicate the surface or cross section on which they are placed. Such a reference character must be underlined to make it clear that a lead line has not been left out by mistake." However, FIG. 7A, 7B, 8A, 8B, 8C, 9, 10A, 10B, 11, 15, 16, and 17 each contain reference characters that are neither underlined nor associated with lead lines.
The view numbers must be larger than the numbers used for reference characters. However, Figures 8A, 8B, 8C, 15, 16, and 17 do not meet this requirement. 
Words must appear in a horizontal, left-to-right fashion when the page is either upright or turned so that the top becomes the right side, except for graphs utilizing standard scientific 
In FIG. 13, “KEyBOARD” should be “KEYBOARD”. 
In FIG. 13, “MEMORY”, preceding each of 1332 and 1334, should not be underlined. 
In FIG. 14, “MEMORY”, preceding each of 1332 and 1334, should not be underlined. 
In paragraph [0076] and [0092], reference character 750 is associated with a “round operation control field”; however, this reference character is associated with an augmentation operation field in FIG. 7A and FIG. 7B. Reference characters 758 and 759A, respectively, may have been intended. 
In Figure 10A, reference character 1004 is associated with a “local subset of the L2 cache” in particular. However, in paragraph [00155], reference character 1004 is associated with an L1 cache.
Paragraph [00177] discloses physical resources 1210, 1215 in general; however, reference characters 1210 and 1215 in Figure 12 are used to refer to processors in particular.
In Figure 13, reference character 1320 appears to be pointing to the node at which the audio I/O 1324 meets the second bus, rather than the second bus itself. 
In Figure 13, it is unclear as to whether element 1338 is connected to element 1392, given that the associated lead line is pointing to a boundary separating element 1392 from the rest of the chipset 1390.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing 

Claim Objections
Claims 2-5, 11, 18, 21, 26, 30, and 36 are objected to because of the following informalities.  Appropriate correction is required.
In claim 2, line 2, “64-bits” should be “64 bits”.
Claims 3-4 are objected to for failing to alleviate the objection of claim 2 above. 

In claim 5, line 2, “64-bits” should be “64 bits”. 
In claim 5, line 3, “512-bits” should be “512 bits”. 

In claim 11, line 2, “64-bits” should be “64 bits”. 

In claim 18, line 2, “64-bits” should be “64 bits”. 

In claim 21, line 2, “64-bits” should be “64 bits”. 

In claim 26, line 2, “64-bits” should be “64 bits”.
In claim 26, line 3, “512-bits” should be “512 bits”. 

In claim 30, line 2, “64-bits” should be “64 bits”.

In claim 36, line 2, “64-bits” should be “64 bits”.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-37 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim 1 recites the limitation “a total number of data elements” in the sixth-to-last line. However, it is indefinite as to whether this total number of data elements is the same as or different from “a total number of data elements” as recited in claim 1, tenth-to-last line. If the same, antecedent basis language should be used. For the purposes of prior art examination, Examiner is interpreting the limitation as “the total number of data elements”. 
Claims 2-7 and 9-18 are rejected for failing to alleviate the rejection of claim 1 above. 

Claim 8 recites the limitation “a total number of data elements” in the seventh-to-last line. However, it is indefinite as to whether this total number of data elements is the same as or different from “a total number of data elements” as recited in claim 8, eleventh-to-last line. If the same, antecedent basis language should be used. For the purposes of prior art examination, Examiner is interpreting the limitation as “the total number of data elements”.

Claim 19 recites the limitation “a total number of data elements” in the ninth-to-last line. However, it is indefinite as to whether this total number of data elements is the same as or different from “a total number of data elements” as recited in claim 19, thirteenth-to-last line. If the same, antecedent basis language should be used. For the purposes of prior art examination, Examiner is interpreting the limitation as “the total number of data elements”.

Claim 20 recites the limitation “a total number of data elements” in the sixth-to-last line. However, it is indefinite as to whether this total number of data elements is the same as or different from “a total number of data elements” as recited in claim 20, tenth-to-last line. If the same, antecedent basis language should be used. For the purposes of prior art examination, Examiner is interpreting the limitation as “the total number of data elements”. 
Claims 21-28 are rejected for failing to alleviate the rejection of claim 20 above. 

Claim 29 recites the limitation “a total number of data elements” in the seventh-to-last line. However, it is indefinite as to whether this total number of data elements is the same as or different from “a total number of data elements” as recited in claim 29, eleventh-to-last line. If the same, antecedent basis language should be used. For the purposes of prior art examination, Examiner is interpreting the limitation as “the total number of data elements”. 
Claims 30-36 are rejected for failing to alleviate the rejection of claim 29 above. 

Claim 37 recites the limitation “a total number of data elements” in the eighth-to-last line. However, it is indefinite as to whether this total number of data elements is the same as or different from “a total number of data elements” as recited in claim 37, twelfth-to-last line. If the same, antecedent basis language should be used. For the purposes of prior art examination, Examiner is interpreting the limitation as “the total number of data elements”.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-37 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because the claim(s) can be interpreted as software per se and thus can be made without an actual hardware apparatus. While the claim(s) do recite circuitry, paragraph [00192] discloses: ‘One or more aspects of at least one embodiment may be implemented by representative instructions stored on a machine-readable medium which 

Claim Rejections - 35 USC § 103
The following is a quotation of pre-AIA  35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.

Claims 1-5, 7-18, 20, 21, 23-26, 28-30, 33, 34 and 36 is/are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Cavin et al (US 2009/0172348 A1, herein Cavin), Dulong et al (US 2002/0002666 A1, herein Dulong), Van Hook et al (US 7,197,625 B1, herein van Hook), Rusterholz et al (US 4,873,630, herein Rusterholz), Fletcher (US 2009/0327553 A1), and Taunton et al. (US 2005/0094551 A1, herein Taunton).
Regarding Claim 1, Cavin teaches a system comprising: a first processor (Paragraph 31 Core performing scalar operations) to process a first type of instructions (scalar operations); a second processor coupled to the first processor (Fig. 1 Processing Core 33) over an on-chip interconnect (Fig. 1 170), the second processor to process a second type of instructions (Paragraph 33, LoadUnpack and PackStore instructions), the second processor comprising: a plurality of 512-bit vector registers (Fig. 1, Figs. 3-4, vector registers) including: a first source vector register to store a first plurality of data elements (Fig. 1, vector register stores data elements); a second source vector register to store a second plurality of data elements (Fig. 1 vector register stores data elements), each of the second plurality of data elements to be stored in a data element location in the second source vector register corresponding to a data element location of a different one of the first plurality of data elements in the first source vector register (Fig. 4 shows elements being stored in register); a plurality of vector mask registers including a source vector mask register (Fig. 1 Mask registers), the source vector mask register to store predicate data comprising a plurality of bits (Fig. 3, 4); a decoder to decode an instruction (Fig. 1 165); and execution circuitry (Fig. 1 145), wherein the source vector mask register has a number of bits (Fig. 3, 4).
Cavin, however, does not explicitly teach a destination vector register to store a blended combination of the first and second pluralities of data elements. Cavin also does not teach that the instruction specifies a data blend operation and has a format specifying the source vector mask register. Cavin also does not teach that the execution circuitry performs the data blend operation, the execution circuitry to select data elements from the first plurality of data elements to be stored in corresponding locations in the destination vector register if corresponding bits of the predicate data has a first value and to select a data elements from the second plurality of data 
Dulong teaches a destination vector register to store a blended combination of first and second pluralities of data elements (Fig. 2). Dulong teaches an instruction specifying a data blend operation (Paragraph 20 select instruction). Dulong teaches execution circuitry to perform the data blend operation, the execution circuitry to select data elements from the first plurality of data elements to be stored in corresponding locations in the destination vector register if corresponding bits of predicate data have a first value and to select data elements from the second plurality of data elements to be stored in corresponding locations in the destination vector register if corresponding bits of the predicate data have a second value (Fig. 2, 3, Paragraphs 25-28).
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin and Dulong before them, to implement a select instruction in the processing system of Cavin. This would result in a destination vector register to store a blended 
One of ordinary skill in the art would be motivated to do so as this would allow for conditional selecting of elements useful for programming purposes (Dulong Paragraphs
5-7).
The combination, thus far, still does not teach that the number of bits of the source vector mask register is either more bits than a total number of data elements in the first source vector register or a same number of bits as the total number of data elements in the first source vector register, depending upon a size of the data elements in the first source vector register. The combination thus far also does not teach a graphics processor coupled to the on-chip interconnect to perform graphics operations; and an integrated memory controller to couple the first processor and the second processor to a system memory. The combination thus far also does not teach when the number of bits of the source vector mask register is more bits than a total number of data elements in the first source vector register a most significant bit of the source vector mask register is used to select data as part of the data blend operation.

It would be obvious to one of ordinary skill in the art at the time the invention was made, to implement the blend instruction to support different element sizes 8 bit, 16 bit etc. One of ordinary skill in the art would be motivated to do so as this would allow for different element sized data in a vector thus at different times different size data can be operated on simultaneously.
The combination, thus far, still does not teach that the number of bits of the source vector mask register is either more bits than a total number of data elements in the first source vector register or a same number of bits as the total number of data elements in the first source vector register, depending upon a size of the data elements in the first source vector register. The combination thus far also does not teach a graphics processor coupled to the on-chip interconnect to perform graphics operations; and an integrated memory controller to couple the first processor and the second processor to a system memory. The combination thus far also does not teach when the number of bits of the source vector mask register is more bits than a total number of data elements in the first source vector register a most significant bit of the source vector mask register is used to select data as part of the data blend operation.
Rusterholz teaches a 64-bit mask register where not all bits are used (Column 148 lines 41-55 ‘The mask register...’).
It would be obvious to one of ordinary skill in the art, at the time the invention was
made, with the teachings of Cavin, Dulong, van Hook, and Rusterholz before them, to

source vector register or a same number of bits as the total number of data elements in
the first source vector register, depending upon a size of the data elements in the first
source vector register. This modification would have been obvious based on legal
precedent. In re Rose, 220 F.2d 459, 105 USPQ 237 (CCPA 1955) has established that
a change in size or range is not sufficient to patentably distinguish over the prior art, and
that one of ordinary skill in the art would find a change in size or range to be an obvious
modification of the prior art. (See MPEP 2144.04 IV A). In this case, the change in size
is of the mask register to 64 bits. Also, this would allow for using the same mask
registers for different number of vector elements (Rusterholz Column 148 lines 41-55).
The combination thus far still does not teach a graphics processor coupled to the on-chip interconnect to perform graphics operations; and an integrated memory controller to couple the first processor and the second processor to a system memory. The combination thus far also does not teach when the number of bits of the source vector mask register is more bits than a total number of data elements in the first source vector register a most significant bit of the source vector mask register is used to select data as part of the data blend operation.
Cavin teaches a core being a graphics processing unit (Paragraph 30).
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, Van Hook, and Rusterholz before them, to implement another core in the processing system of Cavin to be a graphics processing unit. This would result in a graphics processor coupled to the on-chip interconnect to perform graphics operations.

The combination thus far still does not explicitly teach an integrated memory controller to couple the first processor and the second processor to a system memory. The combination thus far also does not teach when the number of bits of the source vector mask register is more bits than a total number of data elements in the first source vector register a most significant bit of the source vector mask register is used to select data as part of the data blend operation. 
Fletcher teaches an integrated memory controller to couple processors to a system
memory (Fig 1 Paragraph 14).
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, van Hook, Rusterholz, and Fletcher before them, to implement an integrated memory controller coupling the first processor and the second processor to a system memory.
One of ordinary skill in the art would be motivated to do so as the memory controller
would manage the flow of data going to and from the system’s memory.
	However, the combination thus far does not teach when the number of bits of the source vector mask register is more bits than a total number of data elements in the first source vector register a most significant bit of the source vector mask register is used to select data as part of the data blend operation.
	On the other hand, Taunton teaches that when a number of bits of a storage location is more bits than a total number of bits needed to store a value, a most significant bit of the source location is used to store the value ([0050], lines 5-20, in one embodiment, the bits representing the value may be aligned at the most-significant (left-hand) end of the 16-bit field, and lower bits 
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, when a number of bits of a storage location is more bits than a total number of bits needed to store a value, for a most significant bit (rather than a least significant bit) of the source location to be used to store the value, as this modification merely entails simple substitution of one known element (a least significant bit being used to store the value) for another (a most significant bit being used to store the value) to obtain predictable results (a most significant bit, rather than a least significant bit, being used to store the value; further note that Taunton discloses that one skilled in the art would recognize the two possibilities as alternatives), which is an exemplary rationale that may support a conclusion of obviousness, as per MPEP 2143. Note that other rationales under MPEP 2143 may be applicable; for example, this modification also merely entails "Obvious to try" – choosing from a finite number of identified, predictable solutions (data being aligned to the most-significant side and data being aligned to the least-significant side), with a reasonable expectation of success (as noted, Taunton discloses that one skilled in the art would recognize the two possibilities as implementable alternatives). Note that 

Regarding Claim 2, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 1 wherein each of the plurality of vector mask registers comprises 64-bits (The combination results in 64-bit registers), wherein the source vector mask register is to have said more bits (van Hook Column 6 lines 18-22; In other words, the combination, as an example, would result in 512 bit registers with 32 bit elements thus 16 elements, the mask register has 64 bits) when the data elements in the first source vector register are larger than a smallest data element size supported by the instruction (32 bit elements, as an example, are larger than a smallest data element size of 8).
The combination thus far does not explicitly teach a shared cache coupled to and shared by the first processor and the second processor.
Fletcher teaches a shared cache shared by processors (Fig. 1 Paragraph 12).
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, 
One of ordinary skill in the art would be motivated to do so as this reduces cache underutilization, since when one processor is idle, the other processor can have access to the whole shared resource.

Regarding Claim 3, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 2 wherein the first processor comprises a plurality of cores that are capable of multithreading (Cavin Paragraph 31, more than one core as the first processor, Paragraph 23).

Regarding Claim 4, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 3. 
The combination thus far does not explicitly teach that the second processor comprises a digital signal processor (DSP).
Fletcher teaches a digital signal processor (Paragraph 13).
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement the second processor as a digital signal processor.
One of ordinary skill in the art would be motivated to do so as this would allow for efficient processing of digital signal processing operations.

Regarding Claim 5, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 1, wherein each of the plurality of vector mask registers comprises 64 bits 

Regarding Claim 7, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 1, further comprising status registers to maintain data related to an execution state of the second processor (Cavin Paragraph 32).

Claim 8 is a system claim with its limitations included in system claim 4. Claim 8
is rejected for the same reasons as Claim 4.

Regarding Claim 9, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 1. 
The combination thus far does not explicitly teach that the second processor comprises a digital signal processor (DSP).
Fletcher teaches a digital signal processor (Paragraph 13).
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement the second processor as a digital signal processor. 
One of ordinary skill in the art would be motivated to do so as this would allow for efficient processing of digital signal processing operations.

Regarding Claim 10, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach
the system of claim 9 wherein the first processor comprises a plurality of cores that are

processor, Paragraph 23).

Regarding Claim 11, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach
the system of claim 10 wherein each of the plurality of vector mask registers comprises
64-bits (The combination results in 64-bit mask registers), and further comprising a
plurality of status registers to maintain data related to an execution state of the second
processor (Cavin Paragraph 32).

Regarding Claim 12, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 10 wherein the first plurality of data elements comprises eight data elements (Cavin Paragraph 3).
The combination thus far does not explicitly teach a shared cache coupled to and shared by the first processor and the second processor.
Fletcher teaches a shared cache shared by processors (Fig. 1 Paragraph 12).
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement a shared cache coupled to and shared by the first processor and the second processor.
One of ordinary skill in the art would be motivated to do so as this reduces cache underutilization, since when one processor is idle, the other processor can have access to the whole shared resource.

Regarding Claim 13, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach
the system of claim 12, further comprising a plurality of status registers to maintain data
related to an execution state of the second processor (Cavin Paragraph 32).

Regarding Claim 14, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach
the system of claim 1 wherein the first processor comprises a plurality of cores that are
capable of multi-threading (Cavin Paragraph 31, more than one core as the first
processor, Paragraph 23).

Regarding Claim 15, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 14.
The combination thus far does not explicitly teach that the second processor comprises a digital signal processor (DSP).
Fletcher teaches a digital signal processor (Paragraph 13).
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement the second processor as a digital signal processor.
One of ordinary skill in the art would be motivated to do so as this would allow for
efficient processing of digital signal processing operations.

Regarding Claim 16, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach
the system of claim 15 wherein the first plurality of data elements comprises eight data
elements (Cavin Paragraph 3), and further comprising a plurality of status registers to

32).

Regarding Claim 17, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 1, further comprising a plurality of status registers (Cavin Paragraph 32).
The combination thus far does not explicitly teach that the second processor comprises a digital signal processor (DSP), that the system further comprises a shared cache coupled to and shared by the first processor and the second processor.
Fletcher teaches a digital signal processor (Paragraph 13).
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement the second processor as a digital signal processor.
One of ordinary skill in the art would be motivated to do so as this would allow for efficient processing of digital signal processing operations.
The combination thus far does not explicitly teach a shared cache coupled to and shared by the first processor and the second processor.
Fletcher teaches a shared cache shared by processors (Fig. 1 Paragraph 12).
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement a shared cache coupled to and shared by the first processor and the second processor.


Regarding Claim 18, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 1, wherein each of the plurality of vector mask registers comprises 64-bits (the combination results in 64-bit mask registers), and wherein the first processor comprises a plurality of cores that are capable of multi-threading (Cavin Paragraph 31, more than one core as the first processor, Paragraph 23), and the system further comprising a plurality of status registers (Cavin Paragraph 32).
The combination thus far does not explicitly teach that the system further comprises a shared cache coupled to and shared by the first processor and the second processor.
Fletcher teaches a shared cache shared by processors (Fig. 1 Paragraph 12).
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement a shared cache coupled to and shared by the first processor and the second processor.
One of ordinary skill in the art would be motivated to do so as this reduces cache underutilization, since when one processor is idle, the other processor can have access to the whole shared resource.

Claim 20 is a system claim corresponding to system claim 1. The difference is that claim 20 recites “a system memory to store instructions and data”, “a first processor coupled to the 

Regarding Claim 21, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 20, wherein each of the plurality of vector mask registers comprises 64-bits (The combination results in 64-bit registers), wherein the source vector mask register is to have said more bits (van Hook Column 6 lines 18-22; In other words, the combination, as an example, would result in 512 bit registers with 32 bit elements thus 16 elements, the mask register has 64 bits) when the data elements in the first source vector register are larger than a smallest data element size supported by the instruction (32 bit elements are larger than a smallest data element size of 8).
The combination thus far does not explicitly teach that the system memory comprises a dynamic random access (DRAM) memory.
Fletcher teaches a system memory comprising a DRAM (Paragraph 14).
It would be obvious to one of ordinary skill in the art at the time of the invention, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement the system memory as a DRAM.
One of 	ordinary skill in the art would be motivated to do so as it is cheaper compared to other forms of RAM and has a high storage capacity.

Regarding Claim 23, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 20, wherein the source vector mask register is to have said more bits (van Hook Column 6 lines 18-22; In other words, the combination, as an example, would result in 512 bit registers with 32 bit elements thus 16 elements, the mask register has 64 bits) when the data elements in the first source vector register are larger than a smallest data element size supported by the instruction (32 bit elements, as an example, are larger than a smallest data element size of 8).
The combination thus far does not explicitly teach a shared cache to be shared by the first processor and the second processor.
Fletcher teaches a shared cache to be shared by processors (Fig. 1 Paragraph 12).
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement a shared cache to be shared by the first processor and the second processor.
One of ordinary skill in the art would be motivated to do so as this reduces cache underutilization, since when one processor is idle, the other processor can have access to the whole shared resource.

Regarding Claim 24, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 20.
The combination thus far does not explicitly teach that the first processor comprises a plurality of cores capable of multi-threading to execute multiple threads including the first type of instructions.

It would be obvious to one of ordinary skill in the art at the time of the invention, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement the cores executing the first type of instructions as multithreading capable cores. This would result in the first processor comprising a plurality of cores capable of multi-threading to execute multiple threads including the first type of instructions.
One of ordinary skill in the art would be motivated to do so as multithreading would provide better throughput by threads running in parallel and maximizing the use of cores.

Regarding Claim 25, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 24. 
The combination thus far does not explicitly teach that the second processor comprises a digital signal processor (DSP).
Fletcher teaches a digital signal processor (Paragraph 13).
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement the second processor as a digital signal processor.
One of ordinary skill in the art would be motivated to do so as this would allow for efficient processing of digital signal processing operations.

Regarding Claim 26, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 20, wherein each of the plurality of vector mask registers comprises 64-bits, 

Regarding Claim 28, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 20, further comprising status registers to maintain data related to an execution state of the second processor (Cavin Paragraph 32).

Claim 29 is a system claim corresponding to system claim 1. The difference is that claim 29 recites “a system memory to store instructions and data, wherein the system memory comprises a dynamic random access (DRAM) memory”, “a first processor coupled to the system memory”, “the second processor comprises a digital signal processor (DSP)”, “a shared cache coupled to the first processor and the second processor” and “a graphics processor coupled to the first processor over the on-chip interconnect”. 
However, Cavin teaches a system memory to store instructions and data (Paragraph 29, Fig. 1 RAM 26). Cavin teaches the first processor coupled to the system memory (Fig. 1). The combination teaches a graphics processor and as all components are coupled, the graphics processor would be coupled to the first processor over the on-chip interconnect. 
Moreover, Fletcher teaches a system memory comprising a DRAM (Paragraph 14).
It would be obvious to one of ordinary skill in the art at the time of the invention, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement the system memory as a DRAM.
One of 	ordinary skill in the art would be motivated to do so as it is cheaper compared to other forms of RAM and has a high storage capacity.

It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement the second processor as a digital signal processor.
One of ordinary skill in the art would be motivated to do so as this would allow for efficient processing of digital signal processing operations.
Moreover, Fletcher teaches a shared cache shared by processors (Fig. 1 Paragraph 12).
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement a shared cache coupled to and shared by the first processor and the second processor.
One of ordinary skill in the art would be motivated to do so as this reduces cache underutilization, since when one processor is idle, the other processor can have access to the whole shared resource.
Regarding the analogous limitations, Claim 29 is rejected for the same reasons as Claim 1.

Regarding Claim 30, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 29, wherein each of the plurality of vector mask registers comprises 64-bits (The combination results in 64-bit registers), and further comprising a display coupled with the first processor (Cavin Paragraph 25).

Regarding Claim 33, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 29.
The combination does not explicitly teach that the first processor comprises a plurality of cores capable of multi-threading to execute multiple threads including the first type of instructions.
Cavin teaches cores executing the first type of instructions (Paragraph 31). Cavin teaches cores capable of multithreading (Paragraph 23).
It would be obvious to one of ordinary skill in the art at the time of the invention, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement the cores executing the first type of instructions as multithreading capable cores. This would result in the first processor comprising a plurality of cores capable of multi-threading to execute multiple threads including the first type of instructions. 
One of ordinary skill in the art would be motivated to do so as multithreading would provide better throughput by threads running in parallel and maximizing the use of cores.

Regarding Claim 34, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach
the system of claim 29, wherein the source vector mask register is to have said more
bits (van Hook Column 6 lines 18-22; In other words, the combination, as an example,
would result in 512 bit registers with 32 bit elements thus 16 elements, the mask
register has 64 bits) when the data elements in the first source vector register are larger
than a smallest data element size supported by the instruction (32 bit elements are
larger than a smallest data element size of 8), further comprising a plurality of status
registers to maintain data related to an execution state of the second processor (Cavin


Regarding Claim 36, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach
the system of claim 29, wherein each of the plurality of vector mask registers comprises
64-bits (The combination results in 64-bit registers), and further comprising a plurality of
status registers to maintain data related to an execution state of the second processor
(Cavin paragraph 32, IP registers).

Claims 6, 19, and 27 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton (in the case of claims 6 and 27, as applied to claims 1 and 20 above), and further in view of Song et al (US 5,991,531, herein Song).
Regarding Claim 6, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 1 further comprising scalar execution circuitry to execute one or more scalar instructions (Cavin Paragraph 31).
The combination thus far does not explicitly teach the scalar execution circuitry including a plurality of scalar registers.
Song teaches a plurality of scalar registers (Column 1 line 61).
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, Taunton, and Song before them, to implement a plurality of scalar registers.
One of ordinary skill in the art would be motivated to do so as scalar registers would provide quick access to scalar data required for scalar operations.

Claim 19 is a system claim corresponding to system claim 1. The difference is that claim 19 recites “the first processor comprises a plurality of cores that are capable of multithreading”, “the second processor comprises a digital signal processor (DSP)”, “a plurality of status registers to maintain data related to an execution state of the second processor”, “scalar execution circuitry to execute one or more scalar instructions, the scalar execution circuitry including a plurality of scalar registers”, and “a shared cache coupled to the first processor and the second processor”.
However, Cavin discloses the first processor comprises a plurality of cores that are capable of multithreading (Cavin Paragraph 31, more than one core as the first processor, Paragraph 23).
In addition, Fletcher teaches a digital signal processor (Paragraph 13).
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement the second processor as a digital signal processor.
One of ordinary skill in the art would be motivated to do so as this would allow for efficient processing of digital signal processing operations.
In addition, Cavin discloses a plurality of status registers to maintain data related to an execution state of the second processor (Cavin Paragraph 32).
In addition, Fletcher teaches a shared cache shared by processors (Fig. 1 Paragraph 12).
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, 
One of ordinary skill in the art would be motivated to do so as this reduces cache underutilization, since when one processor is idle, the other processor can have access to the whole shared resource.
Cavin teaches scalar execution circuitry to execute one or more scalar instructions (Cavin Paragraph 31).
It would be obvious to one of ordinary skill in the art at the time of the invention, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement scalar execution circuitry to execute one or more scalar instructions. 
One of ordinary skill in the art would be motivated to do so as this would allow for execution of scalar instructions either alone or in parallel with vector instructions, thus improve performance. It would also prevent wastage of resources such as using vector execution circuitry to execute scalar instructions.
However, the combination thus far does not explicitly teach the scalar execution circuitry including a plurality of scalar registers.
Song teaches a plurality of scalar registers (Column 1 line 61).
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, Taunton, and Song before them, to implement a plurality of scalar registers.
One of ordinary skill in the art would be motivated to do so as scalar registers would provide quick access to scalar data required for scalar operations.


Regarding Claim 27, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 20 further comprising scalar execution circuitry to execute one or more scalar instructions (Cavin Paragraph 31).
The combination thus far does not explicitly teach the scalar execution circuitry including a plurality of scalar registers.
Song teaches a plurality of scalar registers (Column 1 line 61).
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, Taunton, and Song before them, to implement a plurality of scalar registers.
One of ordinary skill in the art would be motivated to do so as scalar registers would provide quick access to scalar data required for scalar operations.

Claims 22, 31-32, 35, and 37 are rejected under pre-AIA  35 U.S.C. 103(a) as being unpatentable over Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton (in the case of claims 22, 31-32, and 35, as applied to claims 20 and 29 above), and further in view of Van Dyke et al. (US 7,661,107 B1, herein Van Dyke).
Regarding Claim 22, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 20, wherein the source vector mask register is to have said more bits (van Hook Column 6 lines 18-22; In other words, the combination, as an example, would result in 512 bit registers with 32 bit elements thus 16 elements, the mask register has 64 bits) when the data 
The combination thus far does not explicitly teach an audio input/output device coupled to the first processor.
Van Dyke teaches an audio input/output device coupled to a processor (Fig. 3 310).
It would be obvious to one of ordinary skill in the art at the time of the invention, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, Taunton, and Van Dyke before them, to implement an audio input/output device in the system. This would result in an audio input/output device coupled to the first processor.
One of ordinary skill in the art would be motivated to do so as this would allow for
input of audio for processing and also output of audio signals.

Regarding Claim 31, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 29.
The combination thus far does not explicitly teach an audio input/output device coupled to the first processor.
Van Dyke teaches an audio input/output device coupled to a processor (Fig. 3 310).
It would be obvious to one of ordinary skill in the art at the time of the invention, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, Taunton, and Van Dyke before them, to implement an audio input/output device in the system. This would result in an audio input/output device coupled to the first processor. 
One of ordinary skill in the art would be motivated to do so as this would allow for input of audio for processing and also output of audio signals.

Regarding Claim 32, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 29.
The combination thus far does not explicitly teach an audio input/output device coupled to the first processor.
Van Dyke teaches an audio input/output device coupled to a processor (Fig. 3 310).
It would be obvious to one of ordinary skill in the art at the time of the invention, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, Taunton, and Van Dyke before them, to implement an audio input/output device in the system. This would result in an audio input/output device coupled to the first processor.
One of ordinary skill in the art would be motivated to do so as this would allow for input of audio for processing and also output of audio signals.

Regarding Claim 35, Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton teach the system of claim 29, further comprising: a plurality of status registers to maintain data related to an execution state of the second processor (Cavin paragraph 32, IP registers).
The combination thus far does not teach an audio input/output device coupled to the first processor.
Van Dyke teaches an audio input/output device coupled to a processor (Fig. 3 310).
It would be obvious to one of ordinary skill in the art at the time of the invention, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, Taunton, and Van Dyke before them, to implement an audio input/output device in the system. This would result in an audio input/output device coupled to the first processor.


Claim 37 is a system claim corresponding to system claim 1. The difference is that claim 19 recites “a system memory to store instructions and data, wherein the system memory comprises a dynamic random access (DRAM) memory”, “a first processor coupled to the system memory”, “the first processor comprises a plurality of cores capable of multithreading to execute multiple threads including the first type of instructions”, “the second processor comprises a digital signal processor (DSP)”, “a plurality of status registers to maintain data related to an execution state of the second processor”, a “scalar execution unit to execute one or more scalar instructions”, “an audio input/output device coupled to the first processor”; and “a graphics processor coupled to the first processor over the on-chip interconnect”.
However, Cavin teaches a system memory to store instructions and data (Paragraph 29, Fig. 1 RAM 26). Cavin teaches the first processor coupled to the system memory (Fig. 1). The combination teaches a graphics processor and as all components are coupled, the graphics processor would be coupled to the first processor over the on-chip interconnect.
Moreover, Fletcher teaches a system memory comprising a DRAM (Paragraph 14).
It would be obvious to one of ordinary skill in the art at the time of the invention, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement the system memory as a DRAM.
One of 	ordinary skill in the art would be motivated to do so as it is cheaper compared to other forms of RAM and has a high storage capacity.

It would be obvious to one of ordinary skill in the art at the time of the invention, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement the cores executing the first type of instructions as multithreading capable cores. This would result in the first processor comprising a plurality of cores capable of multi-threading to execute multiple threads including the first type of instructions.
One of ordinary skill in the art would be motivated to do so as multithreading would provide better throughput by threads running in parallel and maximizing the use of cores.
Moreover, Fletcher teaches a digital signal processor (Paragraph 13).
It would be obvious to one of ordinary skill in the art, at the time the invention was made, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement the second processor as a digital signal processor.
One of ordinary skill in the art would be motivated to do so as this would allow for efficient processing of digital signal processing operations.
In addition, Cavin discloses a plurality of status registers to maintain data related to an execution state of the second processor (Cavin Paragraph 32).
In addition, Cavin teaches a scalar execution unit to execute one or more scalar instructions (Cavin Paragraph 31).
It would be obvious to one of ordinary skill in the art at the time of the invention, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, and Taunton before them, to implement a scalar execution unit to execute one or more scalar instructions. 

However, the combination thus far does not teach an audio input/output device coupled to the first processor.
Van Dyke teaches an audio input/output device coupled to a processor (Fig. 3 310).
It would be obvious to one of ordinary skill in the art at the time of the invention, with the teachings of Cavin, Dulong, van Hook, Rusterholz, Fletcher, Taunton, and Van Dyke before them, to implement an audio input/output device in the system. This would result in an audio input/output device coupled to the first processor.
One of ordinary skill in the art would be motivated to do so as this would allow for input of audio for processing and also output audio signals.
Regarding the analogous limitations, Claim 37 is rejected for the same reasons as Claim 1.

Response to Arguments
Applicant on page 1 of the remarks argues: “Applicants respectfully submit that the claims have been amended to overcome the rejection. Additionally, Applicants respectfully submit that it is not the number of bits in the xxx that Accordingly, Applicants respectfully request that the Examiner withdraw the rejection of claims 1-37.”
In view of the aforementioned amendment, the previously presented indefinite rejections are withdrawn. 

Applicant on page 3 argues: “As understood by Applicants, Cavin, Dulong, Van Hook, Rusterholz, and Fletcher do not disclose these limitations or render them obvious.”
In view of the aforementioned amended limitations, Examiner is newly relying upon the Taunton reference — see the Claim Rejections - 35 USC § 103 section above. 

Applicant across pages 3-9 argues the rejection of further claims by citing reasons set forth with respect to claim 1, and by noting that further prior art relied upon to render obvious dependent claims does not cure the deficiencies of the prior art relied upon in the rejection of the independent claims.
Examiner’s responses to arguments with respect to claim 1 above are likewise applicable to the arguments directed to the aforementioned further claims.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Cox (US 20020116594 A1) discloses in paragraph [0062] of storing information in least significant bits, most significant bits, or in non-contiguous bits, as alternatives. Therefore, Cox is relevant to the newly amended limitation directed to using a most significant bit. 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to KEITH E VICARY whose telephone number is (571)270-1314. The examiner can normally be reached Monday to Friday, 9:00 AM to 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jyoti Mehta can be reached on (571)270-3995. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KEITH E VICARY/            Primary Examiner, Art Unit 2182