Notice of Pre-AIA  or AIA  Status

The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

The title of the invention is not descriptive.  A new title is required that is clearly indicative of the invention to which the claims are directed.  Examiner notes that a referral to the Atrous Spatial Pyramid Pooling, or equivalent thereof, would more accurately portray the essence of the instant invention.

Claim Rejections - 35 USC § 102

In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.



Claim(s) 1-5,7-12,14,15 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Jansson (20200043517) .

As per claim 1, Jansson (20200043517) teaches a source separation method (as voice source separation – para 0047), suitable for machine learning (as using a U-net architecture – para 0049; fig, 11, subblock 1195 – machine learning engine), the source separation method comprising: 
obtaining a one-dimensional signal, wherein the one-dimensional signal is generated by at least one source (as operating on voice/vocal separation from music – para 0047; examiner notes that applicants specification defines a one dimensional signal as, among other embodiments, sound that is recorded by a band with vocals – see applicants spec, para 0024), encoding the one-dimensional signal in levels to form a plurality of encoded signals (as, in the contracting path of the U-net, encoding the signals –para 0070), 
wherein the encoded signal output by an encoding block of each level serves as an input of the encoding block of a next level, and the encoded signals output by the encoding blocks of different levels have different lengths (as, in the encoding path, contracting the feature length block in half, on each turn,via convolution -- para 0070, and figure 5; examiner notes that applicants spec discloses a “U-net” architecture performing the encoding aspect – see applicants spec, para 0025; Jansson (20200043517) para 0070 discloses such structure of a “U-net” architecture) ;
 and decoding the encoded signals in levels to obtain a signal generated by at least one source to be separated in the at least one source, wherein the encoded signal of a low level is subjected to time-to-depth conversion (as, performing the decoding (expansion) back to the original sample length – para 0049, wherein the encoder compacts/contracts to a low level then expansion, via decoder, to a high resolution output – para 0050)
to form a multi-dimensional signal having the same length as a decoded signal of a high level, the high level corresponding to the encoding block of a next level of the low level (as, the encoder performs downsampling and feature compression – para 0070, along with expansion on the decoder level –para 0050; also allowing for skip connections between the layers – see fig. 5, wherein on the downpath, the feature space is halved via convolution, and expanded on the up path – however, the skip connection provide a path from encoding to decoding – as an example, fig. 5, subblock 502d to 504 d – wherein the dimensionality is maintained; or, following the full path, 502a to 506, then to 504a-n, the original dimensionality is re-established), 
and the multi-dimensional signal of the low level and the decoded signal of the high level are combined to preserve a receptive field and perform a decoding operation, the decoded signal being an output or input of the decoding operation (as, fig. 5, subblock 504( e ) is a combination of the encoded 502 ( c) as well as the fully encoded path to 502(n) to 506 to 504a to 504(e )).

As per claim 2, Jansson (20200043517) teaches the source separation method according to claim 1, wherein forming the multi-dimensional signal comprises: 
equally dividing a channel in the encoded signal of the low level into a plurality of signal groups (as generating a plurality of signal groups – separating the mix of instruments into separate sound sets – para 0075); sequentially combining input features at the same position in the signal groups into a plurality of one-dimensional second encoded signals (as feature combining using a content manager – para 0120; the features are extracted using the U-net architecture, as defined in Fig. 5 and performed extraction and concatenation of features – see para 0074); and combining the second encoded signals to form the multi-dimensional signal (as, using concatenation on the feature space, cumulative to subblock 502(n) at the end of the encoding process – see para 0074 – supplied to the decoding side 504).

As per claim 3, Jansson (20200043517) teaches the source separation method according to claim 1, wherein decoding the encoded signals in levels comprises: 
changing, in the decoding block of at least one level, a dimension of the decoded signal of the decoding block of the level through depth-to-time conversion, or subjecting the decoded signal output by the decoding block of at least one level to depth-to-time conversion (applicants spec defined depth-to-time conversion, as a space to depth in a time series;  -- Jansson (20200043517) teaches operating in the spectrogram information of the signal – para 0075, which is a time series of frequency/amplitude data by definition, and the encoder/decoder operates such image time slices – para 0074), 
 to form a second decoded signal having the same length as the one-dimensional signal (as performing decoding path such that the length on the decoding end – as an example, fig. 5, subblock 508, being the end of the decoding path, having the same dimensionality as the input signal found in subblock 502a) .

As per claim 4, Jansson (20200043517) teaches the source separation method according to claim 1, wherein decoding the encoded signals in levels comprises: 
obtaining a mask according to the encoded signal output by the encoding block of a highest level (as deriving a mask from the estimated magnitude spectrum from the input signal to the encoder – para 0075), 
wherein the mask relates to filtering a time segment in which the at least one source to be separated has no output (as, using the mask to filter/separate an instrument from a mix of instruments – para 0075);
 and filtering, according to the mask, the decoded signal output by the decoding block of a lowest level (as, performing the mask on the low level – on layer 504 ( n) – para 0075, and fig. 5, subblock 504n – where it is discussed, performing the masking to generate a complex spectrogram independent of the amplitude of the original signal – para 0075).

As per claim 5, Jansson (20200043517) teaches the source separation method according to claim 1, wherein decoding the encoded signals in levels comprises: changing a weight of a convolution kernel based on the encoded signal output by the encoding block of a highest level and the decoded signal output by the decoding block of a lowest level (as, defining the kernel size – para 0071, and provision for altering the weighting/kernel – para 0143).

As per claim 7, Jansson (20200043517) teaches the source separation method according to claim 1, wherein encoding the one-dimensional signal in levels to form the encoded signals comprises: performing downsampling processing in the encoding block of at least one level (as performing downsampling on multiple layers – para 0070) according to a depth separable convolution having a stride greater than one (as, performing the downsampling from layer to layer, para 0070, and more than one layer/stride – see fig. 5, path 502 ( c) to 502 (n ); applicants spec defines the stride to be at least one level – see para 0029, with a stride of 2, the 1x16384 features are reduced to 1x8194; Jansson (20200043517) teaches halving on each downsampling on the encoding side – para 0070 – the number of feature channels is halved; see also para 0071 for stride length).

	Claims 8-12, 14 are apparatus claims that perform the method steps of claims 1-5, 7 above and as such, claims 8-12, 14 are similar in scope and content to method claims 1-5, 7 above and therefore, claims 8-12, 14 are rejected under similar rationale as presented against claims 1-5,7 above.  Furthermore, Jansson (20200043517) teaches processor/processing devices – para 0091, with memory – para 0092, 0093, 0095, 0096).

	Claim 15 is a non-transitory computer readable medium claim storing code that is loaded and executed by a processor, performing the method steps of claim 1 above and as such, claim 15 is similar in scope and content to claim 1 above and therefore, claim 15 is rejected under similar rationale as presented against claim 1 above.  Furthermore, Jansson (20200043517) teaches processor/processing devices – para 0091, with memory – para 0092, 0093, 0095, 0096). 

Claim Rejections - 35 USC § 103

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 6,13 are rejected under 35 U.S.C. 103 as being unpatentable over Jansson (20200043517) in view of Huang et al(20180260956).

As per claims 6,13, Jansson (20200043517) teaches the source separation method according to claim 1, wherein encoding the one-dimensional signal in levels to form the encoded signals comprises: performing, in the encoding block of at least one level, Jansson (20200043517), fig 5, encoding block by block, left side progression, and as discussed above in claims 1-5,7).  Jansson (20200043517) however, does not explicitly teach Atrous Spatial Pyramid Pooling in the encoding/downsampling space; Huang et al(20180260956) teaches the expansion of dilated convolutions to Atrous Spatial Pyramid Pooling, especially in a semantic segmentation framework and audio generation (para 0004).  Therefore, it would have been obvious to one of ordinary skill in the art of convolutional networks to modify the convolutional kernels as taught in Jansson (20200043517) with dilated convolutional kernels in a ASPP structure, as taught by Huang et al(20180260956), because it would advantageously enlarge the field of convolutional kernels thereby improving feature extraction, in the application of audio generation (Huang et al(20180260956), para 0004).    

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Please see related art listed on the PTO-892 form.
As to applicants atrous convolution (also known as dilated convolution):
Kameoka et al (20200395036) teaches convolution dilation in sound source separation applications -para 0002, 0060 
Mesgarani et al (20190066713) teaches sound separation – para 0005, using dilated convolutional networks – para 1112
Gan et al (20200342234) teaches sound separation (para 0024, 0055) extracted from video signals, using dilated convolutions – para 0025) 

As to max pooling:

Matsikawa (20200036354) teaches max pooling schemes – para 0054, in sound source separation applications – para 0037 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to Michael Opsasnick, telephone number (571)272-7623, who is available Monday-Friday, 9am-5pm. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Mr. Richemond Dorvil, can be reached at (571)272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

/Michael N Opsasnick/Primary Examiner, Art Unit 2658                                                                                                                                                                                                        07/30/2022