DETAILED ACTION


In the response to this office action, the examiner respectfully requests that support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line numbers in the specification and/or drawing figure(s). This will assist the examiner in prosecuting this application.


Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Claim Objections
Claims 1-8 and 10-21 are objected to because of the following informalities:  

Claim 1 states “and a known geometry of the microphone array” which should be something like “and a known geometry of the one or more microphone arrays”
Claims 20 and 21 are objected to in an analogous manner.

Claim 18 states “causing the vehicle to initiate a comfort stop is initiated” which should be “causing the vehicle to initiate a comfort stop”.

Claim 20 should have an “and” before “memory” because it is the last item in a list of items.

Claim 20 states “capturing, by one or more microphone arrays” which should be “capturing, by the one or more microphone arrays”.  

Claim 20 states “extracting, using one or more processors” which should be “extracting, by the one or more processors”.  
Claim 21 is objected to in an analogous manner.

Claims 2-8 and 10-19 are objected as inheriting the same problems as above.
Appropriate correction is required.


Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/process/file/efs/guidance/eTD-info-I.jsp.

Claims 1-8 and 10-21 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-18 and 37 of U.S. Patent No. 11295757.  Although the claims at issue are not identical, they are not patentably distinct from each other because the present claims are broader versions of the patent’s claims or contain only obvious differences from the patent’s claims.

Claim 1 of present application
Patent 11295757
1. (Currently Amended) A method comprising:
1. A method comprising:
capturing, by one or more microphone arrays of a vehicle, sound signals in an environment; 
capturing, by one or more microphone arrays of a vehicle, sound signals in an environment;
 extracting, using one or more processors, frequency spectrum features from the sound signals;
extracting, using one or more processors, frequency spectrum features from the sound signals;
predicting, using an acoustic scene classifier and the frequency spectrum features, one or more siren signal classifications, wherein the acoustic scene classifier predicts labels that indicate the presence of one or more of a plurality of different types of siren signals;
predicting, using an acoustic scene classifier and the frequency spectrum features, one or more siren signal classifications; 
converting, using the one or more processors, the one or more siren signal classifications into one or more siren signal event detections; 
converting, using the one or more processors, the one or more siren signal classifications into one or more siren signal event detections;
computing time delay of arrival estimates for the one or more detected siren signals; 
computing time delay of arrival estimates for the one or more siren signal event detections;
estimating, using the one or more processors, one or more bearing angles to one or more sources of the one or more detected siren signals using the time delay of arrival estimates and a known geometry of the microphone array; and 
estimating, using the one or more processors, one or more bearing angles to one or more sources of the one or more siren signal event detections using the time delay of arrival estimates and a known geometry of the one or more microphone arrays; and
tracking, using a Bayesian filter, the one or more bearing angles.
tracking, using a Bayesian filter, the one or more bearing angles,

wherein predicting, using an acoustic scene classifier and the frequency spectrum features, one or more siren signal classifications, further comprises continuously predicting labels indicating the presence or absence of the one or more siren signals and their respective start and end times.


Regarding claims 2-6, the limitations of claims 2-6 are found in claims 2-6 of patent 11295757.  
Regarding claim 7, the limitations of claim 7 are found in claim 8 of patent 11295757.  
Regarding claim 8, the limitations of claim 8 are found in claim 7 of patent 11295757.  
Regarding claims 10-18, the limitations of claims 10-18 are found in claims 9-17 of patent 11295757.  

Regarding claim 19, patent 11295757 discloses wherein the acoustic scene classifier predicts labels that indicate the presence of one or more of a plurality of different types of siren signals based, at least in part the frequency spectrum features as above in claim 1. 
Although patent 11295757 does not expressly disclose what the frequency spectrum features include, the examiner takes official notice that frequency spectrums of audio signals represent the power of oscillations (of a media such as air) along the audio frequency spectrum by definition, which was well known in the art.  At the time of filing, it would have been obvious to one of ordinary skill in the art to use some of these oscillation/frequency correlations for the frequency spectrum features in the system of claim 1 of patent 11295757 for the benefit of reducing the amount of information used in evaluation.  Therefore it would have been obvious to further comprise wherein the frequency spectrum features comprises an oscillation frequency of the sound signals, and the acoustic scene classifier predicts labels that indicate the presence of one or more of a plurality of different types of siren signals based, at least in part, on the oscillation frequency.

Regarding claims 20 and 21, the limitations of claims 20 and 21 are found in claims 18 and 37 of patent 11295757.  


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1-8 and 10-21 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claim 1 states “the one or more siren signal classifications into one or more siren signal event detections; computing time delay of arrival estimates for the one or more detected siren signals” which is unclear. It is unclear if these are talking about the same thing. The wording should be consistent. What is detected are events.  Perhaps “the one or more siren signal classifications into one or more siren signal event detections; computing time delay of arrival estimates for the one or more detected siren signal events” could be used.
Claims 20 and 21 are rejected in an analogous manner.
Claims 2-8, and 10-19 are rejected as inheriting the problems as above.  

Similar to the above, claim 8 states “difference of the one or more detected siren signals at each microphone pair in the microphone array’. It is unclear if “detected siren signal events” should be used.  It is unclear if this is intending to mean each microphone pair throughout all arrays, for only a specific array, or for each pair within each microphone’s own array. Also, this may be missing a “one or more” as above.  Clarification in needed.

Claim 1 states “wherein the acoustic scene classifier predicts labels that indicate the presence of one or more of a plurality of different types of siren signals” which is unclear.  It is unclear if this is intended to mean the classifier predicts labels that indicate the presence of different types of siren signals (outputs labels for each type) or mean it may (if one chosen from one or more) only output labels that indicate one type (i.e. outputs label for fire siren, fire siren from a group such as fire, police, and ambulance sirens).  
Claims 20 and 21 are rejected in an analogous manner.
Claims 2-8, and 10-19 are rejected as inheriting the problems as above.  

Claim 11 states “wherein the bearing angles are used to triangulate the location of the sound source”. It is unclear what this means when the “one or more bearing angles” of claim 1 is only one. Perhaps the following could be used: “wherein a plurality of the one or more bearing angles are used to triangulate the location of the sound source”.
Claims 12-18 are rejected as inheriting the problems of the above.

Claim 13 recites the limitation "the emergency vehicle". There is insufficient antecedent basis for this limitation in the claim. Only a “vehicle” was mentioned previously. It is unclear if this claim should be dependent on claim 12 or claim 11 as stated.
Claims 14-18 are rejected as being dependent on the above.

claims 15-17 have what looks like an abbreviation, “AV”, without expressly stating what “AV” stands for.  Looking at the parent case, it looks like some claim language may be missing.

Claim 12 refers to “the vehicle” however multiple vehicles have been mentioned.  It is suggested that “a vehicle” of claim 1 be more descriptive or have claim 12 expressly state something along the lines of  “where the vehicle is a first vehicle” in the beginning of the claim. 
Claim 13 is rejected in an analogous manner.
Claims 14-18 are rejected as inheriting the problems as above.

Claim 11 refers to “the sound source” however only “one or more sources of the one or more detected siren signals” was previously mentioned.  It is unclear how best to clear up the different terminology, but they should be consistent.  Applicant is asked to clarify.
Claims 12-18 are rejected as inheriting the problems as above.  


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-8, 10 and 19-21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Akotkar et al. (US 20190049989 A1) in view of Silver et al. (US 20180374347 A1).  

Regarding claim 1, Akotkar discloses a method comprising: 
capturing, by one or more microphone arrays (figure 1 item 108, figure 2 item 201) of a vehicle (figures 1 and 2 item 102), sound signals in an environment (such as 104 of figure 1); 
extracting (via 110), using one or more processors (220 of figure 2, 502 of figure 5), frequency spectrum features from the sound signals (paragraph [0017]); 
predicting (via 112), using an acoustic scene classifier and the frequency spectrum features, one or more siren signal classifications (see abstract and at least paragraphs [0017], [0020], [0027], and [0058]), wherein the acoustic scene classifier predicts labels that indicate the presence of one (“alarm signal” from emergency vehicle 106 of figure 1) or more of a plurality of different types of siren signals (the emergency vehicle being one of many types of sirens, i.e. tornado siren, home alarm siren, etc.); 
converting (via 114), using the one or more processors, the one or more siren signal classifications into one or more siren signal event detections (see abstract and at least paragraphs [0017], [0020], [0027], and [0058]); and 
estimating a location of a source of the siren signal (paragraph [0017], “determine a location from which the alarm signal and thus emergency vehicle may be approaching”).  
Akotkar does not expressly disclose use of time delays or bearing angles or the claimed tracking.
Silver discloses capturing, by one or more microphone arrays (figures 3A-3D, mics 152a-152d, see paragraph [0040] for additional arrays) of a vehicle (figure 1 item 100, seen in figures 3A-3D and figure 4), siren sound signals in an environment (see paragraph [0019]); 
computing time delay of arrival estimates for the one or more detected siren signals (“to compute direction from the relative phase of the sound waves that reach each microphone or rather the time difference of arrival”, paragraph [0039]); 
estimating, one or more bearing angles (“to compute direction from the relative phase”, paragraph [0039]) to one or more sources of the one or more detected siren signals using the time delay of arrival estimates and a known geometry of the microphone array (see paragraph [0039]); and 
tracking, using a Bayesian filter (via Kalman filter, paragraph [0045], see claim 6 below as to applicant’s admission that a Kalman filter is an implementation of a Bayesian filter), the one or more bearing angles (see paragraph [0045]).
At the time of filing, it would have been obvious to a person of ordinary skill in the art to use the time delay/angle calculations and tracking of Silver in the system of Akotkar for the benefit of accurately calculating locally where the sound is coming from without the need of external systems and then keeping track of its location.  Therefore, it would have been obvious to combine Silver with Akotkar to obtain the invention as specified in claim 1.

Regarding claim 2, Silver discloses wherein the time delay of arrival estimates are computed using a maximum likelihood criterion obtained by implementing a generalized cross correlation method (“algorithms such as a generalized cross correlation phase transform”, see paragraph [0045]).

Regarding claim 3, Silver discloses estimating one or more ranges of the one or more siren signal sources (see paragraphs [0041]-[0045], “One or more of the models described above may include learned models, for instance, those that utilize machine learning, such as classifiers. For instance, one or more classifiers may be used to detect the siren noise, estimate a bearing, estimate a range”) by applying triangularization to the one or more bearing angles (paragraph [0042], second model, timing of siren reaching each microphone is used to provide likely bearing).

Regarding claim 4, Akotkar discloses wherein transforming sound signals into frequency spectrum features includes generating one of a spectrogram, mel-spectrogram or mel-frequency cepstral coefficients (MFCC) (see paragraphs [0025] and [0060]).

Regarding claim 5, Akotkar discloses wherein the acoustic scene classifier is implemented at least in part using a convolutional neural network (CNN) (see paragraph [0029]).

Regarding claim 6, Silver discloses wherein the Bayesian filter is one of a Kalman filter (see paragraph [0045]), extended Kalman filter (EKF), unscented Kalman filter or particle filter.

Regarding claim 7, Akotkar discloses wherein predicting, using an acoustic scene classifier and the frequency spectrum features, one or more siren signal classifications, further comprises continuously (see figure 4, networks continuously running, see paragraphs [0025] to [0027], “MFCC's extracted every 10 ms”) predicting labels indicating the presence or absence of the one or more types of siren signals (as above).

Regarding claim 8, Silver discloses wherein the one or more bearing angles are estimated by using a spatio-temporal difference of the one or more detected siren signals at each microphone pair in the microphone array (see paragraph [0042]).

Regarding claim 10, although Akotkar does not expressly disclose the claimed names of the sirens (or alarm sound), it would have been obvious to the designer than any type of alarm or siren may be recognized and that they may name them whatever they desire, at their preference.  Therefore at the time of filing, it would have been obvious to one of ordinary skill in the art to further comprise wherein the different types of siren signals include one or more of wailing, yelp, hi-lo, rumbler, chirp, pulsar, localizer, and mechanical wail siren signals.  

Regarding claim 19, Akotkar discloses wherein the acoustic scene classifier predicts labels that indicate the presence of one or more of a plurality of different types of siren signals based, at least in part the frequency spectrum features as above in claim 1. 
Although Akotkar does not expressly disclose what the frequency spectrum features include, the examiner takes official notice that frequency spectrums of audio signals represent the power of oscillations (of a media such as air) along the audio frequency spectrum by definition, which was well known in the art.  At the time of filing, it would have been obvious to one of ordinary skill in the art to use some of these oscillation/frequency correlations for the frequency spectrum features in the system of Akotkar for the benefit of reducing the amount of information used in evaluation.  Therefore it would have been obvious to further comprise wherein the frequency spectrum features comprises an oscillation frequency of the sound signals, and the acoustic scene classifier predicts labels that indicate the presence of one or more of a plurality of different types of siren signals based, at least in part, on the oscillation frequency.


Regarding claim 20, Akotkar discloses a vehicle comprising: 
one or more microphone arrays (figure 1 item 108, figure 2 item 201); 
one or more processors (220 of figure 2, 502 of figure 5);
memory (figure 5 item 506) storing instructions (522) that when executed by the one or more processors, cause the one or more processors to perform operations (see at least paragraphs [0034]-[0039]) comprising: 
capturing, by the one or more microphone arrays, sound signals (such as 104 of figure 1) in an environment (figure 1); 
extracting (via 110) frequency spectrum features from the sound signals (paragraph [0017]); 
predicting (via 112), using an acoustic scene classifier and the frequency spectrum features, one or more siren signal classifications (see abstract and at least paragraphs [0017], [0020], [0027], and [0058]), wherein the acoustic scene classifier predicts labels that indicate the presence of one (“alarm signal” from emergency vehicle 106 of figure 1) or more of a plurality of different types of siren signals (the emergency vehicle being one of many types of sirens, i.e. tornado siren, home alarm siren, etc.); 
converting (via 114) the one or more siren signal classifications into one or more siren signal event detections (see abstract and at least paragraphs [0017], [0020], [0027], and [0058]); 
estimating a location of a source of the siren signal (paragraph [0017], “determine a location from which the alarm signal and thus emergency vehicle may be approaching”).  
Akotkar does not expressly disclose use of time delays or bearing angles or the claimed tracking.
Silver discloses capturing, by one or more microphone arrays (figures 3A-3D, mics 152a-152d, see paragraph [0040] for additional arrays) of a vehicle (figure 1 item 100, seen in figures 3A-3D and figure 4), siren sound signals in an environment (see paragraph [0019]); 
computing time delay of arrival estimates for the one or more detected siren signals (“to compute direction from the relative phase of the sound waves that reach each microphone or rather the time difference of arrival”, paragraph [0039]); 
estimating, one or more bearing angles (“to compute direction from the relative phase”, paragraph [0039]) to one or more sources of the one or more detected siren signals using the time delay of arrival estimates and a known geometry of the microphone array (see paragraph [0039]); and 
tracking, using a Bayesian filter (via Kalman filter, paragraph [0045], see claim 6 below as to applicant’s admission that a Kalman filter is an implementation of a Bayesian filter), the one or more bearing angles (see paragraph [0045]).
At the time of filing, it would have been obvious to a person of ordinary skill in the art to use the time delay/angle calculations and tracking of Silver in the system of Akotkar for the benefit of accurately calculating locally where the sound is coming from without the need of external systems and then keeping track of its location.  Therefore, it would have been obvious to combine Silver with Akotkar to obtain the invention as specified in claim 20.

Claim 21 is rejected in an analogous manner to claim 20.


Allowable Subject Matter
Claims 11-18 would be allowable if rewritten to overcome the claim objections, obvious double patenting rejections, and rejection(s) under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), 2nd paragraph, set forth in this Office action and to include all of the limitations of the base claim and any intervening claims.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DOUGLAS JOHN SUTHERS whose telephone number is (571)272-0563. The examiner can normally be reached M-F, 8 am -5 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vivian Chin can be reached on 571-272-7848. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DOUGLAS J SUTHERS/           Examiner, Art Unit 2654  

/VIVIAN C CHIN/           Supervisory Patent Examiner, Art Unit 2654