DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Notice of Pre-AIA  or AIA  Status
This office action is in response to communications on 03/04/2021. Claims 1-20 are pending, and likewise Claims 1-20 have been examined.

Information Disclosure Statement
The information disclosure statements (IDS) submitted on 09/16/2022 and 09/20/2022 are in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Specification
The disclosure is objected to because of the following informalities:
Para [0040], Ln 3-4, “bin normalizer 214” should be “210”.
Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.



Claims 4-7, 11-13 and 17-20 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claims 4, 11 and 17 recites the limitation “the second subfingerprint” in the last line of the claim.  There is insufficient antecedent basis for this limitation in the claim. Two “second subfingerprints” have been established prior to this limitation, one on the line before, and one in the last limitation of the independent claim.

Claim 5, 12 and 18 recites the limitation “the second subfingerprint” in the last line of the claim.  There is insufficient antecedent basis for this limitation in the claim. Two “second subfingerprints” have been established prior to this limitation, one in Claim 4, 11 or 17 respectively, and one in the independent claim

Claim 7, 13 and 20 recites the limitation " the second subfingerprint " twice, in the second and third lines of the claim.  There is insufficient antecedent basis for this limitation in the claim. Two “second subfingerprints” have been established prior to this limitation, one in Claim 4, 11 or 17 respectively, and one in the independent claim.

Claims 6 and 19 are dependent on a claim containing insufficient antecedent basis, and are therefore also rejected for containing the same improper antecedent basis.



	Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-3, 8-10 and 14-16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Sharifi et al. (US 9202472 B1), and further in view of Coover et al. “A POWER MASK BASED AUDIO FINGERPRINT”, hereinafter Coover.

Regarding Claim 1:
Sharifi teaches an apparatus comprising: an audio segmenter to divide an audio signal into a plurality of audio segments including a first audio segment, a second audio segment temporally after and adjacent to the first audio segment, and a third audio segment temporally after and adjacent to the second audio segment(Col 4, Ln 57-60, Transform component 206 can be configured to transform the audio clip received by input component 204 into a time-frequency representation (similar to time-frequency spectrogram 102 of FIG. 1). Col 8, Ln 53-56, a two-dimensional window parallel with the time-frequency plane of the audio clip's time-frequency representation and substantially centered at the interest point to be normalized. Interest point would be second, and adjacent points on the time axis in the window are first and third); 
a bin normalizer to normalize the second audio segment to thereby create a first normalized audio segment, the normalization based on first audio characteristics of the first audio segment, second audio characteristics of the second audio segment, and third audio characteristics the third audio segment(Col 8, Ln 39-42, normalized by their respective neighborhoods. Col 8, Ln 45-50, computing a mean magnitude across a time-frequency window centered or substantially centered at the interest point. Col 8, Ln 53-56, a two-dimensional window parallel with the time-frequency plane of the audio clip's time-frequency representation and substantially centered at the interest point to be normalized); 
a subfingerprint generator to generate a first subfingerprint from the first normalized audio segment, (Col 5, Ln 23-25, generate a descriptor for the received audio clip based on the interest point data. Descriptor is subfingerprint Col 2, Ln 45-50, The encoder may also combine this descriptor with descriptors derived in a similar manner for other subsets of interest points within the audio clip to create a composite identifier that uniquely identifies the audio clip).
Sharifi does not teach the first subfingerprint including a first portion corresponding to a location of an energy extremum in the normalized second audio segment;
a portion strength evaluator to determine a likelihood of the first portion to change based on changes to at least one of the first audio characteristics, the second audio characteristics, or the third audio characteristics;
and a portion replacer to, in response to determining the likelihood does not satisfy a threshold, replace the first portion with a second portion to thereby generate a second subfingerprint.
In the same field of Audio fingerprinting Coover teaches the first subfingerprint including a first portion corresponding to a location of an energy extremum in the normalized second audio segment(Pg 2, 3. The Power Mask Based Fingerprint, Para 1, Ln 7-9, Similarly, fingerprint bits extracted from the “strong-bit region”, where the absolute power differences of the audio spectrum are large, are referred to as “strong bits”. Pg 2, 3.1. The Power Mask, Para 1, Ln 4-5, sub-fingerprint of 32 bits)
a portion strength evaluator to determine a likelihood of the first portion to change based on changes to at least one of the first audio characteristics, the second audio characteristics, or the third audio characteristics(Pg 2, 3.2. Matching with the Power Mask, Para 2, Ln 1-2, A strong bit is more noise resistant than a weak bit due to its large absolute power difference. Pg 2, 3.1. The Power Mask, Para 1, Ln 8-10, denote the absolute power difference…: Eq (4)); 
and a portion replacer to, in response to determining the likelihood does not satisfy a threshold, replace the first portion with a second portion to thereby generate a second subfingerprint(Pg 2, 3.1. The Power Mask, Para 1, Ln 4-6, For each sub-fingerprint of 32 bits, a Power Mask is a second 32-bit number, which encodes a strong bit by 1 and a weak bit by 0. Pg 2-3, 3.2. Matching with the Power Mask, Para 4, Ln 3-6, strong bits per sub-fingerprint. By only selecting the number of bits that have an absolute difference value that is greater than an adaptive threshold. Below threshold are weak bits, set to zero in the power mask(replaced)).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify Sharifi with the power masking of Coover, as it improves noise resistance(Abstract, Ln 13-14).

Regarding Claim 2:
The combination of Sharifi and Coover teaches the apparatus of claim 1, but Coover does not teach wherein the portion replacer is to, in response to determining the likelihood does not satisfy a strength threshold, exclude the first portion when matching query subfingerprints to the first subfingerprint.
In the same field of Audio Fingerprinting, Coover teaches wherein the portion replacer is to, in response to determining the likelihood does not satisfy a strength threshold, exclude the first portion when matching query subfingerprints to the first subfingerprint(Pg 1-2, 2. The Philips Fingerprint, Para 4(last), Ln 6-8, Power Mask to the matching process, which weights different bits based on their relevance to the fingerprint. Pg 2, 3.1. The Power Mask, Para 1, Ln 4-6, For each sub-fingerprint of 32 bits, a Power Mask is a second 32-bit number, which encodes a strong bit by 1 and a weak bit by 0. A weight of zero would ignore the first portion).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify the combination of Sharifi and Coover with the power masking of Coover, as it improves noise resistance(Abstract, Ln 13-14).

Regarding Claim 3:
The combination of Sharifi and Coover teaches the apparatus of claim 1, and Sharifi teaches further including a signal transformer to transform the audio signal into a frequency domain to thereby generate a first group of time-frequency bins corresponding to the first audio segment, a second group of time-frequency bins corresponding to the second audio segment, and a third group of time-frequency bins corresponding to the third audio segment(Col 8, Ln 51-60, normalize component…can define a two-dimensional window parallel with the time-frequency plane of the audio clip's time-frequency representation and substantially centered at the interest point to be normalized. Col 7, Ln 62-64, normalized by its neighborhood, as defined by the time-frequency window. & Fig 1. Normalization would include both neighbors in time and frequency dimensions); 
and wherein the normalizing of the second audio segment includes normalizing a time- frequency bin of the second group of time-frequency bins based on a surrounding region of time-frequency bins, the surrounding region of time-frequency bins including ones of the first group of time-frequency bins and ones of the second group of time-frequency bins(Col 8, Ln 51-60, The normalize component 506 can then compute the mean magnitude within the window, and divide the magnitude of the interest point being normalized by this mean magnitude to yield the strength value (i.e., the normalized magnitude value) of the interest point. & Fig 1. Normalization would include both neighbors in time and frequency dimensions).

Regarding Claim 8:
Sharifi teaches a method comprising: dividing an audio signal into a plurality of audio segments including a first audio segment, a second audio segment temporally after and adjacent to the first audio segment, and a third audio segment temporally after and adjacent to the second audio segment(Col 4, Ln 57-60, Transform component 206 can be configured to transform the audio clip received by input component 204 into a time-frequency representation (similar to time-frequency spectrogram 102 of FIG. 1). Col 8, Ln 53-56, a two-dimensional window parallel with the time-frequency plane of the audio clip's time-frequency representation and substantially centered at the interest point to be normalized. Interest point would be second, and adjacent points on the time axis in the window are first and third); 
normalizing the second audio segment to thereby create a first normalized audio segment, the normalization based on first audio characteristics of the first audio segment, second audio characteristics of the second audio segment, and third audio characteristics the third audio segment(Col 8, Ln 39-42, normalized by their respective neighborhoods. Col 8, Ln 45-50, computing a mean magnitude across a time-frequency window centered or substantially centered at the interest point. Col 8, Ln 53-56, a two-dimensional window parallel with the time-frequency plane of the audio clip's time-frequency representation and substantially centered at the interest point to be normalized); 
generating a first subfingerprint from the first normalized audio segment(Col 5, Ln 23-25, generate a descriptor for the received audio clip based on the interest point data. Descriptor is subfingerprint Col 2, Ln 45-50, The encoder may also combine this descriptor with descriptors derived in a similar manner for other subsets of interest points within the audio clip to create a composite identifier that uniquely identifies the audio clip).
 Sharifi does not teach the first subfingerprint including a first portion corresponding to a location of an energy extremum in the normalized second audio segment; 
determining a likelihood of the first portion to change based on changes to at least one of the first audio characteristics, the second audio characteristics, or the third audio characteristics; 
and in response to determining the likelihood does not satisfy a threshold, replacing the first portion with a second portion to thereby generate a second subfingerprint.
In the same field of Audio Fingerprinting, Coover teaches the first subfingerprint including a first portion corresponding to a location of an energy extremum in the normalized second audio segment(Pg 2, 3. The Power Mask Based Fingerprint, Para 1, Ln 7-9, Similarly, fingerprint bits extracted from the “strong-bit region”, where the absolute power differences of the audio spectrum are large, are referred to as “strong bits”. Pg 2, 3.1. The Power Mask, Para 1, Ln 4-5, sub-fingerprint of 32 bits); 
determining a likelihood of the first portion to change based on changes to at least one of the first audio characteristics, the second audio characteristics, or the third audio characteristics(Pg 2, 3.2. Matching with the Power Mask, Para 2, Ln 1-2, A strong bit is more noise resistant than a weak bit due to its large absolute power difference. Pg 2, 3.1. The Power Mask, Para 1, Ln 8-10, denote the absolute power difference…: Eq (4)); 
and in response to determining the likelihood does not satisfy a threshold, replacing the first portion with a second portion to thereby generate a second subfingerprint(Pg 2, 3.1. The Power Mask, Para 1, Ln 4-6, For each sub-fingerprint of 32 bits, a Power Mask is a second 32-bit number, which encodes a strong bit by 1 and a weak bit by 0. Pg 2-3, 3.2. Matching with the Power Mask, Para 4, Ln 3-6, strong bits per sub-fingerprint. By only selecting the number of bits that have an absolute difference value that is greater than an adaptive threshold. Below threshold are weak bits, set to zero in the power mask(replaced)).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify Sharifi with the power masking of Coover, as it improves noise resistance(Abstract, Ln 13-14).

Regarding Claim 9:
Claim 9 contains similar limitations as Claim 2, and is therefore rejected for the same reasons.

Regarding Claim 10:
Claim 10 contains similar limitations as Claim 3, and is therefore rejected for the same reasons.

Regarding Claim 14:
Sharifi teaches A non-transitory computer readable medium comprising instructions which, when executed, cause a processor to(Col 5, Ln 53-59, processors…memory..instructions): 
divide an audio signal into a plurality of audio segments including a first audio segment, a second audio segment temporally after and adjacent to the first audio segment, and a third audio segment temporally after and adjacent to the second audio segment(Col 4, Ln 57-60, Transform component 206 can be configured to transform the audio clip received by input component 204 into a time-frequency representation (similar to time-frequency spectrogram 102 of FIG. 1). Col 8, Ln 53-56, a two-dimensional window parallel with the time-frequency plane of the audio clip's time-frequency representation and substantially centered at the interest point to be normalized. Interest point would be second, and adjacent points on the time axis in the window are first and third); 
normalize the second audio segment to thereby create a first normalized audio segment, the normalization based on first audio characteristics of the first audio segment, second audio characteristics of the second audio segment, and third audio characteristics the third audio segment(Col 8, Ln 39-42, normalized by their respective neighborhoods. Col 8, Ln 45-50, computing a mean magnitude across a time-frequency window centered or substantially centered at the interest point. Col 8, Ln 53-56, a two-dimensional window parallel with the time-frequency plane of the audio clip's time-frequency representation and substantially centered at the interest point to be normalized); 
generate a first subfingerprint from the first normalized audio segment(Col 5, Ln 23-25, generate a descriptor for the received audio clip based on the interest point data. Descriptor is subfingerprint Col 2, Ln 45-50, The encoder may also combine this descriptor with descriptors derived in a similar manner for other subsets of interest points within the audio clip to create a composite identifier that uniquely identifies the audio clip).
Sharifi does not teach the first subfingerprint including a first portion corresponding to a location of an energy extremum in the normalized second audio segment; 
determine a likelihood of the first portion to change based on changes to at least one of the first audio characteristics, the second audio characteristics, or the third audio characteristics; 
and in response to determining the likelihood does not satisfy a threshold, replace the first portion with a second portion to thereby generate a second subfingerprint.
In the same field of Audio fingerprinting Coover teaches the first subfingerprint including a first portion corresponding to a location of an energy extremum in the normalized second audio segment(Pg 2, 3. The Power Mask Based Fingerprint, Para 1, Ln 7-9, Similarly, fingerprint bits extracted from the “strong-bit region”, where the absolute power differences of the audio spectrum are large, are referred to as “strong bits”. Pg 2, 3.1. The Power Mask, Para 1, Ln 4-5, sub-fingerprint of 32 bits); 
determine a likelihood of the first portion to change based on changes to at least one of the first audio characteristics, the second audio characteristics, or the third audio characteristics(Pg 2, 3.2. Matching with the Power Mask, Para 2, Ln 1-2, A strong bit is more noise resistant than a weak bit due to its large absolute power difference. Pg 2, 3.1. The Power Mask, Para 1, Ln 8-10, denote the absolute power difference…: Eq (4)); 
and in response to determining the likelihood does not satisfy a threshold, replace the first portion with a second portion to thereby generate a second subfingerprint(Pg 2, 3.1. The Power Mask, Para 1, Ln 4-6, For each sub-fingerprint of 32 bits, a Power Mask is a second 32-bit number, which encodes a strong bit by 1 and a weak bit by 0. Pg 2-3, 3.2. Matching with the Power Mask, Para 4, Ln 3-6, strong bits per sub-fingerprint. By only selecting the number of bits that have an absolute difference value that is greater than an adaptive threshold. Below threshold are weak bits, set to zero in the power mask(replaced)).
It would have been obvious for one skilled in the art, at the effective time of filling, to modify Sharifi with the power masking of Coover, as it improves noise resistance(Abstract, Ln 13-14).

Regarding Claim 15:
Claim 15 contains similar limitations as Claim 2, and is therefore rejected for the same reasons.

Regarding Claim 16:
Claim 16 contains similar limitations as Claim 3, and is therefore rejected for the same reasons.

Allowable Subject Matter
Claims 4-7, 11-13 and 17-20 objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims, as well as having the rejections under U.S.C. 112(b) overcome.

The following is a statement of reasons for the indication of allowable subject matter:  

Claim 4 recites “replacing the first audio segment with a fourth audio segment; normalizing the second audio segment to thereby create a second normalized audio segment based on second audio characteristics of the fourth audio segment and the third audio segment; generating a second subfingerprint from the normalized second audio segment; and determining if the second subfingerprint includes the first portion”. These limitations are not taught by the prior art of record alone or in combination.

Claims 11 and 17 contain similar limitations as Claim 4 and therefore contain allowable subject matter for the same reasons.

Claims 5-7, 12-13 and 18-20 also contain allowable subject matter, as they are dependent on a claim containing allowable subject matter.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.

Ondel et al. “MASK+:DATA-DRIVEN REGIONS SELECTION FOR ACOUSTIC FINGERPRINTING”
Audio fingerprinting using a mask fingerprint.

	Anguera et al. “MASK: Robust Local Features for Audio Fingerprinting”
Audio fingerprinting using a mask fingerprint.

	Yao et al. “Audio Identification by Sampling Sub-fingerprints and Counting Matches”
Audio Fingerprint using sub-fingerprints and bits.

	Haitsma et al. “A Highly Robust Audio Fingerprinting System”
Audio Fingerprint using sub-fingerprints and bits.

	李 根(google translate from Japanese: Ri Ne) et al. (JP 2020527255 A)
Audio fingerprinting using masks and bits.

	Dong et al. (TW 201640492 A)
Audio Fingerprinting with Hamming distance for robustness comparison.

	Dumont et al. (US 9299350 B1)
Updating personal Audio Fingerprints.

	Bauer et al. (US 20130279740 A1)
Audio Fingerprinting with bits and bit strength.

	Selby et al. (US 20110307085 A1)
Audio Fingerprinting with bit robustness

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALEXANDER G MARLOW whose telephone number is (571)272-4536. The examiner can normally be reached Monday - Thursday 10:00 am - 8:00 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richmond Dorvil can be reached on (571)272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ALEXANDER G MARLOW/Assistant Examiner, Art Unit 2658                                                                                                                                                                                                        

/RICHEMOND DORVIL/Supervisory Patent Examiner, Art Unit 2658