Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
This action is responsive to the Amendments and Remarks filed in the U.S. on 1/6/2022.  Claims 17-22, 25-31, and 34-40 are pending in the case. Claims 17 and 26 are written in independent form. Claims 1-16, 23-24, and 32-33 have been cancelled.
Applicant’s amendments and remarks filed on 1/6/2022 have been fully considered and were found to overcome the previously cited prior art, thus necessitating the new ground of rejection presented herein. Accordingly, THIS ACTION IS MADE FINAL.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that use the word “configured to” but are nonetheless not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph because the claim limitation(s) recite(s) sufficient structure, materials, or acts to entirely perform the recited function.  Such claim limitation(s) is/are:
a computer device comprising a “memory configured to store instructions” in claim 26;
Because this/these claim limitation(s) is/are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are not being interpreted to cover only the corresponding structure, material, or acts described in the specification as performing the claimed function, and equivalents thereof.
If applicant intends to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to remove the structure, materials, or acts that performs the claimed function; or (2) present a sufficient showing that the claim limitation(s) does/do not recite sufficient structure, materials, or acts to perform the claimed function.


Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 17-22, 26-31, and 35 are rejected under 35 U.S.C. 102(a)(2) as being anticipated by Ingrassia, JR. et al. (U.S. Pre-Grant Publication No. 2011/0295843, hereinafter referred to as Ingrassia).

Regarding Claim 17
Takano teaches a method for searching an audio, performed by a computer device, the method comprising an application having a music play function, the method comprising:
receiving, from the application, a trigger instruction for searching an audio;
Takano a teaches receiving a trigger instruction as “an indication made by the user for start of input of the user’s voice” which is to be used for searching audio data (Par. [0040]).
receiving, in response to the trigger instruction for searching the audio, trigger events via a search interface of the application;
Takano teaches receiving trigger events in the form of a singing voice of a user made up of notes having a pitch range, the receiving of the singing voice of the user being done in response to the indication made by the user for start of input of the user’s voice (Paras. [0041[ - [-0042]).
detecting a predetermined trigger event in the received trigger events, wherein the predetermined trigger event is input to the computer device based on content of the audio to be searched;
Takano teaches detecting predetermined events such as notes having a pitch which are part of the input singing voice (Paras. [0041]-[0042]), wherein the notes and pitch are used for searching audio (Paras. [0046]-[0047]).
recording, each time a predetermined trigger event is detected, a time point when the detected determined trigger event occurs;
Takano teaches “the search query includes, in time sequence, pitch differences that have been detected up to a current time point from a start of voice input” (Para. [0045]) where “each of time points t1 to t7 [in Fig. 7]] indicates a time point at which a pitch is determined as being stable (i.e. a tie point at which a new note is detected) for a corresponding one of periods D1 to D7” (Para. [0046]).
acquiring, when a predetermined end event is detected, all recorded time points to obtain a time point sequence;
Takano teaches “the search query includes, in time sequence, pitch differences that have been detected up to a current time point from a start of voice input” (Para. [0045]) where a predetermined end event being detected can be “a search query may be generated when an amount of data of the symbolized input voice exceeds a set threshold” or “a search query may be generated when a prescribed number of new pitch differences are detected in the input voice” or “a search query may be generated when input of the voice is terminated” (Para. [0095]) thereby teaching multiple examples of predetermined end events that can cause the time point sequences of collected notes/pitches to be acquired.
wherein the time point sequence reflects audio change characteristics of the audio to be searched;
Takano teaches the time point sequences of notes reflecting at least changes in pitch of the audio to be search (Paras. [0046] – [0047]).
selecting a target reference time sequence matching the time point sequence from a plurality of reference time sequences stored in a database, wherein each of the reference time sequence is a sequence formed of time information of a plurality of contiguous audio units in audio data of a song, the contiguous audio units are obtained by segmenting the audio data of the song, and the time information comprises a start time point or a time duration of an audio unit;
Takano teaches selecting a music track from among a plurality of music tracks in a database to be a matching object to the received search query (time sequence of notes and pitches) (Para. [0062]) wherein each of the music tracks is a contiguous time sequence of notes that make up the music track (Paras. [0074]-[0075]).  Takano further teaches segmenting the audio data of the song such that “the database also includes data in which the main melody of each music track is symbolized; for example, the melody of the main vocal section where the music track is a song” (Para. [0062]) thereby teaching a the segmented melody with a corresponding start time point of when the melody starts in the corresponding music track.
determining, based on a stored corresponding relationship between audio data and the reference time sequences, target audio data corresponding to the target reference time sequence as selected.
Takano teaches determining, based on a relationship of audio data segments, such as a melody corresponding to music tracks, target audio data corresponding to the time sequence matching the input search information sung by a user (Paras. [0045], [0062] and [0074]-[0075])
wherein the target audio data is audio data of the audio to be searched; and
Takano teaches that the target audio data is music track audio data to be searched based on the input search information sung by a user (Paras. [0040]-[0042])
playing the target audio data by the application.
Takano teaches a karaoke system playing back the matching music track (Para. [0088]) and further teaches that “the application example of the music track search system 1 is not limited to a karaoke system. For example, the music track search system may be applied to a music rack search in a music distribution service provided via a network, or to a music track search in a music player” (Para. ]0107]).

Regarding Claim 18
Takano further teaches:
wherein the selecting a target reference time sequence matching the time point sequence from the plurality of reference time sequences stored in the databases comprises:
determining a difference degree between each pre-stored reference time sequence and the time point sequence respectively, and
Takano teaches determining calculating a matching matrix for all music tracks and the received search query (Para. [0066]) where the matching matrix is used to determine a difference degree, in the form of a score, between the search query and each of the music tracks and a smaller score indicates a stronger degree of similarity and thus a smaller degree of difference (Paras. [0063]-[0065]).
selecting a reference time sequence which has a minimum difference degree from the time point sequence as the target reference time sequence.
Takano teaches determining calculating a matching matrix for all music tracks and the received search query (Para. [0066]) where the matching matrix is used to determine a difference degree, in the form of a score, between the search query and each of the music tracks and a smaller score indicates a stronger degree of similarity and thus a smaller degree of difference (Paras. [0063]-[0065]). Takano further teaches in one embodiment, automatically selecting “the music track with the highest degree of similarity (lowest score)” (Para. [0086])

Regarding Claim 19
Takano further teaches:
wherein the determining the difference degree between each stored reference time sequence and the time point sequence comprises:
calculating an edit distance between each stored reference time sequence and the time point sequence; and
Takano teaches “a search is performed using a search algorithm for partial sequence matching based on an edit distance” (Para. [0038] & Fig. 1) thereby teaching calculating an edit distance between each stored reference time sequence and the time point sequence in the search query.
taking the edit distance between each stored reference time sequence and the time point sequence as the difference degree between each pre-stored reference time sequence and the time point sequence.
Takano teaches “at step S33, the server program calculates a matching matrix for the music track to be matched (specifically, the edit distance in each cell and the last distance (i.e., score) between the search query and the music track)” (Para. [0063]).

Regarding Claim 20
Takano further teaches:
wherein the detecting a predetermined trigger event comprises any one of:
detecting that the computer device is shaken;
detecting a touch signal in a predetermined region of a touch screen of the computer device;
acquiring a plurality of frames of images by an image capturing component of the computer device, and detecting an image of a predetermined user action in the plurality of frames of images; and
acquiring ambient audio data by an audio capturing component of the computer device, and identifying predetermined audio feature information in the ambient audio data.
Takano teaches acquiring ambient audio data in the form of a user’s singing voice as input (Para. [0027]) and detecting predetermined events such as notes having a pitch which are part of the input singing voice (Paras. [0041]-[0042]), wherein the notes and pitch are used for searching audio (Paras. [0046]-[0047]).


Regarding Claim 21
Takano further teaches:
wherein the audio unit is an audio segment corresponding to a note.
Takano teaches detecting predetermined events such as notes having a pitch which are part of the input singing voice (Paras. [0041]-[0042]), wherein the notes and pitch are used for searching audio (Paras. [0046]-[0047]).


Regarding Claim 22
Takano further teaches:
wherein the audio unit is an audio segment corresponding to a word in lyrics corresponding to the audio data.
Takano teaches “in addition to or alternative to identifiers and/or score of music tracks, there may be displayed information for specifying similar sections (e.g., musical scores or lyrics corresponding to similar sections)” (Para. [0068]) thereby teaching audio segments corresponding to lyrics corresponding to musical tracks.

Regarding Claim 26
All of the limitations herein are similar to some or all of the limitations of Claim 17.
Takano further teaches:
a processor (Para. [0032]); and
a memory configured to store instructions executable by the processor (Para. [0032]);
wherein the instructions, when executed by the processor, cause the processor to perform steps (Para. [0032]).

Regarding Claim 27
All of the limitations herein are similar to some or all of the limitations of Claim 18.

Regarding Claim 28
All of the limitations herein are similar to some or all of the limitations of Claim 19.

Regarding Claim 29
All of the limitations herein are similar to some or all of the limitations of Claim 20.

Regarding Claim 30
All of the limitations herein are similar to some or all of the limitations of Claim 21.

Regarding Claim 31
All of the limitations herein are similar to some or all of the limitations of Claim 22.

Regarding Claim 35
Takano further teaches:
a non-transitory computer-readable storage medium (Para. [0032]),
wherein the non-transitory computer-readable storage medium stores at least one instruction, at least one program, a code set or an instruction set, the at least one instruction, the at least one program, the code set or the instruction set being executed and loaded by a processor to perform the method for searching an audio (Para. [0032]-[0033]).

Regarding Claim 36
Takano further teaches:
wherein the start time point of each audio unit comprises:
start time points of audio segments corresponding to each note of the audio unit, or
Takano teaches selecting a music track from among a plurality of music tracks in a database to be a matching object to the received search query (time sequence of notes and pitches) (Para. [0062]) wherein each of the music tracks is a contiguous time sequence of notes that make up the music track (Paras. [0074]-[0075]).  Takano further teaches segmenting the audio data of the song such that “the database also includes data in which the main melody of each music track is symbolized; for example, the melody of the main vocal section where the music track is a song” (Para. [0062]) thereby teaching a the segmented melody with a corresponding start time point of when the melody starts in the corresponding music track.
Takano also teaches including in the search query “information on a length of each note where “note length information includes, for example, information indicative of an onset time difference. By onset time difference is meant a time length from a time point at which input of one note starts to a time point at which input of a next note starts.” (Para. [0071]) thereby teaching using known start time points corresponding to each note of an audio unit.
a start time point of an audio segment corresponding to one or more words in a lyric corresponding to the audio data.

Regarding Claim 38
Takano further teaches:
wherein each of the reference time sequences stored in the database comprise time information of a plurality of contiguous audio units corresponding to a climactic or verse part of one song.
Takano teaches selecting a music track from among a plurality of music tracks in a database to be a matching object to the received search query (time sequence of notes and pitches) (Para. [0062]) wherein each of the music tracks is a contiguous time sequence of notes that make up the music track (Paras. [0074]-[0075]).  Takano further teaches segmenting the audio data of the song such that “the database also includes data in which the main melody of each music track is symbolized; for example, the melody of the main vocal section where the music track is a song” (Para. [0062]) thereby teaching a the segmented melody for a specific verse, the main vocal section.  It is further noted that in songs, verses have a corresponding melody.

Regarding Claim 39
Takano further teaches wherein the determining the difference degree between each stored reference sequence and the time point sequence respectively comprises:
calculating a cross-correlation between each stored reference time sequence and the time point sequence by a cross-correlation function; and
Takano teaches determining calculating a matching matrix for all music tracks and the received search query (Para. [0066]) where the matching matrix is used to determine a difference degree, in the form of a score, between the search query and each of the music tracks and a smaller score indicates a stronger degree of similarity and thus a smaller degree of difference (Paras. [0063]-[0065]) thereby teaching a cross-correlation function to determine a difference degree between matrices representing the search query and the corresponding audio data for each of the music tracks being searched. 
taking a cross-correlation between each stored reference time sequence and the time point sequence as a difference degree between each stored reference time sequence and the time point sequence.

Takano teaches determining calculating a matching matrix for all music tracks and the received search query (Para. [0066]) where the matching matrix is used to determine a difference degree, in the form of a score, between the search query and each of the music tracks and a smaller score indicates a stronger degree of similarity and thus a smaller degree of difference (Paras. [0063]-[0065]) thereby teaching a cross-correlation function to determine a difference degree between matrices representing the search query and the corresponding audio data for each of the music tracks being searched. 


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 20, 25, 29, 34, and 37 are rejected under 35 U.S.C. 103 as being unpatentable over Takano, and further in view of Grigg et al. (U.S. Pre-Grant Publication No. 2015/0294096, hereinafter referred to as Grigg).

Regarding Claim 20
Takano, as stated above, teaches:
wherein the detecting a predetermined trigger event comprises:
acquiring ambient audio data by an audio capturing component of the computer device, and identifying predetermined audio feature information in the ambient audio data.

Takano teaches all of the elements of the claimed invention as recited above except:
wherein the detecting a predetermined trigger event comprises any one of:
detecting that the computer device is shaken;
detecting a touch signal in a predetermined region of a touch screen of the computer device;
acquiring a plurality of frames of images by an image capturing component of the computer device, and detecting an image of a predetermined user action in the plurality of frames of images;

However, in the related field of endeavor of comparing search input to metadata associated with an audio clip, Grigg teaches:
wherein the detecting a predetermined trigger event comprises any one of:
detecting that the computer device is shaken;
Grigg teaches “the user may be presented with a portion of a favorite audio clip (e.g., a song, a tune, a beat, a melody, or the like) and will knock, tap, or push buttons in time with the audio clip” (Para. [0023]) where “the user, upon hearing her or his predetermined music playing as initiated by the apparatus, taps out a rhythm of a kickdrum using a sensor associated with the second (wearable) device, such as an accelerometer, a touch sensor, a touch screen, a capacitor, a biometric scan, or the like” (Para. [0037]). Therefore, Grigg teaches using an accelerometer as a sensor for detecting a set of inputs by a user.
detecting a touch signal in a predetermined region of a touch screen of the computer device;
Grigg teaches “the user may be presented with a portion of a favorite audio clip (e.g., a song, a tune, a beat, a melody, or the like) and will knock, tap, or push buttons in time with the audio clip” (Para. [0023]) where “the user, upon hearing her or his predetermined music playing as initiated by the apparatus, taps out a rhythm of a kickdrum using a sensor associated with the second (wearable) device, such as an accelerometer, a touch sensor, a touch screen, a capacitor, a biometric scan, or the like” (Para. [0037]).
acquiring a plurality of frames of images by an image capturing component of the computer device, and detecting an image of a predetermined user action in the plurality of frames of images;

Thus, it would have been obvious to one of ordinary skill in the art, having the teachings of Grigg and Takano at the time that the claimed invention was effectively filed, to have combined the input methods to search for metadata associated with musical data, as taught by Grigg, with the system and method for searching for music tracks, as taught by Takano.
One would have been motivated to make such combination because Grigg teaches an alternative input method for “musically-inclined users” for matching user input to metadata associated with musical data (Para. [0001]), the alternative input using a sensor associated with a device, such as an accelerometer, a touch sensor, a touch screen, a capacitor, a biometric scan, or the like (Para. [0037]) and it would be obvious to a person having ordinary skill in the art to have incorporated the additional input methods taught by Grigg to expand the input capabilities of Takano to provide alternative input methods for musically-inclined users.

Regarding Claim 25
Takano and Grigg further teach:
wherein in a case that the time information comprises the time duration of the audio unit, the selecting the target reference time sequence matching the time point sequence from the plurality of reference time sequences stored in the database comprises:
determining, based on the time point sequence, a time difference between each two adjacent time points in the time point sequence to obtain a time difference sequence; and
Grigg teaches “the sensors are configured to determine at least an input type, an input length, an input duration, an input time (e.g., the time at which the input was received), a length of time between receiving multiple inputs, a rhythm, a tempo, a velocity, a pitch, or the like of each input” (Para. [0040])
Ingrassia teaches determining context for a user “using any number of classification or regression models” which matches the context data, such as collected relevant sensor readings, to pre-stored contexts (Para. [0064]). Ingrassia further teaches matching collected ambient noise to determine whether “a user is in a motor vehicle” (Para. [0039]) thereby teaching comparing collected ambient noise, which is merely a sequence of collected audio points, as contextual information and matching the collected ambient audio points to pre-stored reference to determine that the ambient noise indicates that “the user is in a motor vehicle”.
selecting the target reference time sequence matching the time difference sequence from the stored reference time sequences.
Grigg teaches “the computer-program product further causes the apparatus to process the plurality of rhythmic inputs, wherein processing the plurality of rhythmic inputs includes comparing the plurality of rhythmic inputs to at least one predetermined rhythmic pattern…to determine that the plurality of rhythmic inputs matches at least one predetermined rhythmic pattern” (para. [0022]) thereby teaching using the time difference of the rhythmic patterns, since rhythmic patterns are rhythmic movement or sound repeated at regular time intervals, to select a matching predetermined rhythmic pattern for an input rhythmic pattern.

Regarding Claim 29
All of the limitations herein are similar to some or all of the limitations of Claim 20.

Regarding Claim 34
All of the limitations herein are similar to some or all of the limitations of Claim 25.

Regarding Claim 37
Takano and Grigg further teach:
wherein each of the reference time sequences stored in the database comprise time information of a plurality of contiguous audio units obtained by segmenting each song in different ways,
Grigg teaches segmenting each song in different ways by teaching segmenting an track down to “a favorite audio clip (e.g., a song, a tune, a beat, a melody, or the like” (Para. [0023]) as well as “for example, the predetermined rhythmic pattern may be configured to follow along with an instrument or vocal melody in a song” (Para. [0037]).  Therefore, Grigg teaches segmenting each song in different ways based on a favorite clip, such as a tune, beat or melody, as well as reducing the favorite clip further to an instrument or vocal part.
the different types of audio units comprise:
an audio segment corresponding to each note or
Takano teaches including in the search query “information on a length of each note where “note length information includes, for example, information indicative of an onset time difference. By onset time difference is meant a time length from a time point at which input of one note starts to a time point at which input of a next note starts.” (Para. [0071]) thereby teaching segmenting audio down to the note.
an audio segment corresponding to one or more words in a lyric corresponding to the audio data.
Grigg teaches “for example, the predetermined rhythmic pattern may be configured to follow along with an instrument or vocal melody in a song” (Para. [0037]) thereby teaching an audio segment corresponding to one or more words in the vocal melody corresponding to audio data.

Claim 40 is rejected under 35 U.S.C. 103 as being unpatentable over Takano, and further in view of Xu et al. (U.S. Pre-Grant Publication No. 2007/0201558, hereinafter referred to as Xu).

Regarding Claim 40
Takano teaches all of the elements of the claimed invention as recited above except:
wherein the determining the difference degree between each stored reference time sequence and the time point sequence comprises calculating a cross-correlation between each stored reference time sequence and the time point sequence by an Earth Mover’s Distance (EMD) algorithm.

However, in the related field of endeavor of segmenting media, Xu teaches:
wherein the determining the difference degree between each stored reference time sequence and the time point sequence comprises calculating a cross-correlation between each stored reference time sequence and the time point sequence by an Earth Mover’s Distance (EMD) algorithm.
Xu teaches computing a dissimilarity between audio content signatures for segments “using the well-known Earth Mover’s Distance (EMD) metric. The EMD metric gives a global view of the changes in audio content in the time domain.” (Para. [0051]).

Thus, it would have been obvious to one of ordinary skill in the art, having the teachings of Xu and Takano at the time that the claimed invention was effectively filed, to have combined the use of the Earth Mover’s Distance (MD) metric for calculating a dissimilarity, as taught by Xu, with the system and method for searching for music tracks, as taught by Takano.
One would have been motivated to make such combination because Xu teaches using the known Earth Mover’s Distance (EMD) metric for improved content-based retrieval from a large database (Para. [0108]) and it would be obvious to a person having ordinary skill in the art that incorporating a distance equation such as EMD that improves the retrieval from a large database would create a system in Takano that ensures the ability to handle a larger database of music tracks.


Response to Amendment
Applicant’s Amendments, filed on 1/6/2022, are acknowledged and accepted.
In light of the amendments filed on 1/6/2022, the 101 rejection of claim 35 for being directed to non-statutory subject matter has been withdrawn.
In light of the amendments filed on 1/6/2022, the 101 rejection of claims 17 and 26 for being directed to an abstract idea has been withdrawn.
As stated above and restated here for convenience, Applicant’s amendments and remarks filed on 1/6/2022 have been fully considered and were found to overcome the previously cited prior art, thus necessitating the new ground of rejection presented herein. Accordingly, THIS ACTION IS MADE FINAL.


Response to Arguments
On pages 14-15 of the remarks filed on 1/6/2022, Applicant argues that “in Ingrassia, its purpose does not lie in searching for a specific song, but lies in searching for songs within a certain attribute or characteristic, so that several songs that meet the requirements are added to a playlist” whereas “amended claim 17 provides, for example, a technical solution in which audio data of a specific song can be searched without knowing the title of the song” by using trigger events at time points that make up the input search requirements.The significantly amended independent claims were found to overcome the Ingrassia reference, thereby necessitating the new grounds of rejection presented herein.
On pages 15-16 of the remarks filed on 1/6/2022, Applicant argues that Ingrassia does not teach “the time point sequence reflects audio change characteristics of the audio to be searched” because “in Ingrassia the time point appears to refer to calendar time or real time” when the office action states “Ingrassia teaches ‘if a user previously liked a track in a given situation, the same track and other similar tracks can be expected to be good recommendations when the same user and situation are encountered the next time’ (Para. [0037]) thereby teaching recording time points that the sensor readings are sensed”.The significantly amended independent claims were found to overcome the Ingrassia reference, thereby necessitating the new grounds of rejection presented herein.
On pages 16-18 of the remarks filed on 1/6/2022, Applicant argues that “Partridge does not (and cannot) cure the deficiencies of Ingrassia” and more specifically that “Partridge does not disclose or suggest the subject matter recited in amended claim 17, including, inter alia, ‘detecting a predetermined trigger event in the received trigger events, wherein the predetermined trigger event is input to the computer device based on content of the audio to be searched; recording, each time one predetermined trigger event is detected, a time point when the detected predetermined trigger event occurs; acquiring, when a predetermined end event is detected, all recorded time points to obtain a time point sequence, wherein the time point sequence reflects audio change characteristics of the audio to be searched.’”. The Office agrees with this statement when examining at least the amended independent claims, thereby necessitating the new grounds of rejection presented herein.


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Iwamura (U.S. Patent No. 6,188,010) teaches searching for a song title when only melody is known, comprising an interface which allows a user to enter the melody in an easy to understand manner and a remote music database with melody information is searched for the melody entered by the user using, for example a peak or differential matching algorithm.
Burlik et al. (U.S. Pre-Grant Publication 2007/0254271) teaches sensing a repetitive motion, such as walking, running, or tapping, the rate of the repetitive motion, and matching music to the rate of the repetitive motion using tempo.
Henshall (U.S. Pre-Grant Publication No. 2013/0297599) teaches mapping trigger cues to audio segments and playing the audio segments when the trigger cue is received.
Chang (U.S. Pre-Grant Publication No. 2007/0143499) teaches using trigger conditions such as location and time in order to determine whether a scheduled job/event can be executed.
Plans et al. (U.S. Pre-Grant Publication No. 2015/0093729) teaches generating music in relation to biometric data of a user based on captured image data of a body part and deriving a biometric signal form the image data.
Kumar et al. (U.S. Pre-Grant Publication No. 2016/0088031) teaches receiving a first data set representing an image and an event associated with the image, receiving a second data set representing a media item and the event, and associating the media item with the image, storing the association in storage.
Non Patent Literature S. Xu, S. Chen, K. Y. Yip, F. C. M. Lau and X. Qin, "A Two-Stage Audio Retrieval Method for Searching Unannotated Audio Clips," 2008 Tenth IEEE International Symposium on Multimedia, 2008, pp. 334-339, doi: 10.1109/ISM.2008.46 teaches a two-stage audio retrieval method consisting of a first stage text-based retrieval and a second stage content-based retrieval, wherein the second stage uses audio clips returned in the first stage as input to return, from the second stage, similar audio clips based on a pairwise audio content similarity measure.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ROBERT F MAY whose telephone number is (571)272-3195. The examiner can normally be reached Monday-Friday 9:30am to 6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hosain Alam can be reached on 571-272-3978. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ROBERT F MAY/Examiner, Art Unit 2154                                                                                                                                                                                                        5/4/2022

/HOSAIN T ALAM/Supervisory Patent Examiner, Art Unit 2154