DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.

Response to Amendment
This communication is responsive to the applicant’s amendment dated 12/04/2020.  The applicant(s) amended claims 1 and 4-21.

Response to Arguments
Applicant's arguments with respect to claim 1 have been considered but are moot in view of the new ground(s) of rejection because the arguments pertain to the newly amended limitations.

Claim Rejections - 35 USC § 103
Claims 1 and 3-21 are rejected under 35 U.S.C. 103 as being unpatentable over Fujita et al. (US 20160171100 A1) in view of Moorer et al. (US 8069044 B1).

Regarding claims 1, 13, and 18, Fujita teaches:

“partitioning, by the processor, the first audio message into one or more frames” (sub-word) (par. 0046; ‘Hereinafter, a configuration in which the sub-word recognition technology is used will be exemplarily described based on the assumption that an unknown word is retrieved at fast speed. The keyword detection unit 32 recognizes phonemes of each of a transmitted voice and a received voice, and stores the phonemes in a phoneme sequence table 501 (see FIG. 5) in the call search DB 35 together with the appearance times of the phonemes.’);
“identifying, by the processor, one or more phonemes in the one or more frames of the first audio message” (par. 0046; ‘Hereinafter, a configuration in which the sub-word recognition technology is used will be exemplarily described based on the assumption that an unknown word is retrieved at fast speed. The keyword detection unit 32 recognizes phonemes of each of a transmitted voice and a received voice, and stores the phonemes in a phoneme sequence table 501 (see FIG. 5) in the call search DB 35 together with the appearance times of the phonemes.’);
“concatenating, by the processor, the one or more frames of the first audio message, wherein concatenating the one or more frames produces a first index file comprising a concatenation of the one or more phonemes” (par. 0046; ‘The keyword detection unit 32 recognizes phonemes of each of a transmitted voice and a received voice, and stores the phonemes in a phoneme sequence table 501 (see FIG. 5) in the 
“identifying, by the processor, a first phoneme string in the first index file by comparing the first index file with one or more past index files, wherein the first phoneme string is associated with a first characteristic” (par. 0059; ‘Then, the keyword search unit 37 searches for a portion in which the phoneme sequence obtained through the conversion is contained as a partial sequence in any of the phoneme sequences in the phoneme sequence table 501.’);
“based on identifying the first phoneme string in the first index file, determining, by the processor, that the first phoneme string indicates the first characteristic (keyword)” (par. 0059; ‘Then, the keyword search unit 37 searches for a portion in which the phoneme sequence obtained through the conversion is contained as a partial sequence in any of the phoneme sequences in the phoneme sequence table 501.’);
“identifying, by the processor, the first phoneme string in the second audio message” (par. 0045; ‘If there is a recorded call, the recorded call acquisition unit 31 obtains from the call recording device 2 a transmitted voice and a received voice in the recorded call corresponding to the new record ID (step S403).’; par. 0046; ‘The keyword detection unit 32 recognizes phonemes of each of a transmitted voice and a received voice, and stores the phonemes in a phoneme sequence table 501 (see FIG. 5) in the call search DB 35 together with the appearance times of the phonemes.’).
Fujita teaches identifying a first phoneme string in the second audio message (par. 0038; ‘The keyword detection unit 32 detects a keyword contained in the obtained recorded call.’).

“receiving, by the processor, a second audio message from a website via the network”; and
“in response to identifying the first phoneme string in the second audio message, determining, by the processor, the second audio message indicates the first characteristic.”
Moorer teaches:
“receiving, by the processor, a second audio message from a website via the network” (col. 4, lines 27-36; ‘ In some examples, data may be sent and received using various types of encoding protocols such as voice-over-Internet-Protocol ("VoIP"), H.263, IEEE or ITU standards, and others. Data sources 220-226 may be local or remote (e.g., networked) sources and are not limited to the types shown or the descriptions provided. In other words, different types and sources of data may be used to provide audio data to application 200 and are not limited to the examples shown.’);
“in response to identifying the first phoneme string in the second audio message, determining, by the processor, the second audio message indicates the first characteristic” (col. 9, lines 21-25; ‘In other words, a score may reflect whether a detected sequence of phonemes is substantially similar in order, type, phonetic pronunciation, and other characteristics, which suggests that corresponding audio data is similar.’).
Moorer also teaches:
“based on identifying the first phoneme string in the first index file, determining, by the processor, that the first phoneme string indicates the first characteristic” and 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to modify Fujita’s voice search system by incorporating Moorer’s method of comparing phoneme sequences to known sequences in order to determine whether a given stream of audio data is a copy of another stream of audio data. Moorer (Moorer: col. 1, lines 40-44)

Regarding claim 3 (dep. on claim 1), the combination of Fujita in view of Moorer further teaches:
“determining statistical information about the first phoneme string” (Fujita: par. 0054; ‘The emotion score sequence table 503 contains, as the constituent items, a record ID 5031 for uniquely discriminating and identifying a recorded call, a type 5032 that indicates the type of a channel (i.e., a transmission channel or a reception channel) to which the corresponding phoneme sequence belongs, an emotion score sequence 5033 for holding an emotion score value calculated for each phoneme sequence, and a voice start time sequence 5034 for holding the start time of each phoneme sequence that is managed by the phoneme sequence table 501.’).
claim 4 (dep. on claim 3), the combination of Fujita in view of Moorer further teaches:
“includes a confidence score indicating a degree of confidence that the first phoneme string indicates the first characteristic” (Moorer: abstract; ‘Content matching using phoneme comparison and scoring is described, including extracting phonemes from a file, comparing the phonemes to other phonemes, associating a first score with the phonemes based on a probability of the other phonemes matching the phonemes, and providing the file with another file when a request is received to access one or more files having a second score that is substantially similar to the first score.’).

Regarding claims 5 (dep. on claim 4) and 21 (dep. on claim 19), the combination of Fujita in view of Moorer further teaches:
“wherein the characteristic is associated with a positive or negative sentiment” (Fujita: par. 0048; ‘When a discriminator for discriminating between anger and calmness is learned in advance from a database of an angry voice and a calm voice, using a technology of a support vector machine and the like, it becomes possible to calculate a score of an emotion of anger based on the distance from the discrimination boundary.’).

Regarding claim 6 (dep. on claim 4), the combination of Fujita in view of Moorer further teaches:
“receiving, by the processor, a new set of audio messages” (Fujita: par. 0045; ‘If there is a recorded call, the recorded call acquisition unit 31 obtains from the call 
“identifying, by the processor, a second phoneme string in the new set of audio messages, wherein the second phoneme string is associated with a second characteristic” (Fujita: par. 0059; ‘Then, the keyword search unit 37 searches for a portion in which the phoneme sequence obtained through the conversion is contained as a partial sequence in any of the phoneme sequences in the phoneme sequence table 501.’; Moorer: col. 9, lines 21-25; ‘In other words, a score may reflect whether a detected sequence of phonemes is substantially similar in order, type, phonetic pronunciation, and other characteristics, which suggests that corresponding audio data is similar.’);
“comparing, by the processor, the second phoneme string in the new set of audio messages with at least two audio messages in an old set of audio messages” (Fujita: par. 0059; ‘Then, the keyword search unit 37 searches for a portion in which the phoneme sequence obtained through the conversion is contained as a partial sequence in any of the phoneme sequences in the phoneme sequence table 501.’).
However, Fujita in view of Goto does not expressly teach:
“based on the comparison, determining, by the processor, the second phoneme string is absent from the old set of audio messages”; and
“determining, by the processor, the second characteristic is a new topic present in the new set of audio messages.”
Detecting new topics in new sets of audio messages is well-known in the art (see McCarthy et al. (US 7092888 B1). 


Regarding claim 7 (dep. on claim 4), claim 15 (dep. on claim 14), and claim 20 (dep. on claim 19), the combination of Fujita in view of Moorer further teaches:
“wherein the confidence score is a probability” (Moorer: abstract; ‘Content matching using phoneme comparison and scoring is described, including extracting phonemes from a file, comparing the phonemes to other phonemes, associating a first score with the phonemes based on a probability of the other phonemes matching the phonemes, and providing the file with another file when a request is received to access one or more files having a second score that is substantially similar to the first score.’, the method further comprising: 
“determining the confidence score reaches or crosses a predetermined threshold” and “based on determining the confidence score reaches or crosses the predetermined threshold, signifying the first phoneme string indicates the first characteristic” (Moorer: col. 9, lines 29-39; ‘If the score indicates a match (i.e., a substantial similarity or near-substantial similarity, the latter of which may be above a numerical value (e.g., if the score is greater than or equal to 95%, 0.95, or another quantitative threshold then identify the evaluated file (i.e., sequence of phonemes being compared) as being substantially similar to the known file in the comparison, if a score is less than or equal to 90% then do not identify the evaluated file as a substantially 

Regarding claim 8 (dep. on claim 5), the combination of Fujita in view of Moorer further teaches:
“determining the confidence score does not reach or cross a predetermined threshold, receiving a third message” and “in response to determining the confidence score does not reach of cross the predetermined threshold and in response to receiving the third message, re-calculating the confidence score using analysis associated with the third message”   (Moorer: col. 9, lines 29-39; ‘If the score indicates a match (i.e., a substantial similarity or near-substantial similarity, the latter of which may be above a numerical value (e.g., if the score is greater than or equal to 95%, 0.95, or another quantitative threshold then identify the evaluated file (i.e., sequence of phonemes being compared) as being substantially similar to the known file in the comparison, if a score is less than or equal to 90% then do not identify the evaluated file as a substantially similar file, and others)), then a tag may be generated and associated with the file having the detected sequence of phonemes (410).’).

Regarding claim 9 (dep. on claim 6) and claim 16 (dep. on claim 13), the combination of Fujita in view of Moorer further teaches:
“wherein each of the first and second phoneme strings includes two or more phonemes” (Fujita: par. 0046; ‘The keyword detection unit 32 recognizes phonemes of each of a transmitted voice and a received voice, and stores the phonemes in a 

Regarding claim 10 (dep. on claim 7) and claim 17 (dep. on claim 14), the combination of Fujita in view of Moorer further teaches:
“wherein each of the first and second audio message includes two or more phonemes” (Fujita: par. 0046; ‘The keyword detection unit 32 recognizes phonemes of each of a transmitted voice and a received voice, and stores the phonemes in a phoneme sequence table 501 (see FIG. 5) in the call search DB 35 together with the appearance times of the phonemes.’).

Regarding claim 11 (dep. on claim 1), the combination of Fujita in view of Moorer further teaches:
“wherein the phonemes are associated with the English language” (Moorer: col. 2, lines 62-66; ‘In still other examples, an acoustic processor or phoneme recognizer may be used to identify sequences of phonemes into known words or phrases in a given language (e.g., English, Spanish, Japanese, Mandarin Chinese, and others).’).

Regarding claim 12 (dep. on claim 9), the combination of Fujita in view of Moorer further teaches:
“analyzing a known negative message” (Fujita: par. 0048; ‘When a discriminator for discriminating between anger and calmness is learned in advance from a database of an angry voice and a calm voice, using a technology of a support vector machine and 
“identifying phoneme strings in the known negative message” (Fujita: par. 0054; ‘The emotion score sequence table 503 contains, as the constituent items, a record ID 5031 for uniquely discriminating and identifying a recorded call, a type 5032 that indicates the type of a channel (i.e., a transmission channel or a reception channel) to which the corresponding phoneme sequence belongs, an emotion score sequence 5033 for holding an emotion score value calculated for each phoneme sequence, and a voice start time sequence 5034 for holding the start time of each phoneme sequence that is managed by the phoneme sequence table 501.’).

Regarding claim 14 (dep. on claim 13), the combination of Fujita in view of Moorer further teaches:
“further comprising instructions to determine statistical information about the first phoneme string, wherein the statistical information includes a confidence score indicating a degree of confidence that the first phoneme string indicates the first characteristic” (Fujita: par. 0054; ‘The emotion score sequence table 503 contains, as the constituent items, a record ID 5031 for uniquely discriminating and identifying a recorded call, a type 5032 that indicates the type of a channel (i.e., a transmission channel or a reception channel) to which the corresponding phoneme sequence belongs, an emotion score sequence 5033 for holding an emotion score value calculated for each phoneme sequence, and a voice start time sequence 5034 for holding the start time of each phoneme sequence that is managed by the phoneme 

Regarding claim 19 (dep. on claim 18), the combination of Fujita in view of Moorer further teaches:
 “further comprising a statistics analyzer operable to determine statistical information about the phoneme string, wherein the statistical information includes a confidence score indicating a degree of confidence that the phoneme string indicates the first characteristic” (Fujita: par. 0054; ‘The emotion score sequence table 503 contains, as the constituent items, a record ID 5031 for uniquely discriminating and identifying a recorded call, a type 5032 that indicates the type of a channel (i.e., a transmission channel or a reception channel) to which the corresponding phoneme sequence belongs, an emotion score sequence 5033 for holding an emotion score value calculated for each phoneme sequence, and a voice start time sequence 5034 for holding the start time of each phoneme sequence that is managed by the phoneme sequence table 501.’; Moorer: abstract; ‘Content matching using phoneme comparison and scoring is described, including extracting phonemes from a file, comparing the phonemes to other phonemes, associating a first score with the phonemes based on a probability of the other phonemes matching the phonemes, and providing the file with .

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARK VILLENA whose telephone number is (571)270-3191.  The examiner can normally be reached on 10 am - 6pm EST Monday through Friday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached on (571) 272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access 


MARK . VILLENA
Examiner
Art Unit 2658



/MARK VILLENA/           Examiner, Art Unit 2658