DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
In response to the Office Action mailed 1/13/2021, applicant has submitted an amendment filed 4/13/2021.
Claim(s) 1 and 5-18 has/have been amended.  New Claim(s) 21 has/have been added.
Response to Arguments
	New prior art rejections necessitated by Applicant’s amendments are presented below.
	This action is non-final because the previous rejection of claim 8 was determined to be improper, and a new rejection of claim 8 is presented below.

Claim Objections
Claim 18 is objected to because of the following informalities:  
Claim 18, in line 13, recites “a hardware processor, coupled to the audio input, visual input and the data store” which would be better recited as –a hardware processor, coupled to the audio input, the visual input and the data store--.
Appropriate correction is required.


Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 15-16 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

Claim 15 was amended to recite “wherein determining whether any of the first extracted biometric data associated with the first child speaker or the second extracted biometric data associated with the second child speaker comprises determining a match…” where “corresponds to a voiceprint associated with a respective consenting user profile” appears to have been deleted (i.e. previous claim 15 recited “wherein determining whether said extracted biometric data associated with the first speaker or the second speaker corresponds to a voiceprint associated with a respective consenting user profile comprises”).  The phrasing “wherein determining whether any of the first extracted biometric data associated with the first child speaker or the second extracted biometric data associated with the second child speaker comprises” is unusual.

The dependent claims incorporate the issues of their respective parent claims.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 7-8, 10-11, 17, 18, 21, is/are rejected under 35 U.S.C. 103 as being unpatentable over Bapat et al. (US 2018/0349684), hereafter Bapat, in view of Matthews, III et al. (US 2012/0297284), hereafter Matthews, and Watry (US 2016/0361663).

As per Claims 1 and 21, Bapat suggests (along with its corresponding computer-readable medium equivalent) A method comprising: storing one or more user profiles that are each associated with one of one or more users of a computing system, wherein each user profile is associated with a voiceprint that was generated to uniquely characterize voice characteristics of a respective user of the one or more users of the computing system, wherein at least one of the stored one or more user profiles is a consenting user profile…, and wherein the consenting user profile…is associated with a record indicating consent… to store biometric data…; processing an audio signal containing speech data received from a plurality of speakers at the computing system, wherein the plurality of speakers comprise a first… speaker and a second… speaker, and wherein the processing of the audio signal comprises extracting first biometric data associated with the first… speaker and second biometric data associated with the second… speaker; determining whether any of the first extracted biometric data associated with the first… speaker or the second extracted biometric data associated with the second… speaker corresponds to a voiceprint associated with a respective consenting user profile associated with a record indicating consent… to store biometric data… ; responsive to determining that the first extracted biometric data associated with the first… speaker corresponds to the voiceprint associated with the respective consenting user profile associated with the record indicating consent… to store the biometric data…, performing at least one of: (i) processing speech data from the first… speaker; or (ii) storing the speech data from the first… speaker in an archive; responsive to determining that the second extracted biometric data associated with the second… speaker does not correspond to any voiceprint associated with any consenting user profile associated with a record indicating consent… to store biometric data…: deleting speech data from the second… speaker within a predetermined time period (paragraphs 9-11, 126, 160, 168, 194, 196-197, 227-228, 238-239, 243-245; Figures 5, 7A-7B [all paragraphs and Figures are cited for each limitation with “key” paragraphs and Figures pertaining to each limitation identified below, i.e. all other paragraphs and Figures not specifically referenced for any particular limitation are eligible to provide context and additional support]
	“A method comprising: storing one or more user profiles that are each associated with one of one or more users of a computing system, wherein each user profile is associated with a voiceprint that was generated to uniquely characterize voice 
	“processing an audio signal containing speech data received from a plurality of speakers at the computing system, wherein the plurality of speakers comprise a first… speaker and a second… speaker, and wherein the processing of the audio signal comprises extracting first biometric data associated with the first… speaker and second biometric data associated with the second… speaker;”: Paragraphs 196-197 describe 
“determining whether any of the first extracted biometric data associated with the first… speaker or the second extracted biometric data associated with the second… speaker corresponds to a voiceprint associated with a respective consenting user profile associated with a record indicating consent… to store biometric data…”: [In addition to what was already discussed above] comparing the respective speech characterization data of each of the multiple persons [derived from the audio received at the single device] to speech characterization “voiceprint” data in the “consenting user profiles” [the consenting user profiles are suggested to be associated with a record indicating consent to store biometric data as discussed above in the portion of this rejection 
“responsive to determining that the first extracted biometric data associated with the first… speaker corresponds to the voiceprint associated with the respective consenting user profile associated with the record indicating consent… to store the biometric data…, performing at least one of: (i) processing speech data from the first… speaker; or (ii) storing the speech data from the first… speaker in an archive;”: In addition to what was discussed above, paragraphs 238-239 further describe where characterization data “of the detected person[s]” is compared to stored characterization data and storing characterization data for recognized persons [which is at least suggested to refer to storing the characterization data obtained from the system input, 
“responsive to determining that the second extracted biometric data associated with the second… speaker does not correspond to any voiceprint associated with any consenting user profile associated with a record indicating consent… to store biometric data…: deleting speech data from the second… speaker within a predetermined time period”: In addition to what was discussed above, paragraph 228 suggests where a detected person can be designated as unknown [when there is no predetermined similarity between input characterization data and any stored characterization data in any consenting user profile, where the consenting user profiles are suggested to be 
Bapat does not, but Matthews suggests A method comprising: storing one or more user profiles that are each associated with one of one or more users of a computing system, wherein each user profile is associated with a voiceprint that was generated to uniquely characterize voice characteristics of a respective user of the one or more users of the computing system, wherein at least one of the stored one or more user profiles is a consenting user profile of a child, and wherein the consenting user profile of the child is associated with a record indicating consent… to store biometric data of the child; processing an audio signal containing speech data received from a plurality of speakers at the computing system, wherein the plurality of speakers comprise a first child speaker and a second child speaker, and wherein the processing of the audio signal comprises extracting first biometric data associated with the first child speaker and second biometric data associated with the second child speaker; determining whether any of the first extracted biometric data associated with the first child speaker or the second extracted biometric data associated with the second child speaker corresponds to a voiceprint associated with a respective consenting user profile associated with a record indicating consent… to store biometric data of the respective child; responsive to determining that the first extracted biometric data associated with the first child speaker corresponds to the voiceprint associated with the respective consenting user profile associated with the record indicating consent… to store the biometric data of the respective child, performing at least one of: (i) processing speech data from the first child speaker; or (ii) storing the speech data from the first child speaker in an archive; responsive to determining that the second extracted biometric data associated with the second child speaker does not correspond to any voiceprint associated with any consenting user profile associated with a record indicating consent… to store biometric data of the respective child: deleting speech data from the second child speaker within a predetermined time period (Paragraphs 70-71;
As discussed above, Bapat suggests receiving speech from multiple people at a single device and recognizing one of the multiple people based on the recognized person’s speech characterization data being similar to a consenting user profile’s speech characterization data, and determining one of the multiple people to be unknown based on the unknown person’s speech characterization data not being similar to any profile’s speech characterization data, but does not specifically describe 
Matthews describes [in paragraphs 70-71] where a single device [a camera] detects voice input from multiple sources, particularly children that are conversing with each other.
Matthews thus suggests where the multiple people whose speech is received by the single device suggested by Bapat are children [i.e., where the recognized person is “a first child speaker” and where the unknown person is “a second child speaker”].  Logically, in order to recognize the “first child speaker” based on similarity between the “first child speaker’s” speech characterization data and a consenting user profile’s speech characterization data, the consenting user profile must be the child’s profile with the child’s “voiceprint” speech characterization data, where the consenting user profile is suggested to be associated with a record indicating consent to store the child’s speech characterization data [in the same way that every other consenting user profile of any person is suggested to be associated with a record indicating consent to store PII/characterization data].  Also, logically, if the “second child speaker” is determined to be unknown, then that “second child speaker’s” speech characterization data is not determined to be similar to any of the speech characterization data in any of the consenting user profiles [including any child’s profile])
	Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of person with another because the prior art teaches the claimed invention except for the substitution of a person which is not necessarily a child with a person which is.  Matthews teaches that a 
	Bapat, in view of Matthews, do not, but Watry suggests A method comprising: storing one or more user profiles that are each associated with one of one or more users of a computing system, wherein each user profile is associated with a voiceprint that was generated to uniquely characterize voice characteristics of a respective user of the one or more users of the computing system, wherein at least one of the stored one or more user profiles is a consenting user profile of a child, and wherein the consenting user profile of the child is associated with a record indicating consent by a parent of the child to store biometric data of the child; processing an audio signal containing speech data received from a plurality of speakers at the computing system, wherein the plurality of speakers comprise a first child speaker and a second child speaker, and wherein the processing of the audio signal comprises extracting first biometric data associated with the first child speaker and second biometric data associated with the second child speaker; determining whether any of the first extracted biometric data associated with the first child speaker or the second extracted biometric data associated with the second child speaker corresponds to a voiceprint associated with a respective consenting user profile associated with a record indicating consent by a parent of a respective child to store biometric data of the respective child; responsive to determining that the first extracted biometric data associated with the first child speaker corresponds to the voiceprint associated with the respective consenting user profile associated with the record indicating consent by the parent of the respective child to store the biometric data of the respective child, performing at least one of: (i) processing speech data from the first child speaker; or (ii) storing the speech data from the first child speaker in an archive; responsive to determining that the second extracted biometric data associated with the second child speaker does not correspond to any voiceprint associated with any consenting user profile associated with a record indicating consent by a parent of a respective child to store biometric data of the respective child: deleting speech data from the second child speaker within a predetermined time period (Paragraphs 10, 13, 75; Figure 17;
Bapat, in view of Matthews, suggests where the multiple people can be children and where a consenting user profile can be a consenting user profile of a child [because if one of the children is recognized, then that child is suggested to have a user profile that stores the child’s characterization data, where the user profile is suggested to be associated with a record indicating consent to store the child’s characterization data].
Bapat and Matthews do not specifically describe where the suggested consent to store the child’s characterization data in the child’s profile is consent given by a parent of the child.
Watry describes where a parent consents to personal information that a toy and its application is collecting, storing and sharing [paragraph 10], where parents can revoke consent which leads to personal information stored on a child to be deleted [paragraph 13].  Paragraph 75 and Figure 17 also suggests where a parent provides consent for a child to play with a toy after reviewing “privacy data/info” [at least suggested to be personal information of the child].  Watry thus suggests where consent to store personal information of a child is provided by a parent of the child.
Watry thus suggests where the suggested consent to store the “first child speaker’s” characterization data is more specifically provided by the first child speaker’s parent [instead of by the first child speaker himself/herself] and suggests where the “second child speaker’s” speech characterization data is determined to not correspond to “any voiceprint associated with any consenting user profile associated with a record indicating consent by a parent of a respective child to store biometric data of the 
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of consent with another because the prior art teaches the claimed invention except for the substitution of consent which is not necessarily consent provided by a parent for a child with consent which is.  Watry teaches that consent provided by a parent for a child was known in the art.  One of ordinary skill in the art could have substituted one type of consent with another to obtain the predictable results of a system which receives, at a single device, audio from multiple people, determines speech characterization data of each person of the multiple people from the received audio, compares each person’s speech characterization data to stored speech characterization data in profiles that each store a respective person’s characterization data with consent, determines that a first one of the multiple people is a recognized person responsive to the first one of the multiple people’s determined speech characterization data being determined to be sufficiently similar to speech characterization data stored in a profile that corresponds to the first one of the multiple people and that is associated with consent corresponding to the first one of the multiple people, determines that a second one of the multiple people is an unknown person responsive to the second one of the multiple people’s determined speech characterization data being determined to not be sufficiently similar to speech characterization data in any of the profiles, and deletes the second one of the multiple people’s determined speech characterization data based on the second one of the 

As per Claim 7, Bapat suggests wherein responsive to determining that the second extracted biometric data associated with the second speaker does not correspond to any voiceprint associated with any consenting user profile, the speech data from the second speaker is processed before being deleted within said predetermined time period (paragraphs 9-11, 126, 160, 168, 194, 196-197, 227-228, 238-239, 243-245; Figures 5, 7A-7B;
Bapat, in view of Matthews and Watry suggests “responsive to determining that the second extracted biometric data associated with the second child speaker does not correspond to any voiceprint associated with any consenting user profile associated with a record indicating consent by a parent of a respective child to store biometric data of the respective child: deleting speech data from the second child speaker within a predetermined time period” for the reasons discussed in the rejection of claim 1.
Paragraphs 9-11, 238-239, and 243-245 describe where PII of an unknown person is stored in response to determining that a person is unknown, and then deleting the PII in response to a predetermined amount of time elapsing without a response from a user [paragraph 9], and where the data for the unrecognized person which is deleted when a pre-set amount of time has elapsed can include “characterization data” 
Storing can be interpreted as “processing”, such that storing the unknown person’s speech characterization data can be interpreted as “processing” the unknown person’s speech characterization data [subjecting the data to a storing “process”] before deleting the unknown person’s speech characterization data).

As per Claims 7-8, Bapat suggests wherein responsive to determining that the second extracted biometric data associated with the second speaker does not correspond to any voiceprint associated with any consenting user profile, the speech data from the second speaker is processed before being deleted within said predetermined time period, wherein said predetermined time period is immediately after processing the speech data (paragraphs 9-11, 126, 160, 168, 194, 196-197, 227-228, 238-239, 243-245; Figures 5, 7A-7B;
Bapat, in view of Matthews and Watry suggests “responsive to determining that the second extracted biometric data associated with the second child speaker does not correspond to any voiceprint associated with any consenting user profile associated with a record indicating consent by a parent of a respective child to store biometric data of the respective child: deleting speech data from the second child speaker within a predetermined time period” for the reasons discussed in the rejection of claim 1.
For claims 7-8, Bapat, in view of Matthes and Watry, alternatively suggests “responsive to determining that the second extracted biometric data associated with the second child speaker does not correspond to any voiceprint associated with any 
Paragraphs 9-11, 238-239, and 243-245 describe where PII of an unknown person is stored in response to determining that a person is unknown, and then deleting the PII in response to a predetermined amount of time elapsing without a response from a user [paragraph 9], and where the data for the unrecognized person which is deleted when a pre-set amount of time has elapsed can include “characterization data” [paragraph 245, where, as per paragraphs 160 and 168 “characterization data” can be “speech” data].
As an alternative to what was discussed in the first rejection of claim 7, storing for the predetermined time amount of time can be interpreted as “processing”, such that storing the unknown person’s speech characterization data for the predetermined amount of time can be interpreted as “processing” the unknown person’s speech characterization data before deleting the unknown person’s speech characterization data.
For claim 8, the unknown person’s speech characterization data is deleted “immediately after”/”within said predetermined time period” after storing the unknown person’s speech characterization data for the predetermine amount of time [i.e. “immediately after processing the speech data from the second speaker”]).

As per Claims 10-11, Bapat suggests matching additional biometric data acquired from the first… speaker or the second… speaker against stored biometric data associated with a respective consenting user profile, wherein said additional biometric data is selected from: a. image data of the face of the first… speaker or the second… speaker, b. iris pattern data, c. fingerprint data; d. hand geometry data; e. palm blood vessel pattern data f. retinal blood vessel pattern data; g. mouth movement data; or h. behavioural data (paragraphs 9-11, 126, 160, 168, 194, 196-197, 227-228, 238-239, 243-245; Figures 5, 7A-7B;
In addition to what was discussed in the rejection of claim 1 [where “the first child speaker” is a recognized person and where “the second child speaker” is an unknown person]
Paragraph 228 and Figure 7B describes where image portion 704 and corresponding characterization data 705 is compared with stored images and characterization data [i.e. where more than one type of comparison is performed, including an image comparison and a characterization data comparison is performed].  
Bapat thus suggests receiving both audio and video, determining, for each of “the first speaker” and the “second speaker” [one which is recognized and one which is unknown, as discussed in the rejection of claim 1] speech characterization data and “additional biometric data”/“image data of the face” [where the “image data of the face” is “image data of the face of the first… speaker or the second… speaker”], comparing each speaker’s speech characterization data to stored speech characterization data of the stored “consenting user profiles” [“determining whether said extracted biometric data associated with the first speaker or the second speaker corresponds to a voiceprint associated with a respective consenting user profile”], and comparing the face images of each speaker to face images of the stored “consenting user profiles” [“matching additional biometric data acquired from the first… speaker or the second… speaker against stored biometric data associated with a respective consenting user profile”], in order to determine a detected person to be a particular person [e.g. John]).  
Bapat does not, but Matthews suggests matching additional biometric data acquired from the first child speaker or the second child speaker against stored biometric data associated with a respective consenting user profile, wherein said additional biometric data is selected from: a. image data of the face of the first child speaker or the second child speaker, b. iris pattern data, c. fingerprint data; d. hand geometry data; e. palm blood vessel pattern data f. retinal blood vessel pattern data; g. mouth movement data; or h. behavioural data (Paragraphs 70-71;

Matthews describes [in paragraphs 70-71] where a single device [a camera] detects voice input from multiple sources, particularly children that are conversing with each other.
Matthews thus suggests where the multiple people whose speech and face images are received by the single device suggested by Bapat are children [i.e., where the recognized person is “a first child speaker” and where the unknown person is “a second child speaker”].  Logically, in order to recognize the “first child speaker” based on similarity between the “first child speaker’s” speech characterization data and face image and a consenting user profile’s speech characterization data and face image, the consenting user profile must be the child’s profile with the child’s “voiceprint” speech characterization data and the child’s face image, where the consenting user profile is suggested to be associated with a record indicating consent to store the child’s speech characterization data and face image [in the same way that every other consenting user 
	Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of person with another because the prior art teaches the claimed invention except for the substitution of a person which is not necessarily a child with a person which is.  Matthews teaches that a person who is a child was known in the art.  One of ordinary skill in the art could have substituted one type of person with another to obtain the predictable results of a system which receives, at a single device, audio from multiple people, determines speech characterization data of each person of the multiple people from the received audio, compares each person’s speech characterization data to stored speech characterization data in profiles that each store a respective person’s characterization data with consent, determines that a first one of the multiple people is a recognized person responsive to the first one of the multiple people’s determined speech characterization data being determined to be sufficiently similar to speech characterization data stored in a profile that corresponds to the first one of the multiple people and that is associated with consent corresponding to the first one of the multiple people, determines that a second one of the multiple people is an unknown person responsive to the second one of the multiple people’s determined speech characterization data being determined to not be 
	
As per Claim 17, Bapat suggests updating a voiceprint associated with a respective consenting user profile based on the first extracted biometric data associated with the first… speaker or the second extracted biometric data associated the second… speaker (paragraphs 9-11, 126, 160, 168, 194, 196-197, 227-228, 238-239, 243-245; Figures 5, 7A-7B;
Paragraphs 238-239 describe where characterization data is associated with a recognized person when a detected person is recognized, characterization data is stored in persons database and where all information for recognized persons is stored.  Paragraph 10 describes previously stored PII of one or more known persons and where images are already stored for a particular person [suggesting that the stored images are stored PII].  Paragraphs 9-11, 238-239, and 243-245 describe where PII of an unknown person is stored in response to determining that a person is unknown, and then deleting the PII in response to a predetermined amount of time elapsing without a response from a user [paragraph 9], and where the data for the unrecognized person which is deleted when a pre-set amount of time has elapsed can include “characterization data” [paragraph 245, which, as per paragraphs 160 and 168 can be “speech” data], such that 
These portions suggest where “the first extracted biometric data associated with the first child speaker” [which is recognized as similar to the first child speaker’s speech characterization data in the first child speaker’s profile, as discussed in the rejection of claim 1] is stored in the first child speaker’s user profile and the oldest/lowest-quality speech characterization data in the first child speaker’s user profile is deleted [such that the “voiceprint” speech characterization data in the “consenting user profile” of the first child speaker is “updated” “based on the first extracted biometric data ”]).
Bapat does not, but Matthews suggests updating a voiceprint associated with a respective consenting user profile based on the first extracted biometric data associated with the first child speaker or the second extracted biometric data associated the second child speaker (Paragraphs 70-71;
As discussed above, Bapat suggests receiving speech and face images from multiple people at a single device and recognizing one of the multiple people based on the recognized person’s speech characterization data being similar to a consenting user profile’s speech characterization data and based on the recognized person’s face image being similar to the consenting user profile’s face image, and determining one of the multiple people to be unknown based on the unknown person’s speech characterization data not being similar to any profile’s speech characterization data and based on the unknown person’s face image not being similar to any profile’s face image, but does not specifically describe that the recognized person and the unknown person are children and that any of the people with profiles are children.

Matthews thus suggests where the multiple people whose speech and face images are received by the single device suggested by Bapat are children [i.e., where the recognized person is “a first child speaker” and where the unknown person is “a second child speaker”].  Logically, in order to recognize the “first child speaker” based on similarity between the “first child speaker’s” speech characterization data and face image and a consenting user profile’s speech characterization data and face image, the consenting user profile must be the child’s profile with the child’s “voiceprint” speech characterization data and the child’s face image, where the consenting user profile is suggested to be associated with a record indicating consent to store the child’s speech characterization data and face image [in the same way that every other consenting user profile of any person is suggested to be associated with a record indicating consent to store PII/characterization data].  Also, logically, if the “second child speaker” is determined to be unknown, then that “second child speaker’s” speech characterization data and face image are not determined to be similar to any of the speech characterization data and face images of any of the consenting user profiles [including any child’s profile])
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of person with another because the prior art teaches the claimed invention except for the substitution of a person which is not necessarily a child with a person which is.  Matthews teaches that a 

As per Claim 18, Bapat suggests A computing system programmed to process an audio signal containing speech data, the computing system comprising: an audio input; a visual input; a data store storing one or more user profiles that are each associated with one of one or more users of a computing system, wherein each user profile is associated with a voiceprint that was generated to uniquely characterize voice characteristics of a respective user of the one or more users of the computing system, wherein at least one of the stored one or more user profiles is a consenting user profile…, and wherein the consenting user profile… is associated with a record indicating consent… to store biometric data…; an interface to a storage archive storing speech data; and a hardware processor, coupled to the audio input, visual input and the data store, to: process an audio signal received via the audio input and containing speech data received from a plurality of speakers, wherein the plurality of speakers comprise a first… speaker and a second… speaker, and wherein the audio signal is processed to extract first biometric data associated with the first… speaker and second biometric data associated with the second… speaker; determine whether any of the first extracted biometric data associated with the first… speaker or the second extracted biometric data associated with the second… speaker corresponds to a voiceprint associated with a respective consenting user profile associated with a record indicating consent… to store biometric data…; responsive to determining that the first extracted biometric data associated with the first… speaker corresponds to the voiceprint associated with the respective consenting user profile associated with the record indicating consent… to store the biometric data… , perform at least one of: (i) processing speech data from the first… speaker; or (ii) storing the speech data from the first… speaker in an archive; responsive to determining that the second extracted biometric data associated with the second… speaker does not correspond to any voiceprint associated with any consenting user profile associated with a record indicating consent… to store biometric data…: delete speech data from the second… speaker within a predetermined time period. (paragraphs 9-11, 126, 160, 168, 194, 196-197, 227-228, 238-239, 243-245; Figures 5, 7A-7B [all paragraphs and Figures are cited for each limitation with “key” paragraphs and Figures pertaining to each limitation identified below, i.e. all other paragraphs and Figures not specifically referenced for any particular limitation are eligible to provide context and additional support]
“a computing system programmed to process an audio signal containing speech data, the computing system comprising: an audio input; a visual input;”: Paragraphs 196-197 describe receiving/capturing video and/or audio information at a single smart device [the “one” embodiment of “one or more smart devices” in paragraph 196].  Paragraphs 227-228 describe where characterization data is obtained from analysis of an input, and comparing characterization data to stored characterization data in a database [i.e. “processing” characterization data] and paragraphs 160 and 168 describe where characterization data can include a person’s speech.  Paragraphs 238-239 describe where characterization data of detected person[s] is compared.  These portions suggest where a system [in this rejection, the “computing system” is interpreted as the collective set of the single device and the server] includes both audio input and a visual input used to receive both video and audio information and where the audio is processed to obtain speech characterization data of respective people [which logically means that the audio itself contained speech data such that the speech characterization 
“a data store storing one or more user profiles that are each associated with one of one or more users of a computing system, wherein each user profile is associated with a voiceprint that was generated to uniquely characterize voice characteristics of a respective user of the one or more users of the computing system, wherein at least one of the stored one or more user profiles is a consenting user profile…, and wherein the consenting user profile… is associated with a record indicating consent… to store biometric data…;”: Figure 7B depicts multiple sets of person-specific data each including an image for a respective person and characterization data, and paragraph 160 describes where characterization data is deleted for persons who do not give consent to have their PII stored [which suggests an embodiment where the PII is only stored if consent is given], which at least suggests an embodiment where each set of person-specific data in Figure 7B is a “consenting user profile” [in the sense that each person “uses” the system to store PII that they consented to be stored] where the “consenting” user profiles are at least suggested to be associated with a data record indicating that consent was given by a respective person to store the respective person’s PII.  Figure 7A and Paragraph 168 describes where physical feature information of characterization data can include, among other things, a person’s speech, and paragraphs 227-228 describes where characterization data is compared to identify a person [suggesting that the characterization data in each “user profile” includes a respective person’s “voiceprint”] [paragraphs 160, 168, 227-228; Figures 7A-7B]

“and a hardware processor, coupled to the audio input, visual input and the data store, to:”: Figure 5 and paragraph 194 describes a server that includes, among other things, persons data [the persons database in Figures 7A-7B] and entity recognizer 3152, and paragraph 238 describes where person recognition based on characterization data [including comparing characterization data of detected persons with stored characterization data of previously detected persons] is performed using entity recognition module 3152.  Paragraph 126 also describes an embodiment where all functions other than user-facing input and output processing functions are delegated to the server.  Paragraph 196-197 also describes where “the data” [suggested to be the video and/or audio data captured at a smart device] is processed to determine if persons are present.  Paragraphs 196-197, 227-228, 238-239, 9-11, 243-245, 160, 168 collectively suggest the functions discussed below [pertaining to the claimed limitations], and also performing a comparison for characterization data obtained by analyzing the 
“process an audio signal received via the audio input and containing speech data received from a plurality of speakers, wherein the plurality of speakers comprise a first… speaker and a second… speaker, and wherein the audio signal is processed to extract first biometric data associated with the first… speaker and second biometric data associated with the second… speaker”: Paragraphs 196-197 describe receiving video and audio information at one or more smart devices [i.e. including an embodiment where video and audio is received at a single device] and processing “the data” [at least suggested to be the captured video and audio data] to determine if persons [i.e. plural] are present.  Paragraph 227 describes where person detection and recognition is performed by analyzing an input image, and obtaining characterization data by analyzing the input image.  Paragraphs 160 and 168 describe where characterization data can include speech data.  Paragraphs 238-239 describe where characterization data “of the detected person[s]” is compared to stored characterization data. These portions suggest an embodiment where audio information including speech from multiple persons is received at a single device’s “audio input” [in this rejection, the “computing system” is interpreted as the collective set of the single device and the 
“determine whether any of the first extracted biometric data associated with the first… speaker or the second extracted biometric data associated with the second… speaker corresponds to a voiceprint associated with a respective consenting user profile associated with a record indicating consent… to store biometric data…”: [In addition to what was already discussed above] comparing the respective speech characterization data of each of the multiple persons [derived from the audio received at the single device] to speech characterization “voiceprint” data in the “consenting user profiles” [the consenting user profiles are suggested to be associated with a record indicating consent to store biometric data as discussed above in the portion of this rejection directed to the “a data store…” limitation] to determine if there is similarity/correspondence, and if there is a similarity/correspondence, identifying a detected person as a particular person, and if there is no similarity/correspondence, identifying the detected person as an unknown person.  For the purposes of this rejection, one of the multiple speakers is recognized and one is determined to be unknown.  Paragraphs 227-228, in particular, at least suggests where, based on similarity between input characterization data for a detected person and John profile characterization data, the detected person is identified to be John, and if there is no similarity between the input characterization data and any characterization data in any of the stored profiles, the detected person is designated/identified as “an unknown person”.  Paragraphs 238-239 also describes an embodiment where the comparison is 
“responsive to determining that the first extracted biometric data associated with the first… speaker corresponds to the voiceprint associated with the respective consenting user profile associated with the record indicating consent… to store the biometric data… , perform at least one of: (i) processing speech data from the first… speaker; or (ii) storing the speech data from the first… speaker in an archive”: In addition to what was discussed above, paragraphs 238-239 further describe where characterization data “of the detected person[s]” is compared to stored characterization data and storing characterization data for recognized persons [which is at least suggested to refer to storing the characterization data obtained from the system input, since the stored characterization information which is compared to the input characterization information is already stored].  This further suggests where, in response to determining that speech characterization data of one of the multiple detected persons [where speech characterization data can be interpreted as “speech data from the… speaker”] is similar to speech characterization data of a person’s “consenting user profile” [the consenting user profiles are suggested to be associated with a record indicating consent to store biometric data as discussed above in the portion of this rejection directed to the “a data store…” limitation], recognizing the one of the detected persons as the person corresponding to the “consenting user profile” and then storing the one of the detected person’s speech characterization data in the persons database [i.e. the user profile database depicted in Figure 7B].  As discussed 
“responsive to determining that the second extracted biometric data associated with the second… speaker does not correspond to any voiceprint associated with any consenting user profile associated with a record indicating consent… to store biometric data…: delete speech data from the second… speaker within a predetermined time period”: In addition to what was discussed above, paragraph 228 suggests where a detected person can be designated as unknown [when there is no predetermined similarity between input characterization data and any stored characterization data in any consenting user profile, where the consenting user profiles are suggested to be associated with a record indicating consent to store biometric data as discussed above in the portion of this rejection directed to the “a data store…” limitation], and paragraphs 9-11 and 243-245 describe where, in response to determining that a person is unknown [and in response to determining that a user has not classified the unknown person], the characterization data/PII [where speech characterization data can be interpreted as “speech data from the… speaker”] of an unrecognized [suggested to be unknown] person is deleted after a predetermined period of time.  As discussed above, for the purposes of this rejection, one of the multiple speakers is recognized and one is determined to be unknown, where the unknown one of the multiple speakers is interpreted as the “second…speaker” [paragraphs 9-11, 243-245, 227-228, 238-239, 196-197, 160, 168; Figures 7A-7B])
A computing system programmed to process an audio signal containing speech data, the computing system comprising: an audio input; a visual input; a data store storing one or more user profiles that are each associated with one of one or more users of a computing system, wherein each user profile is associated with a voiceprint that was generated to uniquely characterize voice characteristics of a respective user of the one or more users of the computing system, wherein at least one of the stored one or more user profiles is a consenting user profile of a child, and wherein the consenting user profile of the child is associated with a record indicating consent… to store biometric data of the child; an interface to a storage archive storing speech data; and a hardware processor, coupled to the audio input, visual input and the data store, to: process an audio signal received via the audio input and containing speech data received from a plurality of speakers, wherein the plurality of speakers comprise a first child speaker and a second child speaker, and wherein the audio signal is processed to extract first biometric data associated with the first child speaker and second biometric data associated with the second child speaker; determine whether any of the first extracted biometric data associated with the first child speaker or the second extracted biometric data associated with the second child speaker corresponds to a voiceprint associated with a respective consenting user profile associated with a record indicating consent… to store biometric data of the respective child; responsive to determining that the first extracted biometric data associated with the first child speaker corresponds to the voiceprint associated with the respective consenting user profile associated with the record indicating consent… to store the biometric data of the respective child, perform at least one of: (i) processing speech data from the first child speaker; or (ii) storing the speech data from the first child speaker in an archive; responsive to determining that the second extracted biometric data associated with the second child speaker does not correspond to any voiceprint associated with any consenting user profile associated with a record indicating consent… to store biometric data of the respective child: delete speech data from the second child speaker within a predetermined time period (Paragraphs 70-71;
As discussed above, Bapat suggests receiving speech from multiple people at a single device and recognizing one of the multiple people based on the recognized person’s speech characterization data being similar to a consenting user profile’s speech characterization data, and determining one of the multiple people to be unknown based on the unknown person’s speech characterization data not being similar to any profile’s speech characterization data, but does not specifically describe that the recognized person and the unknown person are children and that any of the people with profiles are children.
Matthews describes [in paragraphs 70-71] where a single device [a camera] detects voice input from multiple sources, particularly children that are conversing with each other.
Matthews thus suggests where the multiple people whose speech is received by the single device suggested by Bapat are children [i.e., where the recognized person is “a first child speaker” and where the unknown person is “a second child speaker”].  
	Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of person with another because the prior art teaches the claimed invention except for the substitution of a person which is not necessarily a child with a person which is.  Matthews teaches that a person who is a child was known in the art.  One of ordinary skill in the art could have substituted one type of person with another to obtain the predictable results of a system which receives, at a single device, audio from multiple people, determines speech characterization data of each person of the multiple people from the received audio, compares each person’s speech characterization data to stored speech characterization data in profiles that each store a respective person’s characterization data with consent, determines that a first one of the multiple people is a recognized person responsive to the first one of the multiple people’s determined speech characterization data being 
	Bapat, in view of Matthews, do not, but Watry suggests A computing system programmed to process an audio signal containing speech data, the computing system comprising: an audio input; a visual input; a data store storing one or more user profiles that are each associated with one of one or more users of a computing system, wherein each user profile is associated with a voiceprint that was generated to uniquely characterize voice characteristics of a respective user of the one or more users of the computing system, wherein at least one of the stored one or more user profiles is a consenting user profile of a child, and wherein the consenting user profile of the child is associated with a record indicating consent by a parent of the child to store biometric data of the child; an interface to a storage archive storing speech data; and a hardware processor, coupled to the audio input, visual input and the data store, to: process an audio signal received via the audio input and containing speech data received from a plurality of speakers, wherein the plurality of speakers comprise a first child speaker and a second child speaker, and wherein the audio signal is processed to extract first biometric data associated with the first child speaker and second biometric data associated with the second child speaker; determine whether any of the first extracted biometric data associated with the first child speaker or the second extracted biometric data associated with the second child speaker corresponds to a voiceprint associated with a respective consenting user profile associated with a record indicating consent by a parent of a respective child to store biometric data of the respective child; responsive to determining that the first extracted biometric data associated with the first child speaker corresponds to the voiceprint associated with the respective consenting user profile associated with the record indicating consent by the parent of the respective child to store the biometric data of the respective child, perform at least one of: (i) processing speech data from the first child speaker; or (ii) storing the speech data from the first child speaker in an archive; responsive to determining that the second extracted biometric data associated with the second child speaker does not correspond to any voiceprint associated with any consenting user profile associated with a record indicating consent by a parent of a respective child to store biometric data of the respective child: delete speech data from the second child speaker within a predetermined time period (Paragraphs 10, 13, 75; Figure 17;
Bapat, in view of Matthews, suggests where the multiple people can be children and where a consenting user profile can be a consenting user profile of a child [because if one of the children is recognized, then that child is suggested to have a user profile 
Bapat and Matthews do not specifically describe where the suggested consent to store the child’s characterization data in the child’s profile is consent given by a parent of the child.
Watry describes where a parent consents to personal information that a toy and its application is collecting, storing and sharing [paragraph 10], where parents can revoke consent which leads to personal information stored on a child to be deleted [paragraph 13].  Paragraph 75 and Figure 17 also suggests where a parent provides consent for a child to play with a toy after reviewing “privacy data/info” [at least suggested to be personal information of the child].  Watry thus suggests where consent to store personal information of a child is provided by a parent of the child.
Watry thus suggests where the suggested consent to store the “first child speaker’s” characterization data is more specifically provided by the first child speaker’s parent [instead of by the first child speaker himself/herself] and suggests where the “second child speaker’s” speech characterization data is determined to not correspond to “any voiceprint associated with any consenting user profile associated with a record indicating consent by a parent of a respective child to store biometric data of the respective child” [the second child speaker’s speech characterization data is determined to not correspond to any profiles, including the first child speaker’s profile which stores the first child speaker’s characterization data with the first child speaker’s consent])
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of consent with .

Claim 5-6, is/are rejected under 35 U.S.C. 103 as being unpatentable over Bapat, in view of Matthews and Watry, as applied to Claims 1 and 7, above, and further in view of Mulhern et al. (US 2016/0203699), hereafter Mulhern.

As per Claim 5, Bapat suggests wherein responsive to determining that the second extracted biometric data associated with the second speaker does not correspond to any voiceprint associated with any consenting user profile, the speech data from the second speaker is deleted (see rejection of claim 1, particularly the portion directed to the last limitation).  Bapat, in view of Matthews and Watry, do not, but Mulhern suggests wherein responsive to determining that the second extracted biometric data associated with the second speaker does not correspond to any voiceprint associated with any consenting user profile, the speech data from the second speaker is deleted without being processed further (“data samples are discarded immediately or after a predetermined period of time”, paragraph 30; “all analyzed data samples are location, date- and time-stamped by the program 110 for later retrieval and for inclusion in reports. However, the raw data may be discarded immediately or after a predetermined period, for privacy reasons”, paragraph 161; 
As discussed in the rejection of claim 1, Bapat suggests deleting PII/characterization data of a person determined to be unknown within a predetermined time period.

Mulhern thus suggests where Bapat’s system, instead of storing the unknown person characterization data for a predetermined time to give the user an opportunity to classify the person, immediately deletes the input-audio-based speech characterization data of the detected person determined to be unknown [because the input-audio-based speech characterization data of the detected person determined to be unknown does not match any stored characterization data in any of the profiles, where the detected person determined to be unknown is “the second child speaker” in the combination applied to reject claim 1], and immediately deleting logically deletes “without processing” the input-audio-based speech characterization data corresponding to the detected person determined to be unknown “further” [for claim 5])
	Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one deletion timing with another because the prior art teaches the claimed invention except for the substitution of deletion timing which is not immediate with deletion timing which is immediate.  Mulhern teaches that deletion timing which is immediate (as an alternative to deletion within a predetermined amount of time) was known in the art.  One of ordinary skill in the art could have substituted one deletion timing with another to obtain the predictable results of a system which receives, at a single device, audio from multiple people, determines speech characterization data of each person of the multiple people from the received audio, compares each person’s speech characterization data to stored speech characterization data in profiles that each store a respective person’s characterization 

As per Claim 6, Bapat suggests wherein said predetermined time period is… after determining that the second extracted biometric data associated with the second speaker does not correspond to any voiceprint associated with any consenting user profile (see rejection of claim 1, particularly the portion directed to the last limitation).  Bapat, in view of Matthews and Watry, do not, but Mulhern suggests wherein said predetermined time period is immediately after determining that the second extracted biometric data associated with the second speaker does not correspond to any voiceprint associated with any consenting user profile (“data samples are discarded immediately or after a predetermined period of time”, paragraph 30; “all analyzed data samples are location, date- and time-stamped by the program 110 for later retrieval and for inclusion in reports. However, the raw data may be discarded immediately or after a predetermined period, for privacy reasons”, paragraph 161; 
As discussed in the rejection of claim 1, Bapat suggests deleting PII/characterization data of a person determined to be unknown within a predetermined time period.
Mulhern describes where what can be discarded/deleted within a predetermined time period can also be discarded immediately [for privacy reasons]
Mulhern thus suggests where Bapat’s system, instead of storing the unknown person characterization data for a predetermined time to give the user an opportunity to classify the person, immediately deletes the input-audio-based speech characterization data of the detected person determined to be unknown [because the input-audio-based speech characterization data of the detected person determined to be unknown does not match any stored characterization data in any of the profiles, where the detected person determined to be unknown is “second child speaker” in the combination applied to reject claim 1], and immediately deleting logically deletes “immediately after determining that the second extracted biometric data associated with the second speaker does not correspond to any voiceprint associated with any consenting user profile” [for claim 6])
.

Claim 9 and 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bapat, in view of Matthews and Watry, as applied to claim 1, above, and further in view of Parker et al. (US 2017/0133018), hereafter Parker, and Coffin et al. (US 5,991,429), hereafter Coffin, and Chebolu et al. (US 2005/0060412), hereafter Chebolu.

As per Claim 9, Bapat suggests creating a consenting user profile, wherein creating a consenting user profile comprises:…; receiving speech data of… second user; extracting biometric data from said speech data of… second user; storing said biometric data from said speech data of… second user and associating said biometric data from said speech data of… second user with… user profile associated with… second user; and storing… user profile associated with… second user as a consenting user profile (paragraphs 9-11, 126, 160, 168, 194, 196-197, 227-228, 238-239, 243-245; Figures 5, 7A-7B;
	Figure 7B depicts multiple sets of person-specific data each including an image for a respective person and characterization data [i.e. “user profiles”].  Figure 7A and Paragraph 168 describes where physical feature information of characterization data can include, among other things, information regarding a person’s speech, and paragraphs 227-228 describes where characterization data is compared to identify a 
Bapat suggests “creating a consenting user profile” [the “user profiles” in the persons database in Figure 7B exist and were thus logically “created” at some point, and paragraph 160 suggests where the “user profiles” are “consenting user profiles”] “wherein creating a consenting user profile comprises:…; receiving speech data of… second user; extracting biometric data from said speech data of… second user” [in order to have profiles that each include a respective person’s speech characterization data, it is at least suggested that at some point, for each profile, a respective “user”/person’s speech was received and the person’s speech characterization/”biometric” data was analyzed to “extract” the person’s speech characterization/”biometric” data, particularly because a person’s actual speech is logically the best place to derive information about the person’s speech] “storing said 
Bapat suggests creating a consenting user profile, wherein creating a consenting user profile comprises:…; receiving speech data of… second user; extracting biometric data from said speech data of… second user; storing said biometric data from said speech data of… second user and associating said biometric data from said speech data of… second user with… user profile associated with… second user; and storing… user profile associated with… second user as a consenting user profile.  Bapat, in view of Matthews and Watry, do not, but Parker suggests creating a consenting user profile, wherein creating a consenting user profile comprises:… initialising a user profile associated with a second user…; receiving speech data of the second user; extracting biometric data from said speech data of the second user; storing said biometric data from said speech data of the second user and associating said biometric data from said speech data of the second user with said user profile associated with the second user; and storing said user profile associated with the second user as a consenting user profile  (“As a part of creating a new speaker profile in the database, a voice sample may be recorded from the unidentified voice signal, e.g., taken from the ongoing conversation, and stored in the new speaker profile. Thus, the voice sample stored in the new speaker profile may be correlated against voice signals received in the future. Moreover, the new speaker profile may be integrated with other existing databases, e.g., a supplemental database such as a government database, a local database, etc.”, paragraph 79; “Once a new speaker profile has been created and populated with speaker identification information and a voice sample, the new speaker profile may then be used in the correlation of subsequently received voice signals”, paragraph 80
	Bapat suggests where a “consenting user profile: is created [based on the existence of a “consenting user profile”] by [at least] receiving a “consenting user’s” speech, “extracting” the “consenting user’s” speech characterization data, storing the “consenting user’s” speech characterization data on the system before storing the “consenting user’s” speech characterization data in the “consenting user’s” profile, and then storing the “consenting user’s” profile as one of the “consenting user profiles”.
	Bapat does not, however, describe where the “consenting user” profile [“user profile associated with a second user”] is first “initialized” as part of being “created”.
	Parker describes [in paragraphs 79-80] where a new speaker profile “in the database” is “created and populated” with speaker identification information and a voice sample such that the new speaker profile may be used in the correlation of subsequently received voice signals, which at least suggests where the creating of the profile produces an “initial” speaker profile data structure that can be “populated” with 
	Parker suggests where the “consenting user’s” profile was created by “initializing a user profile associated with a second user” [i.e. creating an “initial” profile data structure for the “consenting user”/”second user” to contain the “consenting user’s” image and characterization data] and then populating the “user profile associated with a second user”/ “consenting user’s”-profile with the “consenting user’s”/second-user’s speech characterization data and then storing the populated “user profile associated with a second user”/ “consenting user’s”-profile as one of the “consenting user profiles” in the persons database.)
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of profile with another because the prior art teaches the claimed invention except for the substitution of a profile which is not necessarily created, in part, by initializing the profile, with a profile which is.  Parker suggests that a profile which is created, in part, by initializing the profile, was known in the art.  One of ordinary skill in the art could have substituted one type of profile with another to obtain the predictable results of a system which receives, at a single device, audio from multiple people, determines speech characterization data of each person of the multiple people from the received audio, compares each person’s speech characterization data to stored speech characterization data in profiles that each store a respective person’s characterization data with consent, determines that a first one of the multiple people is a recognized person responsive to the first one of the 
Bapat, in view of Matthews and Watry and Parker, suggest creating a consenting user profile, wherein creating a consenting user profile comprises:… initialising a user profile associated with a second user…;.  Bapat, in view of Matthews and Watry and Parker, do not, but Coffin suggests creating a consenting user profile, wherein creating a consenting user profile comprises: verifying credentials of a first user of the computing system against a data source to ensure that the first user is authorised to provide consent to store speech data; initialising a user profile associated with a second user, on instruction of the first user (col. 2, line 48 – col. 3, line 21; Col. 3, line 61 – col. 4, line 20; Col. 6, lines 5-32; Abstract;
The combination [thus far] is as discussed in the portion of this rejection of claim 9 based on Parker [including where the a consenting user profile is generated by creating the consenting user profile and populating the consenting user profile with speech characterization data of a consenting user and storing the consenting user profile as one of the consenting user profiles in the persons database].
In Coffin, col. 2, line 48 – col. 3, line 21 describes where a facial recognition system is operated in an enrollment mode at the direction of an authorized operator or system administrator, and where an administrator is logged into a system with an operator password with a level of responsibility [i.e. by “verifying” password “credentials”, where it is suggested that the password is associated with an operator/administrator that has a certain level of responsibility] and can have the ability to enroll new data.  Col. 3, line 61 – col. 4, line 20 describes where an operator can choose to add a new person to the system and add a new record, and where an operator can add all relevant data [at least suggested to be relevant data to be added to the new record] such as a full name and social security number and clearance level [suggesting that the new record corresponds to a single person, and thus is a “profile” for a single person].  Col. 3, line 61 – col. 4, line 20 also describes adding a new person or editing demographic data of a person already present in the database, and “To edit an existing data record” [suggesting that each person in the database corresponds to a record/”profile”]. Col. 6, lines 5-32 describe where image information for use in a surveillance mode includes opening an image information enrollment screen in order to 
	Coffin thus suggests where, in the Bapat/Matthews/Watry/Parker combination, at least one of the “consenting user profiles” is created by logging in a system administrator with enrollment authority using a password that corresponds to a level of responsibility/authority [i.e. “verifying credentials of a first user of the computing system against a data source to ensure that the first user is authorized to provide consent to store speech data”, where successfully logging in with a password “verifies” both the system administrator’s password “credential” and the system administrator’s responsibility level “credential” “against a data source” that is at least suggested to include a valid password and a corresponding responsibility level for the system administrator, “to ensure that” the system administrator is authorized to initiate/consent-to creation/enrollment of the “consenting user profile” which includes, among other things, person’s speech characterization data] and where creation of the consenting user profile [which, as per Parker, includes creating/”initializing” a consenting user profile to be populated with the speech characterization data for a consenting user that was extracted from the consenting user’s speech] is initiated at the direction of the system administrator [where “initializing a user profile associated with a second user” is done “on instruction of the first user”])

	Bapat, in view of Matthews, Watry, Parker, and Coffin, do not, but Chebolu suggests wherein the second user is a child and the first user is a parent of the second user (paragraph 50;
	As discussed above, Coffin thus suggests where, in the Bapat/Matthews/Watry/Parker combination, at least one of the “consenting user profiles” is created by logging in a system administrator with enrollment authority using a password that corresponds to a level of responsibility/authority [i.e. “verifying credentials of a first user of the computing system against a data source to ensure that the first user is authorized to provide consent to store speech data”, where successfully logging in with a password “verifies” both the system administrator’s password “credential” and the system administrator’s responsibility level “credential” “against a data source” that is at least suggested to include a valid password and a corresponding responsibility level for the system administrator, “to ensure that” the system administrator is authorized to initiate/consent-to creation/enrollment of the “consenting user profile” which includes, among other things, person’s speech characterization data] and where creation of the consenting user profile [which, as per Parker, includes creating/”initializing” a 
	Chebolu [paragraph 50] similarly describes where an administrator creates a profile for a person, and more specifically describes where the administrator is a parent and where the profile is created for a child of the parent.
	Chebolu thus suggests “wherein the second user is a child and the first user is a parent of the second user”: where the administrator in the Bapat/Matthews/Watry/Parker/Coffin combination is more specifically a parent of a child and where the created “consenting user profile” is created for the child of the parent)
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of administrator with another and one type of user corresponding to a created profile with another because the prior art teaches the claimed invention except for the substitution of an administrator which is not necessarily a parent of a child with an administrator which is, and the substitution of a user corresponding to a created profile who is not the child of the parent with a user corresponding to a created profile who is.  Chebolu teaches that an administrator who is a parent of a child and a user corresponding to a created profile who is the child of the parent was known in the art.  One of ordinary skill in the art could have substituted one type of administrator with another and one type of user corresponding to a created profile with another to obtain the predictable results of a system which receives, at a single device, audio from multiple people, determines 

matching additional biometric data acquired from the first… speaker or the second… speaker against stored biometric data associated with a respective consenting user profile, and storing non-speech biometric data during profile creation (paragraphs 9-11, 126, 160, 168, 194, 196-197, 227-228, 238-239, 243-245; Figures 5, 7A-7B;
Paragraph 228 and Figure 7B describes where image portion 704 and corresponding characterization data 705 is compared with stored images and characterization data [i.e. where more than one type of comparison is performed, including an image comparison and a characterization data comparison is performed].  Paragraphs 196-197 describe an embodiment where video and audio are both received, and paragraphs 160 and 168 describe where characterization data can include person speech data.
Bapat thus suggests “matching additional biometric data acquired from the first… speaker or the second… speaker against stored biometric data associated with a respective consenting user profile”: comparing multiple types of data [including “person’s speech” characterization data and “additional biometric data”] to corresponding stored data in the “consenting user profiles” in order to determine a detected person to be a particular person [e.g. John]
Bapat further suggests “storing non-speech biometric data during profile creation”:  Figure 7B, paragraph 160, and paragraph 168 describe where characterization data for persons who do not give consent to have PII stored is deleted [suggesting that the person-specific “profiles” in Figure 7B are “consenting user profiles”] where characterization data includes non-speech information like posture, 
Bapat does not, but Matthews suggests matching additional biometric data acquired from the first child speaker or the second child speaker against stored biometric data associated with a respective consenting user profile, and storing non-speech biometric data during profile creation (Paragraphs 70-71;
As discussed above, Bapat suggests receiving speech from multiple people at a single device and recognizing one of the multiple people based on the recognized person’s speech characterization data being similar to a consenting user profile’s speech characterization data, and determining one of the multiple people to be unknown based on the unknown person’s speech characterization data not being similar to any profile’s speech characterization data, but does not specifically describe that the recognized person and the unknown person are children and that any of the people with profiles are children.

Matthews thus suggests where the multiple people whose speech is received by the single device suggested by Bapat are children [i.e., where the recognized person is “a first child speaker” and where the unknown person is “a second child speaker”].  Logically, in order to recognize the “first child speaker” based on similarity between the “first child speaker’s” speech characterization data and a consenting user profile’s speech characterization data, the consenting user profile must be the child’s profile with the child’s “voiceprint” speech characterization data, where the consenting user profile is suggested to be associated with a record indicating consent to store the child’s speech characterization data [in the same way that every other consenting user profile of any person is suggested to be associated with a record indicating consent to store PII/characterization data].  Also, logically, if the “second child speaker” is determined to be unknown, then that “second child speaker’s” speech characterization data is not determined to be similar to any of the speech characterization data in any of the consenting user profiles [including any child’s profile])
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of person with another because the prior art teaches the claimed invention except for the substitution of a person which is not necessarily a child with a person which is.  Matthews teaches that a person who is a child was known in the art.  One of ordinary skill in the art could have substituted one type of person with another to obtain the predictable results of a system 

Claims 13-14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bapat, in view of Matthews and Watry, as applied to Claim 1, above, and further in view of Tussy (US 2016/0063235).

As per Claim 13, Bapat suggests wherein determining whether the first extracted biometric data associated with the first… speaker or the second extracted biometric data associated with the second… speaker corresponds to a voiceprint associated with a respective consenting user profile comprises determining a match against a user profile of a… user (paragraphs 9-11, 96, 126, 160, 168, 194, 196-197, 227-228, 238-239, 243-245; Figures 1, 5, 7A-7B;
See rejection of claim 1, particularly the portions corresponding to the “storing one or more user profiles…” and “determining whether any of the …” and “responsive to determining that the first extracted biometric data associated with the first… speaker corresponds to the voiceprint…” limitations.
For the purposes of rejecting claim 13, Figures 7A-7B also depicts where the person being compared to the persons database is holding what appears to be an electronic device [see also Figure 1 and paragraph 96 which depicts and describes element 166 being held by a person in a similar way and describes element 166 as a portable electronic device].  Bapat thus suggests an embodiment where the first speaker who is determined to be a particular known person [by determining there is sufficient similarity/match between the first speaker’s speech characterization data and the particular person’s stored speech characterization data] is also a person who is using an electronic device.)
Bapat does not, but Matthews suggests wherein determining whether the first extracted biometric data associated with the first child speaker or the second extracted biometric data associated with the second child speaker corresponds to a voiceprint associated with a respective consenting user profile comprises determining a match against a user profile of a… user (Paragraphs 70-71;

Matthews describes [in paragraphs 70-71] where a single device [a camera] detects voice input from multiple sources, particularly children that are conversing with each other.
Matthews thus suggests where the multiple people whose speech is received by the single device suggested by Bapat are children [i.e., where the recognized person is “a first child speaker” and where the unknown person is “a second child speaker”].  Logically, in order to recognize the “first child speaker” based on similarity between the “first child speaker’s” speech characterization data and a consenting user profile’s speech characterization data, the consenting user profile must be the child’s profile with the child’s “voiceprint” speech characterization data, where the consenting user profile is suggested to be associated with a record indicating consent to store the child’s speech characterization data [in the same way that every other consenting user profile of any person is suggested to be associated with a record indicating consent to store PII/characterization data].  Also, logically, if the “second child speaker” is determined to be unknown, then that “second child speaker’s” speech characterization data is not 
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of person with another because the prior art teaches the claimed invention except for the substitution of a person which is not necessarily a child with a person which is.  Matthews teaches that a person who is a child was known in the art.  One of ordinary skill in the art could have substituted one type of person with another to obtain the predictable results of a system which receives, at a single device, audio from multiple people, determines speech characterization data of each person of the multiple people from the received audio, compares each person’s speech characterization data to stored speech characterization data in profiles that each store a respective person’s characterization data with consent, determines that a first one of the multiple people is a recognized person responsive to the first one of the multiple people’s determined speech characterization data being determined to be sufficiently similar to speech characterization data stored in a profile that corresponds to the first one of the multiple people and that is associated with consent corresponding to the first one of the multiple people, determines that a second one of the multiple people is an unknown person responsive to the second one of the multiple people’s determined speech characterization data being determined to not be sufficiently similar to speech characterization data in any of the profiles, and deletes the second one of the multiple people’s determined speech characterization data based on the second one of the multiple people being determined to be an unknown person, 
Bapat, in view of Matthews and Watry do not, but Tussy suggests wherein determining whether the first extracted biometric data associated with the first child speaker or the second extracted biometric data associated with the second child speaker corresponds to a voiceprint associated with a respective consenting user profile comprises determining a match against a user profile of a logged-in user (“facial recognition login system provides a convenient manner for a user to login to an account with a mobile device… user simply needs to image himself or herself”, paragraph 113; “if the credentials provided by the user are not verified… display on the screen of the mobile device… login attempt failed… allow the user to try again to log in via the facial recognition login”, paragraph 104; “If in one of the attempts… required level of correspondence is met… user may be verified and access may be granted”, paragraph 105;
As discussed above, for the purposes of rejecting claim 13, Bapat suggests an embodiment where the first speaker who is determined to be a particular known person [by determining there is sufficient similarity/match between the first speaker’s speech characterization data and the particular person’s stored speech characterization data] is also a person who is using an electronic device.
Tussy more specifically describes where a mobile device user is logged in.
Tussy thus suggests an embodiment of Bapat’s system where the detected person [i.e. the “first child speaker”] who is using an electronic device and whose speech characterization data is compared to and matched with the speech 
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of person who is using an electronic device with another because the prior art teaches the claimed invention except for the substitution of a person who is using an electronic device who is not necessarily logged in with a person who is using an electronic device who is.  Tussy teaches that a person who is using an electronic device who is logged in was known in the art.  One of ordinary skill in the art could have substituted one type of person who is using an electronic device with another to obtain the predictable results of a system which receives, at a single device, audio from multiple people, determines speech characterization data of each person of the multiple people from the received audio, compares each person’s speech characterization data to stored speech characterization data in profiles that each store a respective person’s characterization data with consent, determines that a first one of the multiple people is a recognized person responsive to the first one of the multiple people’s determined speech characterization data being determined to be sufficiently similar to speech characterization data stored in a profile that corresponds to the first one of the multiple people and that is associated with consent corresponding to the first one of the multiple people, determines that a second one of the multiple people is an unknown person responsive to the second one of the 

As per Claim 14, Bapat suggests wherein said… user is one of said one or more users of said computing system and detection of biometric data associated with the…user (paragraphs 9-11, 96, 126, 160, 168, 194, 196-197, 227-228, 238-239, 243-245; Figures 1, 5, 7A-7B;
See rejection of claim 1, particularly the portions corresponding to the “storing one or more user profiles…” and “determining whether extracted biometric data associated with…” and “responsive to determining that said extracted biometric data associated with the first speaker corresponds to the voiceprint…” limitations, and see also rejection of claim 13.
Of particular relevance to claim 14 are the part of the rejection of claim 1 that state that “each set of person-specific data in Figure 7B is a “consenting user profile” [in the sense that each person “uses” the system to store PII that they consented to be 
More specifically, when a person who is using an electronic device [e.g. the person in Figures 7A-7B in element 704] is a person who is recognized as a person corresponding to a “consenting user profile” [e.g. one of the people in the persons database in Figure 7B], that person is a “one of said one or more users of said computing system” [in the sense that he/she “uses” the system to store PII that he/she consented to be stored], and [as per the rejection of claim 13] is also a user of an electronic device [in the rejection of claim 13, above, based on Bapat, in view of Tussy, a “logged-in user”]
Figures 7A-7B and paragraphs 227-228 also describes identification/recognition of a person based on face image comparison [i.e. “detection of” face “biometric data associated with the” person that appears to be using the electronic device]).
Bapat, in view of Matthews and Watry, do not, but Tussy suggests wherein said logged-in user is one of said one or more users of said computing system (“facial recognition login system provides a convenient manner for a user to login to an account with a mobile device… user simply needs to image himself or herself”, paragraph 113; “if the credentials provided by the user are not verified… display on the screen of the mobile device… login attempt failed… allow the user to try again to log in via the facial recognition login”, paragraph 104; “If in one of the attempts… required level of 
As discussed above, for the purposes of rejecting claim 13, Bapat suggests an embodiment where the first speaker who is determined to be a particular known person [by determining there is sufficient similarity/match between the first speaker’s speech characterization data and the particular person’s stored speech characterization data] is also a person who is using an electronic device.
Tussy more specifically describes where a mobile device user is logged in.
Tussy thus suggests an embodiment of Bapat’s system where the detected person [i.e. the “first child speaker”] who is using an electronic device and whose speech characterization data is compared to and matched with the speech characterization data of a “consenting user profile” [thereby identifying the detected person as a particular person corresponding to the “consenting user profile”] is a person who has used the electronic device to log in to an account and is recognized/identified as the particular person corresponding to the stored “consenting user profile” [such that the person’s profile is a “user profile of a logged-in user”])
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of person who is using an electronic device with another because the prior art teaches the claimed invention except for the substitution of a person who is using an electronic device who is not necessarily logged in with a person who is using an electronic device who is.  Tussy teaches that a person who is using an electronic device who is logged in was known in the art.  One of ordinary skill in the art could have substituted one type of person who is 
wherein said logged-in user has been logged into said computing system in response to detection of biometric data associated with the logged-in user (Figure 3; “A log in module 328 is also provided to allow a user to log in, such as with password protection, to the mobile device 304”, paragraph 72; “facial recognition login system provides a convenient manner for a user to login to an account with a mobile device… user simply needs to image himself or herself”, paragraph 113; “if the credentials provided by the user are not verified… display on the screen of the mobile device… login attempt failed… allow the user to try again to log in via the facial recognition login”, paragraph 104; “If in one of the attempts… required level of correspondence is met… user may be verified and access may be granted”, paragraph 105;
	As discussed above, Tussy suggests an embodiment of Bapat’s system where the detected person [i.e. the “first child speaker”]  who is using an electronic device and whose speech characterization data is compared to and matched with the speech characterization data of a “consenting user profile” [thereby identifying the detected person as a particular person corresponding to the “consenting user profile”] is a person who has used the electronic device to log in to an account and is recognized/identified as the particular person corresponding to the stored “consenting user profile” [such that the person’s profile is a “user profile of a logged-in user”, and such that the person is a “logged-in user” who is logged into an account, and such that the person is “one of said one or more users of said computing system” in the sense that the person “uses” the computing system to store PII that he/she consented to be stored]

	Tussy thus further suggests where Bapat’s system, in response to recognizing the face of the “logged-in user” [the “first child speaker” detected person who is recognized and who is using an electronic device and who is logged into an account, where recognizing a face detects biometric data of a particular person], Bapat’s system [“said computing system” which performs the recognition/comparison] logs the “logged-in user” into Bapat’s system [similar to how the mobile device login module in Tussy is used to log into the mobile device])
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of processing performed in response to recognizing a person’s face with facial recognition with another because the prior art teaches the claimed invention except for the substitution of processing performed in response to recognizing a person’s face with facial recognition which does not include logging in the recognized person with processing performed in response to recognizing a person’s face with facial recognition which does.  Tussy teaches that processing performed in response to recognizing a person’s face with facial recognition which includes logging in the recognized person was known in the art.  One of ordinary skill in the art could have substituted one type of processing performed in response to recognizing a person’s face with facial recognition with another to obtain the predictable results of a system which receives, at a single device, audio and video from multiple .

Claim 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bapat, in view of Matthews and Watry, as applied to Claim 1, above, and further in view of Zeppenfeld et al. (US 2014/0254778), hereafter Zeppenfeld.

As per Claim 15, Bapat suggests wherein determining whether any of the first extracted biometric data associated with the first… speaker or the second extracted biometric data associated with the second… speaker comprises determining a match for any of the first extracted biometric data associated with the first… speaker or the second extracted biometric data associated with the second… speaker against… consenting user profiles…  (paragraphs 9-11, 126, 160, 168, 194, 196-197, 227-228, 238-239, 243-245; Figures 5, 7A-7B;
As discussed in the rejection of claim 1, above, Bapat suggests “determining whether any of the first extracted biometric data associated with the first… speaker or the second extracted biometric data associated with the second… speaker corresponds to a voiceprint associated with a respective consenting user profile associated with a record indicating consent… to store biometric data…”: [In addition to what was already discussed above] comparing the respective speech characterization data of each of the multiple persons [derived from the audio received at the single device] to speech characterization “voiceprint” data in the “consenting user profiles” [the consenting user profiles are suggested to be associated with a record indicating consent to store 
Bapat suggests “wherein determining whether any of the first extracted biometric data associated with the first… speaker or the second extracted biometric data associated with the second… speaker comprises determining a match for any of the first extracted biometric data associated with the first… speaker or the second extracted biometric data associated with the second… speaker against… consenting user profiles…”: where the comparing of the respective speech characterization data of each of the multiple persons [derived from the audio received at the single device] to speech characterization “voiceprint” data in the “consenting user profiles” to determine if there is 
Bapat does not, but Matthews suggests wherein determining whether any of the first extracted biometric data associated with the first child speaker or the second extracted biometric data associated with the second child speaker comprises determining a match for any of the first extracted biometric data associated with the first child speaker or the second extracted biometric data associated with the second child speaker against… consenting user profiles… (Paragraphs 70-71;
As discussed above, Bapat suggests receiving speech from multiple people at a single device and recognizing one of the multiple people based on the recognized person’s speech characterization data being similar to a consenting user profile’s speech characterization data, and determining one of the multiple people to be unknown based on the unknown person’s speech characterization data not being similar to any profile’s speech characterization data, but does not specifically describe 
Matthews describes [in paragraphs 70-71] where a single device [a camera] detects voice input from multiple sources, particularly children that are conversing with each other.
Matthews thus suggests where the multiple people whose speech is received by the single device suggested by Bapat are children [i.e., where the recognized person is “a first child speaker” and where the unknown person is “a second child speaker”].  Logically, in order to recognize the “first child speaker” based on similarity between the “first child speaker’s” speech characterization data and a consenting user profile’s speech characterization data, the consenting user profile must be the child’s profile with the child’s “voiceprint” speech characterization data, where the consenting user profile is suggested to be associated with a record indicating consent to store the child’s speech characterization data [in the same way that every other consenting user profile of any person is suggested to be associated with a record indicating consent to store PII/characterization data].  Also, logically, if the “second child speaker” is determined to be unknown, then that “second child speaker’s” speech characterization data is not determined to be similar to any of the speech characterization data in any of the consenting user profiles [including any child’s profile])
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of person with another because the prior art teaches the claimed invention except for the substitution of a person which is not necessarily a child with a person which is.  Matthews teaches that a 
Bapat, in view of Matthews and Watry, do not, but Zeppenfeld suggests wherein determining whether any of the first extracted biometric data associated with the first child speaker or the second extracted biometric data associated with the second child speaker comprises determining a match for any of the first extracted biometric data associated with the first child speaker or the second extracted biometric data associated with the second child speaker against both consenting user profiles and non-consenting user profiles, wherein a non-consenting user profile is a user profile not associated with a record indicating consent to store biometric data (paragraphs 4, 19-20, 24, 29;
Paragraph 19 describes where, in addition to a private customer database, a biometric analysis cloud can include a “fraudster voiceprint database” and where the databases can store “both authorized and unauthorized” voiceprints.  Paragraph 20 also teaches where identifying an unauthorized transaction causes a fraudulent voice print to be transferred from a private customer voiceprint database to a fraudulent database, and where a fraudulent database contains voice prints of unauthorized callers. Paragraph 24 describes comparing a caller’s voice prints to a list of authorized voices and where “The voice prints may also be compared to unauthorized voice prints” and where “The voices of known fraudsters may be stored in a database and used to compare with the voice of an incoming caller”.  Paragraph 29 also describes a score that indicates likelihood that a caller is either authorized or unauthorized, and comparing a caller’s voice to authorized voice prints and also known fraudulent voice prints.  Paragraph 4 also describes the concept of an “unauthorized user”.
	Zeppenfeld suggests where Bapat’s system [a surveillance system that can identify someone as a particular person and as an unknown person, and which allows a user to classify an unknown person as a stranger] further includes, in the person database, in addition to the “consenting user profiles” of known people, profiles of known “fraudsters”/”unauthorized users”, and where the respective speech characterization data of each of the multiple persons [children, as per Matthews] is both consenting user profiles and non-consenting user profiles, wherein a non-consenting user profile is a user profile not associated with a record indicating consent to store biometric data” [i.e. where the comparing compares to both known person profiles and fraudster/”unauthorized user” profiles, where the fraudster/”unauthorized user” profile is at least suggested to be “non-consenting” and “not associated with a record indicating consent to store biometric data” since a person, particularly an unauthorized user/criminal/fraud, most likely would not consent to being identified as an unauthorized user/fraud/criminal])
	Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Bapat, in view of Matthews, to include the suggestion of Zeppenfeld of wherein determining whether any of the first extracted biometric data associated with the first child speaker or the second extracted biometric data associated with the second child speaker comprises determining a match for any of the first extracted biometric data associated with the first child speaker or the second extracted biometric data associated with the second child speaker against both consenting user profiles and non-consenting user profiles, wherein a non-consenting user profile is a user profile not associated with a record indicating consent to store biometric data, in order to provide the ability to identify known frauds/unauthorized-users (as per paragraphs 24 and 29).

Claim 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Bapat, in view of Matthews, Watry and Zeppenfeld, as applied to Claim 15, above, and further in view of Parker et al. (US 2017/0133018), hereafter Parker.

As per Claim 16, Bapat suggests creating a… user profile, wherein creating a… user profile comprises:…; receiving speech data of… third user; extracting third biometric data from said speech data of… third user; storing said third biometric data and associating said third biometric data with… third user profile;… (paragraphs 9-11, 126, 160, 168, 194, 196-197, 227-228, 238-239, 243-245; Figures 5, 7A-7B;
	Figure 7B depicts multiple sets of person-specific data each including an image for a respective person and characterization data [i.e. “user profiles”].  Figure 7A and Paragraph 168 describes where physical feature information of characterization data can include, among other things, information regarding a person’s speech, and paragraphs 227-228 describes where characterization data is compared to identify a person and associates the characterization data with a profile for “the person designed as ‘John’” [suggesting that the person-specific data in the persons database in Figure 7B are “user profiles” in the sense that each person “uses” the system to store PII that they consented to be stored and where the characterization data in each “user profile” includes a respective person’s “voiceprint”]  Paragraph 238-239 further describes 
	Bapat suggests “creating a… user profile,” [the “user profiles” in the persons database in Figure 7B exist and were thus logically “created” at some point] “wherein creating a… user profile comprises:…; receiving speech data of… third user; extracting third biometric data from said speech data of… third user;” [in order to have profiles that each include a respective person’s speech characterization data, it is at least suggested that at some point, for each profile, a respective “user”/person’s speech was received and the person’s speech characterization/”biometric” data was analyzed to “extract” the person’s speech characterization/”biometric” data, particularly because a person’s actual speech is logically the best place to derive information about the person’s speech] “storing said third biometric data and associating said third biometric data with…third user profile;” [storing a person’s speech characterization data in the person’s profile “associates” the person’s speech characterization data with the person’s profile, and paragraphs 238-239 suggest an embodiment where the speech characterization data exists in, and is thus “stored” in the system at some time after “extraction” and before being stored in the person’s profile])
Bapat suggests creating a… user profile, wherein creating a… user profile comprises:…; receiving speech data of… third user; extracting third biometric data from said speech data of… third user; storing said third biometric data and associating said third biometric data with… third user profile;… Bapat, in view of creating a non-consenting user profile, wherein creating a non-consenting user profile comprises:…; receiving speech data of… third user; extracting third biometric data from said speech data of… third user; storing said third biometric data and associating said third biometric data with… third user profile; and storing… third user profile as a non-consenting user profile (paragraphs 4, 19-20, 24, 29;
Paragraph 19 describes where, in addition to a private customer database, a biometric analysis cloud can include a “fraudster voiceprint database” and where the databases can store “both authorized and unauthorized” voiceprints.  Paragraph 20 also teaches where identifying an unauthorized transaction causes a fraudulent voice print to be transferred from a private customer voiceprint database to a fraudulent database, and where a fraudulent database contains voice prints of unauthorized callers. Paragraph 24 describes comparing a caller’s voice prints to a list of authorized voices and where “The voice prints may also be compared to unauthorized voice prints” and where “The voices of known fraudsters may be stored in a database and used to compare with the voice of an incoming caller”.  Paragraph 29 also describes a score that indicates likelihood that a caller is either authorized or unauthorized, and comparing a caller’s voice to authorized voice prints and also known fraudulent voice prints.  Paragraph 4 also describes the concept of an “unauthorized user”.
	Zeppenfeld suggests where Bapat’s system [a surveillance system that can identify someone as a particular person and as an unknown person, and which allows a user to classify an unknown person as a stranger] further includes, in the person database, in addition to the “consenting user profiles” of known people, profiles of 
This further suggests where one of the profiles in the persons database that were “created” in the manner suggested by Bapat [see portion of this rejection of claim 16 based on Bapat] is a “non-consenting user profile” [a “fraudster”/”unauthorized user”/”third user” profile created, in part, by receiving a fraudster’s/unauthorized-user’s/”third user’s” speech, analyzing the fraudster/”unauthorized user” speech to “extract” the “unauthorized user’s”/fraudster’s speech characterization data, where the “unauthorized user’s”/fraudster’s speech characterization data exists and is thus “stored” somewhere in the system prior to being stored in the profile, and is then stored in the “unauthorized user’s”/fraudster’s profile, where, as discussed in the rejection of claim 15, a fraudster/unauthorized-user profile is at least suggested to be a “non-consenting” since a person, and particularly a fraud/criminal/unauthorized-user, most likely would not consent to being identified as a fraud/criminal/unauthorized-user] and “storing… third user profile as a non-consenting user profile” [storing the fraudster’s/unauthorized-user’s profile as one of the fraudster/unauthorized-user profiles in the persons database, which at least suggests associating the fraudster’s/unauthorized-user’s profile with an indicator that the profile corresponds to a known fraud/unauthorized-user])

	Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to modify Bapat to include the suggestion of Zeppenfeld of creating a non-consenting user profile, wherein creating a non-consenting user profile comprises:…; receiving speech data of… third user; extracting third biometric data from said speech data of… third user; storing said third biometric data and associating said third biometric data with… third user profile; and storing… third user profile as a non-consenting user profile, in order to provide the ability to identify known frauds/unauthorized-users (as per paragraphs 24 and 29).
	Bapat, in view of Matthews, Watry, and Zeppenfeld suggest creating a non-consenting user profile, wherein creating a non-consenting user profile comprises:…; receiving speech data of… third user; extracting third biometric data from said speech data of… third user; storing said third biometric data and associating said third biometric data with… third user profile; and storing… third user profile as a non-consenting user profile. Bapat, in view of Matthews, Watry, and Zeppenfeld, do not, but Parker suggests creating a non-consenting user profile, wherein creating a non-consenting user profile comprises: initialising a third user profile associated with a third user; receiving speech data of the third user; extracting third biometric data from said speech data of the third user; storing said third biometric data and associating said third biometric data with said third user profile; and storing said third user profile as a non-consenting user profile (“As a part of creating a new speaker profile in the database, a voice sample may be recorded from the unidentified voice signal, e.g., taken from the ongoing conversation, and stored in the new speaker profile. Thus, the voice sample stored in the new speaker profile may be correlated against voice signals received in the future. Moreover, the new speaker profile may be integrated with other existing databases, e.g., a supplemental database such as a government database, a local database, etc.”, paragraph 79; “Once a new speaker profile has been created and populated with speaker identification information and a voice sample, the new speaker profile may then be used in the correlation of subsequently received voice signals”, paragraph 80
	Bapat and Zeppenfeld suggest where a fraudster/”unauthorized user”/”third user” profile is created [based on the existence of a fraudster/unauthorized-user profile] by [at least] receiving fraudster/”unauthorized user”/”third user” speech, “extracting” fraudster/unauthorized-user speech characterization data, storing the fraudster/unauthorized-user speech characterization data on the system before storing the fraudster/unauthorized-user speech characterization data in the fraudster/unauthorized-user profile, and then storing the fraudster/unauthorized-user profile as one of the “non-consenting” fraudster/unauthorized-user profiles.
	Bapat and Zeppenfeld do not, however, describe where the fraudster/unauthorized-user profile [“third user profile”] is first “initialized” as part of being “created”.

	Parker suggests where the fraudster/unauthorized-user profile was created by “initializing a third user profile associated with a third user” [i.e. creating an “initial” profile data structure for the fraud/”unauthorized user”/”third user” to contain the fraudster’s/unauthorized-user’s image and characterization data] and then populating the “third user profile”/fraudster-profile/unauthorized-user-profile with the “third user’s”/unauthorized-user’s/fraud’s speech characterization data and then storing the populated fraudster-profile/unauthorized-user-profile/”third user profile” as one of the “non-consenting user profiles”/fraudster-profiles/unauthorized-user-profiles in the persons database.)
Therefore, it would have been obvious to one of ordinary skill in the art at the time of effective filing to perform a simple substitution of one type of profile with another because the prior art teaches the claimed invention except for the substitution of a profile which is not necessarily created, in part, by initializing the profile, with a profile which is.  Parker suggests that a profile which is created, in part, by initializing the profile, was known in the art.  One of ordinary skill in the art could have substituted one .
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
2004/0003142 teaches “After the image processing, the data before the image processing is deleted, and then, the image after the image processing is displayed”
4091242 teaches “the delta mod decoder 12 at this time to the gainsave register 40. This is the gain of the speech signal at the point just prior to the speech being deleted. The gain of this first portion of the speech input signal will subsequently be transferred via the gate 42 and line 45 at A13”
8463488 teaches “The administrator, by use of the software 300, can configure multiple user profile and specify specific profile parameters for each user profile. For example, if the administrator is a parent of two children, each of which has obtained a driver's license, the administrator may develop a different profile for each child. Alternately, if the administrator is a fleet owner, the administrator may develop profiles for each driver role, e.g., a first profile for a delivery truck driver, a second profile for a long range freight carrier driver, etc.”; “Developing each profile can be facilitated by use of a user interface. For example, the user interface can includes files in which the administrator can input a profile name, a maximum speed, a maximum range, security testing requirements, and specific rules that are to be applied to the user profile. In some implementations, the user interface can include a map interface through which the administrator can selectively define an allowed operational area for a user, e.g., a city radius, etc.”
2010/0223660 teaches “for each child or for each member of a restricted class of user, an administrator (e.g., parent) may define within a user profile permission settings that designate an allotted time for viewing multimedia content during a tracking period, the duration of a tracking period, the times during which multimedia content may be viewed, the maximum amount of time that may be rolled over from one tracking period to the next, programs that are designated as permitted for a limited number of accesses, the amount of time that is provided for exercising the limited number of accesses, and other such parameters” (paragraph 59)
8688306 teaches “a vehicle's autonomous driving computer may receive profile data for a user. The profile data may be provided by an authorized administrator such as parent, guardian, law enforcement representative, etc. and may define the access rights and permissions of the user with respect to the vehicle”
2013/0185220 “For example, children under the age of 18 need parental consent to create a profile and chat on the online platform; administrators of the online platform will automatically scan sites for inappropriate content, such as inappropriate language or inappropriate pictures; users can report abuse on sites throughout the platform, and inappropriate content can be deleted by site administrators;” (paragraph 133)
2017/0061959 “In some implementations, not shown in FIG. 3, spoken words including keywords may overlap in digitized speech 308, such as when multiple children are playing a voice-controlled video game and speak over one another” (paragraph 17)
2018/0204577 “an optional feature that may be provided to avoid two people rapidly switching personalization settings on the electronic device 302, such as two children repeatedly speaking their keywords one after the other. Speech inputs are 
5029214 teaches “In hardware for transmitting the doll communications code TDCC between the dolls, many competing considerations must be taken into account. The child may be playing near other children or in the same room as adults who are talking”
2008/0235162 “the system could by used in a talking teddy bear that listens to children and responds to comments by the children” (paragraph 52).  This reference does not specifically describe that two children are speaking at the same time.
2020/0335098 (PCT filed date precedes effective date of this application, Foreign application date further precedes effective date of this application) teaches “conversation between children” (paragraphs 193-194) but “conversation between children” in this reference appears to refer to a conversation between a system and a child (see e.g. paragraph 200) not a conversation between two children).
2002/0120866 “Regardless of who creates the user profile and when, the parent at some point must provide consent information with regard to the affiliate server to be stored in the user profile of the child. It should be appreciated that the parent can be given the opportunity to provide the consent information in various manners. For 
2005/0193093 “If the user is a minor, then server 170 proceeds to a parent granting minor consent process at 508” (paragraphs 88-89)
2009/0173786 “the transaction screen may include the text instructing the parent that they are giving consent for their child to send a song they recorded to other users” (paragraph 32).  This reference is directed to providing consent for a child to send a song they recorded, NOT to consent for a child to record a song.
2014/0278366 teaches “the FTC's recent implementation of the Children's Online Privacy and Protection Act (COPPA) concerns itself in part with children under 13 years old. Changes to the Act may require that audio files that contain a child's voice be NOT require consent.
2016/0019372 teaches “The media of method 700 may be at least one of a digital media, a video clip, an audio recording, or any combination thereof. Prompting the at least one person for digital consent may include displaying a Health Insurance Portability and Accountability Act (HIPAA) compliant form to the at last one person. Receiving the at least one person's digital consent may include receiving the at least one person's digital signature. The steps of generating, prompting, receiving and delivering may be connected using a single handheld computing device. Delivering the digital consent may include emailing the digital consent. The method 700 may also include storing the digital consent in a database. Receiving the at last one person's digital consent may include receiving information including at least one of a name and signature of the at least one person (or a parent or guardian of the at least one person in the case of a minor) and entering the information into an electronic consent form. Prompting the at least one person for digital consent may occur automatically after generating the media. Prompting at least one person for digital consent may include prompting each of a plurality of people included in the media for digital consent. The method 700 may include delivering the media with the digital consent to the storage location” (paragraph 61)

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ERIC YEN whose telephone number is (571)272-4249.  The examiner can normally be reached on M-F 12:00PM -8:30PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, RICHEMOND DORVIL can be reached on (571)272-7602.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see https://ppair-my.uspto.gov/pair/PrivatePair. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.






EY 6/15/2021
/ERIC YEN/Primary Examiner, Art Unit 2658