DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
This office action is in response to the amendment filed on 09/14/2022.
Claims 1-14, 16, and 21-25 are pending for examination. Applicant amends claims 1, 3, 7-8, 11, 13, and 16 and cancels claims 15 and 17-20. The amendments have been fully considered and entered.

Response to Arguments
For convenience, the newly introduced limitations, as made by amendments, are marked as underlined.
Applicant’s arguments, see Remarks, filed 09/14/2022, with respect to the rejection of claims 1, 11, and 16 under 35 U.S.C. § 103 have been fully considered but are not persuasive.  The following are applicant arguments recited in the Remarks followed by Examiner's response:
a.	Applicant argues that neither Abdi’s “discussion of first token 316 and second token 324, nor the representation of first token 316 and second token 324 in [Abdi’s]  Fig. 3, teaches or suggests “wherein the hashed sensitive information comprises an audio clip that replaces the sensitive information … and that retains linguistic characteristics of the sensitive information as recited in the independent claims.” (Remarks, pg. 8).
Examiner respectfully disagrees and submits, as explained below, that Abdi’s first token 316 and second token 324 represent audible information exchanged between the user and the customer service representative (see col. 6 lines 20-22) and retains linguistic characteristics of the sensitive information because language is still being used (see 204 and 208 of Fig. 2 compared to 304 and 308 of Fig. 3). For example, responses 216 and 224 uses number and letters for providing a social security number and date of birth. Looking at tokens 316 and 324, they still maintain numbers and letters (i.e., retain linguistic characteristics) although they are different than responses 216 and 224.
b.	Applicant’s argues that “none of the cited references, whether considered alone or in combination, performs the two-step process described in claims 3 and 13 and including ‘a sensitive information database’ and ‘a sensitive score’ for extracted portions that do ‘not match any record in a sensitive information database,’” as currently amended. (Remarks, pg. 9).
Examiner respectfully disagrees and submits that Abdi reasonably teaches this two-step process described in claims 3 and 13 as explained below, in the prior art rejection. Specifically, col. 8 lines 27-44 of Abdi teaches comparing contents of a response to a keyword list (i.e., database), finding that there is a match or no match in the keyword list, and assigning likelihoods (i.e., sensitivity scores) to each component of the response and compares it to a threshold to determine if the response is sensitive.
c.	Applicant argues, with regard to claims 21-23, that none of the cited references can teach or suggests linguistic characteristics of the sensitive information being phonetic, syntactic, or semantic. (Remarks, pg.10)
Examiner respectfully disagrees and submits that Abdi reasonably teaches these limitations as explained below.

Examiner Note
Regarding 35 U.S.C. 101 analysis for claims 11 and 16, paragraphs [0075]-[0076] of the originally filed specification discloses that the processor is a CPU which is a known hardware element in the art. Therefore, a 35 U.S.C. 101 “software per se” rejection is not appropriate for claim 11. Furthermore, paragraph [0106] of the originally filed specification discloses that the computer-readable storage media is not construed as being transitory signals per se, therefore, a 35 U.S.C. 101 “signals per se” rejection is not appropriate for claim 16.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-7, 9, 11-14, 16, and 21-23 are rejected under 35 U.S.C. 103 as being unpatentable over Abdi Taghi Abad et al. (US 10380380 B1; hereinafter “Abdi”) in view of Newman et al. (US 10468026 B1; hereinafter “Newman”) and further in view of Bao et al. (US 20160004873 A1; hereinafter “Bao”).
As per claims 1, 11, and 16, Abdi discloses: a computer-implemented method, system, and computer program product comprising one or more computer readable storage media, the system comprising: 
one or more processors (Abdi, Fig. 9, processing unit); and 
one or more computer-readable storage media storing program instructions (Abdi, Fig. 8, storage medium comprising executable instructions) which, when executed by the one or more processors, are configured to cause the one or more processors to perform a method comprising: 
identifying sensitive information in first audio data from a first client device (Abdi, Fig. 4 and col. 8 lines 53-66, controller component 112 receives customer data in audio form (i.e., audio data) from user device 104 (i.e., first client device) wherein personal data (i.e., sensitive data) is identified); 
generating second audio data including a token, wherein the token comprises an audio clip that replaces the sensitive information, that is based on the sensitive information, and that retains linguistic characteristics of the sensitive information (Abdi, Fig. 4 and col. 9 lines 19-24 and col. 4 line 61 – col. 5 line 9, replacing the personal data with a generated token to form modified customer data (i.e., second audio data), col. 9 lines 5-11 and col. 5 lines 1-2, wherein the token is voice or audio data (i.e. audio clip), Figs. 2-3 and col. 6 lines 20-22, audible information exchanged between the user and the customer service representative where sensitive information is replaced with a token voice/audio clip that replaces the sensitive information, is based on the sensitive information, and retains linguistic characteristics of the sensitive information because language is still being used (see 204 and 208 of Fig. 2 compared to 304 and 308 of Fig. 3)); 
transmitting the second audio data including the token to a second client device (Abdi, col. 9 lines 25-26, the modified customer data including the audio token is provided to the customer service representative 106 via remote system 108 (i.e. second client device), col. 5 lines 4-9, “replacement voice data can then be provided within the verbal message from the user 102 for playback and/or presentation to the customer service representative 106”);
receiving third audio data including the token from the second client device (Abdi, col. 5 lines, 31-46, controller component 112 receives a query from remote system 108 (i.e., second client device) to retrieve the sensitive personal information, wherein the query includes the token, col. 9 lines 5-11, wherein the token includes an audio data (i.e., third audio data including token)); and 
generating fourth audio data by replacing the token with the sensitive information (Abdi, col. 5 lines, 31-46, controller component receives the sensitive personal information in audio form (i.e., fourth audio data) using the audio token included in the query).
While Abdi teaches using an audio token to replace sensitive information during a phone call between a customer and a customer service representative (Abdi, Fig. 4) and generating fourth audio data by replacing the token with the sensitive information in response to a query for the sensitive information (Abdi, col. 5 lines, 31-46), Abdi does not explicitly disclose, however, Newman teaches or suggests: transmitting a fourth audio data including sensitive information to the first client device (Newman, col. 6 lines 24-27, “a call agent may repeat the PII (i.e., sensitive information) or SPI back to the customer for confirmation and the numbers repeated by the call agent are the spoken digits that are detected in the reverse process”).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify/combine the teachings of Abdi to include transmitting a fourth audio data including the sensitive information to the first client device as taught or suggested by Newman for the benefit of the call agent to confirm the personal identifiable information of the customer of the first client device before providing service to the customer (Newman, col. 6 lines 24-27).
While the modified Abdi teaches an audio token to replace sensitive information in audio data (Abdi, Fig. 4), the modified Abdi does not disclose, however, Bao teaches or suggests including hashed sensitive information in audio data (Bao, [0107], encrypting private information using a hash function, [0045], private information was provided via audio).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify/combine the teachings of the modified Abdi to include hashed sensitive data in audio data as taught or suggested by Bao for the benefit of improving a user’s privacy by controlling an information type shared to the public (Bao, [0020]). Furthermore, the combination would have been obvious because a person of ordinary skill in the art would know to simply substitute one known element (i.e. tokens) for another (i.e., hashed sensitive data) to obtain predictable results (i.e., replacing sensitive information with a value to protect the sensitive information).

As per claims 2 and 12, claims 1 and 11 are incorporated, respectively, and the modified Abdi discloses: wherein identifying the sensitive information in the first audio data further comprises: comparing extracted portions of the first audio data to a sensitive information database (Abdi, col. 6 lines 13-16, comparing any audio data from the user to a keyword list); and 
classifying respective extracted portions matching a respective entry in the sensitive information database as the sensitive information database (Abdi, col. 6 lines 13-16, detecting/classifying portions of audio data as sensitive personal information based on the comparison of the audio data to the keyword list).

As per claim 3, claim 1 is incorporated and the modified Abdi discloses: wherein identifying the sensitive information in the first audio data further comprises: determining that an extracted portion of the first audio data does not match any record in a sensitive information database (Abdi, col. 8 lines 27-44, “the controller component 112 can compare the contents of the responses 212 and 220 to a keyword list, can compare the formats of any data within the responses 212 and 220 to formats used for providing sensitive information, and/or can assign likelihoods to each individual component of the responses 212 and 220. The likelihoods can be a measure of the probability a specific individual portion or word contains (e.g., text string or an audio component) contains sensitive personal information. The likelihoods can then be compared to a threshold. If the assigned likelihood exceeds the threshold, then the corresponding word or portion of the response 212 and 220 can be flagged be replaced by a token.” (Emphasis added). In other words, it is implied that after comparing the contents of the response to a keyword list (i.e., database) and subsequently finding that there is no match, it goes to assign likelihoods (i.e., sensitivity scores) to each component of the response and compares it to a threshold to determine if the response is sensitive);
generating a sensitivity score for the extracted portion of the first audio data in response to determining that the extracted portion of the first audio data does not match any record in the sensitive information database (Abdi, col. 8 lines 27-44, implies that after comparing the contents of the response to a keyword list (i.e., database) and subsequently finding that there is no match, it goes to assign likelihoods (i.e., sensitivity scores) to each component of the response and compares it to a threshold to determine if the response is sensitive); 
determining that the sensitivity score satisfies a sensitivity score threshold (Abdi, col. 8 lines 27-44); and 
classifying the extracted portion of the first audio data as the sensitive information (Abdi, col. 8 lines 27-44, “if the assigned likelihood exceeds the threshold, then the corresponding word or portion of the response 212 and 220 can be flagged be replaced by a token”).

As per claim 4, claim 3 is incorporated and the modified Abdi discloses: wherein the sensitivity score is generated by a content sensitivity model that is trained using machine learning algorithms (Abdi, col. 5 lines 58-61, “the controller component 112 can implement machine learning techniques and/or can be part of a recurrent neural network (RNN) that can be trained to recognize sensitive personal information”).

As per claims 5 and 14, claims 1 and 11 are incorporated, respectively, and while the modified Abdi teaches wherein generating the second audio data including the token further comprises storing an indicator specifying the correspondence between the sensitive personal information and the token (Abdi, col. 4 lines 26-30); and wherein generating fourth audio data by replacing the token with the sensitive information further comprises matching the token with the sensitive personal information (Abdi, col. 5 lines, 31-46), the modified Abdi does not explicitly disclose, however, Bao teaches or suggests: storing a correspondence between the sensitive information and a hashed sensitive information in a mapping table (Bao, [0113], mapping table 312 based on pairing the encrypted/hashed information and the corresponding instance of the private information); and matching the hashed sensitive information with the sensitive information based on the correspondence in the mapping table (Bao, [0053], restored information requires comparing/matching the encrypted/hashed information with the unencrypted information).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify/combine the teachings of the modified Abdi to include a mapping table for hashed sensitive data and sensitive data as taught or suggested by Bao for the benefit of improving a user’s privacy by controlling an information type shared to the public (Bao, [0020]).

As per claim 6, claim 1 is incorporated and the modified Abdi discloses: wherein the hashed sensitive information includes an indicator that identifies the hashed sensitive information as data with a sensitive information classification (Abdi, col. 4 lines 24-30, “The controller component 112 can also store the token along with an indicator or other data for indicating or specifying the link or relationship between the sensitive personal information and the generated replacement token data.” Bao teaches the hashed sensitive information as explained above)).

As per claim 7, claim 6 is incorporated and the modified Abdi discloses: wherein the indicator further includes an explanation of the sensitive information classification, wherein the explanation relates to a match in a sensitive information database (Abdi, col. 6 lines 14-16, “sensitive personal information can be detected by comparison of any data from the user 102 to a keyword list,” col. 8 lines 32-33, “the controller component 112 can compare the contents of the responses 212 and 220 to a keyword list,” in other words, the indicator explains a match in a sensitive information keyword list/database).

As per claim 9, claim 1 is incorporated and the modified Abdi discloses: wherein the method is performed by a data security application according to software that is downloaded to the data security application from a remote data processing system (Abdi, col. 2 lines 60-61, “The online service provided by the remote system 108 can be any type of website or application accessible over, for example, the Internet,” col. 4 lines 1-2, “The controller component 112 can be implemented in software”).

As per claim 13, claim 11 is incorporated and the modified Abdi discloses: wherein identifying the sensitive information in the first audio data further comprises: determining that an extracted portion of the first audio data does not match any record in a sensitive information database (Abdi, col. 8 lines 27-44, “the controller component 112 can compare the contents of the responses 212 and 220 to a keyword list, can compare the formats of any data within the responses 212 and 220 to formats used for providing sensitive information, and/or can assign likelihoods to each individual component of the responses 212 and 220. The likelihoods can be a measure of the probability a specific individual portion or word contains (e.g., text string or an audio component) contains sensitive personal information. The likelihoods can then be compared to a threshold. If the assigned likelihood exceeds the threshold, then the corresponding word or portion of the response 212 and 220 can be flagged be replaced by a token.” In other words, it is implied that after comparing the contents of the response to a keyword list (i.e., database) and subsequently finding that there is no match, it goes to assign likelihoods (i.e., sensitivity scores) to each component of the response and compares it to a threshold to determine if the response is sensitive);
generating, in response to determining that the extracted portion of the first audio data does not match any record in the sensitive information database, a sensitivity score for the extracted portion of the first audio data based on inputting the extracted portion of the first audio data to a content sensitivity model that is trained using machine learning algorithms (Abdi, col. 8 lines 27-44, implies that after comparing the contents of the response to a keyword list (i.e., database) and subsequently finding that there is no match, it goes to assign likelihoods (i.e., sensitivity scores) to each component of the response and compares it to a threshold to determine if the response is sensitive, col. 5 line 65 – col. 6 line 7, assign likelihood (i.e., sensitivity score) to a particular spoken word, col. 5 lines 58-61, “the controller component 112 can implement machine learning techniques and/or can be part of a recurrent neural network (RNN) that can be trained to recognize sensitive personal information”); 
determining that the sensitivity score satisfies a sensitivity score threshold (Abdi, col. 8 lines 27-44 and col. 5 line 65 – col. 6 line 7, likelihood (i.e., sensitivity score) of a particular spoken word exceeds a threshold); and 
classifying the extracted portion of the first audio data as the sensitive information (Abdi, col. 8 lines 27-44, “if the assigned likelihood exceeds the threshold, then the corresponding word or portion of the response 212 and 220 can be flagged be replaced by a token,” col. 5 line 65 – col. 6 line 7, the particular spoken word is flagged (i.e., classified as sensitive information)).

As per claim 21, claim 1 is incorporated and the modified Abdi discloses: wherein the linguistic characteristics comprise phonetic characteristics (Abdi, Figs. 2-3 and col. 6 lines 20-22, wherein the linguistic characteristics of the sensitive information comprise phonetic characteristics because phonetics or speech sounds are still being used (see 204 and 208 of Fig. 2 compared to 304 and 308 of Fig. 3)).

As per claim 22, claim 1 is incorporated and the modified Abdi discloses: wherein the linguistic characteristics comprise syntactic characteristics (Abdi, Figs. 2-3 and col. 6 lines 20-22, wherein the linguistic characteristics of the sensitive information comprise syntactic characteristics because rules of language are still being used (see 204 and 208 of Fig. 2 compared to 304 and 308 of Fig. 3)).

As per claim 23, claim 1 is incorporated and the modified Abdi discloses: wherein the linguistic characteristics comprise semantic characteristics (Abdi, Figs. 2-3 and col. 6 lines 20-22, wherein the linguistic characteristics of the sensitive information comprise semantic characteristics because the meaning in the language is still being used (see 204 and 208 of Fig. 2 compared to 304 and 308 of Fig. 3)).

Claims 8 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Abdi in view of Newman and Bao, and further in view of Enuka et al. (US 20200050966 A1; hereinafter “Enuka”).
As per claim 8, claim 7 is incorporated and the modified Abdi does not disclose, however, Enuka teaches or suggests: wherein the method further comprises: receiving feedback related to an accuracy of the sensitive information classification (Enuka, [0165], “upon displaying predictive results relating to the training data at step 835, the user may review the results and provide feedback 840 (e.g., reject one or more of the results). The user feedback may then be provided to the machine learning model,” [0035]-[0037], updating personal information rules in a rules database (i.e., sensitive information database), [0188], where rules are stored in rules database); and 
updating, based on the feedback, the sensitive information database (Enuka, [0165], [0057]-[0058], update personal information rules based on feedback from user which affects scoring of personal information, [0036], “the system may be adapted to allow users to manually create and/or update personal information rules”).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify/combine the teachings of the modified Abdi to include updating, based on feedback, a sensitive information database as taught or suggested by Enuka for the benefit of providing an organized inventory of personal information, indexed by attribute, to facilitate the management of data risk and customer privacy (Enuka, [0005]).

Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Abdi in view of Newman and Bao, and further in view of Coen et al. (US 20160203319 A1; hereinafter “Coen”).
As per claim 10, claim 9 is incorporated and the modified Abdi does not disclose, however, Coen teaches or suggests: wherein the method further comprises: metering a usage of the software; and generating an invoice based on metering the usage (Coen, [0245], “invoices are based upon the usage metered across the applications and services used across the platform”).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify/combine the teachings of the modified Abdi to include generating an invoice based on metering a usage of a software as taught or suggested by Coen to effectively and seamlessly improve the customer's security posture mesh with the cloud philosophies of rapid deployment, virtualization and pay-per-use (Coen, [0008]).

Claims 24-25 are rejected under 35 U.S.C. 103 as being unpatentable over Abdi in view of Newman and Bao, and further in view of Joshi et al. (US 20200125746 A1; hereinafter “Joshi”).
As per claim 24, claim 6 is incorporated and the modified Abdi does not disclose, however, Joshi teaches or suggests: wherein the indicator further includes an explanation of the sensitive information classification, wherein the explanation relates to a sensitivity score generated by a sensitivity score model above a sensitivity score threshold (Joshi, [0047]-[0050] and [0162], sensitivity data discovery process (i.e., model) assigns confidence score of a data object and if confidence score is above a threshold, an operation is performed including notifying/explaining to a user regarding the detected sensitive data).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify/combine the teachings of the modified Abdi to include explaining the sensitive classification related to a score above a threshold as taught or suggested by Joshi for the benefit of automatically discovering and protecting sensitive data (Joshi, [0008]).

As per claim 25, claim 24 is incorporated and the modified Abdi does not disclose, however, Joshi teaches or suggests: wherein the method further comprises: receiving feedback related to an accuracy of the sensitive information classification; and updating, based on the feedback, the sensitivity score model (Joshi, [0080] and [0237], user provides feedback to the machine-learning data discovery process/model as more examples are labeled to improve accuracy of discovering sensitive information).
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention to modify/combine the teachings of the modified Abdi to include providing feedback to the machine-learning model as taught or suggested by Joshi for the benefit of improving the accuracy of discovering sensitive information (Joshi, [0080] and [0237]).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Refer to PTO-892, Notice of References Cited for a listing of analogous art.
Fox et al. (US 20200074108 A1) teaches retaining the syntax of sensitive data without retaining the sensitive data itself in a clean copy ([0040]).

THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ALEXANDER R LAPIAN whose telephone number is (571)272-7552. The examiner can normally be reached M-F 9:30-6:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kristine Kincaid can be reached on 571-272-4063. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

ALEXANDER R. LAPIAN
Examiner
Art Unit 2437



/ALEXANDER R LAPIAN/Examiner, Art Unit 2437  

/KRISTINE L KINCAID/Supervisory Patent Examiner, Art Unit 2437