DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, 16/008,765, was filed on 06/14/2018, and claims foreign priority from Chinese Application 201710457967.6, filed 06/16/2017.  The effective filing date is after the AIA  date of March 16, 2013, and so the application is being examined under the “first inventor to file” provisions of the AIA .
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

Status of the Application
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on Dec. 10, 2021 has been entered.
This Non-Final Office Action is in response to Applicant’s most recent communication of Dec. 10, 2021.
Claims 1, 2, 8, 9, 11-13, 19-20, and 55-57 are pending, of which claims 1, 11, and 12 are independent.
In the most recent response, independent claims 1, 11, and 12 have been amended.
Claims 3-7, 10, 14-18, and 21-54 were previously cancelled.  
All pending claims have been examined on the merits.  

Claim Rejections - 35 USC § 103
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner 
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: 
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 2, 8, 9, 11-13, 19-20, and 55-57 are rejected under 35 U.S.C. 103 as being unpatentable over US 2018/0068317 A1 to Gilbey et al. (“Gilbey”, Eff. Filed on Sept. 7, 2016. Published on Mar. 8, 2018) in view of US 2017/0160813 A1 to Divakaran et al. (“Divakaran”. Filed on Oct. 24, 2016. Published on Nov. 22, 2018) in view of US 2002/0010581 A1 to Euler et al. (“Euler”, Eff. Filed Jun. 19, 2000. Published Jan. 24, 2002), and further in view of US 2014/0379354 A1 to Zhang et al. (“Zhang”, Eff. Filed on Jun. 20, 2013. Published on Dec. 25, 2014).
In regards to claim 1, Gilbey discloses: 
1.    (Currently Amended) A payment method, wherein the method comprises:
providing specified information to a user via a user interface of an application, wherein the specified information is requesting the user to speak a user-defined payment keyword previously defined by the user during a registration process;

(See Gilbey, para. [0045]: “The microphone 115 is for receiving a user's verbal instructions, which can include instructions for carrying out the transaction. The processing components 110 can be configured to determine a nature of the transaction, and in parallel, to either communicate with a voice biometric authentication server (via the transceiver 112) to authenticate the user's voice data providing the verbal transaction instructions, or to perform such authentication on the device 100 itself. There can be a voice authentication module 92 for performing the authentication locally on the device 100.”)

When the authentication is carried out on the voice biometric server 12, there is transmitting, to a voice biometric authentication server 12 (shown in FIG. 3), the user's voice data (412). The authentication of the user's voice data at the voice biometric authentication server 12 can be carried out by assessment of at least one portion of the user's voice data (414). It should be appreciated that positive authentication of the user is to confirm the identity of the user such that the user is authorised to carry out a desired transaction(s).”)

(See Gilbey, para. [0057]: “Subsequently, there is receiving, at the processor 110, an authentication result of the user's voice data (416).”)

The Examiner interprets that the authentication of user’s voice data inherently requires a stored sample of the user’s voice.  Moreover, Gilbey’s disclosure that “The processing components 110 can be configured to determine a nature of the transaction, and in parallel, … to authenticate the user's voice data” suggests that the stored sample of the user’s voice is one of the predefined purchase commands recited in Gilbey, para. [0046].

Moreover, while Gilbey does not expressly disclose who is responsible for “predefining” the words “predefined words” in Gilbey’s para. [0046], it is obvious the user’s voice sample used for authentication would also be one or more of the “predefined words” listed in Gilbey’s para. [0046], and selected by the user. That way, a user’s purchase command can be used “in parallel” to determine a purchase transaction (as disclosed in Gilbey’s para. [0045]) and for authentication.

receiving a spoken payment instruction of the user via the user interface of the application, wherein the spoken payment instruction includes the user-defined payment keyword;

(See Gilbey, para. [0045]: “The microphone 115 is for receiving a user's verbal instructions, which can include instructions for carrying out the transaction. The processing components 110 can be configured to determine a nature of the transaction, and in parallel, to either communicate with a voice biometric authentication server (via the transceiver 112) to authenticate the user's voice data providing the verbal transaction instructions, or to perform such authentication on the device 100 itself. There can be a voice authentication module 92 for performing the authentication locally on the device 100.”)

(See Gilbey, para. [0046]: “Determining the nature of the transaction can include assessing, using, for example, a known speech to text conversion methodology before determining text of the verbal instructions, at least one parameter such as, for example, identity of merchant, type of goods, type of services, time of delivery, keywords of the transaction instructions and so forth. The keywords can be predefined words for either carrying out particular tasks or providing particular information, such as, for example, “buy”, “purchase”, “transfer”, “deliver to”, “pre-order”, “confirm”, “gift to”, “expedite delivery” and so forth. It should be appreciated that the user can be interfacing with an intelligent assistant 94 integrated with the mobile device 100 when providing the verbal transaction instructions. This is possible due to the ability of the intelligent assistant 94 to interface with software applications installed/running on the mobile device 100.”)

generating audio information according to 
(i) the spoken payment instruction and 
(ii) a voice input that is made by the user based on the specified information;

The microphone 115 is for receiving a user's verbal instructions, which can include instructions for carrying out the transaction. The processing components 110 can be configured to determine a nature of the transaction, and in parallel, to either communicate with a voice biometric authentication server (via the transceiver 112) to authenticate the user's voice data providing the verbal transaction instructions, or to perform such authentication on the device 100 itself. There can be a voice authentication module 92 for performing the authentication locally on the device 100.”)

determining whether the user-defined payment keyword in the voice input of the user is the same as a preset keyword; 
obtaining payment object information by:
performing voice recognition on the audio information, to obtain payment object description information,
querying a locally stored object information set according to the payment object description information, to obtain at least one piece of payment object information,

(See Gilbey, para. [0045]: “The microphone 115 is for receiving a user's verbal instructions, which can include instructions for carrying out the transaction. The processing components 110 can be configured to determine a nature of the transaction, and in parallel, to either communicate with a voice biometric authentication server (via the transceiver 112) to authenticate the user's voice data providing the verbal transaction instructions, or to perform such authentication on the device 100 itself. There can be a voice authentication module 92 for performing the authentication locally on the device 100.”)

However, under a conservative interpretation of Gilbey, it does not expressly teach the following features, which are taught by Divakaran:
generating, according to the audio information, 
a feature matrix comprising at least one feature of audio data in the audio information, the at least one feature of the audio data comprising at least one of frequency data or amplitude data;

(See Divakaran, Fig. 22 and para. [0274]: “In implementations in which one or more participant's vocal features (e.g., non-speech features and/or paralinguistics such as voice pitch, speech tone, energy level, and OpenEars features) are analyzed, the vocal feature recognizer 2244 can extract and classify the sound, language, and/or acoustic features from the inputs 2228, 2230. In some implementations, voice recognition algorithms may use Mel-frequency cepstral coefficients to identify the speaker of particular vocal features. Language recognition algorithms may use shifted delta cepstrum coefficients and/or other types of transforms (e.g., cepstrum plus deltas, ProPol 5th order polynomial transformations, dimensionality reduction, vector quantization, etc.) to analyze the vocal features. To classify the vocal features, SVMs and/or other modeling techniques (e.g., GMM-UBM Eigenchannel, Euclidean distance metrics, etc.) may be used. In some implementations, a combination of multiple modeling approaches may be used, the results of which are combined and fused using, for example, logistic regression calibration. In this way, the vocal feature recognizer 2244 can recognize vocal cues including indications of, for example, excitement, confusion, frustration, happiness, calmness, agitation, and the like.”)

(See Divakaran, Fig. 4 and para. [0206]: “In various implementations, the spoken command analyzer 1800 may use a single model to both identify a speaker and to determine what the speaker has said. A “joint” or “combined” speaker and content model models both person-specific and command-specific acoustic properties of a person's speech. The joint speaker and content model can be implemented using, for example, a phonetic model or a i-vector. An i-vector is a compact representation of a speaker's utterance. In various implementations, an i-vector for a short phrase (e.g. one lasting two to five seconds or two to three seconds) can be extracted from training data obtained either during an explicit enrollment process or passively collected as a person speaks while operating a device that includes the spoken command analyzer 1800. I-vector extraction can result in both text identification and speaker identification information being included in the i-vector. I-vectors allows for comparison between similarly constructed i-vectors extract from later-entered speech input.”)

The Examiner interprets a vector as being a simplified version of a matrix.  Moreover, Divakaran also discloses the use of matrices in the context of video recognition:

(Seealso  Divakaran, Fig. 19 and para. [0234]: “The example video event mode 1914 can include one or more computer-accessible data and/or programming structures (e.g., vectors, matrices, databases, lookup tables, or the like), and may include one or more indexed or otherwise searchable stores of information. The video event model 1914 may contain or reference data, arguments, parameters, and/or machine-executable algorithms that can be applied to input images being classified by the video classification system 1912.”)

performing matching between the voice feature vector and a user feature vector; and
authenticating an identity of the user by: 
performing matching between the voice feature vector and a user feature vector; and

(See Divakaran, Fig. 18 and para. [0204]: “In various implementations, the voice biometric information can be used to identify a specific speaker. For example, the spoken command analyzer 1800 can use the voice biometric information to determine that an input phrase was spoken by John rather than by Sam. In some implementations, the speaker's identity can be used to authenticate the speaker. For example, the speaker's identity can be used to determine whether the speaker is authorized to issue a specific instruction. In some cases, a voice-driven system may be configured to only allow particular people to issue some instructions (e.g., “unlock my car”). In other cases, the system may be configured to allow broad categories of people (e.g., adults only) to issue some instructions, while other instructions can be issued by anyone. In most cases, the spoken command analyzer 1800 can identify and authenticate the speaker from the same speech sample that contains the instruction. The spoken command analyzer 1800 may provide the speaker's identification information and the content of the speaker's input to other devices or systems, which may be configured to authenticate the speaker and/or to execute the speaker's instructions.”)

It would have been obvious to a person having ordinary skill in the art (PHOSITA), at the effective filing date of the Application, to include in the method for authenticating and identifying a voice command for a purchase transaction , as taught by Gilbey, with the authentication using voice biometric information for a “virtual personal assistant”, as taught by Divakaran above, because both are directed to the same art of user authentication by using voice recognition.  
However, under a conservative interpretation of Gilbey and Divakaran, they do not expressly teach the following features, which are taught by Euler:
performing dimension reduction processing on the feature matrix by:
inputting the feature matrix and multiple feature dimensions for a voice feature vector of the audio information into a neural network, and
obtaining, from the neural network, multiple dimension values representing the multiple feature dimensions based on the feature matrix;

(See Euler, para. [0018]-[0019]: “FIG. 1 shows a block diagram of a developed voice recognition device and a corresponding method, respectively, in a two-channeled embodiment, i.e., including two input signals y 1 and y2. Using known methods of extracting features, e.g. MFCC, feature vectors O1 and O2 are separately acquired per channel from input signals y1 and y2. A new sequence of transformed feature vectors is formed from the sequence of these feature vectors by a preferably linear operation according to the relationship: O t  ( l ) = T · [ O 1  ( l ) O 2  ( l ) ] ( 1 )

    PNG
    media_image1.png
    88
    900
    media_image1.png
    Greyscale

The matrix operation is performed for every signal block in a reduced time cycle 1. The dimension of matrix T is accordingly selected to cause a reduction in the dimension. If both feature vectors U1 and U2 possess n1 and/or n2 components, respectively, and if the transformed feature vector is only to include nt coefficients, matrix T must have dimension nt times (n1+n2). A typical numerical example is n1=39, n2=39, and nt=32. Then transformation matrix T has the dimension 32*78, and the transformation results in the dimension being reduced from a total of 78 components in feature vectors O1 and O2 to 32 components in transformed featured vector Ot..“)

(See also Euler, para. [0011]: “One advantageous embodiment of the voice recognition device is that the transformation device is a linear transformation device. In this context, suitable measures are that the transformation device is designed to perform a linear discriminant analysis (LDA) or a Karhunen-Loève transform.”)

(See also Euler para. [0013]: “There are also expansions of the LDA that can be used here. Moreover, it is conceivable to select non-linear transformation devices (e.g. so-called “neural networks”). These methods have in common that sample data is required for the design.”)

generating the voice feature vector of the audio information by arranging the multiple dimension values according to a specified order;

(See Euler, para. [0020]: “Based on sample data, transformation matrix T is adjusted so that transformed feature vector O t has the maximum amount of information for differentiating the individual classes. For this purpose, it is possible to use the known methods of linear discriminant analysis or the Karhunen-Loève transform. Transformed feature vectors Ot(l) are used for training classification unit KL.”)

It would have been obvious to a person having ordinary skill in the art (PHOSITA), at the effective filing date of the Application, to include in the method for authenticating and identifying a voice command for a purchase transaction, as taught by Gilbey, and using voice biometric information for a “virtual personal assistant”, as taught by Divakaran above (see para. [0004]: “In some implementations, types of information included in sensory input can include speech information, Euler above, because Euler’s disclosure is also directed to a “voice recognition device”, and therefore the two references are in the same field of endeavor, and because Euler “provide[s] a voice recognition device requiring the lowest expenditure possible with respect to its design and processing performance for the highest possible rate of recognition.” (See Euler para. [0006]).  
However, under a conservative interpretation of Gilbey, Divakaran, and Euler, they do not expressly teach the following features, which are expressly taught by Zhang:
feeding back the at least one piece of payment object information to the user, and

(See Zhang, para. [0112]: “To summarize, FIG. 3 illustrates a payment validation method with the following benefits: by receiving a payment validation request transmitted from a terminal (120), the current voice characteristics associated to the identity information and the text password in the current voice signal. If it is detected that the identification information in the payment validation request is identical to or the same as the pre-stored identification information, the server (140) may transmit a validation reply information to the terminal (120) to authorize for payment transaction, after successfully matching the current voice characteristics and the pre-stored speaker model.”)

receiving a confirmation instruction of the user via the user interface of the application, and using payment object information pointed to by the confirmation instruction as the obtained payment object information; and

(See Zhang, para. [0113]: “The illustrated method replaces the step of generation of SMS validation messages by the server (140) by matching of the current voice signal to the pre-stored speaker model. In effect, the illustrated payment validation method has at least eliminated the extra steps in the prior art method, which requires separately generating a SMS by the server (140) to send to the terminal (120) to be entered by the user for further security verification. Thus the current invention has reduced the operating cost by simplifying the payment validation process using the unique identity of the user (i.e., the voice signature) during the validating process. In addition, the user experience is enhanced with reduced operations as required by the user.”)

(The Examiner notes that Zhang describes Zhang’s invention as being an improvement over prior art that includes this claimed step.)

when the authenticating succeeds based on a successful matching of the voice feature vector and the user feature vector and the user-defined payment keyword in the voice input of the user is the same as the preset keyword, 

(See Zhang, para. [0017]-[0020]: “detecting whether the identification information is identical to a pre-stored identification information; if identical: extracting voice characteristics associated with an identity information and a text password from the current voice signal; matching the current voice characteristics to a pre-stored speaker model; if successfully matched: sending by the server, an validation reply message to the terminal to indicate that payment request has been authorized, wherein wherein the identity information identifies an owner of the current voice signal, and the text password is a password indicated by the current voice signal.”)

sending the payment object information and personal information associated with the user feature vector to a server that performs a payment operation for the payment object information according to a resource account associated with the personal information.

(See Zhang, para. [0071]: “In step 201, the terminal (120) receives identification information input by a user. The user may input relevant identification information as prompted by a payment application program installed on the terminal (120). The identification information may include the user's payment account number, user's name and user's password corresponding to the account number. Such information may have been already been registered ahead of time with the server (140) (belong to a financial institution or to a merchant), prior to processing an on-line payment transaction.”)
 
(See Zhang, para. [0073]: “The registration server here may be the same as or different from the payment server. If the registration server and the payment server are different, then the payment server must first pull the user's identification information from the registration server and take the identification information as the pre-stored identification information. The payment server here refers to the server (140) as shown in FIG. 1.”)
 
(See Zhang, para. [0077]: “In step 205, the server (140) may detect whether the identification information is identical to the pre-stored identification information. In the present embodiment, the server (140) may also communicate with a registration server (locally or remotely).”)

It would have been obvious to a person having ordinary skill in the art (PHOSITA), at the effective filing date of the Application, to include in the method for authentication using to include in the method for authenticating and identifying a voice command for a purchase transaction, as taught by Gilbey, and using voice biometric information for a “virtual personal assistant”, as taught by Divakaran, with the “voice recognition device” further taught by Euler above, with the neural network and matrices, as further taught by Euler, and to further include Zhang’s disclosure of “voice matching”, “password matching”, and “confirmation message sending”, because all three references are directed to a “voice recognition device”, and because Zhang discloses the benefit of “significantly enhance[ing] the security of online payment..” (See Macho para. [0004]).
In regards to claim 2, Divakaran discloses: 
2.    (Currently Amended) The method according to claim 1, wherein before the receiving the spoken payment instruction of the user, the method further comprises: obtaining an enabling instruction from the audio information in the voice input of the user.

(See Divakaran, Fig. 18 and para. [0204]: “In various implementations, the voice biometric information can be used to identify a specific speaker. For example, the spoken command analyzer 1800 can use the voice biometric information to determine that an input phrase was spoken by John rather than by Sam. In some implementations, the speaker's identity can be used to authenticate the speaker. For example, The spoken command analyzer 1800 may provide the speaker's identification information and the content of the speaker's input to other devices or systems, which may be configured to authenticate the speaker and/or to execute the speaker's instructions.”)

In regards to claims 3-7, they are cancelled.
In regards to claim 8, it is recited in the alternative. Divakaran discloses at least the highlighted feature: 
8.    (Currently Amended) The method according to claim 1, wherein the step of generating a voice feature vector comprises at least one of the following:
selecting any piece of audio information of the user, to generate the voice feature vector.

(See Divakaran, Fig. 4 and para. [0206]: “In various implementations, the spoken command analyzer 1800 may use a single model to both identify a speaker and to determine what the speaker has said. A “joint” or “combined” speaker and content model models both person-specific and command-specific acoustic properties of a person's speech. The joint speaker and content model can be implemented using, for example, a phonetic model or a i-vector. An i-vector is a compact representation of a speaker's utterance. In various implementations, an i-vector for a short phrase (e.g. one lasting two to five seconds or two to three seconds) can be extracted from training data obtained either during an explicit enrollment process or passively collected as a person speaks while operating a device that includes the spoken command analyzer 1800. I-vector extraction can result in both text identification and speaker identification information being included in the i-vector. I-vectors allows for comparison between similarly constructed i-vectors extract from later-entered speech input.”)

In regards to claim 9, Divakaran discloses: 
9.    (Currently Amended) The method according to claim 8, wherein the sending further comprises:
performing voice recognition on the audio information used to generate the voice feature vector; determining whether a user voice conforms to the specified information; and

(See Divakaran, Fig. 18 and para. [0203]: “In various implementations, the spoken command analyzer 1800 can include a speech recognition component and a voice biometrics component. The speech recognition component can be used to, given a sample of human speech, analyze the sample and determine the content of the sample. For example, the speech recognition component can determine whether the person asked a question or issued a command. The voice biometrics component can be used to derive acoustical properties of the sample, such as frequencies or frequency ranges, pitch or pitch ranges, tone or tonal ranges, durations, volume or volume ranges, timbre, sonic texture, and/or spatial location(s) of the sample with respect to the point at which the sample was captured. The 

In response to determining that the user voice conforms to the specified information, sending the payment object information and the personal information associated with the user feature vector to the server.

(See Divakaran, Fig. 18 and para. [0204]: “In various implementations, the voice biometric information can be used to identify a specific speaker. For example, the spoken command analyzer 1800 can use the voice biometric information to determine that an input phrase was spoken by John rather than by Sam. In some implementations, the speaker's identity can be used to authenticate the speaker. For example, the speaker's identity can be used to determine whether the speaker is authorized to issue a specific instruction. In some cases, a voice-driven system may be configured to only allow particular people to issue some instructions (e.g., “unlock my car”). In other cases, the system may be configured to allow broad categories of people (e.g., adults only) to issue some instructions, while other instructions can be issued by anyone. In most cases, the spoken command analyzer 1800 can identify and authenticate the speaker from the same speech sample that contains the instruction. The spoken command analyzer 1800 may provide the speaker's identification information and the content of the speaker's input to other devices or systems, which may be configured to authenticate the speaker and/or to execute the speaker's instructions.”)
(See Divakaran, Fig. 4 and para. [0091]: “The web services 442 can, in various implementations, integrate services provided be websites or other networked resources. For example, the web-services can implement a client-server protocol to interface with services provided by remote servers. These services can be used to fulfil a user's intent. For example, when the domain is a map application, the web services 442 can be used to access publically-available maps and address books. In some implementations, the web services 442 can be used for information retrieval and/or to perform transactions on behalf of a device's user. For example, a user can request that the virtual personal assistant system 400 buy a particular product, and the virtual personal assistant system 400 can, using the web services 442, find the product and place the order.”)

In regards to claim 10, it is cancelled. 
In regards to claim 11, it is rejected on the same grounds as claim 1. 
In regards to claim 12, it is rejected on the same grounds as claim 1, except for the following features: 
12.    (Currently Amended) A system, comprising:

a hardware processor; and 
a non-transitory machine-readable storage medium encoded with instructions executable by the hardware processor to perform a operations comprising:

when the authenticating succeeds based on a successful matching of the voice feature vector and the user feature vector and the user-defined payment keyword in the voice input of the user is the same as the preset keyword, 
sending the payment object information and at least one of the audio information in the voice input of the user, the feature matrix generated according to the audio information, or the voice feature vector generated according to the audio information to a server that determines personal information of the user according to the audio information, the feature matrix, or the voice feature vector, to perform a payment operation for the payment object information according to a resource account associated with the personal information.

These features are rejected on the same grounds as claims 8 and 9.
In regards to claims 13 and 19-20, they are respectively rejected on the same grounds as claims 2 and 8-9. 
In regards to claims 21-54, they are cancelled. 
In regards to claim 55, it is rejected on the same grounds as claim 2.
In regards to claim 56, it is rejected on the same grounds as claim 8.
In regards to claim 57, it is rejected on the same grounds as claims 8 and 9.

Response to Arguments
Claim Rejections - 35 USC § 101
Applicant’s previous amendments to the independent claims 1, 11, and 12 overcame the 35 USC § 101 grounds of rejection.  
More specifically, the Examiner held that the independent claim, which recite a method of user authentication, are similar to Example 41 “Cryptographic Communications” in the 2019 Revised Patent Subject Matter Eligibility Guidance (“2019 PEG”), as can be found at pages 14-16 of the following PDF file: https://www.uspto.gov/sites/default/files/documents/101_examples_37to42_20190107.pdf
More specifically, Example 41 was found to be patent eligible because of the following reasons: 
In particular, the combination of additional elements use the mathematical formulas and calculations in a specific manner that sufficiently limits the use of the mathematical concepts to the practical application of transmitting the ciphertext word signal to a computer terminal over a communication channel. Thus, the mathematical concepts are integrated into a process that secures private network communications, so that a ciphertext word signal can be transmitted between computers of people who do not know each other or who have not shared a private key between them in advance of the message being transmitted, where the security of the cipher relies on the difficulty of factoring large integers by computers. Thus, the claim is not directed to the recited judicial exception, and the claim is eligible.

The Applicant’s arguments presented in regards to the patent eligibility of Example 41 are applicable to independent claims 1, 11, and 12 of the present application, as follows: 
In particular, the combination of additional elements performs “authentication” in a practical and specific manner: “when the authenticating succeeds based on a successful matching of the voice feature vector and the user feature vector and the user-defined payment keyword in the voice input of the user is the same as the preset 
These practical and technological features sufficiently limit the use of the business method concept of “sending the payment object information and personal information associated with the user feature vector to a server that performs a payment operation for the payment object information according to a resource account associated with the personal information”. 
Thus, the business method concepts are integrated into a process that performs authentication based on specific technological techniques of “matching of the voice feature vector and the user feature vector” and “matching the user-defined payment keyword in the voice input of the user” to a stored “preset keyword”, so that a payment operation can be transmitted, wherein the authentication relies on the matching of voice data and keywords, and wherein the “voice feature vector” is generated by “arranging the multiple dimension values according to a specified order”, and wherein “the multiple dimension values” are obtained from a neural network. Thus, the claim is not directed to an abstract idea, and the claim is patent eligible. 

Claim Rejections - 35 USC § 103
Applicant’s amendments to the independent claims 1, 11, and 12 have necessitate the new 35 35 USC § 103 grounds of rejection.  
Some of the features previously rejected in view of Zhang are now rejected in view of Gilbey.  These changes in the rejections render Applicant’s arguments pertaining to the Zhang reference as no longer relevant. 

Conclusion
Applicants are invited to contact the Office to schedule an in-person interview to discuss and resolve the issues set forth in this Office Action.  Although an interview is not required, the Office believes that an interview can be of use to resolve any issues related to a patent application in an efficient and prompt manner.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
Any inquiry concerning this communication or earlier communications should be directed to Examiner Ayal Sharon, whose telephone number is (571) 272-5614, and fax number is (571) 273-1794.  The Examiner can normally be reached from Monday to Friday between 9 AM and 6 PM.  If attempts to 
	Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

Sincerely,

/Ayal I. Sharon/
Examiner, Art Unit 3695

March 26, 2022