Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is in response to amendments and remarks filed on 3/09/2022. In the current amendments, claims 1-4, 11-14, 16, and 18-20 are amended. Claims 1-20 are pending and have been examined.
In response to amendments to the remarks filed on 3/09/2022, the objections to Abstract put forth in the previous Office Action have been withdrawn.
In response to amendments and remarks filed on 13/09/2022, the 35 U.S.C. 112(a) rejection to claims 18-19, the 35 U.S.C. 112(b) rejection to claims 1-20, and the 35 U.S.C. 101 rejection to claims 11-12 made in the previous Office Action have been withdrawn.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1, 10, 13-18, and 20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Regarding claim 1,
Claim 1 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 1 is directed to a method, which is directed to a process, one of the statutory categories. See MPEP 2106.03.
Step 2A Prong One Analysis: The claim recites a method for an artificial reality system. Each of the following limitation(s):
wherein the one or more DL-feature representations are extracted from a region of interest detected… within a first image of a real- world environment captured by the client computing device, region of interest comprising a first depiction of a real-world object; 
wherein each of the one or more local feature descriptors corresponds to a patch within the region of interest within the first image and comprises information that encodes one or more visual features present in the patch; 
identifying a set of matching DL-feature representations based on a comparison of the received one or more DL-feature representations with a plurality of stored DL-feature representations associated with a plurality of augmented-reality (AR) targets, the comparison resulting in a determination that the set of matching DL-feature representations and the received one or more DL-feature representations are within a threshold region in a vector space; 
determining, from a set of matching AR targets associated with the set of matching DL-feature representations, a matching AR target based on a comparison of the received one or more local-feature descriptors with stored local-feature descriptors associated with the set of matching AR targets, wherein the stored local-feature descriptors are extracted from the set of matching AR targets; 
selecting an AR effect associated with the determined matching AR target 
  
as drafted, claim 1 is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including observation, evaluation, judgement, opinion)) but for the generic computer components language and insignificant extra solution activity. The above limitations in the context of this claim encompasses wherein the one or more DL-feature representations are extracted from a region of interest detected… within a first image of a real- world environment captured by the client computing device, region of interest comprising a first depiction of a real-world object (corresponds to evaluation). Further, the claim encompasses wherein each of the one or more local feature descriptors corresponds to a patch within the region of interest within the first image and comprises information that encodes one or more visual features present in the patch (corresponds to evaluation). Further, the claim encompasses identifying a set of matching DL-feature representations based on a comparison of the received one or more DL-feature representations with a plurality of stored DL-feature representations associated with a plurality of augmented-reality (AR) targets, the comparison resulting in a determination that the set of matching DL-feature representations and the received one or more DL-feature representations are within a threshold region in a vector space (corresponds observation and judgement). Further, the claim encompasses determining, from a set of matching AR targets associated with the set of matching DL-feature representations, a matching AR target based on a comparison of the received one or more local-feature descriptors with stored local-feature descriptors associated with the set of matching AR targets, wherein the stored local-feature descriptors are extracted from the set of matching AR targets (corresponds to evaluation and judgement). Further, the claim encompasses selecting an AR effect associated with the determined matching AR target (corresponds to evaluation and judgement).
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application. In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). The additional element(s) of “a server” and “a client computing device”, as drafted, is reciting a generic computer component. The generic computer components in these steps are recited at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Further, the limitations of “receiving… one or more deep-learning (DL)-feature representations generated by a machine learning model”, “receiving… one or more local feature descriptors extracted from the region of interest within the first image”, and “sending… the AR effect associated with the determined matching AR target” as drafted, are reciting insignificant extra solution activity because it relates to transmitting information for further process. The insignificant extra-solution activity are recited at a high level of generality such that it amounts no more than mere receiving and transmitting data under MPEP 2106.05(d). Moreover, the recitation of “wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object” amounts to mere instruction to apply the exception. See MPEP 2106.05(f). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The insignificant extra-solution activity of “receiving… one or more deep-learning (DL)-feature representations generated by a machine learning model”, “receiving… one or more local feature descriptors extracted from the region of interest within the first image”, and “sending… the AR effect associated with the determined matching AR target” are considered well known, routine, and conventional because of what is recited in the MPEP 2106.05(d)(II): “The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity...i. Receiving or transmitting data over a network, e.g., using the Internet to gather data”. Moreover, the recitation of “wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object” amounts to mere instruction to apply the exception. See MPEP 2106.05(f). Therefore, these additional elements do not amount to significantly more. The claim in not patent eligible.
Regarding claim 10,
Claim 10 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 10 is directed to a method, which is directed to a process, one of the statutory categories. See MPEP 2106.03.
Step 2A Prong One Analysis: The claim recites a method for an artificial reality system. Each of the following limitation(s):
wherein the comparison of the received one or more DL- feature representation with the plurality of stored DL-feature representations comprises a nearest-neighbor search.   
  
as drafted, claim 10 is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including observation, evaluation, judgement, opinion)) but for the generic computer components language and insignificant extra solution activity. The above limitation in the context of this claim encompasses wherein the comparison of the received one or more DL- feature representation with the plurality of stored DL-feature representations comprises a nearest-neighbor search (corresponds to evaluation).
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application. In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). The additional element(s) of “a server” and “a client computing device”, as drafted, is reciting a generic computer component. The generic computer components in these steps are recited at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Further, the limitations of “receiving… one or more deep-learning (DL)-feature representations generated by a machine learning model”, “receiving… one or more local feature descriptors extracted from the region of interest within the first image”, and “sending… the AR effect associated with the determined matching AR target” as drafted, are reciting insignificant extra solution activity because it relates to transmitting information for further process. The insignificant extra-solution activity are recited at a high level of generality such that it amounts no more than mere receiving and transmitting data under MPEP 2106.05(d). Moreover, the recitation of “wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object” amounts to mere instruction to apply the exception. See MPEP 2106.05(f). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The insignificant extra-solution activity of “receiving… one or more deep-learning (DL)-feature representations generated by a machine learning model”, “receiving… one or more local feature descriptors extracted from the region of interest within the first image”, and “sending… the AR effect associated with the determined matching AR target” are considered well known, routine, and conventional because of what is recited in the MPEP 2106.05(d)(II): “The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity...i. Receiving or transmitting data over a network, e.g., using the Internet to gather data”. Moreover, the recitation of “wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object” amounts to mere instruction to apply the exception. See MPEP 2106.05(f). Therefore, these additional elements do not amount to significantly more. The claim in not patent eligible.
Regarding claim 13,
Claim 13 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 13 is directed to a method, which is directed to a process, one of the statutory categories. See MPEP 2106.03.
Step 2A Prong One Analysis: The claim recites a method for an artificial reality system. Each of the following limitation(s):
wherein the real-world object is continuously tracked in real-time.     
  
as drafted, claim 13 is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including observation, evaluation, judgement, opinion)) but for the generic computer components language and insignificant extra solution activity. The above limitation in the context of this claim encompasses wherein the real-world object is continuously tracked in real-time (corresponds to observation).
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application. In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). The additional element(s) of “a server” and “a client computing device”, as drafted, is reciting a generic computer component. The generic computer components in these steps are recited at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Further, the limitations of “receiving… one or more deep-learning (DL)-feature representations generated by a machine learning model”, “receiving… one or more local feature descriptors extracted from the region of interest within the first image”, and “sending… the AR effect associated with the determined matching AR target” as drafted, are reciting insignificant extra solution activity because it relates to transmitting information for further process. The insignificant extra-solution activity are recited at a high level of generality such that it amounts no more than mere receiving and transmitting data under MPEP 2106.05(d). Moreover, the recitation of “wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object” amounts to mere instruction to apply the exception. See MPEP 2106.05(f). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The insignificant extra-solution activity of “receiving… one or more deep-learning (DL)-feature representations generated by a machine learning model”, “receiving… one or more local feature descriptors extracted from the region of interest within the first image”, and “sending… the AR effect associated with the determined matching AR target” are considered well known, routine, and conventional because of what is recited in the MPEP 2106.05(d)(II): “The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity...i. Receiving or transmitting data over a network, e.g., using the Internet to gather data”. Moreover, the recitation of “wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object” amounts to mere instruction to apply the exception. See MPEP 2106.05(f). Therefore, these additional elements do not amount to significantly more. The claim in not patent eligible.
Regarding claim 14,
Claim 14 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 14 is directed to a method, which is directed to a process, one of the statutory categories. See MPEP 2106.03.
Step 2A Prong One Analysis: Please see claim 1 analysis.
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application. In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). The additional element(s) of “a server” and “a client computing device”, as drafted, is reciting a generic computer component. The generic computer components in these steps are recited at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Further, the limitations of “receiving… one or more deep-learning (DL)-feature representations generated by a machine learning model”, “receiving… one or more local feature descriptors extracted from the region of interest within the first image”, and “sending… the AR effect associated with the determined matching AR target” as drafted, are reciting insignificant extra solution activity because it relates to transmitting information for further process. The insignificant extra-solution activity are recited at a high level of generality such that it amounts no more than mere receiving and transmitting data under MPEP 2106.05(d). Moreover, the recitation of “wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object” and “wherein the AR effect is configured to scale itself based on a location and orientation of the client computing device” amounts to mere instruction to apply the exception. See MPEP 2106.05(f). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The insignificant extra-solution activity of “receiving… one or more deep-learning (DL)-feature representations generated by a machine learning model”, “receiving… one or more local feature descriptors extracted from the region of interest within the first image”, and “sending… the AR effect associated with the determined matching AR target” are considered well known, routine, and conventional because of what is recited in the MPEP 2106.05(d)(II): “The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity...i. Receiving or transmitting data over a network, e.g., using the Internet to gather data”. Moreover, the recitation of “wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object” and “wherein the AR effect is configured to scale itself based on a location and orientation of the client computing device” amounts to mere instruction to apply the exception. See MPEP 2106.05(f). Therefore, these additional elements do not amount to significantly more. The claim in not patent eligible.
Regarding claim 15,
Claim 15 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 15 is directed to a method, which is directed to a process, one of the statutory categories. See MPEP 2106.03.
Step 2A Prong One Analysis: Please see claim 1 analysis.
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application. In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). The additional element(s) of “a server” and “a client computing device”, as drafted, is reciting a generic computer component. The generic computer components in these steps are recited at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Further, the limitations of “receiving… one or more deep-learning (DL)-feature representations generated by a machine learning model”, “receiving… one or more local feature descriptors extracted from the region of interest within the first image”, and “sending… the AR effect associated with the determined matching AR target” as drafted, are reciting insignificant extra solution activity because it relates to transmitting information for further process. The insignificant extra-solution activity are recited at a high level of generality such that it amounts no more than mere receiving and transmitting data under MPEP 2106.05(d). Moreover, the recitation of “wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object” and “wherein the AR effect is a filter effect” amounts to mere instruction to apply the exception. See MPEP 2106.05(f). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The insignificant extra-solution activity of “receiving… one or more deep-learning (DL)-feature representations generated by a machine learning model”, “receiving… one or more local feature descriptors extracted from the region of interest within the first image”, and “sending… the AR effect associated with the determined matching AR target” are considered well known, routine, and conventional because of what is recited in the MPEP 2106.05(d)(II): “The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity...i. Receiving or transmitting data over a network, e.g., using the Internet to gather data”. Moreover, the recitation of “wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object” and “wherein the AR effect is a filter effect” amounts to mere instruction to apply the exception. See MPEP 2106.05(f). Therefore, these additional elements do not amount to significantly more. The claim in not patent eligible.
Regarding claim 16,
Claim 16 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 16 is directed to a method, which is directed to a process, one of the statutory categories. See MPEP 2106.03.
Step 2A Prong One Analysis: The claim recites a method for an artificial reality system. Each of the following limitation(s):
authorizing a user

as drafted, claim 16 is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including observation, evaluation, judgement, opinion)) but for the generic computer components language and insignificant extra solution activity. The above limitation in the context of this claim encompasses authorizing a user (corresponds to evaluation and judgement). 
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application. In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). The additional element(s) of “a server” and “a client computing device”, as drafted, is reciting a generic computer component. The generic computer components in these steps are recited at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Further, the limitations of “receiving… one or more deep-learning (DL)-feature representations generated by a machine learning model”, “receiving… one or more local feature descriptors extracted from the region of interest within the first image”, “sending… the AR effect associated with the determined matching AR target”, and “receive the AR effect associated with the determined matching AR target based on information associated with the user” as drafted, are reciting insignificant extra solution activity because it relates to transmitting information for further process. The insignificant extra-solution activity are recited at a high level of generality such that it amounts no more than mere receiving and transmitting data under MPEP 2106.05(d). Moreover, the recitation of “wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object” amounts to mere instruction to apply the exception. See MPEP 2106.05(f). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The insignificant extra-solution activity of “receiving… one or more deep-learning (DL)-feature representations generated by a machine learning model”, “receiving… one or more local feature descriptors extracted from the region of interest within the first image”, “sending… the AR effect associated with the determined matching AR target”, “receive the AR effect associated with the determined matching AR target based on information associated with the user” are considered well known, routine, and conventional because of what is recited in the MPEP 2106.05(d)(II): “The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity...i. Receiving or transmitting data over a network, e.g., using the Internet to gather data”. Moreover, the recitation of “wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object” amounts to mere instruction to apply the exception. See MPEP 2106.05(f). Therefore, these additional elements do not amount to significantly more. The claim in not patent eligible.
Regarding claim 17,
Claim 17 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 17 is directed to a method, which is directed to a process, one of the statutory categories. See MPEP 2106.03.
Step 2A Prong One Analysis: The claim recites a method for an artificial reality system. Each of the following limitation(s):
wherein the information associated with the user comprises user affinity information, wherein the user affinity information comprises an affinity coefficient between the user and the AR effect

as drafted, claim 17 is a process that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including observation, evaluation, judgement, opinion)) but for the generic computer components language and insignificant extra solution activity. The above limitation in the context of this claim encompasses wherein the information associated with the user comprises user affinity information, wherein the user affinity information comprises an affinity coefficient between the user and the AR effect (corresponds to evaluation). 
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application. In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). The additional element(s) of “a server” and “a client computing device”, as drafted, is reciting a generic computer component. The generic computer components in these steps are recited at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Further, the limitations of “receiving… one or more deep-learning (DL)-feature representations generated by a machine learning model”, “receiving… one or more local feature descriptors extracted from the region of interest within the first image”, “sending… the AR effect associated with the determined matching AR target”, and “receive the AR effect associated with the determined matching AR target based on information associated with the user” as drafted, are reciting insignificant extra solution activity because it relates to transmitting information for further process. The insignificant extra-solution activity are recited at a high level of generality such that it amounts no more than mere receiving and transmitting data under MPEP 2106.05(d). Moreover, the recitation of “wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object” amounts to mere instruction to apply the exception. See MPEP 2106.05(f). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The insignificant extra-solution activity of “receiving… one or more deep-learning (DL)-feature representations generated by a machine learning model”, “receiving… one or more local feature descriptors extracted from the region of interest within the first image”, “sending… the AR effect associated with the determined matching AR target”, “receive the AR effect associated with the determined matching AR target based on information associated with the user” are considered well known, routine, and conventional because of what is recited in the MPEP 2106.05(d)(II): “The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity...i. Receiving or transmitting data over a network, e.g., using the Internet to gather data”. Moreover, the recitation of “wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object” amounts to mere instruction to apply the exception. See MPEP 2106.05(f). Therefore, these additional elements do not amount to significantly more. The claim in not patent eligible.
Regarding claim 18,
Claim 18 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 18 is directed to a media, which is directed to a manufacture, one of the statutory categories. See MPEP 2106.03.
Step 2A Prong One Analysis: The claim recites a media for an artificial reality system. Each of the following limitation(s):
wherein the one or more DL-feature representations are extracted from a region of interest detected… within a first image of a real- world environment captured by the client computing device, region of interest comprising a first depiction of a real-world object; 
wherein each of the one or more local feature descriptors corresponds to a patch within the region of interest within the first image and comprises information that encodes one or more visual features present in the patch; 
identifying a set of matching DL-feature representations based on a comparison of the received one or more DL-feature representations with a plurality of stored DL-feature representations associated with a plurality of augmented-reality (AR) targets, the comparison resulting in a determination that the set of matching DL-feature representations and the received one or more DL-feature representations are within a threshold region in a vector space; 
determining, from a set of matching AR targets associated with the set of matching DL-feature representations, a matching AR target based on a comparison of the received one or more local-feature descriptors with stored local-feature descriptors associated with the set of matching AR targets, wherein the stored local-feature descriptors are extracted from the set of matching AR targets; 
selecting an AR effect associated with the determined matching AR target 
  
as drafted, claim 18 is a manufacture that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including observation, evaluation, judgement, opinion)) but for the generic computer components language and insignificant extra solution activity. The above limitations in the context of this claim encompasses wherein the one or more DL-feature representations are extracted from a region of interest detected… within a first image of a real- world environment captured by the client computing device, region of interest comprising a first depiction of a real-world object (corresponds to evaluation). Further, the claim encompasses wherein each of the one or more local feature descriptors corresponds to a patch within the region of interest within the first image and comprises information that encodes one or more visual features present in the patch (corresponds to evaluation). Further, the claim encompasses identifying a set of matching DL-feature representations based on a comparison of the received one or more DL-feature representations with a plurality of stored DL-feature representations associated with a plurality of augmented-reality (AR) targets, the comparison resulting in a determination that the set of matching DL-feature representations and the received one or more DL-feature representations are within a threshold region in a vector space (corresponds observation and judgement). Further, the claim encompasses determining, from a set of matching AR targets associated with the set of matching DL-feature representations, a matching AR target based on a comparison of the received one or more local-feature descriptors with stored local-feature descriptors associated with the set of matching AR targets, wherein the stored local-feature descriptors are extracted from the set of matching AR targets (corresponds to evaluation and judgement). Further, the claim encompasses selecting an AR effect associated with the determined matching AR target (corresponds to evaluation and judgement).
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application. In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). The additional element(s) of “One or more computer-readable non-transitory storage media embodying software that is operable when executed to” and “a client computing device”, as drafted, is reciting a generic computer component. The generic computer components in these steps are recited at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Further, the limitations of “receiving… one or more deep-learning (DL)-feature representations generated by a machine learning model”, “receiving… one or more local feature descriptors extracted from the region of interest within the first image”, and “sending… the AR effect associated with the determined matching AR target” as drafted, are reciting insignificant extra solution activity because it relates to transmitting information for further process. The insignificant extra-solution activity are recited at a high level of generality such that it amounts no more than mere receiving and transmitting data under MPEP 2106.05(d). Moreover, the recitation of “wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object” amounts to mere instruction to apply the exception. See MPEP 2106.05(f). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The insignificant extra-solution activity of “receiving… one or more deep-learning (DL)-feature representations generated by a machine learning model”, “receiving… one or more local feature descriptors extracted from the region of interest within the first image”, and “sending… the AR effect associated with the determined matching AR target” are considered well known, routine, and conventional because of what is recited in the MPEP 2106.05(d)(II): “The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity...i. Receiving or transmitting data over a network, e.g., using the Internet to gather data”. Moreover, the recitation of “wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object” amounts to mere instruction to apply the exception. See MPEP 2106.05(f). Therefore, these additional elements do not amount to significantly more. The claim in not patent eligible.
Regarding claim 20,
Claim 20 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Step 1 Analysis: Claim 20 is directed to a system, which is directed to a machine, one of the statutory categories. See MPEP 2106.03.
Step 2A Prong One Analysis: The claim recites a system for an artificial reality system. Each of the following limitation(s):
wherein the one or more DL-feature representations are extracted from a region of interest detected… within a first image of a real- world environment captured by the client computing device, region of interest comprising a first depiction of a real-world object; 
wherein each of the one or more local feature descriptors corresponds to a patch within the region of interest within the first image and comprises information that encodes one or more visual features present in the patch; 
identifying a set of matching DL-feature representations based on a comparison of the received one or more DL-feature representations with a plurality of stored DL-feature representations associated with a plurality of augmented-reality (AR) targets, the comparison resulting in a determination that the set of matching DL-feature representations and the received one or more DL-feature representations are within a threshold region in a vector space; 
determining, from a set of matching AR targets associated with the set of matching DL-feature representations, a matching AR target based on a comparison of the received one or more local-feature descriptors with stored local-feature descriptors associated with the set of matching AR targets, wherein the stored local-feature descriptors are extracted from the set of matching AR targets; 
selecting an AR effect associated with the determined matching AR target 
  
as drafted, claim 20 is a machine that, under its broadest reasonable interpretation, covers mental processes (concepts performed in the human mind (including observation, evaluation, judgement, opinion)) but for the generic computer components language and insignificant extra solution activity. The above limitations in the context of this claim encompasses wherein the one or more DL-feature representations are extracted from a region of interest detected… within a first image of a real- world environment captured by the client computing device, region of interest comprising a first depiction of a real-world object (corresponds to evaluation). Further, the claim encompasses wherein each of the one or more local feature descriptors corresponds to a patch within the region of interest within the first image and comprises information that encodes one or more visual features present in the patch (corresponds to evaluation). Further, the claim encompasses identifying a set of matching DL-feature representations based on a comparison of the received one or more DL-feature representations with a plurality of stored DL-feature representations associated with a plurality of augmented-reality (AR) targets, the comparison resulting in a determination that the set of matching DL-feature representations and the received one or more DL-feature representations are within a threshold region in a vector space (corresponds observation and judgement). Further, the claim encompasses determining, from a set of matching AR targets associated with the set of matching DL-feature representations, a matching AR target based on a comparison of the received one or more local-feature descriptors with stored local-feature descriptors associated with the set of matching AR targets, wherein the stored local-feature descriptors are extracted from the set of matching AR targets (corresponds to evaluation and judgement). Further, the claim encompasses selecting an AR effect associated with the determined matching AR target (corresponds to evaluation and judgement).
Step 2A Prong Two Analysis: This judicial exception is not integrated into a practical application. In particular, the claim only recites additional elements that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). The additional element(s) of “one or more processors”, “one or more computer-readable non-transitory storage media coupled to one or more of the processors and comprising instructions operable when executed by one or more of the processors to cause the system to”, and “a client computing device”, as drafted, is reciting a generic computer component. The generic computer components in these steps are recited at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Further, the limitations of “receiving… one or more deep-learning (DL)-feature representations generated by a machine learning model”, “receiving… one or more local feature descriptors extracted from the region of interest within the first image”, and “sending… the AR effect associated with the determined matching AR target” as drafted, are reciting insignificant extra solution activity because it relates to transmitting information for further process. The insignificant extra-solution activity are recited at a high level of generality such that it amounts no more than mere receiving and transmitting data under MPEP 2106.05(d). Moreover, the recitation of “wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object” amounts to mere instruction to apply the exception. See MPEP 2106.05(f). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
Step 2B Analysis: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional element of using generic computer components to perform the abstract idea amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The insignificant extra-solution activity of “receiving… one or more deep-learning (DL)-feature representations generated by a machine learning model”, “receiving… one or more local feature descriptors extracted from the region of interest within the first image”, and “sending… the AR effect associated with the determined matching AR target” are considered well known, routine, and conventional because of what is recited in the MPEP 2106.05(d)(II): “The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity...i. Receiving or transmitting data over a network, e.g., using the Internet to gather data”. Moreover, the recitation of “wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object” amounts to mere instruction to apply the exception. See MPEP 2106.05(f). Therefore, these additional elements do not amount to significantly more. The claim in not patent eligible.
 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-4, 6-7, 10, 13, 15-16, and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Attorre et al. (US 20200134377 A1) in view of Loxam et al. (US 20140225924 A1)
Regarding Claim 1,
Attorre et al. teaches a method comprising, by a server (Attorre et al., Para. [0046] and FIG. 6, “operations in flow chart 600 can be performed by one or more servers in a cloud computing environment as described below with respect to FIG. 7” teaches the server)
receiving, from a client computing device, one or more deep-learning (DL)-feature representations generated by a machine learning model, wherein the one or more DL-feature representations are extracted from a region of interest detected by the client computing device within a first image of a real-world environment captured by the client computing device, the region of interest comprising a first depiction of a real-world object (Attorre et al., Para. [0005], “a method includes receiving a source image at one or more computing devices” teaches receiving from a computing device. Para. [0029] and FIG. 2, “A CNN in logo detection model 220 may then extract feature vectors from each candidate region and classify the candidate region based on the feature vectors” teaches a CNN in the logo detection model (corresponds to the machine learning model) that extracts feature vectors (corresponds to deep-learning feature representations) from each candidate region (corresponds to the region of interest detected). Para. [0034] and FIG. 4, “logo detection model 410 may detect generic logo patches or regions (i.e., regions that are likely to embody a logo). For example, in the example shown in FIG. 4, logo detection model 410 may receive an image 420, which may include images of one or more logos, and identify candidate regions 430, 432, and 434 that likely embody a logo from image 420. Outputs 440 of logo detection model 410 may thus include sub-images 442, 444, and 446 that may be cropped out from image 420 or may include identifications (e.g., coordinates) of candidate regions 430, 432, and 434” teaches determining identification of candidate regions (corresponds to the region of interest) of an image (corresponds to the first image of a real-world environment captured) from the logo detection model when an image is received from the computing device).
receiving, from the client computing device, one or more local feature descriptors extracted from the region of interest within the first image, wherein each of the one or more local feature descriptors corresponds to a patch within the region of interest within the first image and comprises information that encodes one or more visual features present in the patch (Attorre et al., Para. [0005], “a method includes receiving a source image at one or more computing devices” teaches receiving from a computing device. Para. [0034] and FIG. 4, “logo detection model 410 may detect generic logo patches or regions (i.e., regions that are likely to embody a logo). For example, in the example shown in FIG. 4, logo detection model 410 may receive an image 420, which may include images of one or more logos, and identify candidate regions 430, 432, and 434 that likely embody a logo from image 420. Outputs 440 of logo detection model 410 may thus include sub-images 442, 444, and 446 that may be cropped out from image 420 or may include identifications (e.g., coordinates) of candidate regions 430, 432, and 434” teaches determining patches/regions, sub-images, and identifications (corresponds to local-feature descriptors) from the identified candidate region (corresponds to the region of interest) when an image is received (corresponds to the first image)).
identifying a set of matching DL-feature representations based on a comparison of the received one or more DL-feature representations with a plurality of stored DL-feature representations associated with a plurality of augmented-reality (AR) targets, the comparison resulting in a determination that the set of matching DL-feature representations and the received one or more DL-feature representations are within a threshold region in a vector space (Attorre et al., Para. [0003], “Logo detection or recognition in images and videos can be used in many applications, such as copyright or trademark infringement detection, contextual advertise placement, intelligent traffic control based on vehicle logos, automated computation of brand-related statistics, augmented reality, and the like” teaches the technique being used in many applications such as augmented reality. Para. [0053], “To detect logos in a new source image that may include an image of the new target logo, candidate regions in the new source image that may embody a logo may be determined by the second model, and a feature vector may be extracted from each candidate region and compared with each reference feature vector in the embedding database (including the reference feature vectors extracted from the images of the new target logo) to find a match. As such, new target logos may be detected by existing models or networks without retraining such models or networks” teaches comparing feature vectors (corresponds to DL-feature representations) with the reference feature vectors stored in the database that is associated to a target logo (corresponds to AR target)  to identify potential matches. Para. [0020], “The target logo associated with the best matching reference feature vector is determined as present in the candidate region in the source image if the best matching score is greater than a threshold value” teaches determining a threshold value (corresponds to the threshold region in a vector space) for a set of matching reference feature vector (corresponds to deep-learning feature representations)).
determining, from a set of matching AR targets associated with the set of matching DL-feature representations, a matching AR target based on a comparison of the received one or more local-feature descriptors with stored local-feature descriptors associated with the set of matching AR targets, wherein the stored local-feature descriptors are extracted from the set of matching AR targets (Attorre et al., Para. [0036], “The features extracted from each of sub-images 442, 444, and 446 may be compared with reference features stored in database 450 by a comparator 485 to determine if there is a match between any reference features stored in databased 450 and features extracted from sub-image 442, 444, or 446” teaches determining matching features by comparing features extracted from sub images (corresponds to local-feature descriptors) with reference features stored in a database).
Attorre et al. does not appear to explicitly teach selecting an AR effect associated with the determined matching AR target; sending, to the client computing device, the AR effect associated with the determined matching AR target, wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object
However, Loxam et al. teaches selecting an AR effect associated with the determined matching AR target (Loxam et al., FIG. 3A and Para. [0042], “the augmentation engine 375 can start transmitting to the mobile computing device 300 the potential large augmented reality content files such as video files, and advertisements while the object recognition engine 320 determines what the object is. Thus, at approximately at the same time as the object recognition engine 320 is hierarchically filtering or narrowing down the possible known matching images/object to the transmitted features, the augmentation engine 375 can be preparing and selecting augmented reality content to be transmitted back to the video processing module on the mobile computing device 300 for display. Note, similarly, the augmentation engine 316 can be preparing and selecting augmented reality content to be overlaid onto the video frames while the trigger item identification is performing its operations” teaches an augmentation engine that prepares and selects an augmentation reality content to be overlaid (corresponds to the AR effect) onto the video frame based on the determined matching images or object (corresponds to the AR target)).
sending, to the client computing device, the AR effect associated with the determined matching AR target, wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object (Loxam et al., FIG. 3A and Para. [0042], “the augmentation engine 375 can start transmitting to the mobile computing device 300 the potential large augmented reality content files such as video files, and advertisements while the object recognition engine 320 determines what the object is. Thus, at approximately at the same time as the object recognition engine 320 is hierarchically filtering or narrowing down the possible known matching images/object to the transmitted features, the augmentation engine 375 can be preparing and selecting augmented reality content to be transmitted back to the video processing module on the mobile computing device 300 for display. Note, similarly, the augmentation engine 316 can be preparing and selecting augmented reality content to be overlaid onto the video frames while the trigger item identification is performing its operations” teaches an augmentation engine that prepares and selects an augmentation reality content to be overlaid (corresponds to the AR effect) onto the video frame based on the determined matching images or object (corresponds to the AR target). FIG.5-6 and Para. [0071], “The process described in FIGS. 5 and 6 may generally require the detect trigger item engine 370 to perform a one-to-many approach by positively matching the feature points of the one real world trigger item with the indexed feature points of the many known candidate trigger items” teaches the real world trigger item (corresponds to the real world object). Para. [0072], “When it is ready, the user can point the phone at the picture, and it will come to life. If the user was not told what picture to point the smart phone at, then the user can point the camera around the location and the augmented reality application will automatically detect the trigger items in view” teaches the augmented reality application automatically detecting the trigger items (corresponds to the real-world object) in view, which shows the trigger item is being tracked in real-time).
Attorre et al. in view of Loxam et al. are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “image recognition”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Attorre et al. with Loxam et al., with motivation of selecting an AR effect associated with the determined matching AR target; sending, to the client computing device, the AR effect associated with the determined matching AR target, wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object. “The systems and methods allow mobile computing devices to identify real world trigger items and to cause augmented reality scenarios associated with a real world trigger item to be presented on a display of the mobile computing device” (Loxam et al., Abstract). The proposed teaching is beneficial in that it helps identify real world trigger items and cause augmented reality scenarios associated with a real work trigger item.
Regarding Claim 2,
Attorre et al. in view of Loxam et al. teaches	the method of Claim 1, 
Attorre et al. further teaches wherein the one or more DL- feature representations are extracted at the client computing device by (Attorre et al., Para. [0005], “The method also includes extracting, from the candidate region and by a neural network implemented using the one or more computing devices, a feature vector of the candidate region” teaches extracting a feature vector from a computing device).
accessing the first image (Attorre et al., Para. [0007], “the first reference feature vector extracted from a first image of a first target logo in the set of target logos” teaches accessing the first image).
generating, by a first machine learning model, an initial feature map associated with the first image (Attorre et al., Para. [0029] and FIG. 2, “the input image may be fed to the CNN to generate a convolutional feature map” teaches utilizing a convolutional neural network (corresponds to the first machine learning model) to generate a feature map associated with the input image (corresponds to the first image)).
identifying the region of interest within the initial feature map (Attorre et al., Para. [0029] and FIG. 2, “candidate regions may be identified from the convolutional feature map” teaches identifying a candidate region (corresponds to region of interest) within the convolutional feature map). 
wherein the region of interest is associated with at least a first real-world- object type, and wherein the region of interest is associated with a portion of the first image corresponding to the first depiction of the real-world object (Attorre et al., Para. [0029], “using selective search techniques and may be reshaped to a predetermined size using, for example, a region-of-interest (ROI) pooling layer” teaches utilizing a selective search technique to select the region of interest of the image that depicts the real-world object). 
extracting, from the region of interest, the one or more DL-feature representations, wherein each extracted DL-feature representation is an output of a second machine learning model that is trained to detect at least objects of the first real-world-object type (Attorre et al., Para. [0053], “To detect logos in a new source image that may include an image of the new target logo, candidate regions in the new source image that may embody a logo may be determined by the second model, and a feature vector may be extracted from each candidate region and compared with each reference feature vector in the embedding database (including the reference feature vectors extracted from the images of the new target logo) to find a match. As such, new target logos may be detected by existing models or networks without retraining such models or networks” teaches a second model trained to determine candidate regions and detect logos (corresponds to objects of the first real-world-object type) in a source image. Feature vectors are then extracted from the candidate regions).
Regarding Claim 3,
Attorre et al. in view of Loxam et al. teaches the method of Claim 2, wherein the one or more local- feature descriptors are extracted at the client computing device by: 
Attorre et al. further teaches extracting, from a portion of the first image associated with the region of the interest, one or more local-feature descriptors associated with one or more detected points of interest, wherein each local-feature descriptor is generated based on information associated with a spatially bounded patch within the first image, the spatially bounded patch comprising a respective detected point of interest (Attorre et al., Para. [0034] and FIG. 4, “logo detection model 410 may detect generic logo patches or regions (i.e., regions that are likely to embody a logo). For example, in the example shown in FIG. 4, logo detection model 410 may receive an image 420, which may include images of one or more logos, and identify candidate regions 430, 432, and 434 that likely embody a logo from image 420. Outputs 440 of logo detection model 410 may thus include sub-images 442, 444, and 446 that may be cropped out from image 420 or may include identifications (e.g., coordinates) of candidate regions 430, 432, and 434” teaches determining patches/regions, sub-images, and identifications (corresponds to local-feature descriptors and the coordinates from the identifications corresponds to one or more detected points of interest) from the logo detection model when an image is received). 
Regarding Claim 4,
Attorre et al. in view of Loxam et al. teaches the method of Claim 2, 
Attorre et al. further teaches wherein selecting the region of interest comprises (Attorre et al., Para. [0005], “detecting, in the source image and using a first logo detection model implemented by the one or more computing devices, a candidate region for determining a logo in the source image” teaches detecting the candidate region (corresponds to region of interest)).
calculating, for one or more portions of the first image, a confidence score based on a third machine learning model (Attorre et al., Para. [0046], “the one or more processing devices may implement one or more neural networks for one or more machine learning-based models” teaches the embodiment consisting of multiple machine learning models. Para. [0005], “extracting, from the candidate region and by a neural network implemented using the one or more computing devices, a feature vector of the candidate region, and determining, for each reference feature vector from a set of reference feature vectors stored in a database, a respective matching score” teaches determining a matching score (corresponds to confidence score) based on the neural network implemented). 
selecting, the region of interest, one or more of the portions of the first image having a confidence score greater than a threshold confidence score (Attorre et al., Para. [0020], “The target logo associated with the best matching reference feature vector is determined as present in the candidate region in the source image if the best matching score is greater than a threshold value” teaches selecting a candidate region (corresponds to region of interest) in the source image (corresponds to the first image) if the best matching score (corresponds to confidence score) is greater than a threshold value).
Regarding Claim 6,
Attorre et al. in view of Loxam et al. teaches the method of Claim 2, wherein the stored DL-feature representations are determined by a process comprising: 
Attorre et al. further teaches passing a plurality of second images comprising second depictions of the real-world object, wherein each of the plurality of second images comprises a variation of the first depiction of the real-world object (Attorre et al., Para. [0034], “Outputs 440 of logo detection model 410 may thus include sub-images 442, 444, and 446 that may be cropped out from image 420” teaches sub-images (corresponds to second images) that is a cropped variation of the first image depiction. The model depicting the brand logos (corresponds to real-world object) in each of the images. Para. [0036], “Each of sub-images 442, 444, and 446 extracted from image 420 may be passed to a feature extractor 480” teaches passing a plurality of sub-images to the feature extractor.
extracting, from each of the plurality of second images, one or more DL-feature representations (Attorre et al., Para. [0036], “Each of sub-images 442, 444, and 446 extracted from image 420 may be passed to a feature extractor 480 to extract features from each of sub-images 442, 444, and 446 using a feature extractor” teaches utilizing a feature extractor to extract feature vectors from the sub-images).
Regarding Claim 7,
Attorre et al. in view of Loxam et al. teaches the method of Claim 6, further comprising: 
Attorre et al. further teaches representing the DL-feature representations extracted from the plurality of second images as vector representations (Attorre et al., Para. [0036], “The features extracted from each of sub-images 442, 444, and 446 may be compared with reference features stored in database 450 by a comparator 485… Comparator 485 may compare features (e.g., represented by feature vectors) to determine matching scores between features” teaches the features extracted from the sub-images (corresponds to seconds images) represented by feature vectors).
based on the respective vector representations, associating the DL-feature representations with respective AR targets (Attorre et al., Para. [0005], “extracting, from the candidate region and by a neural network implemented using the one or more computing devices, a feature vector of the candidate region, and determining, for each reference feature vector from a set of reference feature vectors stored in a database, a respective matching score between the reference feature vector and the feature vector of the candidate region, where each reference feature vector in the set of reference feature vectors is extracted from a respective image of a target logo in a set of target logos” teaches associating the respective reference feature vector with the respective image of a target logos (corresponds to AR target)).
Regarding Claim 10,
Attorre et al. in view of Loxam et al. teaches the method of Claim 1, 
Attorre et al. further teaches wherein the comparison of the received one or more DL-feature representation with the plurality of stored DL-feature representations comprises a nearest-neighbor search (Attorre et al., Para. [0024], “To detect logos in a new source image that likely includes an image of the new target logo, candidate regions that likely embody a logo are determined by the agnostic logo detection model, a feature vector is extracted from each candidate region and compared with each reference feature vector in the database (including the reference feature vectors extracted from the images of the new target logo) to find a match. As such, new target logos can be detected by existing models or networks without retraining such models or networks using images of the new target logos” teaches a comparison process to identify potential matching (corresponds to a nearest-neighbor search) between feature vectors). 
Regarding Claim 13,
Attorre et al. in view of Loxam et al. teaches the method of Claim 1, 
Loxam et al. further teaches wherein the real-world object is continuously tracked in real-time (Loxam et al., Para. [0072], “When it is ready, the user can point the phone at the picture, and it will come to life. If the user was not told what picture to point the smart phone at, then the user can point the camera around the location and the augmented reality application will automatically detect the trigger items in view” teaches the augmented reality application automatically detecting the trigger items (corresponds to the real-world object) in view, which shows the trigger item is being tracked in real-time).
Attorre et al. in view of Loxam et al. are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “image recognition”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Attorre et al. with Loxam et al., with motivation wherein the real-world object is continuously tracked in real-time. “The systems and methods allow mobile computing devices to identify real world trigger items and to cause augmented reality scenarios associated with a real world trigger item to be presented on a display of the mobile computing device” (Loxam et al., Abstract). The proposed teaching is beneficial in that it helps identify real world trigger items and cause augmented reality scenarios associated with a real work trigger item.
Regarding Claim 15,
Attorre et al. in view of Loxam et al. teaches the method of Claim 1, 
Loxam et al. further teaches wherein the AR effect is a filter effect (Loxam et al., Para. [0095], “The object recognition engine distributed across the IDOL server set applies a hierarchical set of filters to the transmitted identified points of interest and their associated major within each frame of a video stream to determine what that one or more potential trigger item are within that frame. Since this is a video feed of a series of closely related frames both in time and in approximate location, the pattern of identified major features of potential trigger item within each frame of a video stream helps to narrow down the matching known object stored in the object database” teaches applying a set of filters).  
Attorre et al. in view of Loxam et al. are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “image recognition”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Attorre et al. with Loxam et al., with motivation wherein the AR effect is a filter effect. “The pattern of identified major features of potential trigger item within each frame of a video stream helps to narrow down the matching known object stored in the object database” (Loxam et al., Para. [0095]). The proposed teaching is beneficial in that it helps to narrow down the matching known object stored in the object database.
Regarding Claim 16,
Attorre et al. in view of Loxam et al. teaches the method of Claim 1, further comprising: 
Loxam et al. further teaches authorizing a user of the client device to receive the AR effect associated with the determined matching AR target based on information associated with the user (Loxam et al., Para. [0036], “The augmentation engine 316 is also configured to allow a user to create augmented reality content from stock locations including any combination of 1) off of the local memory of the smart mobile computing device 300, 2) from Internet sources, 3) from an augment information database 360 maintained at the backend server, 4) from a links database 350, or 5) similar source. The augmentation engine 316 then also allows the user to associate that augmented reality content with at least one trigger item from the trigger item engine 314/330” teaches allowing the user to create augmented reality content (corresponds to AR effect) and associate the content with the trigger item (corresponds to AR target). Para. [0044], “The augmentation engine 375 may select the augmented reality information that is most relevant to the user” teaches the authorization being based off information that is most relevant (corresponds to associated) to the user).
Attorre et al. in view of Loxam et al. are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “image recognition”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Attorre et al. with Loxam et al., with motivation of authorizing a user of the client device to receive the AR effect associated with the determined matching AR target based on information associated with the user. “The systems and methods allow mobile computing devices to identify real world trigger items and to cause augmented reality scenarios associated with a real world trigger item to be presented on a display of the mobile computing device” (Loxam et al., Abstract). The proposed teaching is beneficial in that it helps identify real world trigger items and cause augmented reality scenarios associated with a real work trigger item.
Regarding Claim 18,
Attorre et al. in view of Loxam et al. teaches one or more computer-readable non-transitory storage media embodying software that is operable when executed to (Attorre et al., Para. [0004], “Various inventive embodiments are described herein, including methods, systems, non-transitory computer-readable storage media storing programs, code, or instructions executable by one or more processors” teaches the embodiment comprising of a non-transitory computer-readable storage media that stores programs, code, or instructions executable by one or more processors) 
receive, from a client computing device, one or more deep-learning (DL)-feature representations generated by a machine learning model, wherein the one or more DL-feature representations are extracted from a region of interest detected by the client computing device within a first image of a real-world environment captured by the client computing device, the region of interest comprising a first depiction of a real-world object (Attorre et al., Para. [0005], “a method includes receiving a source image at one or more computing devices” teaches receiving from a computing device. Para. [0029] and FIG. 2, “A CNN in logo detection model 220 may then extract feature vectors from each candidate region and classify the candidate region based on the feature vectors” teaches a CNN in the logo detection model (corresponds to the machine learning model) that extracts feature vectors (corresponds to deep-learning feature representations) from each candidate region (corresponds to the region of interest detected). Para. [0034] and FIG. 4, “logo detection model 410 may detect generic logo patches or regions (i.e., regions that are likely to embody a logo). For example, in the example shown in FIG. 4, logo detection model 410 may receive an image 420, which may include images of one or more logos, and identify candidate regions 430, 432, and 434 that likely embody a logo from image 420. Outputs 440 of logo detection model 410 may thus include sub-images 442, 444, and 446 that may be cropped out from image 420 or may include identifications (e.g., coordinates) of candidate regions 430, 432, and 434” teaches determining identification of candidate regions (corresponds to the region of interest) of an image (corresponds to the first image of a real-world environment captured) from the logo detection model when an image is received from the computing device).
receive, from the client computing device, one or more local feature descriptors extracted from the region of interest within the first image, wherein each of the one or more local feature descriptors corresponds to a patch within the region of interest within the first image and comprises information that encodes one or more visual features present in the patch (Attorre et al., Para. [0005], “a method includes receiving a source image at one or more computing devices” teaches receiving from a computing device. Para. [0034] and FIG. 4, “logo detection model 410 may detect generic logo patches or regions (i.e., regions that are likely to embody a logo). For example, in the example shown in FIG. 4, logo detection model 410 may receive an image 420, which may include images of one or more logos, and identify candidate regions 430, 432, and 434 that likely embody a logo from image 420. Outputs 440 of logo detection model 410 may thus include sub-images 442, 444, and 446 that may be cropped out from image 420 or may include identifications (e.g., coordinates) of candidate regions 430, 432, and 434” teaches determining patches/regions, sub-images, and identifications (corresponds to local-feature descriptors) from the identified candidate region (corresponds to the region of interest) when an image is received (corresponds to the first image)).
identify a set of matching DL-feature representations based on a comparison of the received one or more DL-feature representations with a plurality of stored DL-feature representations associated with a plurality of augmented-reality (AR) targets, the comparison resulting in a determination that the set of matching DL-feature representations and the received one or more DL-feature representations are within a threshold region in a vector space (Attorre et al., Para. [0003], “Logo detection or recognition in images and videos can be used in many applications, such as copyright or trademark infringement detection, contextual advertise placement, intelligent traffic control based on vehicle logos, automated computation of brand-related statistics, augmented reality, and the like” teaches the technique being used in many applications such as augmented reality. Para. [0053], “To detect logos in a new source image that may include an image of the new target logo, candidate regions in the new source image that may embody a logo may be determined by the second model, and a feature vector may be extracted from each candidate region and compared with each reference feature vector in the embedding database (including the reference feature vectors extracted from the images of the new target logo) to find a match. As such, new target logos may be detected by existing models or networks without retraining such models or networks” teaches comparing feature vectors (corresponds to DL-feature representations) with the reference feature vectors stored in the database that is associated to a target logo (corresponds to AR target)  to identify potential matches. Para. [0020], “The target logo associated with the best matching reference feature vector is determined as present in the candidate region in the source image if the best matching score is greater than a threshold value” teaches determining a threshold value (corresponds to the threshold region in a vector space) for a set of matching reference feature vector (corresponds to deep-learning feature representations)).
determine, from a set of matching AR targets associated with the set of matching DL-feature representations, a matching AR target based on a comparison of the received one or more local-feature descriptors with stored local-feature descriptors associated with the set of matching AR targets, wherein the stored local-feature descriptors are extracted from the set of matching AR targets  (Attorre et al., Para. [0036], “The features extracted from each of sub-images 442, 444, and 446 may be compared with reference features stored in database 450 by a comparator 485 to determine if there is a match between any reference features stored in databased 450 and features extracted from sub-image 442, 444, or 446” teaches determining matching features by comparing features extracted from sub images (corresponds to local-feature descriptors) with reference features stored in a database).
Attorre et al. does not appear to explicitly teach select an AR effect associated with the determined matching AR target; send, to the client computing device, the AR effect associated with the determined matching AR target, wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object
However, Loxam et al. teaches select an AR effect associated with the determined matching AR target (Loxam et al., FIG. 3A and Para. [0042], “the augmentation engine 375 can start transmitting to the mobile computing device 300 the potential large augmented reality content files such as video files, and advertisements while the object recognition engine 320 determines what the object is. Thus, at approximately at the same time as the object recognition engine 320 is hierarchically filtering or narrowing down the possible known matching images/object to the transmitted features, the augmentation engine 375 can be preparing and selecting augmented reality content to be transmitted back to the video processing module on the mobile computing device 300 for display. Note, similarly, the augmentation engine 316 can be preparing and selecting augmented reality content to be overlaid onto the video frames while the trigger item identification is performing its operations” teaches an augmentation engine that prepares and selects an augmentation reality content to be overlaid (corresponds to the AR effect) onto the video frame based on the determined matching images or object (corresponds to the AR target)).
send, to the client computing device, the AR effect associated with the determined matching AR target, wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object (Loxam et al., FIG. 3A and Para. [0042], “the augmentation engine 375 can start transmitting to the mobile computing device 300 the potential large augmented reality content files such as video files, and advertisements while the object recognition engine 320 determines what the object is. Thus, at approximately at the same time as the object recognition engine 320 is hierarchically filtering or narrowing down the possible known matching images/object to the transmitted features, the augmentation engine 375 can be preparing and selecting augmented reality content to be transmitted back to the video processing module on the mobile computing device 300 for display. Note, similarly, the augmentation engine 316 can be preparing and selecting augmented reality content to be overlaid onto the video frames while the trigger item identification is performing its operations” teaches an augmentation engine that prepares and selects an augmentation reality content to be overlaid (corresponds to the AR effect) onto the video frame based on the determined matching images or object (corresponds to the AR target). FIG.5-6 and Para. [0071], “The process described in FIGS. 5 and 6 may generally require the detect trigger item engine 370 to perform a one-to-many approach by positively matching the feature points of the one real world trigger item with the indexed feature points of the many known candidate trigger items” teaches the real world trigger item (corresponds to the real world object). Para. [0072], “When it is ready, the user can point the phone at the picture, and it will come to life. If the user was not told what picture to point the smart phone at, then the user can point the camera around the location and the augmented reality application will automatically detect the trigger items in view” teaches the augmented reality application automatically detecting the trigger items (corresponds to the real-world object) in view, which shows the trigger item is being tracked in real-time).
Attorre et al. in view of Loxam et al. are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “image recognition”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Attorre et al. with Loxam et al., with motivation to select an AR effect associated with the determined matching AR target; send, to the client computing device, the AR effect associated with the determined matching AR target, wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object. “The systems and methods allow mobile computing devices to identify real world trigger items and to cause augmented reality scenarios associated with a real world trigger item to be presented on a display of the mobile computing device” (Loxam et al., Abstract). The proposed teaching is beneficial in that it helps identify real world trigger items and cause augmented reality scenarios associated with a real work trigger item.
Regarding Claim 19,
Attorre et al. in view of Loxam et al. teaches the media of Claim 18, 
Attorre et al. further teaches wherein the software is further operable when executed to extract the one or more DL-feature representations at the client computing device by (Attorre et al., Para. [0005], “The method also includes extracting, from the candidate region and by a neural network implemented using the one or more computing devices, a feature vector of the candidate region” teaches extracting a feature vector from a computing device).
accessing the first image (Attorre et al., Para. [0007], “the first reference feature vector extracted from a first image of a first target logo in the set of target logos” teaches accessing the first image).
generating, by a first machine learning model, an initial feature map associated with the first image (Attorre et al., Para. [0029] and FIG. 2, “the input image may be fed to the CNN to generate a convolutional feature map” teaches utilizing a convolutional neural network (corresponds to the first machine learning model) to generate a feature map associated with the input image (corresponds to the first image)).
identifying the region of interest within the initial feature map (Attorre et al., Para. [0029] and FIG. 2, “candidate regions may be identified from the convolutional feature map” teaches identifying a candidate region (corresponds to region of interest) within the convolutional feature map).  
wherein  the region of interest is associated with at least a first real-world-object type, and wherein the region of interest is associated with a portion of the first image corresponding to the first depiction of the real-world object (Attorre et al., Para. [0029], “using selective search techniques and may be reshaped to a predetermined size using, for example, a region-of-interest (ROI) pooling layer” teaches utilizing a selective search technique to select the region of interest of the image that depicts the real-world object). 
extracting, from the region of interest, the one or more DL-feature representations, wherein each extracted DL-feature representation is an output of a second machine learning model that is trained to detect at least objects of the first real-world-object type (Attorre et al., Para. [0053], “To detect logos in a new source image that may include an image of the new target logo, candidate regions in the new source image that may embody a logo may be determined by the second model, and a feature vector may be extracted from each candidate region and compared with each reference feature vector in the embedding database (including the reference feature vectors extracted from the images of the new target logo) to find a match. As such, new target logos may be detected by existing models or networks without retraining such models or networks” teaches a second model trained to determine candidate regions and detect logos (corresponds to objects of the first real-world-object type) in a source image. Feature vectors are then extracted from the candidate regions).
Regarding Claim 20,
Attorre et al. in view of Loxam et al. teaches a system comprising: one or more processors; and one or more computer-readable non-transitory storage media coupled to one or more of the processors and comprising instructions operable when executed by one or more of the processors to cause the system to (Attorre et al., Para. [0058], “The depicted example of a computing system 800 includes a processor 802” teaches the system comprising of a processor. Para. [0004], “Various inventive embodiments are described herein, including methods, systems, non-transitory computer-readable storage media storing programs, code, or instructions executable by one or more processors, and the like” teaches the embodiment comprising of a non-transitory computer-readable storage media that stores programs, code, or instructions executable by one or more processors).
receive, from a client computing device, one or more deep-learning (DL)-feature representations generated by a machine learning model, wherein the one or more DL-feature representations are extracted from a region of interest detected by the client computing device within a first image of a real-world environment captured by the client computing device, the region of interest comprising a first depiction of a real-world object  (Attorre et al., Para. [0005], “a method includes receiving a source image at one or more computing devices” teaches receiving from a computing device. Para. [0029] and FIG. 2, “A CNN in logo detection model 220 may then extract feature vectors from each candidate region and classify the candidate region based on the feature vectors” teaches a CNN in the logo detection model (corresponds to the machine learning model) that extracts feature vectors (corresponds to deep-learning feature representations) from each candidate region (corresponds to the region of interest detected). Para. [0034] and FIG. 4, “logo detection model 410 may detect generic logo patches or regions (i.e., regions that are likely to embody a logo). For example, in the example shown in FIG. 4, logo detection model 410 may receive an image 420, which may include images of one or more logos, and identify candidate regions 430, 432, and 434 that likely embody a logo from image 420. Outputs 440 of logo detection model 410 may thus include sub-images 442, 444, and 446 that may be cropped out from image 420 or may include identifications (e.g., coordinates) of candidate regions 430, 432, and 434” teaches determining identification of candidate regions (corresponds to the region of interest) of an image (corresponds to the first image of a real-world environment captured) from the logo detection model when an image is received from the computing device).
receive, from the client computing device, one or more local feature descriptors extracted from the region of interest within the first image, wherein each of the one or more local feature descriptors corresponds to a patch within the region of interest within the first image and comprises information that encodes one or more visual features present in the patch (Attorre et al., Para. [0005], “a method includes receiving a source image at one or more computing devices” teaches receiving from a computing device. Para. [0034] and FIG. 4, “logo detection model 410 may detect generic logo patches or regions (i.e., regions that are likely to embody a logo). For example, in the example shown in FIG. 4, logo detection model 410 may receive an image 420, which may include images of one or more logos, and identify candidate regions 430, 432, and 434 that likely embody a logo from image 420. Outputs 440 of logo detection model 410 may thus include sub-images 442, 444, and 446 that may be cropped out from image 420 or may include identifications (e.g., coordinates) of candidate regions 430, 432, and 434” teaches determining patches/regions, sub-images, and identifications (corresponds to local-feature descriptors) from the identified candidate region (corresponds to the region of interest) when an image is received (corresponds to the first image)).
identify a set of matching DL-feature representations based on a comparison of the received one or more DL-feature representations with a plurality of stored DL-feature representations associated with a plurality of augmented-reality (AR) targets, the comparison resulting in a determination that the set of matching DL-feature representations and the received one or more DL-feature representations are within a threshold region in a vector space (Attorre et al., Para. [0003], “Logo detection or recognition in images and videos can be used in many applications, such as copyright or trademark infringement detection, contextual advertise placement, intelligent traffic control based on vehicle logos, automated computation of brand-related statistics, augmented reality, and the like” teaches the technique being used in many applications such as augmented reality. Para. [0053], “To detect logos in a new source image that may include an image of the new target logo, candidate regions in the new source image that may embody a logo may be determined by the second model, and a feature vector may be extracted from each candidate region and compared with each reference feature vector in the embedding database (including the reference feature vectors extracted from the images of the new target logo) to find a match. As such, new target logos may be detected by existing models or networks without retraining such models or networks” teaches comparing feature vectors (corresponds to DL-feature representations) with the reference feature vectors stored in the database that is associated to a target logo (corresponds to AR target)  to identify potential matches. Para. [0020], “The target logo associated with the best matching reference feature vector is determined as present in the candidate region in the source image if the best matching score is greater than a threshold value” teaches determining a threshold value (corresponds to the threshold region in a vector space) for a set of matching reference feature vector (corresponds to deep-learning feature representations)).
determine, from a set of matching AR targets associated with the set of matching DL-feature representations, a matching AR target based on a comparison of the received one or more local-feature descriptors with stored local-feature descriptors associated with the set of matching AR targets, wherein the stored local-feature descriptors are extracted from the set of matching AR targets (Attorre et al., Para. [0036], “The features extracted from each of sub-images 442, 444, and 446 may be compared with reference features stored in database 450 by a comparator 485 to determine if there is a match between any reference features stored in databased 450 and features extracted from sub-image 442, 444, or 446” teaches determining matching features by comparing features extracted from sub images (corresponds to local-feature descriptors) with reference features stored in a database).
Attorre et al. does not appear to explicitly teach select an AR effect associated with the determined matching AR target; send, to the client computing device, the AR effect associated with the determined matching AR target, wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object
However, Loxam et al. teaches select an AR effect associated with the determined matching AR target (Loxam et al., FIG. 3A and Para. [0042], “the augmentation engine 375 can start transmitting to the mobile computing device 300 the potential large augmented reality content files such as video files, and advertisements while the object recognition engine 320 determines what the object is. Thus, at approximately at the same time as the object recognition engine 320 is hierarchically filtering or narrowing down the possible known matching images/object to the transmitted features, the augmentation engine 375 can be preparing and selecting augmented reality content to be transmitted back to the video processing module on the mobile computing device 300 for display. Note, similarly, the augmentation engine 316 can be preparing and selecting augmented reality content to be overlaid onto the video frames while the trigger item identification is performing its operations” teaches an augmentation engine that prepares and selects an augmentation reality content to be overlaid (corresponds to the AR effect) onto the video frame based on the determined matching images or object (corresponds to the AR target)).
send, to the client computing device, the AR effect associated with the determined matching AR target, wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object (Loxam et al., FIG. 3A and Para. [0042], “the augmentation engine 375 can start transmitting to the mobile computing device 300 the potential large augmented reality content files such as video files, and advertisements while the object recognition engine 320 determines what the object is. Thus, at approximately at the same time as the object recognition engine 320 is hierarchically filtering or narrowing down the possible known matching images/object to the transmitted features, the augmentation engine 375 can be preparing and selecting augmented reality content to be transmitted back to the video processing module on the mobile computing device 300 for display. Note, similarly, the augmentation engine 316 can be preparing and selecting augmented reality content to be overlaid onto the video frames while the trigger item identification is performing its operations” teaches an augmentation engine that prepares and selects an augmentation reality content to be overlaid (corresponds to the AR effect) onto the video frame based on the determined matching images or object (corresponds to the AR target). FIG.5-6 and Para. [0071], “The process described in FIGS. 5 and 6 may generally require the detect trigger item engine 370 to perform a one-to-many approach by positively matching the feature points of the one real world trigger item with the indexed feature points of the many known candidate trigger items” teaches the real world trigger item (corresponds to the real world object). Para. [0072], “When it is ready, the user can point the phone at the picture, and it will come to life. If the user was not told what picture to point the smart phone at, then the user can point the camera around the location and the augmented reality application will automatically detect the trigger items in view” teaches the augmented reality application automatically detecting the trigger items (corresponds to the real-world object) in view, which shows the trigger item is being tracked in real-time).
Attorre et al. in view of Loxam et al. are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “image recognition”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Attorre et al. with Loxam et al., with motivation to select an AR effect associated with the determined matching AR target; send, to the client computing device, the AR effect associated with the determined matching AR target, wherein the AR effect is rendered by the client computing device so that the AR effect is anchored to the real-world object. “The systems and methods allow mobile computing devices to identify real world trigger items and to cause augmented reality scenarios associated with a real world trigger item to be presented on a display of the mobile computing device” (Loxam et al., Abstract). The proposed teaching is beneficial in that it helps identify real world trigger items and cause augmented reality scenarios associated with a real work trigger item.
Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Attorre et al. in view of Loxam et al. and in further view of Rao et al. (“A Mobile Outdoor Augmented Reality Method Combining Deep Learning Object Detection and Spatial Relationships for Geovisualization”)
Regarding Claim 5,
Attorre et al. in view of Loxam et al. teaches the method of Claim 2, 
Attorre et al. further teaches wherein the second machine learning model is a convolutional neural network (Attorre et al., Para. [0049], “the one or more processing devices may implement a second model that is trained to detect each candidate region in an input image that is likely to embody a logo. The second model may be, for example, a convolutional neural network, such as a Fast R-CNN or any other variation of a R-CNN network” teaches the second model being a convolutional neural network).
Attorre et al. in view of Loxam et al. does not appear to explicitly teach wherein each extracted DL-feature representation is an output of an average pooling layer of the convolutional neural network
However, Rao et al. teaches wherein each extracted DL-feature representation is an output of an average pooling layer of the convolutional neural network (Rao et al., Figure 4, “This architecture follows a design similar to that of the original SSD. The main differences are that it takes a 224 × 224 pixel image as input and then uses a truncated SqueezeNet (rather than VGG-16) and a series of additional layers (at lower depths than the original) to extract features from the image. The features it uses for detection are selected from 5 layers: fire9 (the last fire module in the SqueezeNet), Ex1_2, Ex2_2, Ex3_2 (three convolutional layers) and GAP (a global average pooling layer)” teaches the output of the GAP (corresponding to the average pooling layer) of the convolutional neural network being extracted features (corresponds to DL-feature representation) from the image).
Attorre et al. in view of Loxam et al. in view of Rao et al. are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “image detection” and “convolutional neural network”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Attorre et al. and Loxam et al. with Rao et al., with motivation wherein each extracted DL-feature representation is an output of an average pooling layer of the convolutional neural network. “To significantly reduce the computational cost of the proposed lightweight SSD approach, we use a truncated SqueezeNet architecture (with conv10 and the softmax classifier removed) as the base network and append several additional feature layers (at lower depths than the original) with decaying spatial resolution” (Rao et al., Section 3.1). The proposed teaching is beneficial in that it helps significantly reduce the computational cost of the proposed lightweight Single Shot Detector approach.
Claims 8-9 are rejected under 35 U.S.C. 103 as being unpatentable over Attorre et al. in view of Loxam et al. and in further view of Sladojevic et al. (“Deep Neural Networks Based Recognition of Plant Diseases by Leaf Image Classification”)
Regarding Claim 8,
Attorre et al. in view of Loxam et al. teaches the method of Claim 6, 
Attorre et al. in view of Loxam et al. does not appear to explicitly teach wherein one or more of the plurality of second images are synthetically generated using a data augmentation process that automatically varies one or more conditions in the first image to generate one or more second images
 However, Sladojevic et al., teaches wherein one or more of the plurality of second images are synthetically generated using a data augmentation process that automatically varies one or more conditions in the first image to generate one or more second images (Sladojevic et al., Section 3.3 and Figure 2, “Transformations applied in augmentation process are illustrated in Figure 2, where the first row represents resulting images obtained by applying affine transformation on the single image; the second row represents images obtained from perspective transformation against the input image and the last row visualizes the simple rotation of the input image. The process of augmentation was chosen to fit the needs; the leaves in a natural environment could vary in visual perspective” teaches applying augmentation process that generates images based on different transformations).
Attorre et al. in view of Loxam et al. in view of Sladojevic et al. are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “image recognition” and “convolutional neural network”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Attorre et al. and Loxam et al. with Sladojevic et al., with motivation wherein one or more of the plurality of second images are synthetically generated using a data augmentation process that automatically varies one or more conditions in the first image to generate one or more second images. “The main purpose of applying augmentation is to increase the dataset and introduce slight distortion to the images which helps in reducing overfitting during the training stage” (Sladojevic et al., Section 3.3). The proposed teaching is beneficial in that it helps reduce overfitting during the training stage.
Regarding Claim 9,
Attorre et al. in view of Loxam et al. in view of Sladojevic et al. teaches the method of Claim 8, 
Sladojevic et al. further teaches wherein the one or more conditions comprise one or more of: perspectives, orientations, sizes, locations, and lighting conditions (Sladojevic et al. Section 3.3, “The image augmentation contained one of several transformation techniques including affine transformation, perspective transformation, and simple image rotations… Affine transformations were applied to express translations and rotations (linear transformations and vector addition, resp.) where all parallel lines in the original image are still parallel in the output image. To find a transformation matrix, three points from the original image were needed as well as their corresponding locations in the output image. For perspective transformation, a transformation matrix was required. Straight lines would remain straight even after the transformation. For the augmentation process, simple image rotations were applied, as well as rotations on the different axis by various degrees” teaches the one or more condition, one of which comprises of perspective transformation).
Attorre et al. in view of Loxam et al. in view of Sladojevic et al. are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “image recognition” and “convolutional neural network”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Attorre et al. and Loxam et al. with Sladojevic et al., with motivation wherein the one or more conditions comprise one or more of: perspectives, orientations, sizes, locations, and lighting conditions. “The main purpose of applying augmentation is to increase the dataset and introduce slight distortion to the images which helps in reducing overfitting during the training stage” (Sladojevic et al., Section 3.3). The proposed teaching is beneficial in that it helps reduce overfitting during the training stage.
Claims 11-12 are rejected under 35 U.S.C. 103 as being unpatentable over Attorre et al. in view of Loxam et al. and in further view of Ribo et al. (“Hybrid tracking for outdoor augmented reality application”)
Regarding Claim 11,
Attorre et al. in view of Loxam et al. teaches the method of Claim 3, 
Attorre et al. in view of Loxam et al. does not appear to explicitly teach wherein the one or more detected points of interest are corners detected within the first image
However, Ribo et al., teaches wherein the one or more detected points of interest are comers detected within the first image (Ribo et al., Section 7, “we used a complete georeferenced 3D model of a city section, shown in Figure 10a, to derive the most significant corners” teaches detecting significant corners in the 3D model (corresponds to first image). Figure 10a, “3D model with points of interest, roof lines, and camera positions (22 calibrated reference images)” teaches the detected corners being points of interest).
Attorre et al. in view of Loxam et al. in view of Ribo et al. are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “image detection” and “convolutional neural network”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Attorre et al. and Loxam et al. with Ribo et al., with motivation wherein the one or more detected points of interest are comers detected within the first image. “Spatial subpixel analysis aims to estimate model parameters by analyzing the gray levels of the involved pixels within a small neighborhood. Our approach extends this work from edges to corners. Because corners are intersections of two or more edges that border different areas, we can use this approach to improve corner localization accuracy” (Ribo et al., Section 3). The proposed teaching is beneficial in that it helps improve corner localization accuracy.
Regarding Claim 12,
Attorre et al. in view of Loxam et al. teaches the method of Claim 3, 
Attorre et al. in view of Loxam et al. does not appear to explicitly teach wherein one or more of the detected points of interest are associated with the real-world object within the first image
However, Ribo et al. teaches wherein one or more of the detected points of interest are associated with the real-world object within the first image (Ribo et al., Figure 10, “Georeferenced 3D model. (a) 3D model with points of interest, roof lines, and camera positions (22 calibrated reference images). (b) Some of the images used to compute the 3D model. The model was provided by the VRVis Research Center for Virtual Reality and Visualization” teaches a 3D model of a city section (corresponds to real-world object) with detected point of interests).
Attorre et al. in view of Loxam et al. in view of Ribo et al. are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “image detection” and “convolutional neural network”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Attorre et al. and Loxam et al. with Ribo et al., with motivation wherein one or more of the detected points of interest are associated with the real-world object within the first image. “Spatial subpixel analysis aims to estimate model parameters by analyzing the gray levels of the involved pixels within a small neighborhood. Our approach extends this work from edges to corners. Because corners are intersections of two or more edges that border different areas, we can use this approach to improve corner localization accuracy” (Ribo et al., Section 3). The proposed teaching is beneficial in that it helps improve corner localization accuracy.
Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over Attorre et al. in view of Loxam et al. and in further view of Bae et al. (“Fast and scalable structure-from-motion based localization for high-precision mobile augmented reality systems”)
Regarding Claim 14,
Attorre et al. in view of Loxam et al. teaches the method of Claim 1, 
Attorre et al. in view of Loxam et al. does not appear to explicitly teach wherein the AR effect is configured to scale itself based on a location and orientation of the client computing device 
However, Bae et al. teaches wherein the AR effect is configured to scale itself based on a location and orientation of the client computing device (Bae et al., Para. 15, “Once the 3D physical model is available, a user can take a photo with a mobile device at a random location. HD4AR uses a new image-based localization approach, which takes advantage of a pre-constructed 3D point cloud of target scene to identify a mobile device’s relative location and orientation. The localization process compares the new photo to the generated 3D physical model and estimates the extrinsic camera parameters to find the relative position of the user’s camera. In addition, the HD4AR uses the client-server architecture to further increase the localization speed. The smartphone as the client uploads new photographs to the server for localization and the major image processing load is located on the server. The localization method using a direct 2D-to-3D matching algorithm takes at most few seconds to localize a photograph. After recovering a complete pose of the user’s camera, the server can decide what cyber-information should appear in the user’s photograph and send the cyber object and their associated information to the client. The client app will then draw cyber objects on top of the photograph” teaches determining what cyber information/object (corresponds to scaling AR effect) will be drawn on a photograph based on processing by HD4AR, which takes into a mobile device (computing device)’s location and orientation into consideration).
Attorre et al. in view of Loxam et al. in view of Bae et al. are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “image recognition” and “augmented reality”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Attorre et al. and Loxam et al. with Bae et al., with motivation wherein the AR effect is configured to scale itself based on a location and orientation of the client computing device. "The approach supports near real-time localization and information association regardless of size of physical objects, users location, and number of cyber-physical information items" (Bae et al., Conclusion). The proposed teaching is beneficial in that it helps support near real-time localization and information association.
Claim 17 is rejected under 35 U.S.C. 103 as being unpatentable over Attorre et al. in view of Loxam et al. and in further view of Sanches et al. (“Aspects of User Profiles That Can Improve Mobile Augmented Reality Usage”)
Regarding Claim 17,
Attorre et al. in view of Loxam et al. teaches the method of Claim 16, 
Attorre et al. in view of Loxam et al. does not appear to explicitly teach wherein the information associated with the user comprises user affinity information, wherein the user affinity information comprises an affinity coefficient between the user and the AR effect.
However, Sanches et al. teaches wherein the information associated with the user comprises user affinity information, wherein the user affinity information comprises an affinity coefficient between the user and the AR effect (Conclusion, “The results showed that the age of the user may be related to their performance in the application and this relation occurs due to the greater interest of certain age groups by the use of this type of applications. The affinity factor with games, in this case, may be implicit in the age factor. Factors related to aging were not relevant for performance reduction when the user touches the device screen to interact with an AR application. Young users, in general, performed the task faster than older users. However, the age factor has a higher correlation with the performance of the users when considering only male users” teaches how age factors into how users interact and perform on augmented reality application, based on their interest. Conclusion, “In applications whose target audience are young males, tasks may require more effort, since these users tend to perform well. On the other hand, if the target audience of the application is older users, the complexity of the task to be performed in the AR environment must be reduced so that interest in the application is maintained” teaches the affinity factor of age helping in predicting the probability that a user will perform a particular action based on the user's interest in the action and reducing the AR environment (corresponds to AR effect) based on this factor (corresponds to affinity coefficient)).
Attorre et al. in view of Loxam et al. in view of Sanches et al. are analogous art because they are from the same field of endeavor and are from the same problem solving area. Namely, they pertain to the field of “image recognition” and “augmented reality”. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Attorre et al. and Loxam et al. with Sanches et al., with motivation wherein the information associated with the user comprises user affinity information, wherein the user affinity information comprises an affinity coefficient between the user and the AR effect. “The results of this research may help developers to use AR technology in their applications” (Sanches et al., Conclusion). The proposed teaching is beneficial in that it helps in future use for developer utilizing augmented reality technology in their applications.

Response to Arguments
Applicant's arguments filed 3/9/2022 with respect to the 35 U.S.C. 101 rejection to claims 1, 10, 13-18, and 20 have been fully considered but they are not persuasive. Applicant asserts that “Step 2A, Prong One - The Claims Do Not Recite an Abstract Idea Applicant respectfully submits that Claims 1, 10-18, and 20 do not recite a mental process. As an example, amended independent Claim 1 relates to quickly providing artificial reality (AR) effects (e.g., a virtual avatar) as a user experiences an AR environment by accurately identifying features in the environment. As recited in independent Claim 1, this is achieved by identifying matching DL-feature representations within a threshold region in a vector space and determining a matching AR target based on a comparison of received local-feature descriptors with stored local- feature descriptors. This is not an abstract mental process. Instead, as described above Claim 1 relates to providing AR effects as a user experiences an AR environment. The steps of the method recited in Claim 1, and indeed all pending claims, do not recite a judicial exception. Instead, Claims 1, 10-18, and 20 merely involve, and only in part, what could be argued to be a mental process, which is insufficient to render an invention patent ineligible. While the claims may result in evaluation, observation, and judgment (Office Action at 12), to assert that the claims are directed solely to the high-level concept of a "mental process" fails to account for the technical improvements, discussed herein, resulting from the claims as demonstrated in the Specification. That these technical improvements may also result in additional benefits for an entity performing the claims should not be used to undermine the technical nature of the claims. At some level, all inventions, including those that are patent eligible, embody, use, reflect, rest on, or apply abstract ideas. See Alice Corp., 573 U.S. at 217 (citing Mayo Collaborative Svs. v. Prometheus Labs., Inc., 566 U.S. 66, 71 (2012)).” (Remarks, pg. 12).
Examiner’s Response:
The Examiner respectfully disagrees. The claim merely recites performing a comparison of data to determine matching DL-features within a threshold and the type of data that is received to perform a comparison with of received data (local-feature descriptors) with stored data (stored local-feature descriptors) to select a corresponding AR target.  The proposed improvement appears to merely suggest an improvement based on performing a comparison with received features to features stored in a database. See MPEP 2106.05(a) (“It is important to note, the judicial exception alone cannot provide the improvement. The improvement can be provided by one or more additional elements. See the discussion of Diamond v. Diehr, 450 U.S. 175, 187 and 191-92, 209 USPQ 1, 10 (1981)) in subsection II, below. In addition, the improvement can be provided by the additional element(s) in combination with the recited judicial exception. See MPEP § 2106.04(d) (discussing Finjan, Inc. v. Blue Coat Sys., Inc., 879 F.3d 1299, 1303-04, 125 USPQ2d 1282, 1285-87 (Fed. Cir. 2018)).”). According to Specification [0017], Applicant discloses technological advantages of the described system and method because “the accuracy and performance of the matching process may be vastly improved.  It also serves to reduce the expenditure of computational resources, which may be particularly important when dealing with image comparison, which is a computationally challenging task.  The DL-feature comparison may be efficient at quickly narrowing down a large number of AR targets to a small subset of potentially matching AR targets.” In this case, it is not clear how the improvement, namely the “accuracy” of the matching process being improved, is reflected by the “identifying...” limitation because the “identifying...” limitation amounts to evaluating feature representations and identifying matching DL-feature representations, which is a mental process. 

Applicant asserts that “Independent Claim 1 of this Application, as amended, recites: "… Applicant submits that the remaining limitations, e.g., limitations relating to the computing system identifying a set of matching DL-feature representations based on a comparison .. .resulting in a determination that the set of matching DL- feature representations and the received one or more DL-feature representations are within a threshold region in a vector space; and determining .. .a matching AR target based on a comparison of the received one or more local-feature descriptors with stored local-feature descriptors associated with the set of matching AR targets, wherein the stored local-feature descriptors are extracted from the set of matching AR targets are more than mere generic computing systems; provide particular requirements of components specifically configured to affect the technological advancements described in the Specification and recited in the claims. At a minimum, the claims include limitations that reflect an improvement to a particular technical field, and apply the alleged abstract idea in another meaningful way beyond generally linking the use of the idea to a particular technological environment. 2019 Guidance at 55. Active 41882387.1The Specification, in at least paragraphs 17, discusses the significant technological advantages of the described systems and methods over existing systems and methods. See, e.g., Specification at [0017]. Paragraph 17 of the Specification explains that, by performing the steps outlined in Claim 1: [T]he accuracy and performance of the matching process may be vastly improved. It also serves to reduce the expenditure of computational resources, which may be particularly important when dealing with image comparisons, which is a computationally challenging task. The DL-feature comparison may be efficient at quickly narrowing down a large number of AR targets to a small subset of potentially matching AR targets. Thus, the claims represent the practical application of any alleged judicial exception. As such, Step 2A, Prong Two is not satisfied, and cannot be satisfied, which concludes the eligibility analysis. 2019 Guidance at 54 ("When the exception is so integrated, then the claim is not directed to a judicial exception (Step 2A: NO) and is eligible.")” (Remarks, pg. 12-14).
Examiner’s Response:
The Examiner respectfully disagrees. The additional element of “a client computing device”, as drafted, is reciting a generic computer component. The generic computer components in these steps are recited at a high-level of generality (i.e., as a generic computer component performing a generic computer function) such that it amounts no more than mere instructions to apply the exception using a generic computer component. According to Specification [0017], Applicant discloses technological advantages of the described system and method because “the accuracy and performance of the matching process may be vastly improved.  It also serves to reduce the expenditure of computational resources, which may be particularly important when dealing with image comparison, which is a computationally challenging task.  The DL-feature comparison may be efficient at quickly narrowing down a large number of AR targets to a small subset of potentially matching AR targets.” MPEP 2106.05(a): “In computer-related technologies, the examiner should determine whether the claim purports to improve computer capabilities or, instead, invokes computers merely as a tool. Enfish, LLC v. Microsoft Corp., 822 F.3d 1327, 1336, 118 USPQ2d 1684, 1689 (Fed. Cir. 2016). In Enfish, the court evaluated the patent eligibility of claims related to a self-referential database. Id. The court concluded the claims were not directed to an abstract idea, but rather an improvement to computer functionality. Id. It was the specification’s discussion of the prior art and how the invention improved the way the computer stores and retrieves data in memory in combination with the specific data structure recited in the claims that demonstrated eligibility. 822 F.3d at 1339, 118 USPQ2d at 1691. The claim was not simply the addition of general purpose computers added post-hoc to an abstract idea, but a specific implementation of a solution to a problem in the software arts.” Unlike the guidance provided in the MPEP, the alleged improvement in present Specification [0017] does not describe improvement to the computer’s capabilities; instead, it is merely further supporting that the computer is invoked as a tool to perform the claimed steps that amount to mental processes in which the “accuracy and performance” of these mental processes such as identifying matching DL-features may be improved. In other words, the Specification at best describes an alleged improvement in the abstract idea (such as evaluation), instead of improvement to computer functionalities or to a specific technology.

Applicant asserts that “Step 2B - The Claims Recite Significantly More than the Alleged Judicial Exception Even if Claim 1 were properly held to be directed to a judicial exception, the claims provide an "'inventive concept,'-i.e., an element or combination of elements that is 'sufficient to ensure that the patent in practice amounts to significantly more than a patent"' upon the alleged abstract idea. Alice Corp., 573 U.S. at 217-218 (quoting Mayo, 566 U.S. at 72-73). The Examiner states that the claims do not include anything significantly more than the cited judicial exception of "a mental process." (Office Action at 13.) The Examiner asserts that any additional elements "amounts to no more than mere instructions to apply the exception using a generic computer component." (Office Action at 13.) Applicant respectfully disagrees. The combination of the steps in Claim 1, for example, operates in a non-conventional and non-generic way to provide artificial reality (AR) effects (e.g., a virtual avatar) as a user experiences an AR environment by quickly and accurately identifying features in the environment that may appear differently than stored features. In combination, the recited steps (e.g., (a) receiving, from the client computing device, one or more local feature descriptors extracted from the region of interest within the first image, wherein each of the one or more local feature descriptors corresponds to a patch within the region of interest within the first image and comprises information that encodes one or more visual features present in the patch; (b) identifying a set ofActive 41882387.1 15 of 21 matching DL-feature representations based on a comparison of the received one or more DL- feature representations with a plurality of stored DL-feature representations associated with a plurality of augmented-reality (AR) targets, the comparison resulting in a determination that the set of matching DL-feature representations and the received one or more DL-feature representations are within a threshold region in a vector space; (c) determining, from a set of matching AR targets associated with the set of matching DL-feature representations, a matching AR target based on a comparison of the received one or more local-feature descriptors with stored local-feature descriptors associated with the set of matching AR targets, wherein the stored local- feature descriptors are extracted from the set of matching AR targets; and (d) selecting an AR effect associated with the determined matching AR target; are not merely utilized as tools to implement the abstract idea as "apply it" instructions, but instead set up a sequence of events that address unique problems associated with quickly and accurately providing artificial reality (AR) effects as a user experiences an AR environment. Thus, as in Bascom Global Internet v. AT&T Mobility LLC, the claimed combination of additional elements presents a specific implementation of the abstract idea. 827 F.3d 1341, 1353 (Fed. Cir. 2016). For purposes of step 2B, the present claims are similar to Claim 3 of Example 36 of the Eligibility Guidance. Claim 3 of Example 36 was deemed patent-eligible because it recited a combination of additional elements that amount to significantly more including: (b) extracting characteristics from the acquired image sequence(s) of an item to form feature vectors, the characteristics comprising contour information and character information that is stored in the inventory record as classification data relating to the acquired image sequence(s); (c) recognizing and tracking the position of item in the image sequence as classification and location data by processing the feature vectors using the stored recognition model and adding the classification and location data to the inventory record; and (d) determining a physical location of the item in the warehouse using the location data relating to the item in the image sequence(s). The additional elements of Claim 3, similar to the additional elements recited in the claims of this Application, "do not simply limit the abstract idea to the technological environment of image processing, but are instead meaningful limitations that integrate the abstract idea into a particular application." Likewise, when viewed in combination the particular elements recited in Claim 1 amounts to16 of 21 significantly more than the abstract idea of mental concepts "performed in the human mind (including observation, evaluation, judgment, opinion)."” (Remarks, pg. 12).
Examiner’s Response:
The Examiner respectfully disagrees. First, limitation “wherein the one or more DL-feature representations are extracted from a region of interest detected… within a first image of a real- world environment captured by the client computing device, region of interest comprising a first depiction of a real-world object” amounts to the mental process of analyzing (evaluating) a region of interest within a first image to determine (evaluate) the DL-feature representations. The second limitation “wherein each of the one or more local feature descriptors corresponds to a patch within the region of interest within the first image and comprises information that encodes one or more visual features present in the patch” amounts to the mental process of analyzing (evaluating) a patch within the region of interest to determine (evaluate) the local feature descriptors. The third limitation “identifying a set of matching DL-feature representations based on a comparison of the received one or more DL-feature representations with a plurality of stored DL-feature representations associated with a plurality of augmented-reality (AR) targets, the comparison resulting in a determination that the set of matching DL-feature representations and the received one or more DL-feature representations are within a threshold region in a vector space” amounts to the mental process of identifying (observing) DL-feature representations and determining (judgement) matching DL-feature representations within a threshold region. The fourth limitation “determining, from a set of matching AR targets associated with the set of matching DL-feature representations, a matching AR target based on a comparison of the received one or more local-feature descriptors with stored local-feature descriptors associated with the set of matching AR targets, wherein the stored local-feature descriptors are extracted from the set of matching AR targets” amounts to the mental process of determining (evaluating and judgement) matching AR targets by comparing (evaluating and judgement) local feature descriptors. The fifth limitation “selecting an AR effect associated with the determined matching AR target” amounts to the mental process of selecting (evaluating and judgement) an AR effect. As these limitations are directed to mental processes (an abstract idea), these limitations are not additional elements that can amount to significant more. See MPEP 2106.05(a) (“It is important to note, the judicial exception alone cannot provide the improvement. The improvement can be provided by one or more additional elements. See the discussion of Diamond v. Diehr, 450 U.S. 175, 187 and 191-92, 209 USPQ 1, 10 (1981)) in subsection II, below. In addition, the improvement can be provided by the additional element(s) in combination with the recited judicial exception. See MPEP § 2106.04(d) (discussing Finjan, Inc. v. Blue Coat Sys., Inc., 879 F.3d 1299, 1303-04, 125 USPQ2d 1282, 1285-87 (Fed. Cir. 2018)).”) Further, Applicant pointed to Example 36, but did not provide specific arguments about how the present claims are analogous to those in Example 36.


Applicant's arguments filed 3/9/2022 with respect to the 35 U.S.C.103 rejection to claim 1-4, 6-7, 10, 13, 15-16, and 18-20 in the previous Office Action have been fully considered but they are not persuasive.	
 Applicant asserts that “Independent Claim 1 of this Application, as amended, recites: "Even if the proposed Attorre-Loxam combination were proper, it would still fail to disclose, teach, or suggest all of the limitations of independent Claims 1, 18, and 20. As an example, the proposed Attorre-Loxam combination fails to disclose, teach, or suggest receiving, from the client computing device, one or more local feature descriptors extracted from the region of interest within the first image, wherein each of the one or more local feature descriptors corresponds to a patch within the region of interest within the first image and comprises information that encodes one or more visual features present in the patch, as independent Claim 1 recites. 
The Examiner asserts that local feature descriptors are disclosed in paragraphs [0034] of Attorre. Office Action at 38 ("identifications (corresponds to local feature descriptors)"). Applicant respectfully disagrees. As discussed above, Attorre merely discloses that "identifications (e.g., coordinates)" may be included in outputs of the logo detection model. Attorre at [0034]. Loxam does not make up for the deficiencies of Attorre, either alone or in combination, and the Examiner does not assert otherwise. In contrast, Claim 1 as amended recites receiving, from the client computing device, one or more local feature descriptors extracted from the region of interest within the first image, wherein each of the one or more local feature descriptors corresponds to a patch within the region of interest within the first image and comprises information that encodes one or more visual features present in the patch. During an interview on 01 March 2022, the Examiner indicated that the proposed amendments appeared to overcome the proposed Attorre-Loxam combination. For at least these reasons, independent Claims 1, 18, and 20 are allowable over the proposed Attorre-Loxam combination. Applicant respectfully requests the Examiner to reconsider and allow these independent claims and all their dependent claims.” (Remarks, pg. 18).
Examiner’s Response:
The Examiner respectfully disagrees. Attorre et al. teaches “receiving, from the client computing device, one or more local feature descriptors extracted from the region of interest within the first image, wherein each of the one or more local feature descriptors corresponds to a patch within the region of interest within the first image and comprises information that encodes one or more visual features present in the patch” (Attorre et al., Para. [0005], “a method includes receiving a source image at one or more computing devices” teaches receiving from a computing device. Para. [0034] and FIG. 4, “logo detection model 410 may detect generic logo patches or regions (i.e., regions that are likely to embody a logo). For example, in the example shown in FIG. 4, logo detection model 410 may receive an image 420, which may include images of one or more logos, and identify candidate regions 430, 432, and 434 that likely embody a logo from image 420. Outputs 440 of logo detection model 410 may thus include sub-images 442, 444, and 446 that may be cropped out from image 420 or may include identifications (e.g., coordinates) of candidate regions 430, 432, and 434” teaches determining patches/regions, sub-images, and identifications (corresponds to local-feature descriptors because identifications can be “coordinates” that describe the regions) from the identified candidate region (corresponds to the region of interest) when an image is received (corresponds to the first image)). Regarding the dependent claims, Applicant relies on the arguments above. Therefore, the above response is applicable to the dependent claims.

Conclusion
THIS ACTION IS MADE FINAL.  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Henry T Nguyen whose telephone number is (571)272-8860. The examiner can normally be reached Monday-Friday 8:00am-4:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on (571) 272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/HENRY TRONG NGUYEN/Examiner, Art Unit 2125                                                                                                                                                                                                        
/KAMRAN AFSHAR/Supervisory Patent Examiner, Art Unit 2125