DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Allowable Subject Matter
Claims 9 and 28 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1-8 and 10-27 are rejected under 35 U.S.C. 103 as being unpatentable over Dai et al. (CN 110245677 A)in view of Szarvas et al. (US 10909442 B1).

Claim 1. Dai et al. disclose an event detection method comprising: 
receiving a query video snippet (read as input image data (English Translation)): 
encoding the query video snippet into a low dimensional descriptor of a code space (read as the data input from the encoder, encoded by the encoder, to obtain a low-dimensional descriptor (English Translation)), 
wherein the code space includes a plurality of class clusters characterized by one or more of a minimum trending reconstruction error (read as minimizing output data with the reconstruction error of the input data (English Translation)), a minimum trending class entropy, a maximum trending fit of video snippets and a maximum trending compactness of the code space; 
classifying the low dimensional descriptor of the query video snippet based on its proximity to a nearest one (read as to obtain a low-dimensional descriptor … minimizing output data with the reconstruction error of the input data (English Translation)) of a plurality of class clusters of the code space: and 
outputting an indication of an event class of the query video snippet based on the classified class cluster of the low dimensional descriptor of the query video snippet (read as the output of the encoder as the descriptor of the image after dimensionality reduction (English Translation)).
Dai et al. do not explicitly disclose classifying descriptor to nearest class. However, in the related field of endeavor Szarvas et al. disclose: In at least one embodiment, a hinge loss function may be used for at least part of the objective function (Column 19 lines 58-60). The idea, using a loss function, is clearly disclosed by Szarvas et al.
Therefore, it would have been obvious to a person of ordinary skill in the art, at the time the invention was filed, to modify the teaching of Dai et al. with the teaching of Szarvas et al. in order to use methods and apparatus for providing interpretable content-based recommendations with respect to a variety of content sources (such as books) using low-dimension descriptors (Szarvas et al.: Column 2 lines 49-52).

Claim 2. The event detection method of Claim 1, the combination of Dai et al. and Szarvas et al. teaches,
wherein the low dimensional descriptors are encouraged to belong to any one of a plurality of class clusters (Szarvas et al.: read as a clustering algorithm or an algorithm such as K-nearest neighbors may be used on the descriptors (Column 20 lines 1-2)), but discouraged from being between the plurality of class clusters in the code space.

Claim 3. The event detection method of Claim 1, the combination of Dai et al. and Szarvas et al. teaches,
wherein the plurality of class clusters are mapped to corresponding event classes (Szarvas et al.: read as a clustering algorithm or an algorithm such as K-nearest neighbors may be used on the descriptors (Column 20 lines 1-2)).

Claim 4. The event detection method of Claim 2, the combination of Dai et al. and Szarvas et al. teaches,
wherein the event classes include a cycle class (Dai et al.: read as the trained model (English Translation). Models are usually trained through cycles.) and a not cycle class.

Claim 5. The event detection method of Claim 2, the combination of Dai et al. and Szarvas et al. teaches,
wherein the event classes include a plurality of action cycle classes (Dai et al.: read as the trained model (English Translation)).

Claim 6. The event detection method of Claim 1, the combination of Dai et al. and Szarvas et al. teaches,
further comprising: 
encoding a plurality of training video snippets into low dimensional descriptors of the training video snippets in the code space (Dai et al.: read as the data input from the encoder, encoded by the encoder, to obtain a low-dimensional descriptor (English Translation)): 
decoding the low dimensional descriptors of the training video snippets into corresponding reconstructed video snippets (Dai et al.: read as encoder outputs the result decoded by the decoder, and output data (English Translation)); and 
adjusting one or more parameters of the encoding and decoding based on a loss function including (Szarvas et al.: read as In at least one embodiment, a hinge loss function may be used for at least part of the objective function (Column 19 lines 58-60)) one or more objective functions selected from a group including to reduce a reconstruction error  (Dai et al.: read as to obtain a low-dimensional descriptor … minimizing output data with the reconstruction error of the input data (English Translation)) between the one or more training video snippets and the corresponding one or more reconstructed video snippets, to reduce entropy class entropy of the plurality of event classes of the code space, to increase lit (of the training video snippet), and to increase compactness of the code space (Dai et al.: read as obtain a compression encoder of the high dimensional image descriptor to the descriptor of the low-dimensional space (English Translation)).

Claim 7. The event detection method of Claim 6, the combination of Dai et al. and Szarvas et al. teaches,
further comprising: 
encoding one or more labeled video snippets of a plurality of event classes into low dimensional descriptors of the labeled video snippets in the code space (Dai et al.: read as input image data, performing normalization processing to the data;
the convolution of the data input from the encoder, encoded by the encoder, to obtain a low-dimensional descriptor (English Translation)): and 
mapping the plurality of event classes to class clusters corresponding to the low dimensional descriptors of the labeled video snippets (Szarvas et al.: read as a clustering algorithm or an algorithm such as K-nearest neighbors may be used on the descriptors (Column 20 lines 1-2)).

Claim 8. The event detection method of Claim 6, the combination of Dai et al. and Szarvas et al. teaches,
further comprising: 
decoding the low dimensional descriptors of the query video snippets into corresponding reconstructed video snippets (Dai et al.: read as encoder outputs the result decoded by the decoder, and output data…the output of the encoder as the descriptor of the image after dimensionality reduction (English Translation)); and 
further adjusting the one or more parameters of the encoding and decoding based on the loss function (Szarvas et al.: read as In at least one embodiment, a hinge loss function may be used for at least part of the objective function (Column 19 lines 58-60)).

Claim 10. Dai et al. disclose A event detection method comprising: 
encoding a plurality of training video snippets into low dimensional descriptors of the training video snippets in a code space (read as input image data … the data input from the encoder, encoded by the encoder, to obtain a low-dimensional descriptor (English Translation)); 
decoding the low dimensional descriptors of the training video snippets into corresponding reconstructed video snippets (read as encoder outputs the result decoded by the decoder, and output data (English Translation)): and 
adjusting one or more parameters of the encoding and decoding based on a loss function including one or more objective functions selected from a group including to reduce a reconstruction error between the one or more training video snippets and the corresponding (read as to obtain a low-dimensional descriptor … minimizing output data with the reconstruction error of the input data (English Translation)) one or more reconstructed video snippets to reduce a class entropy of the plurality of event classes of the code space to increase fit of the training video snippet, and to increase compactness of the code space (read as obtain a compression encoder of the high dimensional image descriptor to the descriptor of the low-dimensional space (English Translation)).
Dai et al. do not explicitly disclose decoding based on a loss function including one or more objective functions. However, in the related field of endeavor Szarvas et al. disclose: In at least one embodiment, a hinge loss function may be used for at least part of the objective function (Column 19 lines 58-60). The idea, using a loss function, is clearly disclosed by Szarvas et al.
Therefore, it would have been obvious to a person of ordinary skill in the art, at the time the invention was filed, to modify the teaching of Dai et al. with the teaching of Szarvas et al. in order to use methods and apparatus for providing interpretable content-based recommendations with respect to a variety of content sources (such as books) using low-dimension descriptors (Szarvas et al.: Column 2 lines 49-52).

Claim 11. The event detection method of Claim 10, the combination of Dai et al. and Szarvas et al. teaches,
wherein the low dimensional descriptors are encouraged to belong to any one of a plurality of class clusters (Szarvas et al.: read as a clustering algorithm or an algorithm such as K-nearest neighbors may be used on the descriptors (Column 20 lines 1-2)), but discouraged from being between the plurality of class clusters in the code space.

Claim 12. The event detection method of Claim 10, the combination of Dai et al. and Szarvas et al. teaches,
wherein the plurality of event classes include a cycle class (Dai et al.: read as the trained model (English Translation). Models are usually trained through cycles.) and a not cycle class.

Claim 13. The event detection method of Claim 10, the combination of Dai et al. and Szarvas et al. teaches,
wherein the plurality of event classes include a plurality of action cycle classes (Dai et al.: read as the trained model (English Translation). Models are usually trained through cycles.).

Claim 14. The event detection method of Claim 10, the combination of Dai et al. and Szarvas et al. teaches,
further comprising: 
encoding one or more labeled video snippets of a plurality of event classes into low dimensional descriptors of the labeled video snippets in the code space (Dai et al.: read as a convolutional neural network to realize the image descriptor dimensionality reduction … encoded by the encoder, to obtain a low-dimensional descriptor (English Translation)); and 
mapping the plurality of event classes to class clusters corresponding to the low dimensional descriptors of the labeled video snippets (Szarvas et al.: read as a clustering algorithm or an algorithm such as K-nearest neighbors may be used on the descriptors (Column 20 lines 1-2)).

Claim 15. Dai et al. disclose an event detection device comprising: 
a neural network encoder configured to encode a query video snippet into a low dimensional descriptor of a code space (read as a convolutional neural network to realize the image descriptor dimensionality reduction … encoded by the encoder, to obtain a low-dimensional descriptor (English Translation)), wherein the code space includes a plurality of class clusters characterized by one or more of a minimum trending reconstruction error (read as minimizing output data with the reconstruction error of the input data (English Translation)), a minimum trending class entropy, a maximum trending fit of video snippets and a maximum compactness of the code space; and 
a class cluster classifier configured to classify the low dimensional descriptor of the query video snippet based on its proximity to a nearest one of a plurality of class clusters of the code space and output an indication of an event class of the query video snippet based on the classified class cluster of the low dimensional descriptor of the query video snippet (read as the output of the encoder as the descriptor of the image after dimensionality reduction (English Translation)).
Dai et al. do not explicitly disclose classifying descriptor to nearest class. However, in the related field of endeavor Szarvas et al. disclose: In at least one embodiment, a hinge loss function may be used for at least part of the objective function (Column 19 lines 58-60). The idea, using a loss function, is clearly disclosed by Szarvas et al.
Therefore, it would have been obvious to a person of ordinary skill in the art, at the time the invention was filed, to modify the teaching of Dai et al. with the teaching of Szarvas et al. in order to use methods and apparatus for providing interpretable content-based recommendations with respect to a variety of content sources (such as books) using low-dimension descriptors (Szarvas et al.: Column 2 lines 49-52).

Claim 16. The event detection device according to Claim 15, the combination of Dai et al. and Szarvas et al. teaches,
wherein the plurality of class clusters are mapped to corresponding event classes (Szarvas et al.: read as a clustering algorithm or an algorithm such as K-nearest neighbors may be used on the descriptors (Column 20 lines 1-2)).

Claim 17. The event detection device according to Claim 16, the combination of Dai et al. and Szarvas et al. teaches,
wherein the event classes include a cycle class (Dai et al.: read as the trained model (English Translation). Models are usually trained through cycles.) and a not cycle class.

Claim 18. The event detection device according to Claim 16, the combination of Dai et al. and Szarvas et al. teaches,
wherein the event classes include a plurality of action cycle classes (Dai et al.: read as the trained model (English Translation). Models are usually trained through cycles.).

Claim 19. The event detection device according to Claim 15, the combination of Dai et al. and Szarvas et al. teaches,
further comprising: 
the neural network encoder further configured to encode one or more labeled video snippets of a plurality of event classes into low dimensional descriptors of the labeled video snippets in the code space (Dai et al.: read as a convolutional neural network to realize the image descriptor dimensionality reduction … encoded by the encoder, to obtain a low-dimensional descriptor (English Translation)); and 
the class cluster classifier further configured to map the plurality of event classes to class clusters corresponding to the low dimensional descriptors of the labeled video snippets (Szarvas et al.: read as a clustering algorithm or an algorithm such as K-nearest neighbors may be used on the descriptors (Column 20 lines 1-2)).

Claim 20. The event detection device according Claim 15, the combination of Dai et al. and Szarvas et al. teaches,
further comprising; 
the neural network encoder further configured to encode a plurality of training video snippets into low dimensional descriptors of the training video snippets in the code space (Dai et al.: read as a convolutional neural network to realize the image descriptor dimensionality reduction … encoded by the encoder, to obtain a low-dimensional descriptor (English Translation)); 
a neural network decoder configured to decode the low dimensional descriptors of the training video snippets into corresponding reconstructed video snippets (Dai et al.: read as minimizing output data with the reconstruction error of the input data (English Translation)); and 
a loss function configured to adjust (Szarvas et al.: read as In at least one embodiment, a hinge loss function may be used for at least part of the objective function (Column 19 lines 58-60)) one or more parameters of the neural network encoder and the neural network decoder based on one or more objective functions selected from a group including to reduce a reconstruction error between the one or more training video snippets (Dai et al.: read as to obtain a low-dimensional descriptor … minimizing output data with the reconstruction error of the input data (English Translation)) and the corresponding one or more reconstructed video snippets, to reduce class entropy of the plurality of event classes of the code space, to increase lit of the training video snippets, and to increase compactness of the code space (Dai et al.: read as obtain a compression encoder of the high dimensional image descriptor to the descriptor of the low-dimensional space (English Translation)).

Claim 21. The event detection device according Claim 20, the combination of Dai et al. and Szarvas et al. teaches,
further comprising: 
the neural network decoder further configured to decode the low dimensional descriptors of the query video snippets into corresponding reconstructed video snippets (read as a convolutional neural network to realize the image descriptor dimensionality reduction … encoded by the encoder, to obtain a low-dimensional descriptor (English Translation)); and 
the loss function further configured to adjust the one or more parameters of the neural network encoder and the neural network decoder based on the one or more objective functions (Szarvas et al.: read as In at least one embodiment, a hinge loss function may be used for at least part of the objective function (Column 19 lines 58-60)).

Claim 22. Dai et al. disclose an event detection device comprising: 
a neural network encoder configured to encode a plurality of training video snippets into low dimensional descriptors of the training video snippets in the code space (read as a convolutional neural network to realize the image descriptor dimensionality reduction … encoded by the encoder, to obtain a low-dimensional descriptor (English Translation)); 
a neural network decoder configured to decode the low dimensional descriptors of the training video snippets into corresponding reconstructed video snippets (read as after the trained model, the output of the encoder as the descriptor of the image after dimensionality reduction (English Translation)); and 
a loss function configured to adjust one or more parameters of the neural network encoder and the neural network decoder based on one or more objective functions selected from a group including to reduce a reconstruction error between the one or more training video snippets and the corresponding one or more reconstructed video snippets (read as a convolutional neural network to realize the image descriptor dimensionality reduction … encoded by the encoder, to obtain a low-dimensional descriptor … to optimize the model parameters by minimizing output data with the reconstruction error of the input data (English Translation)), to reduce entropy class entropy of the plurality of event classes of the code space, to increase fit of the training video snippets, and to increase compactness of the code space (read as obtain a compression encoder of the high dimensional image descriptor to the descriptor of the low-dimensional space (English Translation)).
Dai et al. do not explicitly disclose decoding based on a loss function including one or more objective functions. However, in the related field of endeavor Szarvas et al. disclose: using a loss function
Therefore, it would have been obvious to a person of ordinary skill in the art, at the time the invention was filed, to modify the teaching of Dai et al. with the teaching of Szarvas et al. in order to use methods and apparatus for providing interpretable content-based recommendations with respect to a variety of content sources (such as books) using low-dimension descriptors (Szarvas et al.: Column 2 lines 49-52).

Claim 23. The event detection device according Claim 22, the combination of Dai et al. and Szarvas et al. teaches,
wherein the plurality of class clusters are mapped to corresponding event classes (Szarvas et al.: read as a clustering algorithm or an algorithm such as K-nearest neighbors may be used on the descriptors (Column 20 lines 1-2)).

Claim 24. The event detection device according Claim 23, wherein the event classes include a cycle class (Dai et al.: read as the trained model (English Translation). Models are usually trained through cycles.) and a not cycle class.

Claim 25. The event detection device according Claim 23, wherein the event classes include a plurality of action cycle classes (Dai et al.: read as the trained model (English Translation). Models are usually trained through cycles.).

Claim 26. The event detection device according Claim 22, the combination of Dai et al. and Szarvas et al. teaches,
further comprising: 
the neural network encoder further configured to encode one or more labeled video snippets of a plurality of event classes into low dimensional descriptors of the labeled video snippets in the code space (read as a convolutional neural network to realize the image descriptor dimensionality reduction … encoded by the encoder, to obtain a low-dimensional descriptor (English Translation)), wherein the code space includes a plurality of class clusters characterized by one or more of a minimum trending reconstruction error (read as a convolutional neural network to realize the image descriptor dimensionality reduction … encoded by the encoder, to obtain a low-dimensional descriptor … to optimize the model parameters by minimizing output data with the reconstruction error of the input data (English Translation)), a minimum trending class entropy, a maximum trending lit of video snippets and a maximum compactness of the code space; and 
a class cluster classifier configured to map the plurality of event classes to class clusters corresponding to the low dimensional descriptors of' the labeled video snippets (read as obtain a compression encoder of the high dimensional image descriptor to the descriptor of the low-dimensional space (English Translation)).

Claim 27. the event detection device according Claim 26, the combination of Dai et al. and Szarvas et al. teaches,
further comprising: 
the neural network encoder further configured to encode a query video snippet into a low dimensional descriptor of the code space (read as a convolutional neural network to realize the image descriptor dimensionality reduction … encoded by the encoder, to obtain a low-dimensional descriptor … to optimize the model parameters by minimizing output data with the reconstruction error of the input data (English Translation)), wherein the code space includes a plurality of class clusters characterized by one or more of a minimum trending reconstruction error, a minimum trending class entropy, a maximum trending fit of video snippets and a maximum compactness of the code space (read as obtain a compression encoder of the high dimensional image descriptor to the descriptor of the low-dimensional space (English Translation)); and 
the class cluster classifier further configured to classify the low dimensional descriptor of' the query video snippet based on its proximity to a nearest one of a plurality of class clusters of the code space and output an indication of an event class of the query video snippet based on the classified class cluster of the low dimensional descriptor of the query video snippet (Szarvas et al.: read as a clustering algorithm or an algorithm such as K-nearest neighbors may be used on the descriptors (Column 20 lines 1-2)).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Refer to PTO-892.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMED RACHEDINE whose telephone number is (571)272-9249. The examiner can normally be reached Mon-Fri 8-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Lester Kincaid can be reached on (571)272-7922. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

MOHAMMED RACHEDINE
Examiner
Art Unit 2649



/MOHAMMED RACHEDINE/Primary Examiner, Art Unit 2646