DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
 	This is in response to applicant’s amendment/response filed on 10/20/2022, which has been entered and made of record.  Claims 17 and 20-22 have been amended. Claim 18 have been canceled. Claim 23 have been added.  Claims 17 and 19-23 are pending in the application. 

Response to Arguments
 	Applicant’s arguments, regarding the newly recited claim language, filed 10/20/2022, with respect to the rejection(s) of claim(s) 17-22 under Murphy et al. as modified by Kakuta et al. and Zimerman et al. have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Murphy et al. as modified by Bertolami et al. and Kakuta et al.’s.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 17, 19-21, and 23 is/are rejected under 35 U.S.C. 103 as being unpatentable over U.S. PGPubs 2017/0228937 to Murphy et al. in view of U.S. PGPubs 2010/0287485 to Bertolami et al., further in view of U.S. PGPubs 2013/0194305 to Kakuta et al..

Regarding claim 17, Murphy et al. teach an image processing apparatus (par 0005), comprising: circuitry configured to (Fig 6, par 0044, par 0079): 
acquire content information of an augmented reality (AR) content including image information of an AR object and position information of the AR object in a real space (par 0019-0020, par 0027, par 0041, par 0054-0064, generate AR content with captured image of real environment based on position information and view point of the mobile device), the content information to be added to map information representing the real space (par 0024, par 0027, “the location-based services platform 103 may additionally include location representation data 107, which may include media (e.g., audio, video) or image data (e.g., panoramic images, photographs, etc.) associated with a determined location (e.g., location information specifying coordinates in metadata). In addition, the location representation data 107 can also include map information. Map information may include maps, satellite images, street and path information, point of interest (POI) information, signing information associated with maps, objects and structures associated with the maps, information about people and the locations of people, coordinate information associated with the information, etc., or a combination thereof. A POI can be a specific point location that a person may, for instance, find interesting or useful “, par 0039-0042, “the user interface 211 may be used to display maps, navigation information, camera images and streams, augmented reality application information, POIs, virtual reality map images, panoramic images, etc. from the memory 217 and/or received over the communication interface 213”), 
acquire, from a client terminals which has received an instruction of the AR content to capture a specific real object (Fig 2, par 0041, receive an input from user to capture a specific scene), a plurality of captured images of the specific real object and position information of the client terminal (par 0039-0041, “the image capture module 207 can process incoming data from the media capture devices. For example, the image capture module 207 can receive a video feed of information relating to a real world environment (e.g., while executing the location-based services application 109 via the runtime module 209). The image capture module 207 can capture one or more images from the information and/or sets of images (e.g., video). These images may be processed by the image processing module 215 to include content retrieved from a location-based services platform 103 or otherwise made available to the location-based services application 109 (e.g., via the memory 217) “, par 0059, “a depiction of the user interface 401 is shown with a mixture of live video or image capture elements in view or atop of the imagery as loaded with respect to FIG. 4B. In this example, at 2:18 PM as indicated on the digital clock 403, the mobile device user runs into another associate also scheduled to be at the same location (e.g., the Legacy Corporation Building). As the user waits to cross the street, the user decides to capture a live video image of the current scenery as imposed over the full resolution imagery of the buildings associated with the user's current location (buildings 1 and 2). This live capture includes a video footage of the associate 421 and a passing vehicle 423. Hence, mixed reality applications may also be appropriately supported in the same manner as described above for AR applications”), and 
generate a 3D virtual object by performing 3D modelling of the specific real object on a basis of the analyzed characteristic points of the plurality of captured images and the position information of the client terminal (par 0025, “The 3D models represent an approximation or likeness of the physical objects associated with a particular location—i.e., streets, buildings, landmarks, etc. of an area. Models can be positioned in virtually any angle or perspective for display on the UE 101. The 3D model can include one or more 3D object models (e.g., models of buildings, trees, signs, billboards, lampposts, landmarks, statues, sites, sceneries, etc.). These 3D object models can further comprise one or more other component object models (e.g., a building can include four wall component models; a sign can include a sign component model and a post component model, etc.)“, par 0029, “the user may be presented a GUI including an image of a location. This image can be tied to the 3D world model (e.g., via a subset of the location representation data 107). The user may then select a portion or point on the GUI (e.g., using a touch enabled input). The UE 101 receives this input and determines a point on the 3D world model that is associated with the selected point. This determination can include the determination of an object model and a point on the object model and/or a component of the object model “, par 0051-0053, “the application 109 or the UE 101 is caused to present the first rendering of a graphical user interface based on location information of a three-dimensional model or models, panoramic image data, etc. corresponding to a starting or current location of the UE 101. A change in the rendering location is caused, which leads to one or more transition renderings based in part on models and possible image data associated with the intermediate locations, before the finally the device presents the destination rendering similar to the starting rendering (e.g., the high resolution image or textured 3D rendering) “, par 0056-0057, render a 3D model of the real object based on position information and view point of the mobile device).
But Murphy et al. do not explicitly teach acquire, from a plurality of client terminals each of which has received an instruction of the AR content to capture a specific real object, a plurality of captured 2D images of the specific real object having different viewpoints and different angles, and position information of the plurality of client terminals; analyze characteristic points of the plurality of captured 2D images having the different viewpoints and the different capturing angles, generate a 3D virtual object on a basis of the analyzed characteristic points of the plurality of captured 2D images having the different viewpoints and the different capturing angles and the position information of the plurality of client terminals.

    PNG
    media_image1.png
    323
    421
    media_image1.png
    Greyscale

	In related endeavor, Bertolami et al. teach acquire, from a plurality of client terminals each of which has received an instruction of the AR content to capture a specific real object, a plurality of captured 2D images of the specific real object having different viewpoints and different angles(Figs 2-3, par 0025-027, “Device 220 and/or scene-facing detectors 226a and 226b may capture an image of physical area 230, or otherwise detect objects within physical area 230 and/or derive data from physical area 230. Physical area 230 may have within it landmarks that are detectable, including large landmarks such as tree 231 and tower 232”, par 028-029, “including scene-facing detectors 227a and 227b and display 223. Device 221 and/or scene-facing detectors 227a and 227b may also capture an image of physical area 230, or otherwise detect objects within physical area 230 and/or derive data from physical area 230, from the perspective of device 221 or user 211”, par 0032, “Image 251 may be an image of physical area 230 composited with augmented reality images. In image 251, because user 211 and/or device 221 are viewing physical area 230 from a different perspective, the positions of objects in image 251 may be different from the positions on the same objects in image 250”), and position information of the plurality of client terminals (par 0020-0022, “Device 100 may be configured with user-facing detector 120 that may be any type of detection component capable of detecting the position of a user or a part of a user relative to device 100 or detector 120, or detecting a representation of user or a part of a user relative to device 100 or detector 120.”, par 0041-0042, “that individual users will initially observe enough common landmarks from their individual perspectives to obtain a confident estimate of their true locations relative to one another. Coarse location estimation may be able to provide user positions but may not be able to provide orientations and, if the area is unmapped, there may be no known features with recorded positions that can be used for precise location determination”); 
analyze characteristic points of the plurality of captured 2D images having the different viewpoints and the different capturing angles(par 0058-0061, “the captured scene image is analyzed to determine a precise location. In one embodiment, one or more portions or subsets of the captured scene image, such as particular groups of pixels or data representing groups of pixels, may be analyzed. Any type of analysis may be employed that can facilitate determining a location …Once landmarks, features, or other elements are matched to mapping information known about a scene or area, an orientation and distance from the identified landmarks, features, or other elements may be determined in order to derive a precise location. This may be accomplished by analyzing the captured image(s) and calculating the distance from the identified elements and the particular angle of view or orientation towards the elements by comparing the captured image(s) to images taken at a known distance or orientation to the elements or associated data. The location of the device that captured the image may be determined in three-dimensional space or two-dimensional space ….coordinate system data is received on a user device. This may include the origin of the coordinate system and the relative location of the user device and/or information that will enable a user device to transform its local coordinate system to the unified coordinate system. For example, a server dedicated to coordinating users in an augmented reality system may determine that a particular local point and orientation are the origin of a coordinate system”), 
generate a 3D virtual object on a basis of the analyzed characteristic points of the plurality of captured 2D images having the different viewpoints and the different capturing angles and the position information of the plurality of client terminals (Figs 2-3, par 0027-0032, “Note that the images presented to users 210 and 211, while in one embodiment contain representations of the same objects and/or users, may be presented in different perspectives, such as the perspective of the device or the user of the device. This may be relatively trivial to implement when presenting an image captured of a scene, such as physical area 230, in a device operated by a user since the device capturing the image necessarily captures the image from the perspective of the device. However, when presenting augmented reality images, alone or composited with scene images, a determination may be need to be made as to the coordinates, locations, and/or positions of such augmented reality images in relation to the user and/or the scene with which such images are composited. This may be done so that the augmented reality images can be manipulated to appear more realistic and, in some embodiments, appear to interact naturally with the scene images with which they are composited”, Fig 3, par 0050-0053, par 0062, “ the user device may present images on a display based on the coordinate system. In one embodiment, a user device may have complete image information for an element stored in memory or otherwise accessible, for example, for a virtual character such as characters 263 and 264. The user device may receive element location and orientation information from another device interacting with an augmented reality system or application, and, using the now known common coordinate system, manipulate the stored element image so that is appears to be located at the element's virtual location and in the proper orientation to a user. Alternatively, a device may receive images of an element from another device and manipulate them according to the elements relative location in the common coordinate system. In yet another embodiment, no manipulation may be performed by a user device, and the image may be received from another device appropriately altered to reflect how the element should appear to a user located at the user's location”).
		It would have been obvious to a person of ordinary skill in the art at the time before the effective filing data of the claimed invention to modified Murphy et al. to include acquire, from a plurality of client terminals each of which has received an instruction of the AR content to capture a specific real object, a plurality of captured 2D images of the specific real object having different viewpoints and different angles, and position information of the plurality of client terminals; analyze characteristic points of the plurality of captured 2D images having the different viewpoints and the different capturing angles, generate a 3D virtual object on a basis of the analyzed characteristic points of the plurality of captured 2D images having the different viewpoints and the different capturing angles and the position information of the plurality of client terminals as taught by Bertolami et al. to continue to maintain a common coordinate system that enables the execution of an augmented reality application through determining another user's location relative to the location of that user in the virtual environment to present a more realistic environment to users. 
But Murphy et al. as modified by Bertolami et al. do not explicitly teach generate a 3D virtual object by performing 3D modelling of the specific real object on a basis the analyzed characteristic points of  the plurality of captured 2D images having the different viewpoints and the different capturing angles and the position information of the plurality of client terminals.

    PNG
    media_image2.png
    299
    486
    media_image2.png
    Greyscale

In related endeavor, Kakuta et al. teach acquire a plurality of captured 2D images of the specific real object having the different viewpoints and the different capturing angles and position information of the plurality of client terminals (Figs 1 and 3, par 0013, par 0015, par 0041, par 0045-0046, obtain multi-images of a real object form a plurality mobile terminals), and 
generate a 3D virtual object by performing 3D modelling of the specific real object on a basis the analyzed characteristic points of  the plurality of captured 2D images having the different viewpoints and the different capturing angles and the position information of the plurality of client terminals (par 0013, par 0048-0049, par 0052, render an 3D image of a real object based on the position information and view point of the plurality of mobile device).
		It would have been obvious to a person of ordinary skill in the art at the time before the effective filing data of the claimed invention to modified Murphy et al. as modified by Bertolami et al. to include generate a 3D virtual object by performing 3D modelling of the specific real object on a basis the analyzed characteristic points of  the plurality of captured 2D images having the different viewpoints and the different capturing angles and the position information of the plurality of client terminals as taught by Kakuta et al. to obtain the object image from real scene using multiple devices in different location and viewpoint to synthesize a more accurate 3D virtual model of the object. 

Regarding claim 19, Murphy et al. as modified by Bertolami et al. and Kakuta et al. teach all the limitation of claim 17, and further teach wherein the circuitry is further configured to provide a client display terminal with a virtual object acquired from the 3D modelling to be superimposed on a first-person view image representing a virtual space (Murphy et al.: par 0019, “location based services may be called upon to support augmented reality (AR) or mixed reality (MR) applications. AR allows a user's view of the real world, as rendered to a GUI, to be overlaid with additional visual information, while MR allows for the merging of real and virtual worlds to produce visualizations and new environments to the GUI of a device”, par 0025, “The 3D models represent an approximation or likeness of the physical objects associated with a particular location—i.e., streets, buildings, landmarks, etc. of an area. Models can be positioned in virtually any angle or perspective for display on the UE 101. The 3D model can include one or more 3D object models (e.g., models of buildings, trees, signs, billboards, lampposts, landmarks, statues, sites, sceneries, etc.). These 3D object models can further comprise one or more other component object models (e.g., a building can include four wall component models; a sign can include a sign component model and a post component model, etc.)”, par 0044, “if the application 109 is an augmented reality application, the location information may be used to establish the viewpoint using a location, directional heading, and/or tilt angle specified as part of the location information. The viewpoint is then used as the basis for rendering a corresponding user interface”, Kakuta et al.: par 0042, par 0049, par 0052, “a 3D CG object as one example of the virtual image is represented. The synthesizing means 112 generates an omnidirectional synthesized image by superimposing an omnidirectional image from the omnidirectional image obtaining camera 2 with the 3D CG object generated by the 3D object representing means 111”, Bertolami et al.: Fig 2, par 0019, par 0036, “Device 100 may also have more than one display. For example, device 100 may be a stereoscopic headgear with two displays, one for each eye, that create three-dimensional image effects when viewed”).

Regarding claim 20, the method claim 20 is similar in scope to claim 17 and is rejected under the same rational.

Regarding claim 21, Murphy et al. teach at least one non-transitory computer-readable medium encoded with instructions which, when executed by at least one processor of an image processing apparatus (par 0006, par 0072). The remaining limitations of the claim are similar in scope to claim 17 and rejected under the same rationale.

Regarding claim 23, Murphy et al. as modified by Bertolami et al. and Kakuta et al. teach all the limitation of claim 17, and further teach wherein the circuitry is further configured to control display of the generated 3D virtual object for provision of the AR content (Murphy et al.: par 0044, “a UE 101 receives a request to render a user interface of a location-based service to the GUI of the device. This request may be facilitated by or in response to the application 109 (e.g., an augmented reality application, mixed reality application, etc.) having access to the location-based services platform 103. The request may also include location information associated with the device (e.g., a UE 101), a user of the device, or the like. By way of example, the location information may be used as a location on which the user interface of the application 109 is based. For example, if the application 109 is an augmented reality application, the location information may be used to establish the viewpoint using a location, directional heading, and/or tilt angle specified as part of the location information. The viewpoint is then used as the basis for rendering a corresponding user interface”, Bertolami et al.: par 0019, “Device 100 may also have more than one display. For example, device 100 may be a stereoscopic headgear with two displays, one for each eye, that create three-dimensional image effects when viewed”, Fig 2, par 0026, “Image 250 may also include images of other users, such as user image 261. User image 261 may be an image of a user within physical area 230, or user image 261 may be an augmented reality application or system generated image of a user”, Kakuta et al.: par 0020, “A Mixed Reality display system according to the present invention is a Mixed Reality display system which is constructed to perform communication between an image providing server and a plurality of display devices, the image providing server comprising: a virtual object representing means that represents a virtual object”).

Claim 22 is/are rejected under 35 U.S.C. 103 as being unpatentable over U.S. PGPubs 2017/0228937 to Murphy et al. in view of U.S. PGPubs 2010/0287485 to Bertolami et al., further in view of U.S. PGPubs 2013/0194305 to Kakuta et al., further in view of U.S. PGPubs 2018/0014156 to Zimerman et al., further in view of U.S. PGPubs 2017/0280047 to Kinoshita.

Regarding claim 22, Murphy et al. as modified by Bertolami et al. and Kakuta et al. teach all the limitation of claim 17,  and further teach the circuitry is further configured to: perform the 3D modelling on a basis of the plurality of candidate images (Murphy et al.: par 0025, par 0029, par 0051-0053, par 0056-0057, render a 3D model of the real object based on position information and view point of the mobile device, Kakuta et al.: par 0013, par 0048-0049, par 0052, render an 3D image of a real object based on the position information and view point of the plurality of mobile device), but keep silent for teaching wherein the content information further includes event information of the AR object, and the circuitry is further configured to: determine a plurality of candidate images from the plurality of captured 2D images on a basis of the event information and the position information of the plurality of client terminals.
In related endeavor, Zimerman et al. teach wherein the content information further includes event information of the AR object (Zimerman et al.: par 0004, par 0062-0063, “obtaining media content of an event, comprising: identifying a real-life event, a time of the real-life event and a geographic location of the real-life event; identifying a subset of a plurality of client terminals of users located in proximity to the geographic location of the real-life event at the time of the real-life event; sending a message to the subset of client terminals containing a request to acquire media content documenting the real-life event; and receiving at least one media content item documenting the real-life event from at least one client terminal of the subset of client terminals”), but keep silent for teaching the circuitry is further configured to: determine a plurality of candidate images from the plurality of captured images on a basis of the event information and the position information of the plurality of client terminals.
It would have been obvious to a person of ordinary skill in the art at the time before the effective filing data of the claimed invention to modified Murphy et al. as modified by Bertolami et al. and Kakuta et al. to include wherein the content information further includes event information of the AR object as taught by Zimerman et al. to provide AR content from social network based on the live event to allow users to view media content from an event when such media is posted by others who attended the event and  create and upload using mobile devices such as smartphones and/or tablet computers in social network platforms. 
In related endeavor, Kinoshita teaches the circuitry is further configured to: determine a plurality of candidate images from the plurality of captured images on a basis of the event information and the position information of the plurality of client terminals (0024-0026, par 0087-0097, par 0669,  “to execute a positional state determination process of determining a positional state of each of candidate images detected as candidates for a main subject for a plurality of frames of image data within a field of view, a stable presence degree computation process of obtaining a degree of stable presence of each of the candidate images within the image data spanning the plurality of frames from the positional state of each of the candidate images of each frame determined in the positional state determination process, and a main subject determination process of determining a main subject among the candidate images using the degree of stable presence obtained in the stable presence degree computation process”).
		It would have been obvious to a person of ordinary skill in the art at the time before the effective filing data of the claimed invention to modified Murphy et al. as modified by Bertolami et al., Kakuta et al. and Zimerman et al. to include the circuitry is further configured to: determine a plurality of candidate images from the plurality of captured images on a basis of the event information and the position information of the plurality of client terminals as taught by Kinoshita to allow users to  determine a target subject desired by a user such as a photographer and setting the subject as a main subject without an action of the user intentionally selecting the subject to track an image designated by a user as a main subject using a function of optimally matching various parameters (focus, brightness and the like) of the camera according to subject’s position and area

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jin Ge whose telephone number is (571)272-5556. The examiner can normally be reached 8:00 to 5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Mehmood can be reached on (571)272-2976. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

JIN . GE
Examiner
Art Unit 2612



/JIN GE/Primary Examiner, Art Unit 2612