This service enables the precise registration of photos within a building using AI-based 6D camera position estimation. In this process, the exact orientation and position of the camera in a building floor is determined.

A semantic 3D model previously created by analyzing plans or point clouds is used for the registration. Semantic features such as ceiling, floor, wall, door and window are used to enable precise mapping of the photo to the 3D model. A detailed BIM model is not required.

The service can return the 1 to n most likely camera poses for a photo if needed to resolve ambiguities. Filters and prior knowledge, e.g. that all images are from the same room, can be used to further refine the result. Knowledge of the floor is required as a minimum.

Exact overlaying of the photo with the model results in semantic information for the photo. Objects that are recognized in the image by other services can subsequently be transferred to the model in turn.


  • Quality: tainings data synthetically created, test images must have spatial edges or other geometric features that can be matched to the semantic model.
  • Input Data Format:
    • 2D RGB image (PNG, JPG).
    • Minimal 3D description (walls, windows, doors) in JSON exchange format. The exchange format will be converted to Structured3D format for internal processing.
    • Optional: Prior knowledge (e.g. room)
  • Output data format: Camera pose as JSON


Niklas Gard, Aleixo Cambeiro Barreiro. Towards automated digital building model generation from floorplans and on-site images. 34. Forum Bauinformatik. 2023. [PDF]

Niklas Gard, Fraunhofer HHI