Content Capture

The extension of image- and video-based rendering towards dynamic real-world scenery and events has been suggested since years, but is still in its infancy. It involves stitching time-synchronised video sources from different points of view, rather than photographs or a series of video snapshots. Closest to ICoSOLE comes the mobicast concept by Kaheel et al. for collaborative event casting using mobile phones. The main scientific challenge is to automatically track and register freely handheld mobile phone video streams w.r.t. a global coordinate space, in real-time. A detailed state of the art description on this topic follows below.

Considering these facts, the proposed content generation on the mobile device consists of multiple encoding chains, which is definitely an innovation investigated in the ICoSOLE. One of these encoding chains will be used to produce a high quality on-demand version of the content and another other one will be used for the live multimedia sharing case. This is definitely a challenging architecture, which leverages different resources of the mobile device, e.g., the available hardware esources such as hardware encoders for audio and video, mobile multicore CPUs as well as GPUs, etc. Furthermore, an intelligent content analysis, of at least audio, will be done to prevent the streaming of unusable or poor quality content.

In addition to this, various types of metadata will be integrated in the generated multimedia streaming. Therefore, existing streaming (e.g. DASH) or container formats (e.g. ISO base media file Format) will be modified with timed metadata information such as, e.g., describing tags, GPS coordinates, etc. In this context, ICoSOLE will investigate which Kind of integration is best suitable, w.r.t. live streaming, amount and type of data, support of the formats, etc.

So far, virtual camera technology has resulted in speciality camera systems that do not replace other broadcast cameras, or for providing complementary content separate from the usual. ICoSOLE strives for a close integration of virtual camera technology, broadcast cameras and user contributed content, allowing unprecedented immersive viewing experiences by the spectator at home or on the road, but also aiming at a clear cost saving while creating novel creative opportunities in conventional broadcasting of spatially spreadout rather than localized events, as illustrated in the use scenarios.

Technically, ICoSOLE will research and develop a spatiotemporal video navigation, distribution and presentation environment providing a user experience comparable to photosynth, but exceeding it in the following areas:

  • ICoSOLE organises video of dynamic events rather than photographs of static scenery and adds a time line as a fourth dimension in addition to 3D spatial navigation in video-on-demand.
  • ICoSOLE integrates professional broadcast and virtual cameras, as well as user contributed content, in real-time.
  • ICoSOLE provides a tool for conventional television production, and for second screen, in addition to web.
  • ICoSOLE ensures quality of UGC through use of metadata and automated Content analysis.
  • ICoSOLE creates a new spatial audio experience based on binaural reproduction, navigable just like the imagery.