We approach the problem by building a hardware prototype and characterize the end-to-end system bottlenecks of power and performance. The prototype has 6 IMX274 cameras and uses Nvidia Jetson TX2 development board for capture and computation. We found that capturing is bottlenecked by sensor power and data-rates across interfaces, whereas compute is limited by the total number of computations per frame. Our characterization shows that redundant capture and redundant computations lead to high power, huge memory footprint, and high latency. The existing systems lack hardware-software co-design aspects, leading to excessive data transfers across the interfaces and expensive computations within the individual subsystems. Finally, we propose mechanisms to optimize the system for low power and low latency. We emphasize the importance of co-design of different subsystems to reduce and reuse the data. For example, reusing the motion vectors of the ISP stage reduces the memory footprint of the stereo correspondence stage. Our estimates show that pipelining and parallelization on custom FPGA can achieve real time stitching.
The work characterizes the thermal implications of using 3D stacked image sensors with near-sensor vision processing units. The characterization reveals that near-sensor processing reduces system power but degrades image quality. For reasonable image fidelity, the sensor temperature needs to stay below a threshold, situationally determined by application needs. Fortunately, the characterization also identifies opportunities -- unique to the needs of near-sensor processing -- to regulate temperature based on dynamic visual task requirements and rapidly increase capture quality on demand.
Based on the characterization, the work proposes and investigate two thermal management strategies -- stop-capture-go and seasonal migration -- for imaging-aware thermal management. The work present parameters that govern the policy decisions and explore the trade-offs between system power and policy overhead. The work's evaluation shows that the novel dynamic thermal management strategies can unlock the energy-efficiency potential of near-sensor processing with minimal performance impact, without compromising image fidelity.
To evaluate the runtime performance and perceptual efficacy of the system, GLEAM was implemented on the Unity 3D Game Engine. The implementation was deployed on Android and iOS devices. On these implementations, GLEAM can prioritize dynamic estimation with update intervals as low as 15 ms or prioritize high spatial quality with update intervals of 200 ms. User studies across 99 participants and 26 scene comparisons reported a preference towards GLEAM over other lighting techniques in 66.67% of the presented augmented scenes and indifference in 12.57% of the scenes. A controlled lighting user study on 18 participants revealed a general preference for policies that strike a balance between resolution and update rate.
Keywords: virtual reality, film, videogame, sandbox
Java Mission-planning and Analysis for Remote Sensing (JMARS) is a geospatial software that provides mission planning and data-analysis tools with access to orbital data for planetary bodies like Mars and Venus. Using JMARS, terrain scenes can be prepared with an assortment of data layers along with any additional data sets. These scenes can then be exported into the JMARS extended reality platform, which includes both augmented reality and virtual reality experiences. JMARS VR Viewer is a virtual reality experience that allows users to view three-dimensional terrain data in a fully immersive and interactive way. This tool also provides a collaborative environment for users to host a terrain scene where people can analyze the data together. The purpose of the project is to design a set of interactions in virtual reality to try and address these questions: (1) how do we make sense of larger complex geospatial datasets, (2) how can we design interactions that assist users in understanding layered data in both an individual and collaborative work environment, and (3) what are the effects on the user’s cognitive overload while using these interfaces.