Search Content

Displaying 1 - 3 of 3

Filtering by

All Subjects: navigation
All Subjects: Image compression
Creators: Li, Baoxin

Stereo based visual odometry

Description

The exponential rise in unmanned aerial vehicles has necessitated the need for accurate pose estimation under any extreme conditions. Visual Odometry (VO) is the estimation of position and orientation of a vehicle based on analysis of a sequence of images captured from a camera mounted on it. VO offers a cheap and relatively accurate alternative to conventional odometry techniques like wheel odometry, inertial measurement systems and global positioning system (GPS). This thesis implements and analyzes the performance of a two camera based VO called Stereo based visual odometry (SVO) in presence of various deterrent factors like shadows, extremely bright outdoors, wet conditions etc... To allow the implementation of VO on any generic vehicle, a discussion on porting of the VO algorithm to android handsets is presented too. The SVO is implemented in three steps. In the first step, a dense disparity map for a scene is computed. To achieve this we utilize sum of absolute differences technique for stereo matching on rectified and pre-filtered stereo frames. Epipolar geometry is used to simplify the matching problem. The second step involves feature detection and temporal matching. Feature detection is carried out by Harris corner detector. These features are matched between two consecutive frames using the Lucas-Kanade feature tracker. The 3D co-ordinates of these matched set of features are computed from the disparity map obtained from the first step and are mapped into each other by a translation and a rotation. The rotation and translation is computed using least squares minimization with the aid of Singular Value Decomposition. Random Sample Consensus (RANSAC) is used for outlier detection. This comprises the third step. The accuracy of the algorithm is quantified based on the final position error, which is the difference between the final position computed by the SVO algorithm and the final ground truth position as obtained from the GPS. The SVO showed an error of around 1% under normal conditions for a path length of 60 m and around 3% in bright conditions for a path length of 130 m. The algorithm suffered in presence of shadows and vibrations, with errors of around 15% and path lengths of 20 m and 100 m respectively.

ContributorsDhar, Anchit (Author) / Saripalli, Srikanth (Thesis advisor) / Li, Baoxin (Committee member) / Papandreou-Suppappola, Antonia (Committee member) / Arizona State University (Publisher)

Created2010

Towards building cyber-human systems for individuals with visual impairment

Description

A lot of strides have been made in enabling technologies to aid individuals with visual impairment live an independent life. The advent of smart devices and participatory web has especially facilitated the possibility of new interactions to aide everyday tasks. Current systems however tend to be complex and require multiple cumbersome devices which invariably come with steep learning curves. Building new cyber-human systems with simple integrated interfaces while keeping in mind the specific requirements of the target users would help alleviate their mundane yet significant daily needs. Navigation is one such significant need that forms an integral part of everyday life and is one of the areas where individuals with visual impairment face the most discomfort. There is little technology out there to help travelers with navigating new routes. A number of research prototypes have been proposed but none of them are available to the general population. This may be due to the need for special equipment that needs expertise before deployment, or trained professionals needing to calibrate devices or because of the fact that the systems are just not scalable. Another area that needs assistance is the field of education. Lot of the classroom material and textbook material is not readily available in alternate formats for use. Another such area that requires attention is information delivery in the age of web 2.0. Popular websites like Facebook, Amazon, etc are designed with sighted people as target audience. While the mobile editions with their pared down versions make it easier to navigate with screen readers, the truth remains that there is still a long way to go in making such websites truly accessible.

ContributorsPaladugu, Devi Archana (Author) / Li, Baoxin (Thesis advisor) / Hedgpeth, Terri (Committee member) / Atkinson, Robert (Committee member) / Walker, Erin (Committee member) / Arizona State University (Publisher)

Created2016

Reconstruction-free inference from compressive measurements

Description

As a promising solution to the problem of acquiring and storing large amounts of image and video data, spatial-multiplexing camera architectures have received lot of attention in the recent past. Such architectures have the attractive feature of combining a two-step process of acquisition and compression of pixel measurements in a conventional camera, into a single step. A popular variant is the single-pixel camera that obtains measurements of the scene using a pseudo-random measurement matrix. Advances in compressive sensing (CS) theory in the past decade have supplied the tools that, in theory, allow near-perfect reconstruction of an image from these measurements even for sub-Nyquist sampling rates. However, current state-of-the-art reconstruction algorithms suffer from two drawbacks -- They are (1) computationally very expensive and (2) incapable of yielding high fidelity reconstructions for high compression ratios. In computer vision, the final goal is usually to perform an inference task using the images acquired and not signal recovery. With this motivation, this thesis considers the possibility of inference directly from compressed measurements, thereby obviating the need to use expensive reconstruction algorithms. It is often the case that non-linear features are used for inference tasks in computer vision. However, currently, it is unclear how to extract such features from compressed measurements. Instead, using the theoretical basis provided by the Johnson-Lindenstrauss lemma, discriminative features using smashed correlation filters are derived and it is shown that it is indeed possible to perform reconstruction-free inference at high compression ratios with only a marginal loss in accuracy. As a specific inference problem in computer vision, face recognition is considered, mainly beyond the visible spectrum such as in the short wave infra-red region (SWIR), where sensors are expensive.

ContributorsLohit, Suhas Anand (Author) / Turaga, Pavan (Thesis advisor) / Spanias, Andreas (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)

Created2015

Theses and Dissertations

Filtering by

Stereo based visual odometry

Towards building cyber-human systems for individuals with visual impairment

Reconstruction-free inference from compressive measurements