Search Content

Sparse methods in image understanding and computer vision

Description

Image understanding has been playing an increasingly crucial role in vision applications. Sparse models form an important component in image understanding, since the statistics of natural images reveal the presence of sparse structure. Sparse methods lead to parsimonious models, in addition to being efficient for large scale learning. In sparse…

Image understanding has been playing an increasingly crucial role in vision applications. Sparse models form an important component in image understanding, since the statistics of natural images reveal the presence of sparse structure. Sparse methods lead to parsimonious models, in addition to being efficient for large scale learning. In sparse modeling, data is represented as a sparse linear combination of atoms from a "dictionary" matrix. This dissertation focuses on understanding different aspects of sparse learning, thereby enhancing the use of sparse methods by incorporating tools from machine learning. With the growing need to adapt models for large scale data, it is important to design dictionaries that can model the entire data space and not just the samples considered. By exploiting the relation of dictionary learning to 1-D subspace clustering, a multilevel dictionary learning algorithm is developed, and it is shown to outperform conventional sparse models in compressed recovery, and image denoising. Theoretical aspects of learning such as algorithmic stability and generalization are considered, and ensemble learning is incorporated for effective large scale learning. In addition to building strategies for efficiently implementing 1-D subspace clustering, a discriminative clustering approach is designed to estimate the unknown mixing process in blind source separation. By exploiting the non-linear relation between the image descriptors, and allowing the use of multiple features, sparse methods can be made more effective in recognition problems. The idea of multiple kernel sparse representations is developed, and algorithms for learning dictionaries in the feature space are presented. Using object recognition experiments on standard datasets it is shown that the proposed approaches outperform other sparse coding-based recognition frameworks. Furthermore, a segmentation technique based on multiple kernel sparse representations is developed, and successfully applied for automated brain tumor identification. Using sparse codes to define the relation between data samples can lead to a more robust graph embedding for unsupervised clustering. By performing discriminative embedding using sparse coding-based graphs, an algorithm for measuring the glomerular number in kidney MRI images is developed. Finally, approaches to build dictionaries for local sparse coding of image descriptors are presented, and applied to object recognition and image retrieval.

ContributorsJayaraman Thiagarajan, Jayaraman (Author) / Spanias, Andreas (Thesis advisor) / Frakes, David (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Turaga, Pavan (Committee member) / Arizona State University (Publisher)

Created2013

MRI visualization and mathematical modeling of local drug delivery

Description

Controlled release formulations for local, in vivo drug delivery are of growing interest to device manufacturers, research scientists, and clinicians; however, most research characterizing controlled release formulations occurs in vitro because the spatial and temporal distribution of drug delivery is difficult to measure in vivo. In this work, in vivo…

Controlled release formulations for local, in vivo drug delivery are of growing interest to device manufacturers, research scientists, and clinicians; however, most research characterizing controlled release formulations occurs in vitro because the spatial and temporal distribution of drug delivery is difficult to measure in vivo. In this work, in vivo magnetic resonance imaging (MRI) of local drug delivery is performed to visualize and quantify the time resolved distribution of MRI contrast agents. I find it is possible to visualize contrast agent distributions in near real time from local delivery vehicles using MRI. Three dimensional T1 maps are processed to produce in vivo concentration maps of contrast agent for individual animal models. The method for obtaining concentration maps is analyzed to estimate errors introduced at various steps in the process. The method is used to evaluate different controlled release vehicles, vehicle placement, and type of surgical wound in rabbits as a model for antimicrobial delivery to orthopaedic infection sites. I are able to see differences between all these factors; however, all images show that contrast agent remains fairly local to the wound site and do not distribute to tissues far from the implant in therapeutic concentrations. I also produce a mathematical model that investigates important mechanisms in the transport of antimicrobials in a wound environment. It is determined from both the images and the mathematical model that antimicrobial distribution in an orthopaedic wounds is dependent on both diffusive and convective mechanisms. Furthermore, I began development of MRI visible therapeutic agents to examine active drug distributions. I hypothesize that this work can be developed into a non-invasive, patient specific, clinical tool to evaluate the success of interventional procedures using local drug delivery vehicles.

ContributorsGiers, Morgan (Author) / Caplan, Michael R (Thesis advisor) / Massia, Stephen P (Committee member) / Frakes, David (Committee member) / McLaren, Alex C. (Committee member) / Vernon, Brent L (Committee member) / Arizona State University (Publisher)

Created2013

Feature extraction from compressive cameras with application to activity recognition

Description

Recent advances in camera architectures and associated mathematical representations now enable compressive acquisition of images and videos at low data-rates. While most computer vision applications of today are composed of conventional cameras, which collect a large amount redundant data and power hungry embedded systems, which compress the collected data for…

Recent advances in camera architectures and associated mathematical representations now enable compressive acquisition of images and videos at low data-rates. While most computer vision applications of today are composed of conventional cameras, which collect a large amount redundant data and power hungry embedded systems, which compress the collected data for further processing, compressive cameras offer the advantage of direct acquisition of data in compressed domain and hence readily promise to find applicability in computer vision, particularly in environments hampered by limited communication bandwidths. However, despite the significant progress in theory and methods of compressive sensing, little headway has been made in developing systems for such applications by exploiting the merits of compressive sensing. In such a setting, we consider the problem of activity recognition, which is an important inference problem in many security and surveillance applications. Since all successful activity recognition systems involve detection of human, followed by recognition, a potential fully functioning system motivated by compressive camera would involve the tracking of human, which requires the reconstruction of atleast the initial few frames to detect the human. Once the human is tracked, the recognition part of the system requires only the features to be extracted from the tracked sequences, which can be the reconstructed images or the compressed measurements of such sequences. However, it is desirable in resource constrained environments that these features be extracted from the compressive measurements without reconstruction. Motivated by this, in this thesis, we propose a framework for understanding activities as a non-linear dynamical system, and propose a robust, generalizable feature that can be extracted directly from the compressed measurements without reconstructing the original video frames. The proposed feature is termed recurrence texture and is motivated from recurrence analysis of non-linear dynamical systems. We show that it is possible to obtain discriminative features directly from the compressed stream and show its utility in recognition of activities at very low data rates.

ContributorsKulkarni, Kuldeep Sharad (Author) / Turaga, Pavan (Thesis advisor) / Spanias, Andreas (Committee member) / Frakes, David (Committee member) / Arizona State University (Publisher)

Created2012

Camera calibration using adaptive segmentation and ellipse fitting for localizing control points

Description

There is a growing interest for improved high-accuracy camera calibration methods due to the increasing demand for 3D visual media in commercial markets. Camera calibration is used widely in the fields of computer vision, robotics and 3D reconstruction. Camera calibration is the first step for extracting 3D data from a…

There is a growing interest for improved high-accuracy camera calibration methods due to the increasing demand for 3D visual media in commercial markets. Camera calibration is used widely in the fields of computer vision, robotics and 3D reconstruction. Camera calibration is the first step for extracting 3D data from a 2D image. It plays a crucial role in computer vision and 3D reconstruction due to the fact that the accuracy of the reconstruction and 3D coordinate determination relies on the accuracy of the camera calibration to a great extent. This thesis presents a novel camera calibration method using a circular calibration pattern. The disadvantages and issues with existing state-of-the-art methods are discussed and are overcome in this work. The implemented system consists of techniques of local adaptive segmentation, ellipse fitting, projection and optimization. Simulation results are presented to illustrate the performance of the proposed scheme. These results show that the proposed method reduces the error as compared to the state-of-the-art for high-resolution images, and that the proposed scheme is more robust to blur in the imaged calibration pattern.

ContributorsPrakash, Charan Dudda (Author) / Karam, Lina J (Thesis advisor) / Frakes, David (Committee member) / Papandreou-Suppappola, Antonia (Committee member) / Arizona State University (Publisher)

Created2012

Clinically relevant classification and retrieval of diabetic retinopathy images

Description

Diabetic retinopathy (DR) is a common cause of blindness occurring due to prolonged presence of diabetes. The risk of developing DR or having the disease progress is increasing over time. Despite advances in diabetes care over the years, DR remains a vision-threatening complication and one of the leading causes of…

Diabetic retinopathy (DR) is a common cause of blindness occurring due to prolonged presence of diabetes. The risk of developing DR or having the disease progress is increasing over time. Despite advances in diabetes care over the years, DR remains a vision-threatening complication and one of the leading causes of blindness among American adults. Recent studies have shown that diagnosis based on digital retinal imaging has potential benefits over traditional face-to-face evaluation. Yet there is a dearth of computer-based systems that can match the level of performance achieved by ophthalmologists. This thesis takes a fresh perspective in developing a computer-based system aimed at improving diagnosis of DR images. These images are categorized into three classes according to their severity level. The proposed approach explores effective methods to classify new images and retrieve clinically-relevant images from a database with prior diagnosis information associated with them. Retrieval provides a novel way to utilize the vast knowledge in the archives of previously-diagnosed DR images and thereby improve a clinician's performance while classification can safely reduce the burden on DR screening programs and possibly achieve higher detection accuracy than human experts. To solve the three-class retrieval and classification problem, the approach uses a multi-class multiple-instance medical image retrieval framework that makes use of spectrally tuned color correlogram and steerable Gaussian filter response features. The results show better retrieval and classification performances than prior-art methods and are also observed to be of clinical and visual relevance.

ContributorsChandakkar, Parag Shridhar (Author) / Li, Baoxin (Thesis advisor) / Turaga, Pavan (Committee member) / Frakes, David (Committee member) / Arizona State University (Publisher)

Created2012

3D Printing Sensor-Stents

Description

This paper summarizes the [1] ideas behind, [2] needs, [3] development, and [4] testing of 3D-printed sensor-stents known as Stentzors. This sensor was successfully developed entirely from scratch, tested, and was found to have an output of 3.2*10-6 volts per RMS pressure in pascals. This paper also recommends further work…

This paper summarizes the [1] ideas behind, [2] needs, [3] development, and [4] testing of 3D-printed sensor-stents known as Stentzors. This sensor was successfully developed entirely from scratch, tested, and was found to have an output of 3.2*10-6 volts per RMS pressure in pascals. This paper also recommends further work to render the Stentzor deployable in live subjects, including [1] further design optimization, [2] electrical isolation, [3] wireless data transmission, and [4] testing for aneurysm prevention.

ContributorsMeidinger, Aaron Michael (Author) / LaBelle, Jeffrey (Thesis director) / Frakes, David (Committee member) / Barrett, The Honors College (Contributor) / Mechanical and Aerospace Engineering Program (Contributor)

Created2014-05

PIV ANALYSIS OF BASILAR TIP ANEURYSM HEMODYNAMICS, AND THE EFFECTS OF ENTERPRISE STENT TREATMENT

Description

Intracranial aneurysms, which form in the blood vessels of the brain, are particularly dangerous because of the importance and fragility of the human brain. When an intracranial aneurysm gets large it poses a significant risk of bursting and causing subarachnoid hemorrhaging (SAH), a possibly fatal condition. One possible treatment involves…

Intracranial aneurysms, which form in the blood vessels of the brain, are particularly dangerous because of the importance and fragility of the human brain. When an intracranial aneurysm gets large it poses a significant risk of bursting and causing subarachnoid hemorrhaging (SAH), a possibly fatal condition. One possible treatment involves placing a stent in the vessel to act as a flow diverter. In this study we look at the hemodynamics of two geometries of idealized basilar tip aneurysms, at 2,3, and 4 ml/s pulsatile flow, at three different points in the cardiac cycle. The smaller model had neck and dome diameters of 2.67 mm and 4 mm respectively, while the larger aneurysm had neck and dome diameters of 3 mm and 6 mm respectively. Both diameters and the dome to neck ratio increased in the second model, representing growth over time. Flow was analyzed using stereoscopic particle image velocimetry (PIV) for both geometries in untreated models, as well as after treatment with a high porosity Enterprise stent (Codman and Shurtleff Inc.). Flow in the models was characterized by root mean square velocity in the aneurysm and neck plane, cross neck flow, max aneurysm vorticity, and total aneurysm kinetic energy. It was found that in the smaller aneurysm model (model 1), Enterprise stent treatment reduced all flow parameters substantially. The smallest reduction was in max vorticity, at 42.48%, and the largest in total kinetic energy, at 75.69%. In the larger model (model 2) there was a 52.18% reduction in cross neck flow, but a 167.28% increase in aneurysm vorticity. The other three parameters experienced little change. These results, along with observed velocity vector fields, indicate a noticeable diversion of flow away from the aneurysm in the stent treated model 1. Treatment in model 2 had a small flow diversion effect, but also altered flow in unpredictable ways, in some cases having a detrimental effect on aneurysm hemodynamics. The results of this study indicate that Enterprise stent treatment is only effective in small, relatively undeveloped aneurysm geometries, and waiting until an aneurysm has grown too large can eliminate this treatment option altogether.

ContributorsLindsay, James Bryan (Author) / Frakes, David (Thesis director) / LaBelle, Jeffrey (Committee member) / Nair, Priya (Committee member) / Barrett, The Honors College (Contributor) / School of Humanities, Arts, and Cultural Studies (Contributor)

Created2013-05

Image processing using approximate data-path units

Description

In this work, we present approximate adders and multipliers to reduce data-path complexity of specialized hardware for various image processing systems. These approximate circuits have a lower area, latency and power consumption compared to their accurate counterparts and produce fairly accurate results. We build upon the work on approximate adders…

In this work, we present approximate adders and multipliers to reduce data-path complexity of specialized hardware for various image processing systems. These approximate circuits have a lower area, latency and power consumption compared to their accurate counterparts and produce fairly accurate results. We build upon the work on approximate adders and multipliers presented in [23] and [24]. First, we show how choice of algorithm and parallel adder design can be used to implement 2D Discrete Cosine Transform (DCT) algorithm with good performance but low area. Our implementation of the 2D DCT has comparable PSNR performance with respect to the algorithm presented in [23] with ~35-50% reduction in area. Next, we use the approximate 2x2 multiplier presented in [24] to implement parallel approximate multipliers. We demonstrate that if some of the 2x2 multipliers in the design of the parallel multiplier are accurate, the accuracy of the multiplier improves significantly, especially when two large numbers are multiplied. We choose Gaussian FIR Filter and Fast Fourier Transform (FFT) algorithms to illustrate the efficacy of our proposed approximate multiplier. We show that application of the proposed approximate multiplier improves the PSNR performance of 32x32 FFT implementation by 4.7 dB compared to the implementation using the approximate multiplier described in [24]. We also implement a state-of-the-art image enlargement algorithm, namely Segment Adaptive Gradient Angle (SAGA) [29], in hardware. The algorithm is mapped to pipelined hardware blocks and we synthesized the design using 90 nm technology. We show that a 64x64 image can be processed in 496.48 µs when clocked at 100 MHz. The average PSNR performance of our implementation using accurate parallel adders and multipliers is 31.33 dB and that using approximate parallel adders and multipliers is 30.86 dB, when evaluated against the original image. The PSNR performance of both designs is comparable to the performance of the double precision floating point MATLAB implementation of the algorithm.

ContributorsVasudevan, Madhu (Author) / Chakrabarti, Chaitali (Thesis advisor) / Frakes, David (Committee member) / Gupta, Sandeep (Committee member) / Arizona State University (Publisher)

Created2013

Filtering by