Filtering by
- All Subjects: Machine Learning
- All Subjects: Signal processing--Digital techniques.
- Creators: Li, Baoxin
- Creators: Turaga, Pavan
![156084-Thumbnail Image.png](https://d1rbsgppyrdqq4.cloudfront.net/s3fs-public/styles/width_400/public/2021-08/156084-Thumbnail%20Image.png?versionId=1aMdmRxQU3NIJav6FzXyraQaajoDijyX&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIASBVQ3ZQ42ZLA5CUJ/20240617/us-west-2/s3/aws4_request&X-Amz-Date=20240617T040854Z&X-Amz-SignedHeaders=host&X-Amz-Expires=120&X-Amz-Signature=8c70dd9a09c6e9e064815ccbd175e06ffffd87e0b09ce936a17d91ab2f32dbef&itok=waUcUOon)
The feature extraction processes can be categorized into three groups. The first group contains processes that are hand-crafted for a specific task. Hand-engineering features requires the knowledge of domain experts and manual labor. However, the feature extraction process is interpretable and explainable. Next group contains the latent-feature extraction processes. While the original feature lies in a high-dimensional space, the relevant factors for a task often lie on a lower dimensional manifold. The latent-feature extraction employs hidden variables to expose the underlying data properties that cannot be directly measured from the input. Latent features seek a specific structure such as sparsity or low-rank into the derived representation through sophisticated optimization techniques. The last category is that of deep features. These are obtained by passing raw input data with minimal pre-processing through a deep network. Its parameters are computed by iteratively minimizing a task-based loss.
In this dissertation, I present four pieces of work where I create and learn suitable data representations. The first task employs hand-crafted features to perform clinically-relevant retrieval of diabetic retinopathy images. The second task uses latent features to perform content-adaptive image enhancement. The third task ranks a pair of images based on their aestheticism. The goal of the last task is to capture localized image artifacts in small datasets with patch-level labels. For both these tasks, I propose novel deep architectures and show significant improvement over the previous state-of-art approaches. A suitable combination of feature representations augmented with an appropriate learning approach can increase performance for most visual computing tasks.
![158654-Thumbnail Image.png](https://d1rbsgppyrdqq4.cloudfront.net/s3fs-public/styles/width_400/public/2021-09/158654-Thumbnail%20Image.png?versionId=xTO1TG6t7I0bfbY523HO6mlYMBYRqcG_&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIASBVQ3ZQ42ZLA5CUJ/20240617/us-west-2/s3/aws4_request&X-Amz-Date=20240617T110742Z&X-Amz-SignedHeaders=host&X-Amz-Expires=120&X-Amz-Signature=3a6b1706cf0a93b912e3f8b69b12ece89ac97c530e32cc78a1309a52df76e7a4&itok=jEP2d-vw)
In the context of common naturally occurring image distortions, a metric is proposed to identify the most susceptible DNN convolutional filters and rank them in order of the highest gain in classification accuracy upon correction. The proposed approach called DeepCorrect applies small stacks of convolutional layers with residual connections at the output of these ranked filters and trains them to correct the most distortion-affected filter activations, whilst leaving the rest of the pre-trained filter outputs in the network unchanged. Performance results show that applying DeepCorrect models for common vision tasks significantly improves the robustness of DNNs against distorted images and outperforms other alternative approaches.
In the context of universal adversarial perturbations, departing from existing defense strategies that work mostly in the image domain, a novel and effective defense which only operates in the DNN feature domain is presented. This approach identifies pre-trained convolutional features that are most vulnerable to adversarial perturbations and deploys trainable feature regeneration units which transform these DNN filter activations into resilient features that are robust to universal perturbations. Regenerating only the top 50% adversarially susceptible activations in at most 6 DNN layers and leaving all remaining DNN activations unchanged can outperform existing defense strategies across different network architectures and across various universal attacks.
![157840-Thumbnail Image.png](https://d1rbsgppyrdqq4.cloudfront.net/s3fs-public/styles/width_400/public/2021-09/157840-Thumbnail%20Image.png?versionId=.NPRWb8hy01avsC9_VioYVC0ecMwE9bV&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIASBVQ3ZQ42ZLA5CUJ/20240617/us-west-2/s3/aws4_request&X-Amz-Date=20240617T071338Z&X-Amz-SignedHeaders=host&X-Amz-Expires=120&X-Amz-Signature=d1ff954f97d6f811e6e82345610610028e3552b6e9d93000489e053a2f19d737&itok=BJZG_m6O)
![158817-Thumbnail Image.png](https://d1rbsgppyrdqq4.cloudfront.net/s3fs-public/styles/width_400/public/2021-09/158817-Thumbnail%20Image.png?versionId=UdsVFocMXP1taaqfzhvs35xgKdySucu9&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIASBVQ3ZQ42ZLA5CUJ/20240617/us-west-2/s3/aws4_request&X-Amz-Date=20240617T040855Z&X-Amz-SignedHeaders=host&X-Amz-Expires=120&X-Amz-Signature=a6b91fde65c0f87bdd1478ac6ea4a14dd24b2e4cff729913208d22cd4be0c747&itok=CIDsuKOz)
![161220-Thumbnail Image.png](https://d1rbsgppyrdqq4.cloudfront.net/s3fs-public/styles/width_400/public/2022-09/161220-Thumbnail%20Image.png?versionId=u.1JlHKgCbxK3jfB7CTwqOJ7lT6pGHnd&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIASBVQ3ZQ42ZLA5CUJ/20240617/us-west-2/s3/aws4_request&X-Amz-Date=20240617T110742Z&X-Amz-SignedHeaders=host&X-Amz-Expires=120&X-Amz-Signature=691d20f6b19638bad974b4b335afe4ba2f99911563e413b15a1f414cfd206180&itok=iozA_jtU)
Classification in machine learning is quite crucial to solve many problems that the world is presented with today. Therefore, it is key to understand one’s problem and develop an efficient model to achieve a solution. One technique to achieve greater model selection and thus further ease in problem solving is estimation of the Bayes Error Rate. This paper provides the development and analysis of two methods used to estimate the Bayes Error Rate on a given set of data to evaluate performance. The first method takes a “global” approach, looking at the data as a whole, and the second is more “local”—partitioning the data at the outset and then building up to a Bayes Error Estimation of the whole. It is found that one of the methods provides an accurate estimation of the true Bayes Error Rate when the dataset is at high dimension, while the other method provides accurate estimation at large sample size. This second conclusion, in particular, can have significant ramifications on “big data” problems, as one would be able to clarify the distribution with an accurate estimation of the Bayes Error Rate by using this method.
![158896-Thumbnail Image.png](https://d1rbsgppyrdqq4.cloudfront.net/s3fs-public/styles/width_400/public/2021-09/158896-Thumbnail%20Image.png?versionId=9MqsWppBFa4QRolH8IvPLvSRZ1SD5gRk&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIASBVQ3ZQ42ZLA5CUJ/20240617/us-west-2/s3/aws4_request&X-Amz-Date=20240617T110742Z&X-Amz-SignedHeaders=host&X-Amz-Expires=120&X-Amz-Signature=f719150e9c524c7c397be8c7b1647095b5d87e241596c4b3d678543d7abcd9e9&itok=TYQXxpAA)
For each of these cases, I present a feasible image restoration pipeline to correct for their particular limitations. For the pinhole camera, I present an early pipeline to allow for practical pinhole photography by reducing noise levels caused by low-light imaging, enhancing exposure levels, and sharpening the blur caused by the pinhole. For lensless cameras, we explore a neural network architecture that performs joint image reconstruction and point spread function (PSF) estimation to robustly recover images captured with multiple PSFs from different cameras. Using adversarial learning, this approach achieves improved reconstruction results that do not require explicit knowledge of the PSF at test-time and shows an added improvement in the reconstruction model’s ability to generalize to variations in the camera’s PSF. This allows lensless cameras to be utilized in a wider range of applications that require multiple cameras without the need to explicitly train a separate model for each new camera. For UDCs, we utilize a multi-stage approach to correct for low light transmission, blur, and haze. This pipeline uses a PyNET deep neural network architecture to perform a majority of the restoration, while additionally using a traditional optimization approach which is then fused in a learned manner in the second stage to improve high-frequency features. I show results from this novel fusion approach that is on-par with the state of the art.
![158864-Thumbnail Image.png](https://d1rbsgppyrdqq4.cloudfront.net/s3fs-public/styles/width_400/public/2021-09/158864-Thumbnail%20Image.png?versionId=73rnZwUySMvkz6toZCmgPokVJpqtYhGm&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIASBVQ3ZQ42ZLA5CUJ/20240617/us-west-2/s3/aws4_request&X-Amz-Date=20240617T110742Z&X-Amz-SignedHeaders=host&X-Amz-Expires=120&X-Amz-Signature=d198b56ca6c6ddf86f9ef13fc25c477d8409a76233ffe49fc16e28db0caff3b2&itok=FJG-tcXb)
It is observed that the performance of the algorithm with regards to the kernels used are consistent with the theoretical performance of the kernel as presented in a previous work. The theoretical approach has also been automated in this work and the various implementation challenges have been addressed.
Standardization is sorely lacking in the field of musical machine learning. This thesis project endeavors to contribute to this standardization by training three machine learning models on the same dataset and comparing them using the same metrics. The music-specific metrics utilized provide more relevant information for diagnosing the shortcomings of each model.
![153595-Thumbnail Image.png](https://d1rbsgppyrdqq4.cloudfront.net/s3fs-public/styles/width_400/public/2021-08/153595-Thumbnail%20Image.png?versionId=Hobinv_MAtsLUYPFdBVJwahn7SNZ1o1f&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIASBVQ3ZQ42ZLA5CUJ/20240617/us-west-2/s3/aws4_request&X-Amz-Date=20240617T080059Z&X-Amz-SignedHeaders=host&X-Amz-Expires=120&X-Amz-Signature=b86359a00efe3cfc156990ec2a4ce34edb55c76826955c79b2c5cd4a07380705&itok=rdyJE0kU)
![135771-Thumbnail Image.png](https://d1rbsgppyrdqq4.cloudfront.net/s3fs-public/styles/width_400/public/2021-05/135771-Thumbnail%20Image.png?versionId=efC_QVCo5KHTK4erjxcVHjrCT8zy2x1Z&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIASBVQ3ZQ42ZLA5CUJ/20240617/us-west-2/s3/aws4_request&X-Amz-Date=20240617T080325Z&X-Amz-SignedHeaders=host&X-Amz-Expires=120&X-Amz-Signature=fbdecfb5144e74bed6a3ac9f764e6c45461b3e542b4ca4c615b1e78cd247ecee&itok=Q3sfg2q7)