Filtering by
- All Subjects: deep learning
- Creators: Tepedelenlioğlu, Cihan
- Resource Type: Text
Breast cancer is one of the most common types of cancer worldwide. Early detection and diagnosis are crucial for improving the chances of successful treatment and survival. In this thesis, many different machine learning algorithms were evaluated and compared to predict breast cancer malignancy from diagnostic features extracted from digitized images of breast tissue samples, called fine-needle aspirates. Breast cancer diagnosis typically involves a combination of mammography, ultrasound, and biopsy. However, machine learning algorithms can assist in the detection and diagnosis of breast cancer by analyzing large amounts of data and identifying patterns that may not be discernible to the human eye. By using these algorithms, healthcare professionals can potentially detect breast cancer at an earlier stage, leading to more effective treatment and better patient outcomes. The results showed that the gradient boosting classifier performed the best, achieving an accuracy of 96% on the test set. This indicates that this algorithm can be a useful tool for healthcare professionals in the early detection and diagnosis of breast cancer, potentially leading to improved patient outcomes.
The aim of this project is to understand the basic algorithmic components of the transformer deep learning architecture. At a high level, a transformer is a machine learning model based off of a recurrent neural network that adopts a self-attention mechanism, which can weigh significant parts of sequential input data which is very useful for solving problems in natural language processing and computer vision. There are other approaches to solving these problems which have been implemented in the past (i.e., convolutional neural networks and recurrent neural networks), but these architectures introduce the issue of the vanishing gradient problem when an input becomes too long (which essentially means the network loses its memory and halts learning) and have a slow training time in general. The transformer architecture’s features enable a much better “memory” and a faster training time, which makes it a more optimal architecture in solving problems. Most of this project will be spent producing a survey that captures the current state of research on the transformer, and any background material to understand it. First, I will do a keyword search of the most well cited and up-to-date peer reviewed publications on transformers to understand them conceptually. Next, I will investigate any necessary programming frameworks that will be required to implement the architecture. I will use this to implement a simplified version of the architecture or follow an easy to use guide or tutorial in implementing the architecture. Once the programming aspect of the architecture is understood, I will then Implement a transformer based on the academic paper “Attention is All You Need”. I will then slightly tweak this model using my understanding of the architecture to improve performance. Once finished, the details (i.e., successes, failures, process and inner workings) of the implementation will be evaluated and reported, as well as the fundamental concepts surveyed. The motivation behind this project is to explore the rapidly growing area of AI algorithms, and the transformer algorithm in particular was chosen because it is a major milestone for engineering with AI and software. Since their introduction, transformers have provided a very effective way of solving natural language processing, which has allowed any related applications to succeed with high speed while maintaining accuracy. Since then, this type of model can be applied to more cutting edge natural language processing applications, such as extracting semantic information from a text description and generating an image to satisfy it.
We first consider sensor fusion, a typical multimodal fusion problem critical to building a pervasive computing platform. A systematic fusion technique is described to support both multiple sensors and descriptors for activity recognition. Targeted to learn the optimal combination of kernels, Multiple Kernel Learning (MKL) algorithms have been successfully applied to numerous fusion problems in computer vision etc. Utilizing the MKL formulation, next we describe an auto-context algorithm for learning image context via the fusion with low-level descriptors. Furthermore, a principled fusion algorithm using deep learning to optimize kernel machines is developed. By bridging deep architectures with kernel optimization, this approach leverages the benefits of both paradigms and is applied to a wide variety of fusion problems.
In many real-world applications, the modalities exhibit highly specific data structures, such as time sequences and graphs, and consequently, special design of the learning architecture is needed. In order to improve the temporal modeling for multivariate sequences, we developed two architectures centered around attention models. A novel clinical time series analysis model is proposed for several critical problems in healthcare. Another model coupled with triplet ranking loss as metric learning framework is described to better solve speaker diarization. Compared to state-of-the-art recurrent networks, these attention-based multivariate analysis tools achieve improved performance while having a lower computational complexity. Finally, in order to perform community detection on multilayer graphs, a fusion algorithm is described to derive node embedding from word embedding techniques and also exploit the complementary relational information contained in each layer of the graph.
Recovery of CS video frames is far more complex than still images, which are known to be (approximately) sparse in a linear basis such as the discrete cosine transform. By combining sparsity of individual frames with an optical flow-based model of inter-frame dependence, the perceptual quality and peak signal to noise ratio (PSNR) of reconstructed frames is improved. The efficacy of this approach is demonstrated for the cases of \textit{a priori} known image motion and unknown but constant image-wide motion.
Although video sequences can be reconstructed from CS measurements, the process is computationally costly. In autonomous systems, this reconstruction step is unnecessary if higher-level conclusions can be drawn directly from the CS data. A tracking algorithm is described and evaluated which can hold target vehicles at very high levels of compression where reconstruction of video frames fails. The algorithm performs tracking by detection using a particle filter with likelihood given by a maximum average correlation height (MACH) target template model.
Motivated by possible improvements over the MACH filter-based likelihood estimation of the tracking algorithm, the application of deep learning models to detection and classification of compressively sensed images is explored. In tests, a Deep Boltzmann Machine trained on CS measurements outperforms a naive reconstruct-first approach.
Taken together, progress in these three areas of CS inference has the potential to lower system cost and improve performance, opening up new applications of CS video cameras.