Search Content

Informatics approach to improving surgical skills training

Description

Surgery as a profession requires significant training to improve both clinical decision making and psychomotor proficiency. In the medical knowledge domain, tools have been developed, validated, and accepted for evaluation of surgeons' competencies. However, assessment of the psychomotor skills still relies on the Halstedian model of apprenticeship, wherein surgeons are…

Surgery as a profession requires significant training to improve both clinical decision making and psychomotor proficiency. In the medical knowledge domain, tools have been developed, validated, and accepted for evaluation of surgeons' competencies. However, assessment of the psychomotor skills still relies on the Halstedian model of apprenticeship, wherein surgeons are observed during residency for judgment of their skills. Although the value of this method of skills assessment cannot be ignored, novel methodologies of objective skills assessment need to be designed, developed, and evaluated that augment the traditional approach. Several sensor-based systems have been developed to measure a user's skill quantitatively, but use of sensors could interfere with skill execution and thus limit the potential for evaluating real-life surgery. However, having a method to judge skills automatically in real-life conditions should be the ultimate goal, since only with such features that a system would be widely adopted. This research proposes a novel video-based approach for observing surgeons' hand and surgical tool movements in minimally invasive surgical training exercises as well as during laparoscopic surgery. Because our system does not require surgeons to wear special sensors, it has the distinct advantage over alternatives of offering skills assessment in both learning and real-life environments. The system automatically detects major skill-measuring features from surgical task videos using a computing system composed of a series of computer vision algorithms and provides on-screen real-time performance feedback for more efficient skill learning. Finally, the machine-learning approach is used to develop an observer-independent composite scoring model through objective and quantitative measurement of surgical skills. To increase effectiveness and usability of the developed system, it is integrated with a cloud-based tool, which automatically assesses surgical videos upload to the cloud.

ContributorsIslam, Gazi (Author) / Li, Baoxin (Thesis advisor) / Liang, Jianming (Thesis advisor) / Dinu, Valentin (Committee member) / Greenes, Robert (Committee member) / Smith, Marshall (Committee member) / Kahol, Kanav (Committee member) / Patel, Vimla L. (Committee member) / Arizona State University (Publisher)

Created2013

Deep Temporal Clustering: Fully Unsupervised Learning of Time-Domain Features

Description

Unsupervised learning of time series data, also known as temporal clustering, is a challenging problem in machine learning. This thesis presents a novel algorithm, Deep Temporal Clustering (DTC), to naturally integrate dimensionality reduction and temporal clustering into a single end-to-end learning framework, fully unsupervised. The algorithm utilizes an autoencoder for…

Unsupervised learning of time series data, also known as temporal clustering, is a challenging problem in machine learning. This thesis presents a novel algorithm, Deep Temporal Clustering (DTC), to naturally integrate dimensionality reduction and temporal clustering into a single end-to-end learning framework, fully unsupervised. The algorithm utilizes an autoencoder for temporal dimensionality reduction and a novel temporal clustering layer for cluster assignment. Then it jointly optimizes the clustering objective and the dimensionality reduction objective. Based on requirement and application, the temporal clustering layer can be customized with any temporal similarity metric. Several similarity metrics and state-of-the-art algorithms are considered and compared. To gain insight into temporal features that the network has learned for its clustering, a visualization method is applied that generates a region of interest heatmap for the time series. The viability of the algorithm is demonstrated using time series data from diverse domains, ranging from earthquakes to spacecraft sensor data. In each case, the proposed algorithm outperforms traditional methods. The superior performance is attributed to the fully integrated temporal dimensionality reduction and clustering criterion.

ContributorsMadiraju, NaveenSai (Author) / Liang, Jianming (Thesis advisor) / Wang, Yalin (Thesis advisor) / He, Jingrui (Committee member) / Arizona State University (Publisher)

Created2018

Monocular depth estimation with edge-based constraints and active learning

Description

The ubiquity of single camera systems in society has made improving monocular depth estimation a topic of increasing interest in the broader computer vision community. Inspired by recent work in sparse-to-dense depth estimation, this thesis focuses on sparse patterns generated from feature detection based algorithms as opposed to regular grid…

The ubiquity of single camera systems in society has made improving monocular depth estimation a topic of increasing interest in the broader computer vision community. Inspired by recent work in sparse-to-dense depth estimation, this thesis focuses on sparse patterns generated from feature detection based algorithms as opposed to regular grid sparse patterns used by previous work. This work focuses on using these feature-based sparse patterns to generate additional depth information by interpolating regions between clusters of samples that are in close proximity to each other. These interpolated sparse depths are used to enforce additional constraints on the network’s predictions. In addition to the improved depth prediction performance observed from incorporating the sparse sample information in the network compared to pure RGB-based methods, the experiments show that actively retraining a network on a small number of samples that deviate most from the interpolated sparse depths leads to better depth prediction overall.

This thesis also introduces a new metric, titled Edge, to quantify model performance in regions of an image that show the highest change in ground truth depth values along either the x-axis or the y-axis. Existing metrics in depth estimation like Root Mean Square Error(RMSE) and Mean Absolute Error(MAE) quantify model performance across the entire image and don’t focus on specific regions of an image that are hard to predict. To this end, the proposed Edge metric focuses specifically on these hard to classify regions. The experiments also show that using the Edge metric as a small addition to existing loss functions like L1 loss in current state-of-the-art methods leads to vastly improved performance in these hard to classify regions, while also improving performance across the board in every other metric.

ContributorsRai, Anshul (Author) / Yang, Yezhou (Thesis advisor) / Zhang, Wenlong (Committee member) / Liang, Jianming (Committee member) / Arizona State University (Publisher)

Created2019

Modeling relationships between cycles in psychology: potential limitations of sinusoidal and mass-spring models

Description

With improvements in technology, intensive longitudinal studies that permit the investigation of daily and weekly cycles in behavior have increased exponentially over the past few decades. Traditionally, when data have been collected on two variables over time, multivariate time series approaches that remove trends, cycles, and serial dependency have been…

With improvements in technology, intensive longitudinal studies that permit the investigation of daily and weekly cycles in behavior have increased exponentially over the past few decades. Traditionally, when data have been collected on two variables over time, multivariate time series approaches that remove trends, cycles, and serial dependency have been used. These analyses permit the study of the relationship between random shocks (perturbations) in the presumed causal series and changes in the outcome series, but do not permit the study of the relationships between cycles. Liu and West (2016) proposed a multilevel approach that permitted the study of potential between subject relationships between features of the cycles in two series (e.g., amplitude). However, I show that the application of the Liu and West approach is restricted to a small set of features and types of relationships between the series. Several authors (e.g., Boker & Graham, 1998) proposed a connected mass-spring model that appears to permit modeling of more general cyclic relationships. I showed that the undamped connected mass-spring model is also limited and may be unidentified. To test the severity of the restrictions of the motion trajectories producible by the undamped connected mass-spring model I mathematically derived their connection to the force equations of the undamped connected mass-spring system. The mathematical solution describes the domain of the trajectory pairs that are producible by the undamped connected mass-spring model. The set of producible trajectory pairs is highly restricted, and this restriction sets major limitations on the application of the connected mass-spring model to psychological data. I used a simulation to demonstrate that even if a pair of psychological time-varying variables behaved exactly like two masses in an undamped connected mass-spring system, the connected mass-spring model would not yield adequate parameter estimates. My simulation probed the performance of the connected mass-spring model as a function of several aspects of data quality including number of subjects, series length, sampling rate relative to the cycle, and measurement error in the data. The findings can be extended to damped and nonlinear connected mass-spring systems.

ContributorsMartynova, Elena (M.A.) (Author) / West, Stephen G. (Thesis advisor) / Amazeen, Polemnia (Committee member) / Tein, Jenn-Yun (Committee member) / Arizona State University (Publisher)

Created2019

irRotate - Automatic Screen Rotation Based on Face Orientation using Infrared Cameras

Description

This work solves the problem of incorrect rotations while using handheld devices.Two new methods which improve upon previous works are explored. The first method
uses an infrared camera to capture and detect the user’s face position and orient the
display accordingly. The second method utilizes gyroscopic and accelerometer data
as input to a…

This work solves the problem of incorrect rotations while using handheld devices.Two new methods which improve upon previous works are explored. The first method
uses an infrared camera to capture and detect the user’s face position and orient the
display accordingly. The second method utilizes gyroscopic and accelerometer data
as input to a machine learning model to classify correct and incorrect rotations.
Experiments show that these new methods achieve an overall success rate of 67%
for the first and 92% for the second which reaches a new high for this performance
category. The paper also discusses logistical and legal reasons for implementing this
feature into an end-user product from a business perspective. Lastly, the monetary
incentive behind a feature like irRotate in a consumer device and explore related
patents is discussed.

ContributorsTallman, Riley (Author) / Yang, Yezhou (Thesis advisor) / Liang, Jianming (Committee member) / Chen, Yinong (Committee member) / Arizona State University (Publisher)

Created2020

Towards Robust Machine Learning Models for Data Scarcity

Description

Recently, a well-designed and well-trained neural network can yield state-of-the-art results across many domains, including data mining, computer vision, and medical image analysis. But progress has been limited for tasks where labels are difficult or impossible to obtain. This reliance on exhaustive labeling is a critical limitation in the rapid…

Recently, a well-designed and well-trained neural network can yield state-of-the-art results across many domains, including data mining, computer vision, and medical image analysis. But progress has been limited for tasks where labels are difficult or impossible to obtain. This reliance on exhaustive labeling is a critical limitation in the rapid deployment of neural networks. Besides, the current research scales poorly to a large number of unseen concepts and is passively spoon-fed with data and supervision.

To overcome the above data scarcity and generalization issues, in my dissertation, I first propose two unsupervised conventional machine learning algorithms, hyperbolic stochastic coding, and multi-resemble multi-target low-rank coding, to solve the incomplete data and missing label problem. I further introduce a deep multi-domain adaptation network to leverage the power of deep learning by transferring the rich knowledge from a large-amount labeled source dataset. I also invent a novel time-sequence dynamically hierarchical network that adaptively simplifies the network to cope with the scarce data.

To learn a large number of unseen concepts, lifelong machine learning enjoys many advantages, including abstracting knowledge from prior learning and using the experience to help future learning, regardless of how much data is currently available. Incorporating this capability and making it versatile, I propose deep multi-task weight consolidation to accumulate knowledge continuously and significantly reduce data requirements in a variety of domains. Inspired by the recent breakthroughs in automatically learning suitable neural network architectures (AutoML), I develop a nonexpansive AutoML framework to train an online model without the abundance of labeled data. This work automatically expands the network to increase model capability when necessary, then compresses the model to maintain the model efficiency.

In my current ongoing work, I propose an alternative method of supervised learning that does not require direct labels. This could utilize various supervision from an image/object as a target value for supervising the target tasks without labels, and it turns out to be surprisingly effective. The proposed method only requires few-shot labeled data to train, and can self-supervised learn the information it needs and generalize to datasets not seen during training.

ContributorsZhang, Jie (Author) / Wang, Yalin (Thesis advisor) / Liu, Huan (Committee member) / Stonnington, Cynthia (Committee member) / Liang, Jianming (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)

Created2020

Self-supervised Representation Learning via Image Out-painting for Medical Image Analysis

Description

In recent years, Convolutional Neural Networks (CNNs) have been widely used in not only the computer vision community but also within the medical imaging community. Specifically, the use of pre-trained CNNs on large-scale datasets (e.g., ImageNet) via transfer learning for a variety of medical imaging applications, has become the de…

In recent years, Convolutional Neural Networks (CNNs) have been widely used in not only the computer vision community but also within the medical imaging community. Specifically, the use of pre-trained CNNs on large-scale datasets (e.g., ImageNet) via transfer learning for a variety of medical imaging applications, has become the de facto standard within both communities.

However, to fit the current paradigm, 3D imaging tasks have to be reformulated and solved in 2D, losing rich 3D contextual information. Moreover, pre-trained models on natural images never see any biomedical images and do not have knowledge about anatomical structures present in medical images. To overcome the above limitations, this thesis proposes an image out-painting self-supervised proxy task to develop pre-trained models directly from medical images without utilizing systematic annotations. The idea is to randomly mask an image and train the model to predict the missing region. It is demonstrated that by predicting missing anatomical structures when seeing only parts of the image, the model will learn generic representation yielding better performance on various medical imaging applications via transfer learning.

The extensive experiments demonstrate that the proposed proxy task outperforms training from scratch in six out of seven medical imaging applications covering 2D and 3D classification and segmentation. Moreover, image out-painting proxy task offers competitive performance to state-of-the-art models pre-trained on ImageNet and other self-supervised baselines such as in-painting. Owing to its outstanding performance, out-painting is utilized as one of the self-supervised proxy tasks to provide generic 3D pre-trained models for medical image analysis.

ContributorsSodha, Vatsal Arvindkumar (Author) / Liang, Jianming (Thesis advisor) / Devarakonda, Murthy (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)

Created2020

Filtering by

Informatics approach to improving surgical skills training

Deep Temporal Clustering: Fully Unsupervised Learning of Time-Domain Features

Monocular depth estimation with edge-based constraints and active learning

Modeling relationships between cycles in psychology: potential limitations of sinusoidal and mass-spring models

irRotate - Automatic Screen Rotation Based on Face Orientation using Infrared Cameras

Towards Robust Machine Learning Models for Data Scarcity

Self-supervised Representation Learning via Image Out-painting for Medical Image Analysis