Matching Items (2)
Filtering by

Clear all filters

156586-Thumbnail Image.png
Description
Image Understanding is a long-established discipline in computer vision, which encompasses a body of advanced image processing techniques, that are used to locate (“where”), characterize and recognize (“what”) objects, regions, and their attributes in the image. However, the notion of “understanding” (and the goal of artificial intelligent machines) goes beyond

Image Understanding is a long-established discipline in computer vision, which encompasses a body of advanced image processing techniques, that are used to locate (“where”), characterize and recognize (“what”) objects, regions, and their attributes in the image. However, the notion of “understanding” (and the goal of artificial intelligent machines) goes beyond factual recall of the recognized components and includes reasoning and thinking beyond what can be seen (or perceived). Understanding is often evaluated by asking questions of increasing difficulty. Thus, the expected functionalities of an intelligent Image Understanding system can be expressed in terms of the functionalities that are required to answer questions about an image. Answering questions about images require primarily three components: Image Understanding, question (natural language) understanding, and reasoning based on knowledge. Any question, asking beyond what can be directly seen, requires modeling of commonsense (or background/ontological/factual) knowledge and reasoning.

Knowledge and reasoning have seen scarce use in image understanding applications. In this thesis, we demonstrate the utilities of incorporating background knowledge and using explicit reasoning in image understanding applications. We first present a comprehensive survey of the previous work that utilized background knowledge and reasoning in understanding images. This survey outlines the limited use of commonsense knowledge in high-level applications. We then present a set of vision and reasoning-based methods to solve several applications and show that these approaches benefit in terms of accuracy and interpretability from the explicit use of knowledge and reasoning. We propose novel knowledge representations of image, knowledge acquisition methods, and a new implementation of an efficient probabilistic logical reasoning engine that can utilize publicly available commonsense knowledge to solve applications such as visual question answering, image puzzles. Additionally, we identify the need for new datasets that explicitly require external commonsense knowledge to solve. We propose the new task of Image Riddles, which requires a combination of vision, and reasoning based on ontological knowledge; and we collect a sufficiently large dataset to serve as an ideal testbed for vision and reasoning research. Lastly, we propose end-to-end deep architectures that can combine vision, knowledge and reasoning modules together and achieve large performance boosts over state-of-the-art methods.
ContributorsAditya, Somak (Author) / Baral, Chitta (Thesis advisor) / Yang, Yezhou (Thesis advisor) / Aloimonos, Yiannis (Committee member) / Lee, Joohyung (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)
Created2018
156810-Thumbnail Image.png
Description
Growing understanding of the neural code and how to speak it has allowed for notable advancements in neural prosthetics. With commercially-available implantable systems with bi- directional neural communication on the horizon, there is an increasing imperative to develop high resolution interfaces that can survive the environment and be well tolerated

Growing understanding of the neural code and how to speak it has allowed for notable advancements in neural prosthetics. With commercially-available implantable systems with bi- directional neural communication on the horizon, there is an increasing imperative to develop high resolution interfaces that can survive the environment and be well tolerated by the nervous system under chronic use. The sensory encoding aspect optimally interfaces at a scale sufficient to evoke perception but focal in nature to maximize resolution and evoke more complex and nuanced sensations. Microelectrode arrays can maintain high spatial density, operating on the scale of cortical columns, and can be either penetrating or non-penetrating. The non-penetrating subset sits on the tissue surface without puncturing the parenchyma and is known to engender minimal tissue response and less damage than the penetrating counterpart, improving long term viability in vivo. Provided non-penetrating microelectrodes can consistently evoke perception and maintain a localized region of activation, non-penetrating micro-electrodes may provide an ideal platform for a high performing neural prosthesis; this dissertation explores their functional capacity.

The scale at which non-penetrating electrode arrays can interface with cortex is evaluated in the context of extracting useful information. Articulate movements were decoded from surface microelectrode electrodes, and additional spatial analysis revealed unique signal content despite dense electrode spacing. With a basis for data extraction established, the focus shifts towards the information encoding half of neural interfaces. Finite element modeling was used to compare tissue recruitment under surface stimulation across electrode scales. Results indicated charge density-based metrics provide a reasonable approximation for current levels required to evoke a visual sensation and showed tissue recruitment increases exponentially with electrode diameter. Micro-scale electrodes (0.1 – 0.3 mm diameter) could sufficiently activate layers II/III in a model tuned to striate cortex while maintaining focal radii of activated tissue.

In vivo testing proceeded in a nonhuman primate model. Stimulation consistently evoked visual percepts at safe current thresholds. Tracking perception thresholds across one year reflected stable values within minimal fluctuation. Modulating waveform parameters was found useful in reducing charge requirements to evoke perception. Pulse frequency and phase asymmetry were each used to reduce thresholds, improve charge efficiency, lower charge per phase – charge density metrics associated with tissue damage. No impairments to photic perception were observed during the course of the study, suggesting limited tissue damage from array implantation or electrically induced neurotoxicity. The subject consistently identified stimulation on closely spaced electrodes (2 mm center-to-center) as separate percepts, indicating sub-visual degree discrete resolution may be feasible with this platform. Although continued testing is necessary, preliminary results supports epicortical microelectrode arrays as a stable platform for interfacing with neural tissue and a viable option for bi-directional BCI applications.
ContributorsOswalt, Denise (Author) / Greger, Bradley (Thesis advisor) / Buneo, Christopher (Committee member) / Helms-Tillery, Stephen (Committee member) / Mirzadeh, Zaman (Committee member) / Papandreou-Suppappola, Antonia (Committee member) / Arizona State University (Publisher)
Created2018