Matching Items (212)
- Genre: Academic theses
- Creators: Davulcu, Hasan
- Creators: Jiang, Hanqing
- Member of: Theses and Dissertations
Text search is a very useful way of retrieving document information from a particular website. The public generally use internet search engines over the local enterprise search engines, because the enterprise content is not cross linked and does not follow a page rank algorithm. On the other hand the enterprise search engine uses metadata information, which allows the user to specify the conditions that any retrieved document should meet. Therefore, using metadata information for searching will also be very useful. My thesis aims on developing an enterprise search engine using metadata information by providing advanced features like faceted navigation. The search engine data was extracted from various Indonesian web sources. Metadata information like person, organization, location, and sentiment analytic keyword entities should be tagged in each document to provide facet search capability. A shallow parsing technique like named entity recognizer is used for this purpose. There are more than 1500 entities that have been tagged in this process. These documents have been successfully converted into XML format and are indexed with "Apache Solr". It is an open source enterprise search engine with full text search and faceted search capabilities. The entities will be helpful for users to specify conditions and search faster through the large collection of documents. The user is assured results by clicking on a metadata condition. Since the sentiment analytic keywords are tagged with positive and negative values, social scientists can use these results to check for overlapping or conflicting organizations and ideologies. In addition, this tool is the first of its kind for the Indonesian language. The results are fetched much faster and with better accuracy.
US Senate is the venue of political debates where the federal bills are formed and voted. Senators show their support/opposition along the bills with their votes. This information makes it possible to extract the polarity of the senators. Similarly, blogosphere plays an increasingly important role as a forum for public debate. Authors display sentiment toward issues, organizations or people using a natural language.
In this research, given a mixed set of senators/blogs debating on a set of political issues from opposing camps, I use signed bipartite graphs for modeling debates, and I propose an algorithm for partitioning both the opinion holders (senators or blogs) and the issues (bills or topics) comprising the debate into binary opposing camps. Simultaneously, my algorithm scales the entities on a univariate scale. Using this scale, a researcher can identify moderate and extreme senators/blogs within each camp, and polarizing versus unifying issues. Through performance evaluations I show that my proposed algorithm provides an effective solution to the problem, and performs much better than existing baseline algorithms adapted to solve this new problem. In my experiments, I used both real data from political blogosphere and US Congress records, as well as synthetic data which were obtained by varying polarization and degree distribution of the vertices of the graph to show the robustness of my algorithm.
I also applied my algorithm on all the terms of the US Senate to the date for longitudinal analysis and developed a web based interactive user interface www.PartisanScale.com to visualize the analysis.
US politics is most often polarized with respect to the left/right alignment of the entities. However, certain issues do not reflect the polarization due to political parties, but observe a split correlating to the demographics of the senators, or simply receive consensus. I propose a hierarchical clustering algorithm that identifies groups of bills that share the same polarization characteristics. I developed a web based interactive user interface www.ControversyAnalysis.com to visualize the clusters while providing a synopsis through distribution charts, word clouds, and heat maps.
With the advent of Internet, the data being added online is increasing at enormous rate. Though search engines are using IR techniques to facilitate the search requests from users, the results are not effective towards the search query of the user. The search engine user has to go through certain webpages before getting at the webpage he/she wanted. This problem of Information Overload can be solved using Automatic Text Summarization. Summarization is a process of obtaining at abridged version of documents so that user can have a quick view to understand what exactly the document is about. Email threads from W3C are used in this system. Apart from common IR features like Term Frequency, Inverse Document Frequency, Term Rank, a variation of page rank based on graph model, which can cluster the words with respective to word ambiguity, is implemented. Term Rank also considers the possibility of co-occurrence of words with the corpus and evaluates the rank of the word accordingly. Sentences of email threads are ranked as per features and summaries are generated. System implemented the concept of pyramid evaluation in content selection. The system can be considered as a framework for Unsupervised Learning in text summarization.
Hydrogen embrittlement (HE) is a phenomenon that affects both the physical and chemical properties of several intrinsically ductile metals. Consequently, understanding the mechanisms behind HE has been of particular interest in both experimental and modeling research. Discrepancies between experimental observations and modeling results have led to various proposals for HE mechanisms. Therefore, to gain insights into HE mechanisms in iron, this dissertation aims to investigate several key issues involving HE such as: a) the incipient crack tip events; b) the cohesive strength of grain boundaries (GBs); c) the dislocation-GB interactions and d) the dislocation mobility.
The crack tip, which presents a preferential trap site for hydrogen segregation, was examined using atomistic methods and the continuum based Rice-Thompson criterion as sufficient concentration of hydrogen can alter the crack tip deformation mechanism. Results suggest that there is a plausible co-existence of the adsorption induced dislocation emission and hydrogen enhanced decohesion mechanisms. In the case of GB-hydrogen interaction, we observed that the segregation of hydrogen along the interface leads to a reduction in cohesive strength resulting in intergranular failure. A methodology was further developed to quantify the role of the GB structure on this behavior.
GBs play a fundamental role in determining the strengthening mechanisms acting as an impediment to the dislocation motion; however, the presence of an unsurmountable barrier for a dislocation can generate slip localization that could further lead to intergranular crack initiation. It was found that the presence of hydrogen increases the strain energy stored within the GB which could lead to a transition in failure mode. Finally, in the case of body centered cubic metals, understanding the complex screw dislocation motion is critical to the development of an accurate continuum description of the plastic behavior. Further, the presence of hydrogen has been shown to drastically alter the plastic deformation, but the precise role of hydrogen is still unclear. Thus, the role of hydrogen on the dislocation mobility was examined using density functional theory and atomistic simulations. Overall, this dissertation provides a novel atomic-scale understanding of the HE mechanism and development of multiscale tools for future endeavors.
A comprehensive study of impact of growth conditions on structural and magnetic properties of CZTB thin films
Soft magnetic materials have been studied extensively in the recent past due to their applications in micro-transformers, micro-inductors, spin dependent memories etc. The unique features of these materials are the high frequency operability and high magnetic anisotropy. High uniaxial anisotropy is one of the most important properties for these materials. There are many methods to achieve high anisotropy energy (Hk) which include sputtering with presence of magnetic field, exchange bias and oblique angle sputtering.
This research project focuses on analyzing different growth techniques of thin films of Cobalt, Zirconium Tantalum Boron (CZTB) and the quality of the films resulted. The measurements include magnetic moment measurements using a Vibrating Sample Magnetometer, electrical measurements using 4 point resistivity methods and structural characterization using Scanning Electron Microscopy. Subtle changes in the growth mechanism result in different properties of these films and they are most suited for certain applications.
The growth methods presented in this research are oblique angled sputtering with localized magnetic field and oblique sputtering without presence of magnetic field. The uniaxial anisotropy can be controlled by changing the angle during sputtering. The resulting film of CZTB is tested for magnetic anisotropy and soft magnetism at room temperature by using Lakeshore 7500 Vibrating Sample Magnetometer. The results are presented, analyzed and explained using characterization techniques. Future work includes magnetic field presence during deposition, magnetic devices of this film with giga hertz range operating frequencies.
Micro-blogging platforms like Twitter have become some of the most popular sites for people to share and express their views and opinions about public events like debates, sports events or other news articles. These social updates by people complement the written news articles or transcripts of events in giving the popular public opinion about these events. So it would be useful to annotate the transcript with tweets. The technical challenge is to align the tweets with the correct segment of the transcript. ET-LDA by Hu et al  addresses this issue by modeling the whole process with an LDA-based graphical model. The system segments the transcript into coherent and meaningful parts and also determines if a tweet is a general tweet about the event or it refers to a particular segment of the transcript. One characteristic of the Hu et al’s model is that it expects all the data to be available upfront and uses batch inference procedure. But in many cases we find that data is not available beforehand, and it is often streaming. In such cases it is infeasible to repeatedly run the batch inference algorithm. My thesis presents an online inference algorithm for the ET-LDA model, with a continuous stream of tweet data and compare their runtime and performance to existing algorithms.
Browsing Twitter users, or browsers, often find it increasingly cumbersome to attach meaning to tweets that are displayed on their timeline as they follow more and more users or pages. The tweets being browsed are created by Twitter users called originators, and are of some significance to the browser who has chosen to subscribe to the tweets from the originator by following the originator. Although, hashtags are used to tag tweets in an effort to attach context to the tweets, many tweets do not have a hashtag. Such tweets are called orphan tweets and they adversely affect the experience of a browser.
A hashtag is a type of label or meta-data tag used in social networks and micro-blogging services which makes it easier for users to find messages with a specific theme or content. The context of a tweet can be defined as a set of one or more hashtags. Users often do not use hashtags to tag their tweets. This leads to the problem of missing context for tweets. To address the problem of missing hashtags, a statistical method was proposed which predicts most likely hashtags based on the social circle of an originator.
In this thesis, we propose to improve on the existing context recovery system by selectively limiting the candidate set of hashtags to be derived from the intimate circle of the originator rather than from every user in the social network of the originator. This helps in reducing the computation, increasing speed of prediction, scaling the system to originators with large social networks while still preserving most of the accuracy of the predictions. We also propose to not only derive the candidate hashtags from the social network of the originator but also derive the candidate hashtags based on the content of the tweet. We further propose to learn personalized statistical models according to the adoption patterns of different originators. This helps in not only identifying the personalized candidate set of hashtags based on the social circle and content of the tweets but also in customizing the hashtag adoption pattern to the originator of the tweet.
Environmentally responsive hydrogels are one interesting class of soft materials. Due to their remarkable responsiveness to stimuli such as temperature, pH, or light, they have attracted widespread attention in many fields. However, certain functionality of these materials alone is often limited in comparison to other materials such as silicon; thus, there is a need to integrate soft and hard materials for the advancement of environmental-ly responsive materials.
Conventional hydrogels lack good mechanical properties and have inherently slow response time, important characteristics which must be improved before the hydrogels can be integrated with silicon. In the present dissertation work, both these important attrib-utes of a temperature responsive hydrogel, poly(N-isopropylacrylamide) (PNIPAAm), were improved by adopting a low temperature polymerization process and adding a sili-cate compound, tetramethyl orthosilicate. Furthermore, the transition temperature was modulated by adjusting the media quality in which the hydrogels were equilibrated, e.g. by adding a co-solvent (methanol) or an anionic surfactant (sodium dodecyl sulfate). In-terestingly, the results revealed that, based on the hydrogels’ porosity, there were appre-ciable differences when the PNIPAAm hydrogels interacted with the media molecules.
Next, an adhesion mechanism was developed in order to transfer silicon thin film onto the hydrogel surface. This integration provided a means of mechanical buckling of the thin silicon film due to changes in environmental stimuli (e.g., temperature, pH). We also investigated how novel transfer printing techniques could be used to generate pat-terned deformation of silicon thin film when integrated on a planar hydrogel substrate. Furthermore, we explore multilayer hybrid hydrogel structures formed by the integration of different types of hydrogels that have tunable curvatures under the influence of differ-ent stimuli. Silicon thin film integration on such tunable curvature substrates reveal char-acteristic reversible buckling of the thin film in the presence of multiple stimuli.
Finally, different approaches of incorporating visible light response in PNIPAAm are discussed. Specifically, a chemical chromophore- spirobenzopyran was synthesized and integrated through chemical cross-linking into the PNIPAAm hydrogels. Further, methods of improving the light response and mechanical properties were also demonstrat-ed. Interestingly, such a system was shown to have potential application as light modulated topography altering system
Lighting systems and air-conditioning systems are two of the largest energy consuming end-uses in buildings. Lighting control in smart buildings and homes can be automated by having computer controlled lights and window blinds along with illumination sensors that are distributed in the building, while temperature control can be automated by having computer controlled air-conditioning systems. However, programming actuators in a large-scale environment for buildings and homes can be time consuming and expensive. This dissertation presents an approach that algorithmically sets up the control system that can automate any building without requiring custom programming. This is achieved by imbibing the system self calibrating and self learning abilities.
For lighting control, the dissertation describes how the problem is non-deterministic polynomial-time hard(NP-Hard) but can be resolved by heuristics. The resulting system controls blinds to ensure uniform lighting and also adds artificial illumination to ensure light coverage remains adequate at all times of the day, while adjusting for weather and seasons. In the absence of daylight, the system resorts to artificial lighting.
For temperature control, the dissertation describes how the temperature control problem is modeled using convex quadratic programming. The impact of every air conditioner on each sensor at a particular time is learnt using a linear regression model. The resulting system controls air-conditioning equipments to ensure the maintenance of user comfort and low cost of energy consumptions. The system can be deployed in large scale environments. It can accept multiple target setpoints at a time, which improves the flexibility and efficiency of cooling systems requiring temperature control.
The methods proposed work as generic control algorithms and are not preprogrammed for a particular place or building. The feasibility, adaptivity and scalability features of the system have been validated through various actual and simulated experiments.
Techniques for supporting prediction of security breaches in critical cloud infrastructures using Bayesian network and Markov decision process
Emerging trends in cyber system security breaches in critical cloud infrastructures show that attackers have abundant resources (human and computing power), expertise and support of large organizations and possible foreign governments. In order to greatly improve the protection of critical cloud infrastructures, incorporation of human behavior is needed to predict potential security breaches in critical cloud infrastructures. To achieve such prediction, it is envisioned to develop a probabilistic modeling approach with the capability of accurately capturing system-wide causal relationship among the observed operational behaviors in the critical cloud infrastructure and accurately capturing probabilistic human (users’) behaviors on subsystems as the subsystems are directly interacting with humans. In our conceptual approach, the system-wide causal relationship can be captured by the Bayesian network, and the probabilistic human behavior in the subsystems can be captured by the Markov Decision Processes. The interactions between the dynamically changing state graphs of Markov Decision Processes and the dynamic causal relationships in Bayesian network are key components in such probabilistic modelling applications. In this thesis, two techniques are presented for supporting the above vision to prediction of potential security breaches in critical cloud infrastructures. The first technique is for evaluation of the conformance of the Bayesian network with the multiple MDPs. The second technique is to evaluate the dynamically changing Bayesian network structure for conformance with the rules of the Bayesian network using a graph checker algorithm. A case study and its simulation are presented to show how the two techniques support the specific parts in our conceptual approach to predicting system-wide security breaches in critical cloud infrastructures.