Matching Items (2)
Filtering by

Clear all filters

151336-Thumbnail Image.png
Description
Over 2 billion people are using online social network services, such as Facebook, Twitter, Google+, LinkedIn, and Pinterest. Users update their status, post their photos, share their information, and chat with others in these social network sites every day; however, not everyone shares the same amount of information. This thesis

Over 2 billion people are using online social network services, such as Facebook, Twitter, Google+, LinkedIn, and Pinterest. Users update their status, post their photos, share their information, and chat with others in these social network sites every day; however, not everyone shares the same amount of information. This thesis explores methods of linking publicly available data sources as a means of extrapolating missing information of Facebook. An application named "Visual Friends Income Map" has been created on Facebook to collect social network data and explore geodemographic properties to link publicly available data, such as the US census data. Multiple predictors are implemented to link data sets and extrapolate missing information from Facebook with accurate predictions. The location based predictor matches Facebook users' locations with census data at the city level for income and demographic predictions. Age and relationship based predictors are created to improve the accuracy of the proposed location based predictor utilizing social network link information. In the case where a user does not share any location information on their Facebook profile, a kernel density estimation location predictor is created. This predictor utilizes publicly available telephone record information of all people with the same surname of this user in the US to create a likelihood distribution of the user's location. This is combined with the user's IP level information in order to narrow the probability estimation down to a local regional constraint.
ContributorsMao, Jingxian (Author) / Maciejewski, Ross (Thesis advisor) / Farin, Gerald (Committee member) / Wang, Yalin (Committee member) / Arizona State University (Publisher)
Created2012
155625-Thumbnail Image.png
Description
The process of combining data is one in which information from disjoint datasets sharing at least a number of common variables is merged. This process is commonly referred to as data fusion, with the main objective of creating a new dataset permitting more flexible analyses than the separate analysis of

The process of combining data is one in which information from disjoint datasets sharing at least a number of common variables is merged. This process is commonly referred to as data fusion, with the main objective of creating a new dataset permitting more flexible analyses than the separate analysis of each individual dataset. Many data fusion methods have been proposed in the literature, although most utilize the frequentist framework. This dissertation investigates a new approach called Bayesian Synthesis in which information obtained from one dataset acts as priors for the next analysis. This process continues sequentially until a single posterior distribution is created using all available data. These informative augmented data-dependent priors provide an extra source of information that may aid in the accuracy of estimation. To examine the performance of the proposed Bayesian Synthesis approach, first, results of simulated data with known population values under a variety of conditions were examined. Next, these results were compared to those from the traditional maximum likelihood approach to data fusion, as well as the data fusion approach analyzed via Bayes. The assessment of parameter recovery based on the proposed Bayesian Synthesis approach was evaluated using four criteria to reflect measures of raw bias, relative bias, accuracy, and efficiency. Subsequently, empirical analyses with real data were conducted. For this purpose, the fusion of real data from five longitudinal studies of mathematics ability varying in their assessment of ability and in the timing of measurement occasions was used. Results from the Bayesian Synthesis and data fusion approaches with combined data using Bayesian and maximum likelihood estimation methods were reported. The results illustrate that Bayesian Synthesis with data driven priors is a highly effective approach, provided that the sample sizes for the fused data are large enough to provide unbiased estimates. Bayesian Synthesis provides another beneficial approach to data fusion that can effectively be used to enhance the validity of conclusions obtained from the merging of data from different studies.
ContributorsMarcoulides, Katerina M (Author) / Grimm, Kevin (Thesis advisor) / Levy, Roy (Thesis advisor) / MacKinnon, David (Committee member) / Suk, Hye Won (Committee member) / Arizona State University (Publisher)
Created2017