We collect data for 13,383 rst time code contributions from 45 projects on the website GitHub and analyze behavior of developers before... Show moreWe collect data for 13,383 rst time code contributions from 45 projects on the website GitHub and analyze behavior of developers before submitting code as well as community response to code contributions. Our ndings di er from previous research on open source software communities and social theories of learning in communities of practice. We nd most users do not participate in GitHub peripheral activities before submitting code changes. We also nd that community response to these submitted code changes is a poor predictor of whether or not the code is accepted. M.S. in Information Architecture, May 2014 Show less
We present a new approach to measuring political polarization, including a novel algorithm and open source Python code, which leverages... Show moreWe present a new approach to measuring political polarization, including a novel algorithm and open source Python code, which leverages Twitter content to produce measures of polarization for both users and hashtags. #Polar scores provide advantages over existing measures because they (1) can be calculated throughout the legislative cycle, (2) allow for easy differentiation between users with similar scores, (3) are chamber-agnostic, and (4) are a generic approach that can be applied beyond the U.S. Congress. #Polar scores leverage available information such as party labels, word frequency, and hashtags to create an accessible, straightforward algorithm for estimating polarity using text. (from the paper: Hemphill, L., Culotta, A., and Heston, M. (forthcoming) #Polar Scores: Measuring partisanship using social media content. Journal of Information Technology & Politics.) The dataset contains one plain text TSV file with the following information for each of the 55,244 tweets used to develop #Polar scores : tweet_id, created_at, user_id, screen_name, tag, shortid, sex, party, state, chamber, name. The file contains one row per hashtag, and therefore tweets may appear more than once. The Python code for calculating #Polar scores is available here: http://doi.org/10.5281/zenodo.53888 Show less