Classifying latent attributes of social media users has many applications in public health, politics, and marketing. For example, web-based... Show moreClassifying latent attributes of social media users has many applications in public health, politics, and marketing. For example, web-based studies of public health require monthly estimates of the health status and demographics of users based on their public communications. Most existing approaches are based on supervised learning. Supervised learning requires human annotated labeled data, which can be expensive and many attributes such as health are hard to annotate at the user level. In this thesis, we investigate classification algorithms that use population statistical constraints such as demographics, names, polls, and social network followers to predict individual user attributes. For example, the racial makeup of counties is a source of light supervision came from the U.S. Census to train classification models. These statistics are usually easy to obtain, and a large amount of unlabeled data from social media sites (e.g. Twitter) are available. Learning from Label Proportions (LLP) is a lightly supervised approach when the training data is multiple sets of unlabeled samples and only label distributions of them are known. Because social media users are not a representative sample of the population and constraints are too noisy, using existing LLP models (e.g. linear models, label regularization) is insufficient. We develop several new LLP algorithms to extend LLP to deal with this bias, including bag selection and robust classification models. Also, we propose a scalable model to infer political sentiment from the high temporal big data, and estimate the daily conditional probability of different attributes as a supplement method to polls, for social scientists. Because, constraints are not often available in some domains (e.g. blogs), we propose a self-training algorithm to gradually adapt a classifier trained on social media to a different but similar field. We also extend our framework to deep learning and provide empirical results for demographic classification using the user profile image. Finally, when both textual and profile image are available for a user, we provide a co-training algorithm to iteratively improve both image and text classifications accuracy, and apply an ensemble method to achieve the highest precision. Ph.D. in Computer Science, May 2017 Show less