Search results
(1 - 6 of 6)
- Title
- ANALYZING THE LINGUISTIC CHARACTERISTICS OF MARIJUANA USE BY INCOME USING SOCIAL MEDIA
- Creator
- Zeinali, Sahand
- Date
- 2018, 2018-05
- Description
-
Marijuana use and legality has been a widely-discussed topic in the recent years. Knowing that marijuana has different effects on health, mood...
Show moreMarijuana use and legality has been a widely-discussed topic in the recent years. Knowing that marijuana has different effects on health, mood and behavior after its use, it is important to understand what the underlying causes for marijuana use also are. As marijuana use is becoming more prevalent every day, it is crucial to know what the motives behind the users' tendencies are for smoking marijuana. To be able to identify the words/patterns associated with marijuana use prior to its use, we will need a real-time method to understand the problem on a deeper level with a better method than surveying users. In our study, we aim to understand the different linguistic characteristics of marijuana users based on their income. Social media's provision of data into understanding and tracking people's behavior can be very beneficial in understanding the contrast between the different social classes prior to marijuana use and understand what the underlying causes are for their marijuana use. In our experiment, we use social media to analyze the patterns and characteristics of marijuana use based on income class. By collecting data on Twitter, we then proceed to classify users based on their income. Using this method, we predict the income of each user by utilizing the user's Twitter activity and their linguistic characteristics based on the tweets associated with them. Through the experiment, we can identify patterns amongst the marijuana users in two different income classes and predict what class a user will be placed in based on their recent Twitter activity with a good accuracy.
M.S. in Computer Science, May 2018
Show less
- Title
- REVEALING LINGUISTIC BIAS
- Creator
- Karmarkar, Sathyaveer S.
- Date
- 2021
- Description
-
Readers currently face bias in articles written by writers who focus more on partiality towards any person or organization than showing the...
Show moreReaders currently face bias in articles written by writers who focus more on partiality towards any person or organization than showing the real facts. The study aims to detect and reveal such bias against them and try to portray real facts without any partiality against any person or organization. The data is fetched by selecting various articles from Google, especially those containing some bias in them. The bias was checked by measuring the subjectivity and polarity of the article using multiple libraries such as NLTK etc. We created a google form to take readers’ views showing them randomly either the biased article or the improved article after changing bias and getting their opinions.
Show less
- Title
- Towards Assisting Human-Human Conversations
- Creator
- Nanaware, Tejas Suryakant
- Date
- 2021
- Description
-
The idea of the research is to understand the open-topic conversations and ways to provide assistance to humans who face difficulties in...
Show moreThe idea of the research is to understand the open-topic conversations and ways to provide assistance to humans who face difficulties in initiating conversations and overcome social anxiety so as to be able to talk and have successful conversations. By providing humans with assistive conversational support, we can augment the conversation that can be carried out. The AdvisorBot can also help to reduce the time taken to type and convey the message if the AdvisorBot is context aware and capable of providing good responses.There has been a significant research for creating conversational chatbots in open-domain conversations that have claimed to have passed the Turing Test and can converse with humans while not seeming like a bot. However, if these chatbots can converse like humans, can they provide actual assistance in human conversations? This research study observes and improves the advanced open-domain conversational chatbots that are put in practice for providing conversational assistance.While performing this thesis research, the chatbots were deployed to provide conversational assistance and a human study was performed to identify and improve the ways to tackle social anxiety by connecting strangers to perform conversations that would be aided by AdvisorBot. Through the questionnaires that the research subjects filled during their participation, and by performing linguistic analysis, the quality of the AdvisorBot can be improved so that humans can achieve better conversational skills and are able to clearly convey their message while conversing. The results were further enhanced by using transfer learning techniques and quickly improve the quality of the AdvisorBot.
Show less
- Title
- Large Language Model Based Machine Learning Techniques for Fake News Detection
- Creator
- Chen, Pin-Chien
- Date
- 2024
- Description
-
With advanced technology, it’s widely recognized that everyone owns one or more personal devices. Consequently, people are evolving into...
Show moreWith advanced technology, it’s widely recognized that everyone owns one or more personal devices. Consequently, people are evolving into content creators on social media or the streaming platforms sharing their personal ideas regardless of their education or expertise level. Distinguishing fake news is becoming increasingly crucial. However, the recent research only presents comparisons of detecting fake news between one or more models across different datasets. In this work, we applied Natural Language Processing (NLP) techniques with Naïve Bayes and DistilBERT machine learning method combing and augmenting four datasets. The results show that the balanced accuracy is higher than the average in the recent studies. This suggests that our approach holds for improving fake news detection in the era of widespread content creation.
Show less
- Title
- Evaluating Speech Separation Through Pre-Trained Deep Neural Network Models
- Creator
- Prabhakar, Deeksha
- Date
- 2023
- Description
-
Speaker separation involves separating individual speakers from a mixture of voices or background noise, known as the "cocktail party problem....
Show moreSpeaker separation involves separating individual speakers from a mixture of voices or background noise, known as the "cocktail party problem." This refers to the ability to focus on a specific sound while filtering out other distractions.In this analysis, we propose the idea of obtaining features present in the original data and then evaluating the impact they have on the ability of the model to separate the mixed audio streams. The dataset is prepared such that these feature values can be used as predictor variables to various models like Logistic Regression, Decision Trees, SVM (both rbf and linear kernel), XGBoost, AdaBoost, to obtain the most contributing features that is the features that will lead to a better separation. These results shall then be analyzed to conclude the features that affect separating the audio streams the most. Initially, 400 audio streams are selected from the VoxCeleb dataset and combined to form 200 single utterances. After the mixes are obtained, the pre-trained Speechbrain model, sepformer-whamr is used. This model separates the audio mixes given as input and obtain two outputs that should be as close as possible to the original ones. A feature list from the 400 chosen audios is obtained and then the effect of certain features on the model's capability to distinguish between multiple audio sources in a mixed recording is assessed. Two analysis parameters- permutation feature importance and SHAP values are used to conclude which features have more effect on separation. Our hypothesis is that the features contributing the most to a good separation are invariant across datasets. To test this hypothesis, we obtain 1,000 audio streams from the Mozilla Common Voice Dataset and perform the same experimental methodology described above. Our results demonstrate that the features we extract from VoxCeleb dataset are indeed invariant and aid in separating the audio streams of the Mozilla Common Voice dataset.
Show less
- Title
- Large Language Model Based Machine Learning Techniques for Fake News Detection
- Creator
- Chen, Pin-Chien
- Date
- 2024
- Description
-
With advanced technology, it’s widely recognized that everyone owns one or more personal devices. Consequently, people are evolving into...
Show moreWith advanced technology, it’s widely recognized that everyone owns one or more personal devices. Consequently, people are evolving into content creators on social media or the streaming platforms sharing their personal ideas regardless of their education or expertise level. Distinguishing fake news is becoming increasingly crucial. However, the recent research only presents comparisons of detecting fake news between one or more models across different datasets. In this work, we applied Natural Language Processing (NLP) techniques with Naïve Bayes and DistilBERT machine learning method combing and augmenting four datasets. The results show that the balanced accuracy is higher than the average in the recent studies. This suggests that our approach holds for improving fake news detection in the era of widespread content creation.
Show less