Search results
(1 - 2 of 2)
- Title
- ANALYZING THE LINGUISTIC CHARACTERISTICS OF MARIJUANA USE BY INCOME USING SOCIAL MEDIA
- Creator
- Zeinali, Sahand
- Date
- 2018, 2018-05
- Description
-
Marijuana use and legality has been a widely-discussed topic in the recent years. Knowing that marijuana has different effects on health, mood...
Show moreMarijuana use and legality has been a widely-discussed topic in the recent years. Knowing that marijuana has different effects on health, mood and behavior after its use, it is important to understand what the underlying causes for marijuana use also are. As marijuana use is becoming more prevalent every day, it is crucial to know what the motives behind the users' tendencies are for smoking marijuana. To be able to identify the words/patterns associated with marijuana use prior to its use, we will need a real-time method to understand the problem on a deeper level with a better method than surveying users. In our study, we aim to understand the different linguistic characteristics of marijuana users based on their income. Social media's provision of data into understanding and tracking people's behavior can be very beneficial in understanding the contrast between the different social classes prior to marijuana use and understand what the underlying causes are for their marijuana use. In our experiment, we use social media to analyze the patterns and characteristics of marijuana use based on income class. By collecting data on Twitter, we then proceed to classify users based on their income. Using this method, we predict the income of each user by utilizing the user's Twitter activity and their linguistic characteristics based on the tweets associated with them. Through the experiment, we can identify patterns amongst the marijuana users in two different income classes and predict what class a user will be placed in based on their recent Twitter activity with a good accuracy.
M.S. in Computer Science, May 2018
Show less
- Title
- Towards In-Network Semantic Analysis: A Case Study involving Spam Classification
- Creator
- Gueyraud, Cyprien, Sultana, Nik
- Date
- 2023-03-06
- Description
-
Analyzing free-form natural language expressions “in the network”—that is, on programmable switches and smart NICs—would enable packet...
Show moreAnalyzing free-form natural language expressions “in the network”—that is, on programmable switches and smart NICs—would enable packet-handling decisions that are based on the textual content of flows. This analysis would support richer, latency-critical data services that depend on language analysis—such as emergency response, misinformation classification, customer support, and query-answering applications. But packet forwarding and processing decisions usually rely on simple analyses based on table look-ups that are keyed on well-defined (and usually fixed size) header fields. P4 is the state of the art domain-specific language for programming network equipment, but, to the best of our knowledge, analyzing free-form text using P4 has not yet been investigated. Although there is an increasing variety of P4-programmable commodity network hardware available, using P4 presents considerable technical challenges for text analysis since the language lacks loops and fractional datatypes. This paper presents the first Bayesian spam classifier written in P4 and evaluates it using a standard dataset. The paper contributes techniques for the tokenization, analysis, and classification of free-form text using P4, and investigates trade-offs between classification accuracy and resource usage. It shows how classification accuracy can be tuned between 69.1% and 90.4%, and how resource usage can be reduced to 6% by trading-off accuracy. It uses the spam filtering use-case to motivate the need for more research into in network text analysis to enable future “semantic analysis” applications in programmable networks.
Show less