Discovering High-Quality Threaded Discussions in Online Forums

来源 :Journal of Computer Science & Technology | 被引量 : 0次 | 上传用户:zhouyong910
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Archives of threaded discussions generated by users in online forums and discussion boards contain valuable knowledge on various topics. However, not all threads are useful because of deliberate abuses, such as trolling and flaming,that are commonly observed in online conversations. The existence of various users with different levels of expertise also makes it difficult to assume that every discussion thread stored online contains high-quality contents. Although finding high-quality threads automatically can help both users and search engines sift through a huge amount of thread archives and make use of these potentially useful resources effectively, no previous work to our knowledge has performed a study on such task. In this paper, we propose an automatic method for distinguishing high-quality threads from low-quality ones in online discussion sites. We first suggest four different artificial measures for inducing overall quality of a thread based on ratings of its posts. We then propose two tasks involving prediction of thread quality without using post rating information.We adopt a popular machine learning framework to solve the two prediction tasks. Experimental results on a real world forum archive demonstrate that our method can significantly improve the prediction performance across all four measures of thread quality on both tasks. We also compare how different types of features derived from various aspects of threads contribute to the overall performance and investigate key features that play a crucial role in discovering high-quality threads in online discussion sites. However, not all threads are useful because of deliberate abuses, such as trolling and flaming, that are commonly seen in online conversations. The existence of various users with different levels of expertise also makes it difficult to assume that every call thread stored online contains high-quality contents. However, finding high-quality threads automatically can help both users and search engines sift through a huge amount of thread archives and make use of these potentially useful resources effectively, no previous work to our knowledge has performed a study on such task. In this paper, we propose an automatic method for distinguishing high-quality threads from low-quality ones in online discussion sites. We first suggest four different artificial measures for inducing overall quality of a thread based on ratings of its posts. We then pr opose two tasks involving prediction of thread quality without using post rating information. We adopt a popular machine learning framework to solve the two prediction tasks. of thread quality on both tasks. We also compare how different types of features derived from various aspects of threads contribute to the overall performance and investigate key features that play a crucial role in discovering high-quality threads in online discussion sites.
其他文献
为了有效抑制机抖激光陀螺(RLG)输出数据中的随机漂移,提出了采用新陈代谢GM(1,1)灰色模型与时间序列模型融合的灰色时序建模新方法。依据所建模型对激光陀螺的漂移数据进行K
针对盾构隧道开挖对邻近桩基础的影响这一城市地下交通隧道建设中的难题,首先对盾构隧道开挖的三维数值模拟方法进行了探讨,此基础上分析了当隧道与邻近单桩基础之间相对位置
Based on characteristics of deep sea flexible mining system, a new pump-lockage ore transportation system was designed. According to Bernoulli equation and two-
The flow between two coaxial conical cylinders is numerically studied for two different configurations, with the inner cone rotating and the outer one at rest.
Modern petascale and future exascale systems are massively heterogeneous architectures. Developing productive intra-node programming models is crucial toward ad
Transmission electron microscopy (TEM) and physics-chemical phase analysis were employed to investigate the precipitates in high strength steels microalloyed wi
In this paper new high-strength and high-plasticity twinning induced plasticity (TWIP) steel for modern automobile body was investigated. Some basic experimenta
The dissolution of petroleum asphaltenes with ionic liquids is studied for the first time. The results show that the ionic liquids could be used as novel solven
Blending is an important unit operation in process industry. Blending scheduling is nonlinear optimization problem with constraints. It is difficult to obtain o
Microstructure and mechanical properties of pure magnesium and AZ31 alloy with Ca/Si based refiner addition were investigated. The results indicate that additio