【摘 要】
:
Collaborative filtering(CF)plays a key role in various rec-ommendation systems,but its effiectiveness will be limited by the highly sparse user-image click-through data when CF deploys for image recom
【机 构】
:
Computer and Information Technology,Beijing Jiaotong University,Beijing,China
【出 处】
:
第六届中国计算机学会大数据学术会议
论文部分内容阅读
Collaborative filtering(CF)plays a key role in various rec-ommendation systems,but its effiectiveness will be limited by the highly sparse user-image click-through data when CF deploys for image recom-mendation applications.Some existing methods apply clustering tech-niques to mitigate the sparseness issue.However,there is still a big room to elevate the recommendation performance,because little is known in taking both click-through data and image visual information into accoun-t.In this paper,we propose an Asynchronously Bi-Clustering(ABC)CF approach to improve the CF-based image recommendation.Our ABC approach consists of two coupled clustering solutions.Concretely,it first implements image clustering based on image click-through and visual fea-ture,and then conducts user clustering in a low-dimensional subspace s-panned by the image clusters.The final recommendation is accomplished based on both user clusters and image clusters by a similarity fusion s-trategy.An empirical study shows that our ABC approach is bene cial to the CF-based image recommendation,and the proposed scheme is significantly more effective than some existing methods.
其他文献
互联网逐步融入人们日常生活的各个领域,基于URL的窃取用户信息及互联网金融账户等恶意URL开始成为了一大安全隐患,已有的传统基于黑名单的恶意URL的检测方法,不能解决海量网络流数据中恶意URL的检测问题,使用离线机器学习的检测恶意URL方式的时效性不强,不能很好地及时对恶意URL进行检测.本文采用在线学习算法训练恶意URL检测模型,充分利用了在线学习算法的模型更新效率高、以及可以利用有限的计算机资
降低能耗开销、建设绿色数据中心,已经成为目前大规模数据中心的重要需求.在绿色数据中心中,如何使数据库系统在满足性能需求的前提下尽量地节约能耗,即如何提高数据库系统的能耗有效性,是目前研究的重点.数据库系统中的能耗有效性旨在使用更少的电能来提供相同的服务,例如:处理的事务数量、响应的I/O请求数量等等.能耗有效性越高,说明数据库系统可以用更少的能耗就能够响应同样数量的操作,换句话说,可以用更少的能耗
In this paper,an intelligent inventory management system for vending machines based on image recognition has been proposed.The outside image of a vending machine goods cabinet is obtained by a camera
We study the GroupBy implementation scheme widely used in distributed systems and databases.The GroupBy operation partitions a set of out-of-order records into groups.Due to the massive data size,many
Recently,deep convolutional neural networks(CNNs)in single image super-resolution(SISR)have received excellent performance.However,most deep-learning-based methods do not make full use of low-level fe
数据中心数量与规模的不断扩大使得其能耗开销也快速上升,由于数据中心并不是持续处于高负载状态,因此研究者提出了“能耗同比性”设想,即系统的能耗可随着负载变化而动态调整.但是,如何实现服务器集群的能耗同比性还是一个未决问题.本文针对性地提出了一种基于负载预测的服务器集群能耗同比性控制方法.在一个时间窗口内采样服务器集群负载信息,然后通过时间线性序列拟合算法找出负载变化的关键点,并使用最小二乘法对关键点
Image dehazing has become a significant research area in recent years.However,the traditional dehazing algorithms based on statistics priors cannot adaptive to various conditions of natural hazy image
针对现有的半监督多标签特征选择方法利用l2-范数建立谱图易受到噪声影响的问题,本文提出一种基于l1图的半监督多标签特征选择方法,利用全局线性回归函数建立多标签特征选择模型,结合l1图获取局部描述信息,引入l2,1约束提升特征之间可区分度和回归分析的稳定性,避免噪声干扰.最后通过实验验证文中方法的有效性.
哈希方法作为最近邻搜索中的一个重要算法,具有快速及低内存的优良特性,能够较好的解决现实图像数据库中存在着样本标签信息缺失、人工标注成本过高等问题,因此在图像检索领域得到广泛使用.本文提出了一种基于语义相似度的无监督图像哈希方法.该方法首先对原始图像进行语义聚类,然后基于图像的语义相似性,把原始图像特征映射到汉明空间.同时为了增强哈希学习的鲁棒性,在所得到的目标函数中,该方法采用了l2,p范数(0<
近年来,在多标签分类中标签相关性研究成为热点之一.针对已有的基于k近邻的多标签相关性算法未充分考虑样本分布的问题进行了研究,即算法在利用近邻标签时因仅考虑了近邻标签相关性信息,这可能会使算法的鲁棒性有所降低.基于此,引入萤火虫方法(Firefly Algorithm),将相似度信息与标签信息相结合,提出一种融合萤火虫方法的多标签懒惰学习算法(FF-IMLLA).首先,利用Minkowski距离来度