Cardinality Estimator: Processing SQL with a Vertical Scanning Convolutional Neural Network

来源 :计算机科学技术学报:英文版 | 被引量 : 0次 | 上传用户:caikuairen
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Although the popular database systems perform well on query optimization, they still face poor query execution plans when the join operations across multiple tables are complex. Bad execution planning usually results in bad cardinality estimations. The cardinality estimation models in traditional databases cannot provide high-quality estimation, because they are not capable of capturing the correlation between multiple tables in an effective fashion. Recently, the state-of-the-art learning-based cardinality estimation is estimated to work better than the traditional empirical methods. Basically, they used deep neural networks to compute the relationships and correlations of tables. In this paper, we propose a vertical scanning convolutional neural network (abbreviated as VSCNN) to capture the relationships between words in the word vector in order to generate a feature map. The proposed learning-based cardinality estimator converts Structured Query Language (SQL) queries from a sentence to a word vector and we encode table names in the one-hot encoding method and the samples into bitmaps, separately, and then merge them to obtain enough semantic information from data samples. In particular, the feature map obtained by VSCNN contains semantic information including tables, joins, and predicates about SQL queries. Importantly, in order to improve the accuracy of cardinality estimation, we propose the negative sampling method for training the word vector by gradient descent from the base table and compress it into a bitmap. Extensive experiments are conducted and the results show that the estimation quality of q-error of the proposed vertical scanning convolutional neural network based model is reduced by at least 14.6% when compared with the estimators in traditional databases.
其他文献
目的:对骨科疾病患者行预见性护理指引干预,分析其护理应用价值.方法:于我院2020年2月--2021年2月将在骨科展开治疗148例患者为对象,通过等分法将其分成2组,分析组合计74例,
目的:分析泌尿外科护理中采用情景式健康教育的应用效果.方法:研究时间段为2019年11月-2020年11月期间,选择100例泌尿外科收治的患者作为研究对象,按照数字表达法分成两组,其
顶管施工的优势明显,不会对房屋、道路造成威胁,能够避免施工过程中与其他社会部门的沟通问题,减少了施工成本消耗,且缩短了工程周期,因此得到快速推广,成为管道施工的主要手段。以某市政管道顶管施工为研究内容,详述了顶管工程的施工流程和注意事项,并针对施工过程中出现的疑难问题进行分析,提出有效对策,为施工人员提供了借鉴。
目的:探究对保守治疗的LDH患者予以中医辩证施护的临床效果.方法:将2018年12月~2020年12月我院骨科收治的68例LDH患者按照抽签化组法分为研讨组与参考组,均给予两组患者常规护
A dynamic geometry system,as an important application in the field of geometric constraint solving,is widely used in elementary mathematics education;moreover,the dynamic geometry system is also a fundamental environment for automated theorem proving in g
目的:探讨护理教育联合随访教育在特需门诊患者干预中的应用.方法:以我院特需门诊就诊的100位患者为研究对象,随机将其分为观察组和对照组,每组50例.对照组实施常规健康教育,
随着机电行业突飞猛进的发展,工程监理这一技术服务行业对机电安装监理人员的技术要求也越来越高。除了在施工阶段根据合同文件、设计图纸、验收规范等做好相应的监理工作外,还需在施工监理阶段考虑到项目后期运营使用的问题,为项目的建设和使用增值。从自身项目机电监理工作出发,浅谈暖通施工中的质量控制点,以期为类似问题提供参考。
摘 要: 为了解决大规模数据环境下挖掘出的关联规则过多,用户需要耗费大量时间在这些关联规则中寻找自己感兴趣规则的问题,提出了一种基于Map/Reduce并行化编程模型的前后部项约束关联规则挖掘算法FRPFP。通过对用户感兴趣的规则前后部项进行标记和分组挖掘,并在各分组挖掘过程中根据标记的规则前后部约束项,对事务集进行压缩,从而筛选出有效的频繁项集,最终得到含有用户感兴趣项的关联规则。该算法在Spa
Purpose::Posttraumatic stress disorder (PTSD) is a significant global mental health concern, especially in the military. This study aims to estimate the efficac
风力发电作为我国新能源发电项目中的一种常见发电项目,几乎在中国各地都有分布,为中国社会主义经济发展建设做出了相当程度的贡献。在陆上风电项目设备安装施工过程中,存在着各方面的不足,影响着我国风电良好发展的前进势头。详细阐述了陆上风电项目设备安装的过程,并对陆上风电项目设备安装现场普遍出现的问题进行了分析,探究了相应的解决对策。对陆上风电的设备安装监理工作有较强的实践指导作用。