论文部分内容阅读
在我们考虑大数据时,我们注意到了“大”这个字,但是在建设基础架构时,我们还应该注意“分布式”。事实上,大数据应用程序需要处理大规模信息,而且在出于弹性的考虑将数据复制到多个位置时,信息的规模变得越来越大。但是,大数据的最重要属性并不在于它的规模,而在于它将大作业分割成许多小作业的能力,它能够将处理一个任务的资源分散到多个位置变为并行处理。在将大规模和分布式架构组合在一起时,我们就能发现
When we considered big data, we noticed the word “big”, but we should also pay attention to “distributed” when building the infrastructure. In fact, big data applications need to deal with large-scale information, and the size of the information becomes larger as the data is replicated to multiple locations for resiliency considerations. However, the most important attribute of big data lies not in its size, but in its ability to split large jobs into smaller jobs, which can diversify the resources that process a task into multiple locations into parallel processing. When we combine the large-scale and distributed architecture, we can find out