论文部分内容阅读
Computer vision ( CV) is widely expected to be the next big thing in emerging applications. So many heterogeneous architectures for computer vision emerge. However, plenty of data need to be transferred between different structures for heterogeneous architecture. The long data transfer delay becomes the mainly problem to limit the processing speed for computer vision applications. For re-ducing data transfer delay and fasting computer vision applications, a clustered data-driven array processor is proposed. A three-level pipelining processing element is designed which supports two-buffer data flow interface and 8 bits, 16 bits, 32 bits subtext parallel computation. At the same time, for accelerating transcendental function computation, a four-way shared pipelining transcen-dental function accelerator is designed, which is based on Y-intercept adjusted piecewise linear seg-ment algorithm. A distributed shared memory structure based on unified addressing is also em-ployed. To verify efficiency of architecture, some image processing algorithms are implemented on proposed architecture. Simultaneously the proposed architecture has been implemented on Xilinx ZC 706 development board. The same circuitry has been synthesized using SMIC 130 nm CMOS tech-nology. The circuitry is able to run at 100 MHz. Area is 26. 58 mm2 .