A pipelined vector processor and memory architecture for cyclostationary processing
Bernstein, Raymond F.
Loomis, Herschel H., Jr.
MetadataShow full item record
This work describes a scaleable, high performance, pipelined, vector processor architecture. Special emphasis is placed on performing fast Fourier transforms with mixed-radix butterfly operations. The initial motivation for the architecture was the computation of cyclostationary algorithms. However, the resulting architecture is capable of general purpose vector processing as well. A major factor affecting the performance of the architecture is the memory system design. The use of pipelining techniques, coupled with vector processing, places a substantial burden on the memory system performance. The memory design is based on an interleaved memory philosophy with a buffering technique referred to as split transaction memory (STM). A crucial aspect of the memory design is the memory decoding scheme. A design methodology is described for the specification of permutation matrices that yield near optimal performance for the memory system. Another important aspect of this work is the development of a software based simulator that allows a STM to be specified. The simulator, operating at the register transfer level, emulates the processing of an address stream by STM and records the events for post-processing. The STM simulator was used to evaluate three types of vector processing address patterns: constant stride, constant geometry radix-r butterfly, and digit reversed. A random address pattern was also analyzed in the context of general-purpose computing. STM simulation verified the near-optimal performance of the STM.
RightsThis publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. Copyright protection is not available for this work in the United States.
Showing items related by title, author, creator and subject.
Architectural development and performance analysis of a primary data cache with read miss address prediction capability Christensen, Kathryn S. (Monterey, California. Naval Postgraduate School, 1998-06);This work is part of an ongoing effort to bridge the cycle time gap between high speed processing units and low speed main memories through the use of memory hierarchies. Cache memory exploits the principle of locality by ...
Ҫamligüney, Altay (1996-09);Memory subsystem bandwidth and latency are two major problems for modern computer architectures because memory speed should grow linearly with central processing unit (CPU) speed to maintain balanced system performance. ...
Afinidad, Francis B.; Irvine, Cynthia E.; Nguyen, Thuy D.; Levin, Timothy E. (Monterey, California. Naval Postgraduate School, 2005-11); NPS-CS-06-002Time is often a critical factor for making decisions regarding access to information. To manage and protect critical data in this regard implies that information systems need to enforce temporal security policies. However, ...