A resource constrained loop pipelining technique for perfectly-nested loop structures.
Loading...
Authors
Aakre, Thor Davis
Subjects
Loop pipelining
Inter-iteration initiative interval
Modulo scheduling
Data dependence graph
Unimodular transformation
Module variable expansion
Inter-iteration initiative interval
Modulo scheduling
Data dependence graph
Unimodular transformation
Module variable expansion
Advisors
Zaky, Amr M.
Date of Issue
1993-09
Date
1993-Sep
Publisher
Monterey, California. Naval Postgraduate School
Language
en_US
Abstract
This thesis presents a new technique for loop pipelining of perfectly-nested for-loop structures which is designed to optimize loop execution on VLIW machines. Previously implemented loop pipelining techniques provide limited performance because they explicitly include the constraints imposed by a loop's cyclic dependences in their loop pipelining process. Some loop pipelining techniques have also ignored the realistic constraint of finite resource availability in the creation of final pipelined execution schedules. The new approach presented in this thesis eliminates the problem of cyclic dependences by first applying a linear transformation to the nested loop index space to ensure a cycle-free innermost loop, which is then pipelined using modulo scheduling for a known set of resources. The transformation guarantees that the target machine's available resources are the only limit to the amount of exploitable fine-grained parallelism within the innermost loop. This results in pipelined execution schedules having near-optimal, Inter-Iteration Initiation Intervals (IMu) with the achievable performance being scalable with the addition of resources. Consequently. our loop pipelining method' utilizes more fine-grained parallelism than other loop pipelining techniques which directly incorporate a loop's cyclic dependences in their pipelinlng process. We also explicitly provide a procedure for creating the resultant pipelied execution schedules. In addition, we investigate the negative effect that the transformation has on data locality and the cache miss rate, as well as the use of iteration space tiling to restore data locality and cache miss rate to the levels expected from sequential loop execution.
Type
Thesis
Description
Series/Report No
Department
Department of Computer Science
Organization
Naval Postgraduate School (U.S.)
Identifiers
NPS Report Number
Sponsors
Funder
Format
168 p.
Citation
Distribution Statement
Approved for public release; distribution is unlimited.
Rights
This publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. Copyright protection is not available for this work in the United States.
