RUNTIME RECONFIGURATION

A reconfigurable computing system can have its functional­ity updated during execution, resulting in reduced resource requirements. A runtime reconfigurable system partitions a design temporally so that the entire design does not need to be resident in the FPGA at any given moment (38, 39). Configuration and execution can be overlapped to improve performance in the presence of reconfiguration latency. Us­ing this technique, designs that are larger than the physi­cal hardware resources can be realized in an efficient man­ner.

Dharma, a time-sharing FPGA architecture, was pro­posed that contains a functional block and an interconnect network (40). The interconnect and the logic can be time — shared. The authors proposed that emulated design topol­ogy be levelized in a folded pipeline manner; this topology simplifies the architecture and provides predictable inter­connect delay (Fig. 7).

Figure 4. Example of a logic emulation system. Arrays ofFPGAs and FPICs reside on the emulation modules. The user inputs the emulated design netlist and commands from the workstation. The workstation and control processor personalize the emulation modules, which are used in place of the emulated chip.

Figure 5. SPLASH2 architecture. Each board contains 16 FPGAs, XI through XI6. The blocks Ml through Ml6 are local memories of the FPGAs. A simplified 36-bit bus crossbar, with no permutation of the bit-lines within each bus, interconnects the 16 FPGAs. Another 36-bit bus connects the FPGAs in daisy-chain fashion. The local memories are dual ported with one port connecting to the FPGAs and the other port connecting to the external bus.

Single context, partially reconfigurable, and multiple context architectures have been proposed. In a single context system, any changes to the functionality of the FPGA involves reloading the entire bitstream; early FP — GAs were ofthis type. This scheme has the disadvantage of long reconfiguration time. Partial reconfiguration, as sup­ported by the Xilinx Virtex FPGAs (10), allows portions of the FPGA to be changed via a memory mapped scheme, whereas the other portions of the FPGA continue function­ing. Compared with a single context scheme, area overhead is associated in providing this feature. Multiple context architectures, such as NEC’s Dynamically Reconfigurable Processor (DRP) (41), allow a number of complete configu­rations to be stored in the fabric simultaneously and thus reconfiguration can be achieved in a small number of cy­
cles. This architecture has the shortest context switch time, however, a larger area overhead is associated with imple­mentation of this scheme.

The logical unit of reconfiguration could be at a num­ber of levels including the application, instruction, task, block, or sub-block level. An example of application-level reconfiguration could simply involve loading a runtime — dependent bitstream to support a particular coding stan­dard in a video coding application. The Dynamic Instruc­tion Set Computer (DISC) (42) supported demand-driven modification of the instruction set through partial reconfig­uration. The commercial Stretch processor (43) combines reconfigurable fabric with a processor to support the exe­cution of custom instructions implemented on a reconfig — urable fabric. Furthermore, the fabric can be reconfigured at runtime and the design environment is software-centric, with programming of the processor being in Stretch C.

An operating system for guarantee-based scheduling of hard real-time tasks has been proposed (44). Under control of software running on a microprocessor, task circuits can be scheduled online and placed in a suitable free space in a hardware task area. Communications between tasks and I/O are done though a task communication bus, and termi­nation of a task frees the reconfigurable resources used. It

Control FPGA

Control FPGA

FPGA +

FPGA +

4x 4GB DDR2

4x 4GB DDR2

DRAM+

DRAM+

4x MGT

4x MGT

4xIB4X Infiniband

4xIB4X Infiniband

Control FPGA

FPGA +

4x 4GB DDR2 DRAM+

4x MGT

138

Control FPGA

Control FPGA

FPGA +

FPGA +

4x 4GB DDR2

4x 4GB DDR2

DRAM+

DRAM+

4x MGT

4x MGT

4x IB4X Infiniband

4x IB4X Infiniband

Figure 6. BEE2 Compute Module block diagram. Compute modules can be interconnected via the Infiniband IB4X connectors, either directly or via a 10-Gigabit Ethernet switch. The 100-Base T Ethernet can be used for control, monitoring, or data archiving.

Figure 7. Dynamic Architecture for FPGA-based systems. The architecture contains a functional block and an interconnect network. The interconnect and the logic can be time shared. The emulated design topology is levelized in a folded pipeline manner. The levelized topology simplifies the architecture with predictable interconnect delay.

was shown that hardware in the hardware task area can be shared by tasks and the overheads associated with its implementation on a partially configurable platform were acceptably low.

A pipeline stage is often a convenient block-level unit for reconfiguration. In incremental pipeline reconfigura­tion (45), an application with S pipeline stages can be im­plemented in an FPGA with fewer than S physical pipeline stages. This is done by adding one pipeline stage and re­moving one pipeline stage in each stage of the computation. Execution and computation can be overlapped.

Runtime reconfiguration can be done at even lower lev­els. A crossbar switch which employs runtime reconfigu­ration of the FPGA’s routing resources has been described

(46) . This scheme was able to achieve density, switch up­date latency and performance higher than possible using conventional means.

Tools have been developed to support runtime reconfig­uration. For example, JBits (47) is a set of Java classes that provide an application programming interface to the Xil — inx FPGA bitstream. The interface operates on either bit­streams generated by Xilinx design tools or on bitstreams read back from actual hardware and allows the FPGA logic and routing resources to be modified.

Добавить комментарий

Ваш e-mail не будет опубликован. Обязательные поля помечены *