next up previous contents
Next: Three-link Robot Manipulator Up: Special Computer Architecture for Previous: Neural Networks and Robotics

2400-MFLOPS Reconfigurable Parallel VLSI Processor

A new concept was introduced in [14] for a reconfigurable floating-point multiply-adders to reduce the latency for robot control. This reconfigurations involves direct hardware connections between the multipliers and the adders. A parallel VLSI processor composed of several processor elements (PE) was proposed. In each PE, a switching hardware is used to change the connection between the multipliers and the adders, so that the multiply-adders with a desired numbers of multipliers can be constructed.

Each PE consists of two multipliers, two adders, a local memory (LM) and a switch circuit (SC) as shown in Figure 9. The inner connection of the SC is changed every clock cycle to reconfigure the multiply-adder. Figure 10 shows an example of a reconfigured multiply-adder that contains four multipliers.

  figure333
Figure 9: Reconfigurable parallel VLSI processor.  

  figure337
Figure 10: Reconfiguration of a multi-operand multiply-adder.  

The following examples shows the speed improvement of using this processor. The latency for differential inverse kinematics (DIK) computations of twelve-DOF manipulator is about tex2html_wrap_inline2767 which is about 180 times faster than the latency of a parallel processor approach using general-purpose microprocessors. Also, the latency for resolved acceleration control of a twelve-DOF manipulator is tex2html_wrap_inline2769 which is about 60 times faster than the latency of a parallel processor approach using conventional DSPs.

Figure 11 shows the reconfigured floating-point multi-operand multiply-adder in which there is a pre-normalize circuit before each stage of the addition, and only one post-normalize circuit only in the final stage adder, this reduces the time needed for pre- and post-normalization of the operands about one half using this method in comparison with the multi-operand adder shown in Figure 12.

  figure343
Figure 11: Reconfiguration for the floating-point multi-operand multiply-adder.  

  figure347
Figure 12: Conventional floating-point multi-operand multiply-adder.  

To perform multiplication in one clock cycle, the PE has pipeline registers as shown in Figure 13. For matrix operations, a reconfigurable parallel VLSI processor is shown in Figure 14. In this configuration, each PE has seven sixty-for-bit wide I/O channels to construct a two-dimensional linear array processor. Three I/O channels are provided for common data busses. The other four are to connect the neighboring PEs for the reconfiguration.

  figure353
Figure 13: Structure of the PE.  

  figure357
Figure 14: Reconfigurable parallel VLSI for matrix operations.  

Figure 15 shows the chip layout of the PE, and Figure 16 shows the features of this chip.

  figure363
Figure 15: Chip layout of the PE.  

  figure367
Figure 16: Features of the PE.  


next up previous contents
Next: Three-link Robot Manipulator Up: Special Computer Architecture for Previous: Neural Networks and Robotics

Matanya Elchanani
Wed Dec 18 17:00:21 EST 1996