TMS320C4x

The TMS320C4x is the second generation of 32-bit floating point digital signal processors. The first family member, the TMS320C40, was introduced in 1990. TMS320C4x family members target multiprocessor floating-point DSP systems for scientific, industrial, and military applications. The TMS320C4x is similar to (and object-code compatible with) its predecessor, TMS320C3x.

Key features of the TMS320C4x

The TMS320C4x has several key features:

Architecture

Central Processing Unit (CPU)

The ’C4x’s CPU has a register-based architecture. The CPU consists of several components:

Floating-point/integer multiplier- The multiplier performs single-cycle multiplications on 32-bit integer and 40-bit floating-point values. The ’C4x implementation of floating-point arithmetic allows for floating-point operations at fixed point speeds via a 25-ns instruction cycle and a high degree of parallelism.

Arithmetic Logic Unit (ALU)- The ALU performs single-cycle operations on 32-bit integer, 32-bit logical, and 40-bit floating-point data, including single-cycle integer and floating-point conversions. Results of the ALU are always maintained in 32-bit integer or 40-bit floating-point formats.

32-bit barrel shifter- The barrel shifter is coupled to the ALU and can perform shifts of up to 32 bits left or right. The shifter supports arithmetic shifts, logical shifts, and rotate-through-carry operations.

Internal buses (CPU1/CPU2 and REG1/REG2)- Four internal buses, CPU1, CPU2, REG1, and REG2, carry two operands from memory and two operands from the register file, thus allowing parallel multiplies and adds/subtracts on four integer or floating-point operands in a single cycle.

Auxiliary register arithmetic units (ARAU)- The two auxiliary register arithmetic units (ARAU0 and ARAU1) can generate two addresses in a single cycle. The ARAUs operate in parallel with the multiplier and ALU. They support addressing with displacements, index registers (IR0 and IR1), and circular and bit-reversed addressing.

CPU Primary register file- The ’C4x primary register file provides 32 registers in a multiport register file that is tightly coupled to the CPU. All of the primary register file registers can be operated upon by the multiplier and ALU and can be used as general-purpose registers.

CPU Expansion Register File- Besides the CPU primary register file, the expansion register file contains two special registers that act as pointers:

  1. The IVTP register points to the interrupt-vector table (IVT), which defines vectors for all interrupts.
  2. The TVTP register points to the trap vector table (TVT), which defines vectors for 512 traps.

Memory organization

The total memory reach of the ’C4x is 4G 32-bit words. Program memory (on chip RAM or ROM and external memory) as well as registers affecting timers,communication ports, and DMA channels are contained within this space. This allows tables, coefficients, program code, and data to be stored in either RAM or ROM. Thus, memory usage is maximized, and memory space allocated as desired.

Memory Map- The memory map for each processor is shown in Figure. The level at the external pin ROMEN determines whether or not the first megaword of memory addresses the internal ROM or external memory. The maps illustrate the entire address space of the ’C40 and ’C44. The value of ROMEN affects only the first megaword of memory:

Memory Addressing Modes- The ’C4x supports a base set of general-purpose instructions as well as arithmetic- intensive instructions that are particularly suited for digital signal processing and other numeric-intensive applications.

The following list shows the addressing modes with their addressing types:

Internal buses

A large portion of the ’C4x’s high performance is due to internal busing and parallelism. Separate buses allow for parallel program fetching, data accessing, and DMA accessing:

External bus operation

The ’C4x provides two identical external interfaces: the global memory interface and the local memory interface. Each consists of a 32-bit data bus, a 31-bit (’C40) or 24-bit (’C44) address bus, and two sets of control signals. Both buses can be used to address external program/data memory or I/O space.

Interrupts

The ’C4x supports four external interrupts (IIOF3–0), a number of internal interrupts, a non-maskable external NMI interrupt, and a non-maskable external RESET signal, which sets the processor to a known state. The DMA and communication ports have their own internal interrupts. When the CPU responds to the interrupt, the IACK pin can be used to signal an external interrupt acknowledge.

Peripherals

All ’C4x on-chip peripherals are controlled through memory-mapped registers on a dedicated peripheral bus. This peripheral bus is composed of a 32-bit data bus and a 32-bit address bus. The ’C4x peripherals include two timers and six (’C40) or four (’C44) communication ports.

Pipeline operation

Two characteristics of the ’C4x that contribute to its high performance are pipelining and concurrent I/O and CPU operation. Four functional units control ’C4x pipeline operation: fetch, decode, read, and execute. Pipelining is the overlapping or parallel operations of the fetch, decode, read, and execute levels of a basic instruction.

The four major units of the ’C4x pipeline structure and their functions are as follows: