link to page 4 ADSP-21369SIMD Computational Engine The data bus exchange register (PX) permits data to be passed between the 64-bit PM data bus and the 64-bit DM data bus, or The processor contains two computational processing elements between the 40-bit register file and the PM data bus. These reg- that operate as a single-instruction, multiple-data (SIMD) isters contain hardware to handle the data width difference. engine. The processing elements are referred to as PEX and PEY and each contains an ALU, multiplier, shifter, and register file. Timer PEX is always active, and PEY may be enabled by setting the PEYEN mode bit in the MODE1 register. When this mode is A core timer that can generate periodic software Interrupts. The enabled, the same instruction is executed in both processing ele- core timer can be configured to use FLAG3 as a timer expired ments, but each processing element operates on different data. signal. This architecture is efficient at executing math intensive DSP Single-Cycle Fetch of Instruction and Four Operands algorithms. The processor features an enhanced Harvard architecture in Entering SIMD mode also has an effect on the way data is trans- which the data memory (DM) bus transfers data and the pro- ferred between memory and the processing elements. When in gram memory (PM) bus transfers both instructions and data SIMD mode, twice the data bandwidth is required to sustain (see Figure 2). With separate program and data memory buses computational operation in the processing elements. Because of and on-chip instruction cache, the processors can simultane- this requirement, entering SIMD mode also doubles the band- ously fetch four operands (two over each data bus) and one width between memory and the processing elements. When instruction (from the cache), all in a single cycle. using the DAGs to transfer data in SIMD mode, two data values are transferred with each access of memory or the register file. Instruction CacheIndependent, Parallel Computation Units The processors include an on-chip instruction cache that enables three-bus operation for fetching an instruction and four Within each processing element is a set of computational units. data values. The cache is selective—only the instructions whose The computational units consist of an arithmetic/logic unit fetches conflict with PM bus data accesses are cached. This (ALU), multiplier, and shifter. These units perform all opera- cache allows full-speed execution of core, looped operations tions in a single cycle. The three units within each processing such as digital filter multiply-accumulates, and FFT butterfly element are arranged in parallel, maximizing computational processing. throughput. Single multifunction instructions execute parallel ALU and multiplier operations. In SIMD mode, the parallel Data Address Generators with Zero-Overhead Hardware ALU and multiplier operations occur in both processing Circular Buffer Support elements. These computation units support IEEE 32-bit single- The processor has two data address generators (DAGs). The precision floating-point, 40-bit extended precision floating- DAGs are used for indirect addressing and implementing circu- point, and 32-bit fixed-point data formats. lar data buffers in hardware. Circular buffers allow efficient Data Register File programming of delay lines and other data structures required in digital signal processing, and are commonly used in digital A general-purpose data register file is contained in each pro- filters and Fourier transforms. The two DAGs contain sufficient cessing element. The register files transfer data between the registers to allow the creation of up to 32 circular buffers computation units and the data buses, and store intermediate (16 primary register sets, 16 secondary). The DAGs automati- results. These 10-port, 32-register (16 primary, 16 secondary) cally handle address pointer wraparound, reduce overhead, register files, combined with the enhanced Harvard architecture increase performance, and simplify implementation. Circular of the ADSP-21369 processor, allow unconstrained data flow buffers can start and end at any memory location. between computation units and internal memory. The registers in PEX are referred to as R0–R15 and in PEY as S0–S15. Flexible Instruction SetContext Switch The 48-bit instruction word accommodates a variety of parallel operations for concise programming. For example, the Many of the processor’s registers have secondary registers that ADSP-21369 processor can conditionally execute a multiply, an can be activated during interrupt servicing for a fast context add, and a subtract in both processing elements while branching switch. The data registers in the register file, the DAG registers, and fetching up to four 32-bit values from memory—all in a sin- and the multiplier result registers all have secondary registers. gle instruction. The primary registers are active at reset, while the secondary registers are activated by control bits in a mode control register. Universal Registers These registers can be used for general-purpose tasks. The USTAT (4) registers allow easy bit manipulations (Set, Clear, Toggle, Test, XOR) for all system registers (control/status) of the core. Rev. H | Page 5 of 60 | March 2019 Document Outline Summary Dedicated Audio Components Table of Contents Revision History General Description SHARC Family Core Architecture SIMD Computational Engine Independent, Parallel Computation Units Data Register File Context Switch Universal Registers Timer Single-Cycle Fetch of Instruction and Four Operands Instruction Cache Data Address Generators with Zero-Overhead Hardware Circular Buffer Support Flexible Instruction Set On-Chip Memory On-Chip Memory Bandwidth ROM-Based Security Family Peripheral Architecture External Port SDRAM Controller External Memory Shared External Memory External Port Throughput Asynchronous Memory Controller Pulse-Width Modulation Digital Applications Interface (DAI) Serial Ports S/PDIF-Compatible Digital Audio Receiver/Transmitter Synchronous/Asynchronous Sample Rate Converter Input Data Port Precision Clock Generators Digital Peripheral Interface (DPI) Serial Peripheral (Compatible) Interface UART Port Peripheral Timers 2-Wire Interface Port (TWI) I/O Processor Features DMA Controller Delay Line DMA System Design Program Booting Power Supplies Target Board JTAG Emulator Connector Development Tools Integrated Development Environments (IDEs) EZ-KIT Lite Evaluation Board EZ-KIT Lite Evaluation Kits Software Add-Ins for CrossCore Embedded Studio Board Support Packages for Evaluation Hardware Middleware Packages Algorithmic Modules Designing an Emulator-Compatible DSP Board (Target) Additional Information Related Signal Chains Pin Function Descriptions Specifications Operating Conditions Electrical Characteristics ESD Caution Maximum Power Dissipation Absolute Maximum Ratings Timing Specifications Core Clock Requirements Voltage Controlled Oscillator Power-Up Sequencing Clock Input Clock Signals Reset Interrupts Core Timer Timer PWM_OUT Cycle Timing Timer WDTH_CAP Timing Pin to Pin Direct Routing (DAI and DPI) Precision Clock Generator (Direct Pin Routing) Flags SDRAM Interface Timing (166 MHz SDCLK) SDRAM Interface Enable/Disable Timing (166 MHz SDCLK) Memory Read Memory Write Asynchronous Memory Interface (AMI) Enable/Disable Shared Memory Bus Request Serial Ports Input Data Port Parallel Data Acquisition Port (PDAP) Pulse-Width Modulation Generators Sample Rate Converter—Serial Input Port Sample Rate Converter—Serial Output Port S/PDIF Transmitter S/PDIF Transmitter—Serial Input Waveforms S/PDIF Transmitter Input Data Timing Oversampling Clock (TxCLK) Switching Characteristics S/PDIF Receiver Internal Digital PLL Mode SPI Interface—Master SPI Interface—Slave JTAG Test Access Port and Emulation Output Drive Currents Test Conditions Capacitive Loading Thermal Characteristics 256-Ball BGA_ED Pinout 208-Lead LQFP_EP Pinout Package Dimensions Surface-Mount Design Ordering Guide