DESCRIPTION
DYNAMICALLY CONFIGURABLE PROCESSOR CORE AND METHOD FOR MAKING
THE SAME
Technical Field
The present invention relates to electronic circuitry and, more particularly, to dynamically configurable processor cores and methods for making the same.
Background of the Invention
It is well known that the general model for executing tasks that are described in software code is to load the code into electronic circuitry developed to receive the code and then cause the code to control the operation of the electronic circuitry. This model requires that the software code must be interpreted or translated while it controls the operation of the electronic circuitry.
It is desirable to have a flexible way to develop prototypes and embedded applications by directly executing code in circuitry such as a field programmable gate array (FPGA) or application specific integrated circuit (ASIC) . It is also desirable to have ways to generate prototypes and embedded applications that include parameterized circuitry, to codesign and coverify hardware and software in prototypes and embedded applications, and to develop custom applications for embedded applications .
Summary of the Invention According to one aspect, the invention is a processor core for implementation in an electronic circuit. The processor core includes a memory, an instruction register, a register signal
circuit, a register file, and an arithmetic logic unit. The memory is responsive to memory write instructions to store address and data signals and responsive to memory read instructions to read and produce the stored address and data signals. The instruction register receives the produced stored address and data signals and holds the stored address and data signals in response to information contained in the produced stored address and data signals. The register signal circuit receives the produced stored address and data signals, the held stored address and data signals, and address signals and produce register signals in response thereto. The register file is adapted to receive the register signals produced by the register signal circuit, index registration signals and signal write instructions and to produce information, including the data signals stored in the memory. The arithmetic logic unit is adapted to receive the information produced by the register file and to produce the address signals received by the memory and by the register signal circuit.
According to another aspect, the invention is a method for implementing a processor core. The method includes the steps of a) producing a description of a memory responsive to memory write instructions to store address and data signals and responsive to memory read instructions to read and produce the stored address and data signals, and b) producing a description of an instruction register to receive the produced stored address and data signals and to hold the stored address and data signals in response to information contained in the produced stored address and data signals. The method also includes the steps of c) producing a description of a register signal circuit to receive the produced stored address and data signals, the held stored address and data signals, and address signals and to produce register signals in response thereto, and d) producing a
description of a register file adapted to receive the register signals produced by the register signal circuit, index registration signals and signal write instructions and to produce information including the data signals stored in the memory. The method also includes the steps of e) producing a description of an arithmetic logic unit adapted to receive the information produced by the register file and to produce the address signals received by the memory and by the register signal circuit, and f) implementing the descriptions produced in steps a) -e) .
Brief Description of the Drawings
Figure 1 is a block diagram of the architecture of a first preferred embodiment of the invention. Figure 2 is a block diagram of the architecture of a second preferred embodiment of the invention.
Figure 3 is a block diagram of the design environment of the present invention.
Detailed Description of the Preferred Embodiments of the
Invention
The present invention provides previously unattained flexibility for a wide range of prototyping and embedded applications. For embedded applications that require small memory foot-prints and flexible reconfigurable architectures, the present invention minimizes runtime by using direct bytecode execution.
While the present invention is applicable to a wide range of implementations, the following detailed description will be given in terms of a configurable processor core that executes
Java® bytecode. This processor core is targeted to the Xilinx
Virtex FPGA architecture. This provides a clean-room
implementation of the Java® Virtual Machine, and is provided as a synthesizable • "soft-core" with a suite of tools for parameterized core generation, hardware/software co-design, co- verification, and custom Java® application development. Those skilled in the relevant arts will know how to implement the present invention using other bytecode, as well as in ASICs.
In the case of a Java® processor core, both the application and the operating system code can be developed in Java® and compiled to native code by standard Java® compilers. As such, the Java® processor core brings embedded Java® technology to the field reconfigurable technology. The customizable core and dynamic reconfiguration take full advantage of the features of programmable hardware. The exemplary implementation represents the first known formally synthesized Java® processor, as well as the first known configurable Java® processor. It also represents the first known Java® processor IP core targeted to programmable logic devices (PLDs) for dynamic reconfiguration. Figure 1 is a block diagram of the architecture of a first preferred embodiment of the invention. The processor core 10 has a 32-bit architecture with 32-bit address and data paths. As will be understood by those skilled in the relevant arts, the processor core 10 can be implemented in a Virtex XCV300 BG352-4 FPGA. The processor core 10 includes a 16-32-bit dual-ported RAM 12 that implements a register-file. The processor core 10 also includes an instruction register 14 that is composed of three 8- bit registers denoting the instruction (IR) 16, byte one (bl) 18, and byte two (b2) 20 of a instruction stream. In the processor core 10 a 32-bit arithmetic logic unit (ALU) 22 performs arithmetic and logical operations in response to data and information sent to the ALU 22 from the RAM 12 and sends its
address and data information to a memory 24. The memory 24 receives the address and data information which it writes in response to write instructions (mwi) and produces data (ro) in response to memory read instructions (mri) . The RAM 12 receives the information it stores from a circuit 26, which combines the data it receives from the instruction register 14 with the data produced by the memory 24 and address information produced by the ALU 22. The RAM 12 also responds to index registers 26 and write instructions (swi) . The instruction register 14 also receives the data produced by the memory 24. This embodiment provides a 32-bit direct execution Java® processor that executes Java® Virtual Machine bytecode in hardware in either a standalone (system on a chip) or core configuration. The implementation can have a built-in hardware encryption module . and a fully synthesizable FPGA core.
Figure 2 is a block diagram of the architecture of a second preferred embodiment of the invention. For this implementation of a larger system 40, a floating point interface 42, an external memory interface 44 and a "garbage collector" 46 are defined to operate with the processor core 48. The processor core 48 has the architecture shown in Figure 1, but also includes conventional programmable timers, interrupt controllers and encryption unit circuitry.
Figure 3 is a block diagram of the design environment of the present invention. Software support available for implementation includes parameterized core generation, an operating system runtime environment, co-simulation/verification tools, and a hardware debugger. The parameterized core generation of this implementation automatically synthesizes VHDL, Verilog, or EDIF gate-level netlists.
The parameterized core generator (PCG) operates via an intuitive graphical interface and includes user selectable
parameters for a wide range of configurable options. Every implementation of the PCG can be application specific, leading to optimal solutions in terms of power, speed, and area.
Typically, specialized embedded applications use only a subset of the full Java® Virtual Machine instruction set. By analyzing the application, the designer can determine which Java® bytecode instructions can be omitted or moved from hardware to software, improving various cost criteria. Additional options allow the designer to include functional components of the synthesized architecture such as the built-in encryption engine, programmable timers, and interrupt controllers .
Once parameterized options are selected, the PCG automatically synthesizes a gate-level implementation in VHDL, Verilog, or EDIF netlist format. Along with this softcore, an HDL testbench and a customized runtime are generated. The synthesized core can then be directly input to Xilinx Foundation Series or Alliance Series software for place and route. The HDL testbench is used both to test the hardcore and softcore. Figure 1 shows the data flow architecture of one instance of the customized runtime.
Standard Java®-class files, generated from third party commercial Java® development environments, are statically resolved to build executables. In addition, a linker that is part of the builder builds the boot tables and class initialization codes, and assigns needed interrupt and trap handlers. The executable image incorporates the runtime environment. It is designed for embedded applications and is small enough to be implemented in the internal block RAM. In a standalone configuration, the inventive system architecture can incorporate the entire runtime environment and a Java® application within the Virtex block RAM.
As shown in the design environment shown in Figure 3, the design environment 60 includes a core simulator 62 and a hardware interface 64. The core simulator 62 has a built-in debugger that features single stepping, memory and register file monitors, and conditional break points. The core simulator 62 and the included debugger share a graphical user interface to display and debug the execution of the design, both at instruction and micro steps.
The simulation environment consists of three separate components that validate all three aspects of the simulation environment. These are the instruction tester, the application simulator, and the hardware testbench for the synthesized core.
The hardware interface 64 provides an implementation debugging bridge for the synthesized hardware. The same debugging environment and user interface are used for the testing of the target hardware.
A generator 66 receives core configuration data from a source 68 and specifies the core. This specification is then transmitted to an HDL testbench 70. The generator 66 also transmits data to a runtime environment module 72 and the core specification to an HDL softcore module 74. The runtime environment module 72 and the softcore module 74 send data to the core simulator 62. The core simulator 62 also receives data from a Java application 76. The design environment of Figure 3 also receives a description of a logic synthesis, which is synthesized in the form of a core. This synthesis also employs the hardware interface 64, which further transmits and receives information from the core simulator 62. While the foregoing is a detailed description of the preferred embodiment of the invention, there are many alternative embodiments of the invention that would occur to
those skilled in the ' art and which are within the scope of the present invention.