WO1995015527A1 - A bus interface with graphics and system paths for an integrated memory system - Google Patents

A bus interface with graphics and system paths for an integrated memory system Download PDF

Info

Publication number
WO1995015527A1
WO1995015527A1 PCT/US1994/013550 US9413550W WO9515527A1 WO 1995015527 A1 WO1995015527 A1 WO 1995015527A1 US 9413550 W US9413550 W US 9413550W WO 9515527 A1 WO9515527 A1 WO 9515527A1
Authority
WO
WIPO (PCT)
Prior art keywords
memory
graphics
data
controller
cpu
Prior art date
Application number
PCT/US1994/013550
Other languages
French (fr)
Inventor
Judson Lehman
Shih-Ho Wu
Original Assignee
Vlsi Technology, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vlsi Technology, Inc. filed Critical Vlsi Technology, Inc.
Priority to JP7515680A priority Critical patent/JPH09505424A/en
Priority to KR1019960702848A priority patent/KR960706658A/en
Publication of WO1995015527A1 publication Critical patent/WO1995015527A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1668Details of memory controller
    • G06F13/1684Details of memory controller using multiple buses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management

Definitions

  • the present invention relates generally to computer architecture, and more particularly, to memory-sharing architectures which include graphics capabilities.
  • the foregoing memory capacity problem is typically referred to as the memory granularity problem: expensive chips can be purchased and used efficiently or inexpensive memory chips can be purchased and used inefficiently.
  • This problem is especially significant in computer systems which implement graphics, since these systems typically include a dedicated, high speed display memory. Specialized display memories are usually required because typically refresh for the graphics display (e.g., for a 1280 x 1024 display) consumes virtually all of the available bandwidth of a typical dynamic random access memory (DRAM).
  • DRAM dynamic random access memory
  • graphics refresh optimally requires a memory having a short, wide aspect ratio.
  • Display memories used as frame buffers for high resolution graphics displays have therefore become an increasingly larger fraction of a system's overall cost due to the foregoing memory problem.
  • An exemplary display memory for a current high-end display of 1280 x 1024 pixels requires just over one megabyte of memory. Thus, almost one-half of the display memory remains unused.
  • Figure 1 illustrates a typical computer system 100 which includes graphics capabilities.
  • the Figure 1 computer system includes a central processing unit (CPU) 102, a graphics controller 104 and a system controller 106 all connected to a common bus 108 having a data portion 110 and an address portion 112.
  • CPU central processing unit
  • graphics controller 104 graphics controller
  • system controller 106 all connected to a common bus 108 having a data portion 110 and an address portion 112.
  • the graphics controller 104 is connected to display memory 114 (e.g., random access memory, or RAM) by a memory bus having a memory address bus 116 and a memory data bus 118.
  • RAMDAC 120 performs digital-to-analog conversion (DAC) of signals (e.g., analog RGB color signals) used to drive a graphics display.
  • DAC digital-to-analog conversion
  • the system controller is connected to system memory 122 by a separate memory address bus 124.
  • a memory data bus 126 is connected directly between the common data bus 108 and the system memory.
  • the system memory can also include a separate cache memory 128 connected to the common bus to provide a relatively high-speed portion for the system memory.
  • the graphics controller 104 mediates access of the CPU 102 to the display memory 114.
  • the system controller 106 mediates access of the CPU 102 to system memory 122, and can include a cache controller for mediating CPU access to the cache memory 128.
  • DMA direct memory access
  • the Figure 1 configuration suffers significant drawbacks, including the granularity problem discussed above.
  • the display memory 114 is limited to use in connection with the graphics controller and cannot be used for general system needs. Further, because separate memories are used for the main system and for the graphics memory, a higher number of pin counts render integration of the Figure 1 computer system difficult.
  • the use of separate controllers and memories for the main system and the graphics also results in significant duplication of bus interfaces, memory control and so forth, thus leading to increased cost. For example, the maximum memory required to handle worst case requirements for each of the system memory and the graphics memory must be separately satisfied, even though the computer system will likely never run an application that would require the maximum amount of graphics and main store memory simultaneously.
  • Examples of computers using time-slice access to an integrated memory are the Commodore and the Amiga.
  • the Apple ⁇ computer also used a single memory forsystem and display purposes.
  • the recently-released PolarTM chip set ofthe present assignee, for portable and laptop systgems, makjes provision for integrated memory.
  • Wingine is similar to the conventional computer architecture of Figure 1 but with the addition of a separate path that enables the system controller to perform write operations to graphics memory.
  • Wingine may be viewed as a variation on previous integrated-memory architectures. Part of the system memory is replaced with VRAM, thereby eliminating thebandwidth contention problem using a more expensive memory (VRAM is typically at least twice as expensive as DRAM). In the Wingine implementation, VRAM is not shared but is dedicated for use as graphics memory.
  • VRAM is not shared but is dedicated for use as graphics memory.
  • one version of the Alpha microprocessor sold by Digital Equipment Corporation reportedly has on board a memory controller that allows VRAM to be used to alleviate the bandwidth contention problem. The CPU performs a role analogous to that of a graphics controller, viewing the
  • VRAM frame buffer as a special section of system RAM. As with Wingine, the VRAM is not shared.
  • the present invention provides a low-cost computer system which includes a single shared memory that can be independently accessible as graphics memory or main store system memory without performance degradation. Because the "appetite" for main system memory (unlike that of a display memory) is difficult to satisfy, the memory granularity problem can be addressed by programmably reallocating an unused portion of a display memory for system memory use. Reallocation of the unused display memory alleviates any need to oversize the display memory, yet realizes the cost effectiveness of using readily available memory sizes. Further, reallocation of the graphics memory avoids any need to separately consider both the system memory and the display memory in accommodating worst case operational requirements.
  • performance penalties can be minimized by dynamically allocating the memory bandwidth between graphics and system memory operations on demand, thereby avoiding use of fixed time slices.
  • graphics refresh functions can be accommodated with little or no effect on system memory demands.
  • Exemplary embodiments achieve concurrent graphics and system operations by using a memory controller for controlling access to the shared memory, and an arbiter for arbitrating among requests for access to the memory.
  • a first data path is connected to the arbiter and includes a first buffer store for facilitating exchange of data with the memory
  • a second data path is connected to the arbiter and includes a second buffer store for facilitating exchange of data with the memory.
  • Configuration registers can programmably configure the apparatus such that a first portion of the memory is allocated as display memory and a second portion of the memory is allocated as main memory.
  • Control circuitry connected to the configuration registers and rsponsive to one or more signals applied to the apparatus, including address, data and control signals, can cause at least some of the data signals to be applied to only one of the first and second data paths.
  • separate buffer stores can be provided for graphics and system accesses, thereby achieving improved latency times for both graphics and system cycles.
  • the queues are serviced in parallel and independently of each other.
  • Figure 1 is a system block diagram of a conventional computer system
  • FIG. 2 is a block diagram of another conventional computer system
  • FIG. 3 is a system block diagram of a base computer system in accordance with an exemplary embodiment of the present invention.
  • FIG. 4 is a more detailed block diagram of the graphics controller of Figure 3;
  • FIG. 5 is a more detailed block diagram of the bus interface of Figure
  • Figure 6 is a more detailed diagram of the bus status and configuration registers and decode block of Figure 5;
  • Figure 7 is a block diagram illustrating a remapping of memory in accordance with an exemplary embodiment of the present invention.
  • Figure 3 illustrates an exemplary embodiment of an apparatus for processing data in accordance with the present invention.
  • the Figure 3 apparatus generally labeled 300, can be a computer system which includes a main CPU 302.
  • the main CPU 302 can, for example, be any available microprocessor, such as any standard 486-based processor.
  • the Figure 3 apparatus includes a means for storing data, generally represented as a memory 304.
  • the data storing means 304 includes a system memory portion (e.g., random access memory, or RAM) and a display memory portion (e.g., RAM) addressed via common address lines 306 labeled MA.
  • the display (e.g., graphics) memory portion can include an address space from an address 0 to an address (B-l) for a data storing means having B bytes.
  • the display memory portion and the system memory portion read and write data via common memory data lines 308 labeled MD.
  • the Figure 3 apparatus includes means for controlling a display operation of the Figure 3 system independently of the system controller.
  • the display controlling means is generally represented as a display (e.g., graphics) controller 400.
  • the graphics controller 400 is connected to the CPU 302 via CPU address lines 310 and CPU data lines 312 of a main CPU bus 314.
  • the graphics controller 400 controls access to the graphics memory portion of the data storing means.
  • the Figure 3 computer system further includes means for controlling access to the system memory portion of the data storing means 304.
  • the means for controlling access to the system memory portion is generally represented as a system controller 316 which is interfaced to the CPU 302 and the graphics controller 400 via the main CPU bus 314.
  • the graphics controller and the system controller are indicated as separate blocks, in a physical implementation, they may reside on the same integrated circuit chip or on separate chips.
  • the signal lines 318, 322 and 324 permit the Figure 3 computer system to provide cache support for the system memory via the graphics controller
  • a cache memory 326 can be included for this purpose. Memory reads and writes can be performed to the data storing means in both burst and non-burst modes.
  • the signal line 322 labeled DRAM# indicates to the graphics controller when an addressable location exists within the shared memory and is not in the L2 cache.
  • the signal line 324 labeled ERDY# is an early ready signal from the graphics controller to the system controller to verify that valid data has been read from the shared memory and will be valid for reading by the CPU in a predetermined time.
  • L2 cache level-two cache
  • system memory is located in a single system memory 122, a cache controller included within the system controller 106 can function independently of the graphics controller 104.
  • system memory is located in the shared data storing means 304.
  • existing cache control capabilities of the system controller 316 can still be used by establishing communication between the graphics controller 400 and the system controller 316.
  • system memory is located in both the data storing means represented by memory 304, and an optional expansion memory 328. A failure to detect data in the L2 cache may therefore result in the data being found in the shared memory or in expansion memory. Again, commumcation between the graphics controller 400 and the system controller 316 can handle this situation.
  • FIG. 3 illustrates the manner in which efficient L2 cache memory support is provided for a system wherein a system controller 316 has an integrated L2 cache controller, a graphics controller, and a shared memory system.
  • L2 cache support is provided for all system memory, regardless of the controller to which it is connected. Such support requires coordination between the system controller (with its integrated L2 cache controller) and the graphics controller.
  • L2 cache support may be provided using the existing backoff (i.e., BOFF#) CPU bus signal and the two new signals referred to herein as the DRAM# and ERDY# signals.
  • DRAM# is driven by the system controller and ERDY# is driven by the graphics controller.
  • the system controller 316 monitors memory cycles and notifies the graphics controller when to ignore a particular memory cycle by deasserting the DRAM# on the signal line 322 at a predetermined time in the memory cycle.
  • a system controller instructs the graphics controller to ignore a particular memory cycle when the addressable location is to a location other than the graphics portion of the data storing means (e.g., if the addressable location is to an ISA or PCI bus of the system, or if it's a location within the cache, or in another separate memory and so forth).
  • the graphics controller 400 also monitors memory cycles and begins a memory cycle when an addressable location is within the range of addressable locations for which the graphics controller is enabled to respond. In operation, the graphics controller tests the DRAM# on the signal line 322 at a predetermined time to determine whether it should respond to a current memory cycle. If the DRAM# signal on the signal line 322 has been deasserted by the system controller (i.e., false) the graphics controller 400 aborts the current memory cycle.
  • the memory cycle continues and the graphics controller 400 asserts the signal ERDY# on the signal line 324 to indicate to the system controller that the graphics controller is ready to read data.
  • the ERDY# signal represents an early ready signal which occurs a fixed number of clock cycles before data which is to be read becomes valid.
  • the cache controller 320 integrated within the system controller 316 senses the ERDY# signal on signal line 322 and initiates a writing of data into the cache 326.
  • the graphics controller can also be programmed to drive ERDY# at the end of a memory read cycle to signal to the system controller if a parity error occurred during the read.
  • the graphics controller also monitors BOFF# and, when it is asserted, is alerted that the read has been aborted. If the write-back is to memory outside the graphics controller's address space, the graphics controller may allow the read to continue, assuming that by the time the read has completed, the write ⁇ back may also be done, reducing latency time. The write-back may also be to memory in the graphics controller's address space. In this case, the system controller keeps BOFF# asserted and "masters" the write-back on the CPU bus by driving the bus just as the CPU would do if it were initiating the write. After the write-back has been completed, BOFF# is deasserted, and the CPU restarts the read operation.
  • ERDY# may be driven by multiple sources in a "open-drain" configuration. Multiple DRAM# lines can be used or encoded together to signal to multiple devices.
  • the graphics controller 400 can include means for reallocating addressable locations of the data storing means 304 as display memory which is accessible by the graphics controller
  • the exemplary graphics controller 400 can include means for dynamically controlling access of the system controller and the display controlling means to the display memory portion and the system memory portion, respectively.
  • the reallocating means and access controlling means are generally represented as block 500, included within the graphics controller 400.
  • the Figure 3 computer system can provide significant advantages.
  • the Figure 3 system represents a scalable architecture which can be configured for various price/performance alternatives.
  • the Figure 3 system represents a relatively low-cost system which includes a single bank of shared memory (represented by the data storing means 304) which can be concurrently used, and dynamically reconfigured for both graphics and system functions.
  • the allocation of memory bandwidth between display access and system access is not fixed; rather, memory bandwidth is dynamically allocated on demand between display access and system access.
  • Exemplary embodiments of the present invention can achieve enhanced performance by adding a second bank of memory represented by the expansion memory means 328.
  • B bytes of memory in the shared memory can be allocated to system use, with an address space from address locations zero through address (B-l).
  • the expansion memory can be considered to contain E bytes of expansion system memory (e.g., RAM).
  • the E bytes can be addressed beginning with starting address B and ending with address (E + B - 1).
  • the data storing means 304 can continue to be shared between the graphics controller and the system controller.
  • a relatively high level of performance can be achieved by dedicating all of the data storing means 304 to graphics, reserving only the relatively fast portion of the data storing means or the expansion memory means for system use.
  • exemplary embodiments of the present invention provide significant advantages. By providing a single sharable block of memory that is independently accessible as graphics memory or as main store memory, improved performance at relatively low-cost can be realized. By rendering allocation of the shared memory programmably selectable, any need to have maximum memory size for each of the independent graphics and main memory functions can be eliminated. Further, memory bandwidth can be dynamically allocated on demand rather than via fixed time slices, further improving performance.
  • the graphics controller 400 interfaces to the CPU bus 314 via the reallocating means represented as bus interface 500.
  • the graphics controller interfaces to the data storing means 304 via the access controlling means, represented as a memory interface 408.
  • Commands and data from the Figure 3 CPU 302 are distributed to various logic blocks of the graphics controller 400 on two main buses represented by a display access bus 405 and a system access bus 407, indicated by thick, heavy lines in Figure 4.
  • the system access bus 407 is connected to the memory interface 408.
  • the display access bus 405 is connected to various graphics controller logic blocks which are responsive to commands or programming instructions from the CPU. These logic blocks include a CRT controller (CRTC) 404, a sequencer (SEQ) 410, a RAMDAC interface 412, a clock synthesizer interface 418, an attribute controller (ATT) 422, a hardware cursor (HWC) 428, a graphics accelerator (Accel) 414 and pixel logic 416. In other implementations, other logic blocks may be included or ones of the foregoing logic block may not be included.
  • the CRTC 404 provides vertical and horizontal sync signals to a raster- scan CRT display.
  • the sequencer 410 provides basic timing control for the CRTC 404 and the attribute controller 422.
  • the RAMDAC interface 412 provides for programming of a RAMDAC (i.e., external or integrated) such as the RAMDAC of Figure 1.
  • the RAMDAC is a combination random access memory and digital-to-analog converter that functions as a color palette which drives the CRT.
  • the RAMDAC 120 in Figure 1 can be a look-up table used to convert the data associated with a pixel in the display memory into a color (e.g., RGB analog output).
  • the attribute controller 422 provides processing for alphanumeric and graphics modes.
  • the hardware cursor 428 provides for display of any of a number of user-definable cursors.
  • the accelerator 414 and pixel logic 416 assist the host CPU in graphics-related operations.
  • the pixel logic 416 of Figure 4 may also function as a pixel cache.
  • the clock synthesizer interface 418 provides for programming of a programmable clock synthesizer (i.e., external or internal). Operation of the clock synthesizer interface, along with the other various graphics logic blocks in Figure 3, is well-known to one of ordinary skill in the art.
  • the memory interface 408, which functions as the access controlling means, arbitrates memory access between a number of different entities: the system access bus 407, the pixel logic 416, the display refresh logic 426, and the hardware cursor 428. Priority between these entities can vary according to system activity and the degree to which various buffers are full or empty. The priority scheme takes into account whether a particular access relates to a "mission-critical" function, so as to prevent such functions from being disrupted. For example, display refresh can be classified as a mission-critical function.
  • SUBSTITUTE SHEET (RULE 2 ⁇
  • the exemplary Figure 3 system allocates a portion of the graphics controller's memory to the CPU for system use such that a single shared memory can be used to concurrently implement display functions and system memory functions.
  • latency times for both graphics and system cycles can be further improved by providing separate queues for graphics and system accesses, with the separate queues being serviced in parallel independently of each other.
  • Figure 5 shows the reallocating means represented by the bus interface 402 of Figure 4 in greater detail.
  • a bus state machine 502 connects to the CPU bus and executes bus cycles involving the graphics controller. Commands or data from the CPU are latched in a command latch 504.
  • the command latch is connected to both a graphics queue 506 and a system queue 508.
  • the graphics queue 506 establishes bi ⁇ directional operation using two separate, uni-directional queues: one queue that stores commands from the CPU and outputs them from the bus interface for use by the graphics controller, and one queue that stores data from the graphics controller and outputs it to the CPU.
  • the system queue 508 is a bi ⁇ directional queue composed of two unidirectional queues.
  • the output buses of the graphics queue and the system queue are therefore bi-directional and are connected to an output latch 510 in order to drive data from the graphics controller to the CPU.
  • the graphics queue 506 and the system queue 508 are controlled by a graphics queue state machine 512 and a system queue state machine 514, respectively. These state machines are in turn controlled by the bus state machine 502.
  • a bus status/configuration registers/address decode block 520 is connected to the bus state machine 502. Further, block 520 is connected with an output multiplexer 516 of the output latch, and an output multiplexer ("mux") 518 of the command latch.
  • Bus status registers of block 520 contain information regarding the state of the graphics controller and the amount of available space in the graphics and system queues. The bus status registers may be read directly through the output mux 516 without putting a read command into either queue. Configuration registers of block 520 are written to from the bus state machine 502 and are used to select modes of operation in addition to those provided in a typical video graphics array (VGA) implementation.
  • VGA video graphics array
  • programming flexibility can be improved by providing remapping registers which allow the CPU to reallocate the addresses to which the graphics controller responds.
  • Address decoding is programmable, such that the graphics controller responds to a CPU command if the command is to an address within the graphics controller's designated address space.
  • address decode logic included in block 520 allows a register/memory location to be reallocated (i.e., remapped) from an original address to a new address more suitable to the CPU. This address decode logic therefore maps the new CPU address back to its original address.
  • An exemplary sequence would be as follows.
  • the CPU issues a read command of a particular address.
  • the graphics controller's address decode logic included in block 520 determines that the address is within the graphics controller's range, but that the desired register/memory location has been remapped from its original address to a new address more suitable to the CPU.
  • the address decode logic in block 520 maps the CPU address back to the original address and latches that address into the appropriate queue via the mux 518.
  • registers and memory are always at fixed addresses, simplifying decoding of the graphics and system queue buses.
  • a separate latch (one-stage queue) 522 can be provided for the hardware cursor.
  • the bus status/configuration registers/address decode block 520 of Figure 5 is illustrated in greater detail.
  • the block 520 includes address decode logic 602, configuration registers 604 and status registers 606.
  • the address decode logic 602 examines the CPU control lines that define whether the command is to memory or I/O and is a read or a write operation.
  • the address decode logic 602 further compares the CPU address on the address bus to addresses programmed for various logic groups. If a match is found, the appropriate select line is asserted.
  • VGA mode I/O VGA mode frame buffer
  • Windows mode registers Windows mode frame buffer
  • system memory system memory
  • configuration registers or the status registers address space (which is within the configuration registers address space).
  • the configuration registers 604 are initialized to some pre-determined value at power-on reset.
  • the configuration registers remap some of the address spaces within the graphics controller. This remapping allows software to access particular register or logic at a different address than to which it was initialized. Additional capability can be added to inhibit the graphics controller from responding to accesses of particular logic or memory. This may be done in various ways, for example explicitly via enable/disable bits in a register and implicitly by programming the low and high address boundaries for a group to be the same.
  • the configuration registers can be read by the CPU via a port 608.
  • the status registers 606 are read only. They contain information such as queue status (how full the queues are), what the accelerator is doing, what errors have occurred, and so forth. Certain bits of the status registers may be cleared by being read. The CPU reads the status registers directly without having to go through the graphics or system queues.
  • Figure 7 illustrates a reallocation of addressable locations in memory when the expansion memory means 328 of Figure 3 is used.
  • the reallocation of Figure 7 ensures that addressable locations of any expansion memory are added to the bottom of available system memory. This ensures that expansion memory will always be accessed first by the CPU to accommodate system upgrades to high-speed memory.
  • the remapping operation depicted generally in Figure 7 is described in greater detail in U.S. Patent Application Serial Number 08/159,224 entitled Method and Aparatus for Providing and
  • the present architecture allows system cost to be significantly reduced. Further, by providing a bus interface with separate graphics and system paths, the cost savings described can be achieved with a minimal performance penalty. In a system complete with separate expansion memory, performance at least as good as in conventional memory systems is obtained. In some cases, the possibility of parallel main memory access to two or more possible memory paths results in increased performance by effectively overlapping accesses.
  • the invention has been described in terms of a two-bank system having graphics and main store system memory, the invention can be extended to any arbitrary number of concurrently operating memory banks.

Abstract

The present invention provides a low-cost computer system which includes a single shared memory that can be independently accessible as graphics memory or main store system memory without performance degradation. Because the 'appetite' for main system memory (unlike that of a display memory) is difficult to satisfy, the memory granularity problem can be addressed by programmably reallocating an unused portion of a display memory for system memory use. Reallocation of the unused display memory alleviates any need to oversize the display memory, yet realizes the cost effectiveness of using readily available memory sizes. Further, reallocation of the graphics memory avoids any need to separately consider both the system memory and the display memory in accommodating worst case operational requirements.

Description

A BUS INTERFACE WITH GRAPHICS AND SYSTEM PATHS FOR AN INTEGRATED MEMORY SYSTEM
BACKGROUND OF THE INVENTION
Field of the Invention: The present invention relates generally to computer architecture, and more particularly, to memory-sharing architectures which include graphics capabilities.
State of the Art:
As the density of solid state memories increases, oversized memories are being wastefiilly used for purposes which optimally require specialized memory configurations (e.g., graphics refresh). One reason for this is that manufacturers attempt to produce memory sizes which will achieve a broad range of applicability and a high volume of production. The more popular, and thus more cost-effective memories, tend to be fabricated with square aspect ratios or with tall, thin aspect ratios (i.e., a large number of fixed length words) that are not readily suited to specialized uses.
Although uses which can exploit memories with these popular aspect ratios can be implemented in a relatively cost-effective manner, specialized uses which cannot exploit these aspect ratios can be proportionately more expensive to implement. The expense associated with implementing specialized uses assumes one of two forms: (1) the increased cost associated with purchasing a memory which does not conform to a readily available and widely used memory configuration; or (2) the increased cost associated with purchasing a readily available memory which is much larger than needed to implement a specialized use (e.g., a relatively square memory which must be tall enough to obtain a desired width, even though only a relatively small number of rows in the memory are needed for the purpose at hand).
The foregoing memory capacity problem is typically referred to as the memory granularity problem: expensive chips can be purchased and used efficiently or inexpensive memory chips can be purchased and used inefficiently. This problem is especially significant in computer systems which implement graphics, since these systems typically include a dedicated, high speed display memory. Specialized display memories are usually required because typically refresh for the graphics display (e.g., for a 1280 x 1024 display) consumes virtually all of the available bandwidth of a typical dynamic random access memory (DRAM).
To update a video line on a high resolution graphics display, graphics refresh optimally requires a memory having a short, wide aspect ratio. Display memories used as frame buffers for high resolution graphics displays have therefore become an increasingly larger fraction of a system's overall cost due to the foregoing memory problem. For display memories, even a two megabit memory can be unnecessarily large, such that it cannot be effectively used. An exemplary display memory for a current high-end display of 1280 x 1024 pixels requires just over one megabyte of memory. Thus, almost one-half of the display memory remains unused.
For example, Figure 1 illustrates a typical computer system 100 which includes graphics capabilities. The Figure 1 computer system includes a central processing unit (CPU) 102, a graphics controller 104 and a system controller 106 all connected to a common bus 108 having a data portion 110 and an address portion 112.
The graphics controller 104 is connected to display memory 114 (e.g., random access memory, or RAM) by a memory bus having a memory address bus 116 and a memory data bus 118. RAMDAC 120 performs digital-to-analog conversion (DAC) of signals (e.g., analog RGB color signals) used to drive a graphics display.
The system controller is connected to system memory 122 by a separate memory address bus 124. A memory data bus 126 is connected directly between the common data bus 108 and the system memory. The system memory can also include a separate cache memory 128 connected to the common bus to provide a relatively high-speed portion for the system memory.
The graphics controller 104 mediates access of the CPU 102 to the display memory 114. For system memory transfers not involving direct memory access (DMA), the system controller 106 mediates access of the CPU 102 to system memory 122, and can include a cache controller for mediating CPU access to the cache memory 128.
However, the Figure 1 configuration suffers significant drawbacks, including the granularity problem discussed above. The display memory 114 is limited to use in connection with the graphics controller and cannot be used for general system needs. Further, because separate memories are used for the main system and for the graphics memory, a higher number of pin counts render integration of the Figure 1 computer system difficult. The use of separate controllers and memories for the main system and the graphics also results in significant duplication of bus interfaces, memory control and so forth, thus leading to increased cost. For example, the maximum memory required to handle worst case requirements for each of the system memory and the graphics memory must be separately satisfied, even though the computer system will likely never run an application that would require the maximum amount of graphics and main store memory simultaneously. In addition, transfers between the main memory and the graphics require that either the CPU or a DMA controller intervene, thus blocking use of the system bus. Attempts have been made to alleviate the foregoing drawbacks of the Figure 1 system by integrating system memory with display memory. However, these attempts have reduced duplication of control features at the expense of system performance. These attempts have not adequately addressed the granularity problem.
Some attempts have been made, particularly in the area ofportable and laptop systems, to unify display memory and system memory. For example, one approach to integrated display memory and systemmemory is illustrated in Figure 2. However, approaches such as that illustrated in Figure 2 suffer significant drawbacks. For example, refreshing of the display via the graphics controller requires that cycles be stolen from the main memory, rendering performance unpredictable. Further, these approaches use a time-sliced arbitration mode for allocating specific time slots among the system controller and the graphics controller, such that overall system performance is further degraded.
In other words, overall performance of the Figure 2 system is limited by the bandwidth of the single memory block, and the high demands of graphics refresh function alone introduce significant performance degradation. The allocation of memory bandwidth between display access and system access using fixed time-slots only adds to performance degradation. Because the time slots must be capable of handling the worst case requirements for each of the system memory and display memory subsystems, the worst possible memory allocation is forced to be the normal case.
Examples of computers using time-slice access to an integrated memory are the Commodore and the Amiga. The Apple π computer also used a single memory forsystem and display purposes. In addition, the recently-released Polar™ chip set ofthe present assignee, for portable and laptop systgems, makjes provision for integrated memory. A different approach is described in a document entitled "64200 (Wingine™) High Performance 'Windows™ Engine'", available from Chi8ps and Technologies, Inc. In one respect, Wingine is similar to the conventional computer architecture of Figure 1 but with the addition of a separate path that enables the system controller to perform write operations to graphics memory.
The graphics controller, meanwhile, performs screen refresh only. In another respect, Wingine may be viewed as a variation on previous integrated-memory architectures. Part of the system memory is replaced with VRAM, thereby eliminating thebandwidth contention problem using a more expensive memory (VRAM is typically at least twice as expensive as DRAM). In the Wingine implementation, VRAM is not shared but is dedicated for use as graphics memory. Similarly, one version of the Alpha microprocessor sold by Digital Equipment Corporation reportedly has on board a memory controller that allows VRAM to be used to alleviate the bandwidth contention problem. The CPU performs a role analogous to that of a graphics controller, viewing the
VRAM frame buffer as a special section of system RAM. As with Wingine, the VRAM is not shared.
Thus, traditional computer architectures can not efficiently integrate a single memory to accommodate the two different functions of display memory and system memory without significantly degrading system performance. What is needed, then, is a new computer architecture that allows display memory and system memory to be integrated while still achieving high system performance. Such an architecture should, desirably, allow for memory expansion and use with cache memory. Further, any such system should provide an upgrade path to existing and planned high performance memory chips, including VRAM, synchronous DRAM (SDRAM) and extended data out DRAM (EDODRAM). SUMMARY OF THE INVENTION
The present invention provides a low-cost computer system which includes a single shared memory that can be independently accessible as graphics memory or main store system memory without performance degradation. Because the "appetite" for main system memory (unlike that of a display memory) is difficult to satisfy, the memory granularity problem can be addressed by programmably reallocating an unused portion of a display memory for system memory use. Reallocation of the unused display memory alleviates any need to oversize the display memory, yet realizes the cost effectiveness of using readily available memory sizes. Further, reallocation of the graphics memory avoids any need to separately consider both the system memory and the display memory in accommodating worst case operational requirements.
In exemplary embodiments, performance penalties can be minimized by dynamically allocating the memory bandwidth between graphics and system memory operations on demand, thereby avoiding use of fixed time slices. By eliminating use of fixed time slices to arbitrate between display memory and system memory accesses, graphics refresh functions can be accommodated with little or no effect on system memory demands.
Exemplary embodiments achieve concurrent graphics and system operations by using a memory controller for controlling access to the shared memory, and an arbiter for arbitrating among requests for access to the memory. A first data path is connected to the arbiter and includes a first buffer store for facilitating exchange of data with the memory, and a second data path is connected to the arbiter and includes a second buffer store for facilitating exchange of data with the memory. Configuration registers can programmably configure the apparatus such that a first portion of the memory is allocated as display memory and a second portion of the memory is allocated as main memory. Control circuitry connected to the configuration registers and rsponsive to one or more signals applied to the apparatus, including address, data and control signals, can cause at least some of the data signals to be applied to only one of the first and second data paths.
In accordance with further embodiments, separate buffer stores, or queues, can be provided for graphics and system accesses, thereby achieving improved latency times for both graphics and system cycles. The queues are serviced in parallel and independently of each other.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention can be further understood with reference to the following description and the appended drawings, wherein like elements are provided with the same reference numerals. In the drawings:
Figure 1 is a system block diagram of a conventional computer system;
Figure 2 is a block diagram of another conventional computer system;
Figure 3 is a system block diagram of a base computer system in accordance with an exemplary embodiment of the present invention;
Figure 4 is a more detailed block diagram of the graphics controller of Figure 3;
Figure 5 is a more detailed block diagram of the bus interface of Figure
3; Figure 6 is a more detailed diagram of the bus status and configuration registers and decode block of Figure 5; and
Figure 7 is a block diagram illustrating a remapping of memory in accordance with an exemplary embodiment of the present invention. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
Figure 3 illustrates an exemplary embodiment of an apparatus for processing data in accordance with the present invention. The Figure 3 apparatus, generally labeled 300, can be a computer system which includes a main CPU 302. The main CPU 302 can, for example, be any available microprocessor, such as any standard 486-based processor.
The Figure 3 apparatus includes a means for storing data, generally represented as a memory 304. In accordance with the present invention, the data storing means 304 includes a system memory portion (e.g., random access memory, or RAM) and a display memory portion (e.g., RAM) addressed via common address lines 306 labeled MA. The display (e.g., graphics) memory portion can include an address space from an address 0 to an address (B-l) for a data storing means having B bytes. Further, the display memory portion and the system memory portion read and write data via common memory data lines 308 labeled MD.
The Figure 3 apparatus includes means for controlling a display operation of the Figure 3 system independently of the system controller. The display controlling means is generally represented as a display (e.g., graphics) controller 400. The graphics controller 400 is connected to the CPU 302 via CPU address lines 310 and CPU data lines 312 of a main CPU bus 314. The graphics controller 400 controls access to the graphics memory portion of the data storing means.
The Figure 3 computer system further includes means for controlling access to the system memory portion of the data storing means 304. The means for controlling access to the system memory portion is generally represented as a system controller 316 which is interfaced to the CPU 302 and the graphics controller 400 via the main CPU bus 314. Although the graphics controller and the system controller are indicated as separate blocks, in a physical implementation, they may reside on the same integrated circuit chip or on separate chips.
The signal lines 318, 322 and 324 permit the Figure 3 computer system to provide cache support for the system memory via the graphics controller
400, where the cache controller is included within the system controller. In accordance with exemplary embodiments, a cache memory 326 can be included for this purpose. Memory reads and writes can be performed to the data storing means in both burst and non-burst modes.
Generally speaking, the signal line 322 labeled DRAM# indicates to the graphics controller when an addressable location exists within the shared memory and is not in the L2 cache. The signal line 324 labeled ERDY# is an early ready signal from the graphics controller to the system controller to verify that valid data has been read from the shared memory and will be valid for reading by the CPU in a predetermined time.
More particularly, typical personal computer systems feature an on-chip level-one (LI) cache of, for example, 8 kilo bytes within the CPU. Any external cache therefore functions as a level-two (L2) cache; i.e., data sought by the CPU is first sought in the LI cache, then sought in the L2 cache, if necessary, and then sought in system memory if the data has not been found.
In the conventional computer architecture of Figure 1, since system memory is located in a single system memory 122, a cache controller included within the system controller 106 can function independently of the graphics controller 104.
In the system of Figure 3, on the other hand, system memory is located in the shared data storing means 304. However, in accordance with exemplary embodiments, existing cache control capabilities of the system controller 316 can still be used by establishing communication between the graphics controller 400 and the system controller 316. Further, in the system of Figure 3, system memory is located in both the data storing means represented by memory 304, and an optional expansion memory 328. A failure to detect data in the L2 cache may therefore result in the data being found in the shared memory or in expansion memory. Again, commumcation between the graphics controller 400 and the system controller 316 can handle this situation.
Figure 3 illustrates the manner in which efficient L2 cache memory support is provided for a system wherein a system controller 316 has an integrated L2 cache controller, a graphics controller, and a shared memory system. L2 cache support is provided for all system memory, regardless of the controller to which it is connected. Such support requires coordination between the system controller (with its integrated L2 cache controller) and the graphics controller.
In a 486-like or VL-Bus-based personal computer, L2 cache support may be provided using the existing backoff (i.e., BOFF#) CPU bus signal and the two new signals referred to herein as the DRAM# and ERDY# signals. DRAM# is driven by the system controller and ERDY# is driven by the graphics controller.
The system controller 316 monitors memory cycles and notifies the graphics controller when to ignore a particular memory cycle by deasserting the DRAM# on the signal line 322 at a predetermined time in the memory cycle. A system controller instructs the graphics controller to ignore a particular memory cycle when the addressable location is to a location other than the graphics portion of the data storing means (e.g., if the addressable location is to an ISA or PCI bus of the system, or if it's a location within the cache, or in another separate memory and so forth).
SUBSTiME ET (RULE 26) The graphics controller 400 also monitors memory cycles and begins a memory cycle when an addressable location is within the range of addressable locations for which the graphics controller is enabled to respond. In operation, the graphics controller tests the DRAM# on the signal line 322 at a predetermined time to determine whether it should respond to a current memory cycle. If the DRAM# signal on the signal line 322 has been deasserted by the system controller (i.e., false) the graphics controller 400 aborts the current memory cycle.
On the contrary, if the DRAM# on the signal line 322 has been asserted by the system controller (i.e., tests true), the memory cycle continues and the graphics controller 400 asserts the signal ERDY# on the signal line 324 to indicate to the system controller that the graphics controller is ready to read data. In this sense, the ERDY# signal represents an early ready signal which occurs a fixed number of clock cycles before data which is to be read becomes valid. In this instance, the cache controller 320 integrated within the system controller 316 senses the ERDY# signal on signal line 322 and initiates a writing of data into the cache 326.
The graphics controller can also be programmed to drive ERDY# at the end of a memory read cycle to signal to the system controller if a parity error occurred during the read.
Write-backs, for read-miss-dirty cycles and the like, are also supported using the BOFF# CPU bus signal. When write-back is required in response to a read request, the system controller asserts BOFF# (backoff), causing the CPU to abort the read cycle. Meanwhile, the graphics controller will have already started a memory read if the read address was within its address space.
The graphics controller also monitors BOFF# and, when it is asserted, is alerted that the read has been aborted. If the write-back is to memory outside the graphics controller's address space, the graphics controller may allow the read to continue, assuming that by the time the read has completed, the write¬ back may also be done, reducing latency time. The write-back may also be to memory in the graphics controller's address space. In this case, the system controller keeps BOFF# asserted and "masters" the write-back on the CPU bus by driving the bus just as the CPU would do if it were initiating the write. After the write-back has been completed, BOFF# is deasserted, and the CPU restarts the read operation.
This approach can be extended to provide L2 cache support for memory or other devices connected to the CPU bus. ERDY# may be driven by multiple sources in a "open-drain" configuration. Multiple DRAM# lines can be used or encoded together to signal to multiple devices.
In accordance with exemplary embodiments, the graphics controller 400 can include means for reallocating addressable locations of the data storing means 304 as display memory which is accessible by the graphics controller
400, or as system memory which is independently accessible by the system controller 316. Further, the exemplary graphics controller 400 can include means for dynamically controlling access of the system controller and the display controlling means to the display memory portion and the system memory portion, respectively. The reallocating means and access controlling means are generally represented as block 500, included within the graphics controller 400.
The Figure 3 computer system can provide significant advantages. For example, the Figure 3 system represents a scalable architecture which can be configured for various price/performance alternatives. The Figure 3 system represents a relatively low-cost system which includes a single bank of shared memory (represented by the data storing means 304) which can be concurrently used, and dynamically reconfigured for both graphics and system functions. Unlike previous shared memory systems, the allocation of memory bandwidth between display access and system access is not fixed; rather, memory bandwidth is dynamically allocated on demand between display access and system access.
Exemplary embodiments of the present invention, such as that illustrated in Figure 3, can achieve enhanced performance by adding a second bank of memory represented by the expansion memory means 328. In accordance with the exemplary embodiment wherein expansion memory is used, B bytes of memory in the shared memory can be allocated to system use, with an address space from address locations zero through address (B-l). The expansion memory can be considered to contain E bytes of expansion system memory (e.g., RAM). In an exemplary embodiment, the E bytes can be addressed beginning with starting address B and ending with address (E + B - 1).
In such an alternate embodiment, the data storing means 304 can continue to be shared between the graphics controller and the system controller.
However, in accordance with alternate embodiments, a relatively high level of performance can be achieved by dedicating all of the data storing means 304 to graphics, reserving only the relatively fast portion of the data storing means or the expansion memory means for system use.
By the add on of expansion memory via an independent, separately controlled memory bus, system performance can be further enhanced, while using the same cache controller integrated in the system controller. With the addition of a simple memory interface block, concurrent accesses can occur to both the data storing means 304 and the expansion memory means 328. In this case, performance can be further improved. For example, the possibility of parallel main memory accesses to two possible memory paths can result in increased performance by effectively overlapping accesses. Thus, exemplary embodiments of the present invention provide significant advantages. By providing a single sharable block of memory that is independently accessible as graphics memory or as main store memory, improved performance at relatively low-cost can be realized. By rendering allocation of the shared memory programmably selectable, any need to have maximum memory size for each of the independent graphics and main memory functions can be eliminated. Further, memory bandwidth can be dynamically allocated on demand rather than via fixed time slices, further improving performance.
Referring to Figure 4, the graphics controller 400 interfaces to the CPU bus 314 via the reallocating means represented as bus interface 500. The graphics controller interfaces to the data storing means 304 via the access controlling means, represented as a memory interface 408.
Commands and data from the Figure 3 CPU 302 are distributed to various logic blocks of the graphics controller 400 on two main buses represented by a display access bus 405 and a system access bus 407, indicated by thick, heavy lines in Figure 4. The system access bus 407 is connected to the memory interface 408.
The display access bus 405 is connected to various graphics controller logic blocks which are responsive to commands or programming instructions from the CPU. These logic blocks include a CRT controller (CRTC) 404, a sequencer (SEQ) 410, a RAMDAC interface 412, a clock synthesizer interface 418, an attribute controller (ATT) 422, a hardware cursor (HWC) 428, a graphics accelerator (Accel) 414 and pixel logic 416. In other implementations, other logic blocks may be included or ones of the foregoing logic block may not be included. The CRTC 404 provides vertical and horizontal sync signals to a raster- scan CRT display. The sequencer 410 provides basic timing control for the CRTC 404 and the attribute controller 422. The RAMDAC interface 412 provides for programming of a RAMDAC (i.e., external or integrated) such as the RAMDAC of Figure 1. The RAMDAC is a combination random access memory and digital-to-analog converter that functions as a color palette which drives the CRT. The RAMDAC 120 in Figure 1 can be a look-up table used to convert the data associated with a pixel in the display memory into a color (e.g., RGB analog output).
The attribute controller 422 provides processing for alphanumeric and graphics modes. The hardware cursor 428 provides for display of any of a number of user-definable cursors. The accelerator 414 and pixel logic 416 assist the host CPU in graphics-related operations. The pixel logic 416 of Figure 4 may also function as a pixel cache.
The clock synthesizer interface 418 provides for programming of a programmable clock synthesizer (i.e., external or internal). Operation of the clock synthesizer interface, along with the other various graphics logic blocks in Figure 3, is well-known to one of ordinary skill in the art.
The memory interface 408, which functions as the access controlling means, arbitrates memory access between a number of different entities: the system access bus 407, the pixel logic 416, the display refresh logic 426, and the hardware cursor 428. Priority between these entities can vary according to system activity and the degree to which various buffers are full or empty. The priority scheme takes into account whether a particular access relates to a "mission-critical" function, so as to prevent such functions from being disrupted. For example, display refresh can be classified as a mission-critical function.
SUBSTITUTE SHEET (RULE 2φ The exemplary Figure 3 system allocates a portion of the graphics controller's memory to the CPU for system use such that a single shared memory can be used to concurrently implement display functions and system memory functions. In accordance with alternate embodiments of the present invention, latency times for both graphics and system cycles can be further improved by providing separate queues for graphics and system accesses, with the separate queues being serviced in parallel independently of each other.
More particularly, Figure 5 shows the reallocating means represented by the bus interface 402 of Figure 4 in greater detail. As illustrated in Figure 5, a bus state machine 502 connects to the CPU bus and executes bus cycles involving the graphics controller. Commands or data from the CPU are latched in a command latch 504. The command latch is connected to both a graphics queue 506 and a system queue 508. The graphics queue 506 establishes bi¬ directional operation using two separate, uni-directional queues: one queue that stores commands from the CPU and outputs them from the bus interface for use by the graphics controller, and one queue that stores data from the graphics controller and outputs it to the CPU. Likewise, the system queue 508 is a bi¬ directional queue composed of two unidirectional queues. The output buses of the graphics queue and the system queue are therefore bi-directional and are connected to an output latch 510 in order to drive data from the graphics controller to the CPU.
Separate memory and input/output (I/O) address ranges are defined for each queue such that the graphics and system queues are independently accessible. The graphics queue 506 and the system queue 508 are controlled by a graphics queue state machine 512 and a system queue state machine 514, respectively. These state machines are in turn controlled by the bus state machine 502. A bus status/configuration registers/address decode block 520 is connected to the bus state machine 502. Further, block 520 is connected with an output multiplexer 516 of the output latch, and an output multiplexer ("mux") 518 of the command latch.
Bus status registers of block 520 contain information regarding the state of the graphics controller and the amount of available space in the graphics and system queues. The bus status registers may be read directly through the output mux 516 without putting a read command into either queue. Configuration registers of block 520 are written to from the bus state machine 502 and are used to select modes of operation in addition to those provided in a typical video graphics array (VGA) implementation.
In accordance with exemplary embodiments, programming flexibility can be improved by providing remapping registers which allow the CPU to reallocate the addresses to which the graphics controller responds. Address decoding is programmable, such that the graphics controller responds to a CPU command if the command is to an address within the graphics controller's designated address space.
Outside the bus interface 402 of Figure 4, the graphics controller assumes that registers and memory are always at fixed addresses. Within the bus interface, address decode logic included in block 520 allows a register/memory location to be reallocated (i.e., remapped) from an original address to a new address more suitable to the CPU. This address decode logic therefore maps the new CPU address back to its original address.
An exemplary sequence would be as follows. The CPU issues a read command of a particular address. The graphics controller's address decode logic included in block 520 determines that the address is within the graphics controller's range, but that the desired register/memory location has been remapped from its original address to a new address more suitable to the CPU. In this case, the address decode logic in block 520 maps the CPU address back to the original address and latches that address into the appropriate queue via the mux 518. Below the queues 506 and 508, registers and memory are always at fixed addresses, simplifying decoding of the graphics and system queue buses. In addition to the graphics queue 506 and the system queue 508, a separate latch (one-stage queue) 522 can be provided for the hardware cursor.
Referring to Figure 6, the bus status/configuration registers/address decode block 520 of Figure 5 is illustrated in greater detail. As shown in Figure 6, the block 520 includes address decode logic 602, configuration registers 604 and status registers 606. The address decode logic 602 examines the CPU control lines that define whether the command is to memory or I/O and is a read or a write operation. The address decode logic 602 further compares the CPU address on the address bus to addresses programmed for various logic groups. If a match is found, the appropriate select line is asserted. Separate lines out of the address decode logic signal if the CPU address is within the address space of one of the following exemplary groups: VGA mode I/O, VGA mode frame buffer, Windows mode registers, Windows mode frame buffer, system memory, configuration registers, or the status registers address space (which is within the configuration registers address space).
The configuration registers 604 are initialized to some pre-determined value at power-on reset. The configuration registers remap some of the address spaces within the graphics controller. This remapping allows software to access particular register or logic at a different address than to which it was initialized. Additional capability can be added to inhibit the graphics controller from responding to accesses of particular logic or memory. This may be done in various ways, for example explicitly via enable/disable bits in a register and implicitly by programming the low and high address boundaries for a group to be the same. The configuration registers can be read by the CPU via a port 608.
The status registers 606 are read only. They contain information such as queue status (how full the queues are), what the accelerator is doing, what errors have occurred, and so forth. Certain bits of the status registers may be cleared by being read. The CPU reads the status registers directly without having to go through the graphics or system queues.
Figure 7 illustrates a reallocation of addressable locations in memory when the expansion memory means 328 of Figure 3 is used. The reallocation of Figure 7 ensures that addressable locations of any expansion memory are added to the bottom of available system memory. This ensures that expansion memory will always be accessed first by the CPU to accommodate system upgrades to high-speed memory. The remapping operation depicted generally in Figure 7 is described in greater detail in U.S. Patent Application Serial Number 08/159,224 entitled Method and Aparatus for Providing and
Maximizing Concurrent Operations in a Shared Memory System, by W. Desi Rhoden and Rajeev Jayavant, commonly assigned and filed on even date herewith and incorporated herein by reference.
In summary, by integrating graphics memory and system memory, the present architecture allows system cost to be significantly reduced. Further, by providing a bus interface with separate graphics and system paths, the cost savings described can be achieved with a minimal performance penalty. In a system complete with separate expansion memory, performance at least as good as in conventional memory systems is obtained. In some cases, the possibility of parallel main memory access to two or more possible memory paths results in increased performance by effectively overlapping accesses. Although the invention has been described in terms of a two-bank system having graphics and main store system memory, the invention can be extended to any arbitrary number of concurrently operating memory banks.
It will be appreciated by those skilled in the art that the present invention can be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The presently disclosed embodiments are therefore considered in all respects to be illustrative and not restricted. The scope of the invention is indicated by the appended claims rather than the foregoing description and all changes that come within the meaning and range and equivalence thereof are intended to be embraced therein.

Claims

WHAT IS CLAIMED IS:
1. For use in a computing apparatus including a CPU and memory means, apparatus comprising: memory controller means for controlling access to said memory means, including arbitration means for arbitrating among a plurality of requests for access to said memory means; first data path means connected to said arbitration means and including first buffer storage means for facilitating exchange of data with said memory means; second data path means connected to said arbitration means and including second buffer means for facilitating exchange of data with said memory means; configuration means for programmably configuring said apparatus such that a first portion of said memory means is allocated as display memory and a second portion of said memory means is allocated as main memory; and control means connected to said configuration means and responsive to one or more signals applied to said apparatus, said signals including address, data and control signals, for causing at least some of said data signals to be applied to only one of said first and second data path means.
2. The apparatus of Claim 1 wherein said control means comprises a bus controller.
3. The apparatus of Claim 2 wherein said first buffer storage means comprises a graphics queue.
4. The apparatus of Claim 3 wherein said first buffer storage means further comprises a graphics queue controller connected to said graphics queue and to said bus controller.
5. The apparatus of Claim 4 wherein said graphics queue comprises first data staging means that inputs data from the CPU and outputs data directed to said memory means and second data staging means that inputs data from the memory means and outputs data directed to the CPU.
6. The apparatus of Claim 2 wherein said second buffer storage means comprises a system queue.
7. The apparatus of Claim 6 wherein said second buffer storage means further comprises a system queue controller connected to said system queue and to said bus controller.
8. The apparatus of Claim 7 wherein said system queue comprises first data staging means that inputs data from the CPU and outputs data directed to said memory means and second data staging means that inputs data from the memory means and outputs data directed to the CPU.
SUBSTITUTE. SHEET (Fl^
PCT/US1994/013550 1993-11-30 1994-11-23 A bus interface with graphics and system paths for an integrated memory system WO1995015527A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP7515680A JPH09505424A (en) 1993-11-30 1994-11-23 Bus interface with graphical and system paths for integrated memory system
KR1019960702848A KR960706658A (en) 1993-11-30 1994-11-23 A BUS INTERFACE WITH GRAPHICS AND SYSTEM PATHS FOR AN INTEGRATED MEMORY SYSTEM

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US08/158,968 1993-11-30
US08/158,968 US5450542A (en) 1993-11-30 1993-11-30 Bus interface with graphics and system paths for an integrated memory system

Publications (1)

Publication Number Publication Date
WO1995015527A1 true WO1995015527A1 (en) 1995-06-08

Family

ID=22570516

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1994/013550 WO1995015527A1 (en) 1993-11-30 1994-11-23 A bus interface with graphics and system paths for an integrated memory system

Country Status (4)

Country Link
US (1) US5450542A (en)
JP (1) JPH09505424A (en)
KR (1) KR960706658A (en)
WO (1) WO1995015527A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010006633A1 (en) * 2008-07-18 2010-01-21 Siemens Aktiengesellschaft Method for operating an automation system, computer program, and computer program product
JP2012515992A (en) * 2009-01-23 2012-07-12 アドバンスト・マイクロ・ディバイシズ・インコーポレイテッド Non-graphics use of graphics memory

Families Citing this family (62)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5404437A (en) * 1992-11-10 1995-04-04 Sigma Designs, Inc. Mixing of computer graphics and animation sequences
KR960706657A (en) 1993-11-30 1996-12-09 토마스 씨. 토코스 METHOD AND APPARATUS FOR PROVIDING AND MAXIMIZING CONCURRENT OPERATIONS IN A SHARED MEMORY SYSTEM
US5515107A (en) * 1994-03-30 1996-05-07 Sigma Designs, Incorporated Method of encoding a stream of motion picture data
US5598576A (en) * 1994-03-30 1997-01-28 Sigma Designs, Incorporated Audio output device having digital signal processor for responding to commands issued by processor by emulating designated functions according to common command interface
US5528309A (en) 1994-06-28 1996-06-18 Sigma Designs, Incorporated Analog video chromakey mixer
US5579473A (en) * 1994-07-18 1996-11-26 Sun Microsystems, Inc. Interface controller for frame buffer random access memory devices
US6330644B1 (en) 1994-10-27 2001-12-11 Canon Kabushiki Kaisha Signal processor with a plurality of kinds of processors and a shared memory accessed through a versatile control means
US5790881A (en) * 1995-02-07 1998-08-04 Sigma Designs, Inc. Computer system including coprocessor devices simulating memory interfaces
US6373493B1 (en) * 1995-05-01 2002-04-16 Apple Computer, Inc. Hardware graphics accelerator having access to multiple types of memory including cached memory
US5767866A (en) * 1995-06-07 1998-06-16 Seiko Epson Corporation Computer system with efficient DRAM access
US6204864B1 (en) 1995-06-07 2001-03-20 Seiko Epson Corporation Apparatus and method having improved memory controller request handler
US5731809A (en) * 1995-07-10 1998-03-24 Silicon Integrated Systems Corp. Adaptive display memory management system
US5682522A (en) * 1995-07-18 1997-10-28 Silicon Integrated Systems Corp. Shared memory architecture of graphics frame buffer and hard disk cache
US5818464A (en) * 1995-08-17 1998-10-06 Intel Corporation Method and apparatus for arbitrating access requests to a shared computer system memory by a graphics controller and memory controller
US5854637A (en) * 1995-08-17 1998-12-29 Intel Corporation Method and apparatus for managing access to a computer system memory shared by a graphics controller and a memory controller
US5537353A (en) * 1995-08-31 1996-07-16 Cirrus Logic, Inc. Low pin count-wide memory devices and systems and methods using the same
US6058459A (en) 1996-08-26 2000-05-02 Stmicroelectronics, Inc. Video/audio decompression/compression device including an arbiter and method for accessing a shared memory
US6108015A (en) * 1995-11-02 2000-08-22 Cirrus Logic, Inc. Circuits, systems and methods for interfacing processing circuitry with a memory
US5872998A (en) * 1995-11-21 1999-02-16 Seiko Epson Corporation System using a primary bridge to recapture shared portion of a peripheral memory of a peripheral device to provide plug and play capability
US5822545A (en) * 1995-12-04 1998-10-13 Cypress Semiconductor Corp. Method and apparatus for eliminating electromagnetic interference and noise caused by all unnecessary switching/toggling of bus signals
US5790138A (en) * 1996-01-16 1998-08-04 Monolithic System Technology, Inc. Method and structure for improving display data bandwidth in a unified memory architecture system
EP0786756B1 (en) * 1996-01-23 2009-03-25 Hewlett-Packard Company, A Delaware Corporation Data transfer arbitration for display controller
US5719511A (en) * 1996-01-31 1998-02-17 Sigma Designs, Inc. Circuit for generating an output signal synchronized to an input signal
US5854638A (en) * 1996-02-02 1998-12-29 Opti Inc. Unified memory architecture with parallel access by host and video controller
US5734875A (en) * 1996-02-05 1998-03-31 Seiko Epson Corporation Hardware that rotates an image for portrait-oriented display
US6226016B1 (en) 1996-02-05 2001-05-01 Seiko Epson Corporation Display apparatus and method capable of rotating an image by 180 degrees
WO1997029432A1 (en) * 1996-02-09 1997-08-14 Intel Corporation Apparatus for programmably defining the access latency
US5748203A (en) * 1996-03-04 1998-05-05 United Microelectronics Corporation Computer system architecture that incorporates display memory into system memory
JPH09245179A (en) * 1996-03-08 1997-09-19 Mitsubishi Electric Corp Computer graphic device
US6128726A (en) * 1996-06-04 2000-10-03 Sigma Designs, Inc. Accurate high speed digital signal processor
US5818468A (en) * 1996-06-04 1998-10-06 Sigma Designs, Inc. Decoding video signals at high speed using a memory buffer
US5793385A (en) * 1996-06-12 1998-08-11 Chips And Technologies, Inc. Address translator for a shared memory computing system
US5815167A (en) * 1996-06-27 1998-09-29 Intel Corporation Method and apparatus for providing concurrent access by a plurality of agents to a shared memory
US6104417A (en) * 1996-09-13 2000-08-15 Silicon Graphics, Inc. Unified memory computer architecture with dynamic graphics memory allocation
US6437829B1 (en) 1997-01-16 2002-08-20 Display Laboratories, Inc. Alignment of cathode ray tube displays using a video graphics controller
JP3739888B2 (en) 1997-03-27 2006-01-25 株式会社ソニー・コンピュータエンタテインメント Information processing apparatus and method
US6052133A (en) * 1997-06-27 2000-04-18 S3 Incorporated Multi-function controller and method for a computer graphics display system
US6057862A (en) * 1997-07-01 2000-05-02 Memtrax Llc Computer system having a common display memory and main memory
US6118462A (en) * 1997-07-01 2000-09-12 Memtrax Llc Computer system controller having internal memory and external memory control
US6075546A (en) 1997-11-10 2000-06-13 Silicon Grahphics, Inc. Packetized command interface to graphics processor
TW466405B (en) * 1998-03-17 2001-12-01 Via Tech Inc Device and method of cache in computer system
US6128026A (en) * 1998-05-04 2000-10-03 S3 Incorporated Double buffered graphics and video accelerator having a write blocking memory interface and method of doing the same
US6145033A (en) * 1998-07-17 2000-11-07 Seiko Epson Corporation Management of display FIFO requests for DRAM access wherein low priority requests are initiated when FIFO level is below/equal to high threshold value
US6119207A (en) * 1998-08-20 2000-09-12 Seiko Epson Corporation Low priority FIFO request assignment for DRAM access
US6262751B1 (en) 1998-10-26 2001-07-17 Seiko Epson Corporation Hardware rotation of an image on a computer display
US6362826B1 (en) * 1999-01-15 2002-03-26 Intel Corporation Method and apparatus for implementing dynamic display memory
US6377268B1 (en) 1999-01-29 2002-04-23 Micron Technology, Inc. Programmable graphics memory apparatus
US6414688B1 (en) 1999-01-29 2002-07-02 Micron Technology, Inc. Programmable graphics memory method
US6341338B1 (en) * 1999-02-04 2002-01-22 Sun Microsystems, Inc. Protocol for coordinating the distribution of shared memory
US6574691B1 (en) 1999-07-28 2003-06-03 Koninklijke Philips Electronics N.V. Apparatus and method for interfacing a non-sequential 486 interface burst interface to a sequential ASB interface
JP4236359B2 (en) * 1999-12-13 2009-03-11 株式会社ルネサステクノロジ Microcomputer with screen display device
US7035989B1 (en) 2000-02-16 2006-04-25 Sun Microsystems, Inc. Adaptive memory allocation
US7406681B1 (en) 2000-10-12 2008-07-29 Sun Microsystems, Inc. Automatic conversion of source code from 32-bit to 64-bit
US6957208B1 (en) 2000-10-31 2005-10-18 Sun Microsystems, Inc. Method, apparatus, and article of manufacture for performance analysis using semantic knowledge
US7106339B1 (en) * 2003-04-09 2006-09-12 Intel Corporation System with local unified memory architecture and method
FI115006B (en) * 2003-06-13 2005-02-15 Nokia Corp Method and device for connection improved interface layer in between of computer screen and processor
US6977656B1 (en) * 2003-07-28 2005-12-20 Neomagic Corp. Two-layer display-refresh and video-overlay arbitration of both DRAM and SRAM memories
US20050270297A1 (en) * 2004-06-08 2005-12-08 Sony Corporation And Sony Electronics Inc. Time sliced architecture for graphics display system
US7417637B1 (en) 2004-09-01 2008-08-26 Nvidia Corporation Fairly arbitrating between clients
US8719547B2 (en) * 2009-09-18 2014-05-06 Intel Corporation Providing hardware support for shared virtual memory between local and remote physical memory
US8943108B2 (en) * 2009-12-23 2015-01-27 International Business Machines Corporation Hardware off-load memory garbage collection acceleration
TW201349167A (en) * 2012-05-22 2013-12-01 Hon Hai Prec Ind Co Ltd Display card device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE3530237A1 (en) * 1984-08-24 1986-03-13 Mitsubishi Denki K.K., Tokio/Tokyo NUMERIC CONTROL DEVICE
EP0379436A1 (en) * 1989-01-20 1990-07-25 CNRS, Centre National de la Recherche Scientifique Apparatus and process for request arbitration and resolution of conflicts linked with the access to memories with independent banks for computing machines
EP0571099A1 (en) * 1992-05-19 1993-11-24 Sun Microsystems, Inc. Bus architecture for integrated data and video memory

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5335322A (en) * 1992-03-31 1994-08-02 Vlsi Technology, Inc. Computer display system using system memory in place or dedicated display memory and method therefor

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE3530237A1 (en) * 1984-08-24 1986-03-13 Mitsubishi Denki K.K., Tokio/Tokyo NUMERIC CONTROL DEVICE
EP0379436A1 (en) * 1989-01-20 1990-07-25 CNRS, Centre National de la Recherche Scientifique Apparatus and process for request arbitration and resolution of conflicts linked with the access to memories with independent banks for computing machines
EP0571099A1 (en) * 1992-05-19 1993-11-24 Sun Microsystems, Inc. Bus architecture for integrated data and video memory

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010006633A1 (en) * 2008-07-18 2010-01-21 Siemens Aktiengesellschaft Method for operating an automation system, computer program, and computer program product
JP2012515992A (en) * 2009-01-23 2012-07-12 アドバンスト・マイクロ・ディバイシズ・インコーポレイテッド Non-graphics use of graphics memory

Also Published As

Publication number Publication date
JPH09505424A (en) 1997-05-27
US5450542A (en) 1995-09-12
KR960706658A (en) 1996-12-09

Similar Documents

Publication Publication Date Title
US5450542A (en) Bus interface with graphics and system paths for an integrated memory system
US5659715A (en) Method and apparatus for allocating display memory and main memory employing access request arbitration and buffer control
US5454107A (en) Cache memory support in an integrated memory system
US5696924A (en) Memory access circuit with address translation performing auto increment of translated address on writes and return to translated address on reads
US5655151A (en) DMA controller having a plurality of DMA channels each having multiple register sets storing different information controlling respective data transfer
US5805927A (en) Direct memory access channel architecture and method for reception of network information
US20050160218A1 (en) Highly integrated mass storage device with an intelligent flash controller
US7127573B1 (en) Memory controller providing multiple power modes for accessing memory devices by reordering memory transactions
JPH09219094A (en) Memory, memory sub-system, memory device and memory system addressing method
KR100847968B1 (en) Dual-port semiconductor memories
US5682522A (en) Shared memory architecture of graphics frame buffer and hard disk cache
US6734862B1 (en) Memory controller hub
US5696923A (en) Graphics processor writing to shadow register at predetermined address simultaneously with writing to control register
US6046952A (en) Method and apparatus for optimizing memory performance with opportunistic refreshing
US6434688B1 (en) Method and apparatus for providing and maximizing concurrent operations in a shared memory system which includes display memory
US5859649A (en) Data processing system having display controller with bursting direct memory access
US8629880B2 (en) Two dimensional memory access controller
US5546344A (en) Extended data output DRAM interface
US6735643B2 (en) Electronic card with dynamic memory allocation management
JPH08255107A (en) Display controller
US5748203A (en) Computer system architecture that incorporates display memory into system memory
US7310717B2 (en) Data transfer control unit with selectable transfer unit size
US8212829B2 (en) Computer using flash memory of hard disk drive as main and video memory
US5666556A (en) Method and apparatus for redirecting register access requests wherein the register set is separate from a central processing unit
US6938118B1 (en) Controlling access to a primary memory

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP KR

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
122 Ep: pct application non-entry in european phase