WO2000052576A1 - Methods and systems for implementing shared disk array management functions - Google Patents
Methods and systems for implementing shared disk array management functions Download PDFInfo
- Publication number
- WO2000052576A1 WO2000052576A1 PCT/US2000/003275 US0003275W WO0052576A1 WO 2000052576 A1 WO2000052576 A1 WO 2000052576A1 US 0003275 W US0003275 W US 0003275W WO 0052576 A1 WO0052576 A1 WO 0052576A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- amf
- amfs
- resource
- lock
- redundancy group
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/065—Replication mechanisms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/08—Error detection or correction by redundancy in data representation, e.g. by using checking codes
- G06F11/10—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
- G06F11/1076—Parity data used in redundant arrays of independent storages, e.g. in RAID systems
- G06F11/1088—Reconstruction on already foreseen single or plurality of spare disks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/08—Error detection or correction by redundancy in data representation, e.g. by using checking codes
- G06F11/10—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
- G06F11/1076—Parity data used in redundant arrays of independent storages, e.g. in RAID systems
- G06F11/1096—Parity calculation or recalculation after configuration or reconfiguration of the system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2053—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
- G06F11/2089—Redundant storage control functionality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2097—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements maintaining the standby controller/processing unit updated
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0604—Improving or facilitating administration, e.g. storage management
- G06F3/0607—Improving or facilitating administration, e.g. storage management by facilitating the process of upgrading existing storage systems, e.g. for improving compatibility between host and storage device
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
- G06F3/0613—Improving I/O performance in relation to throughput
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
- G06F3/0619—Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0629—Configuration or reconfiguration of storage systems
- G06F3/0632—Configuration or reconfiguration of storage systems by initialisation or re-initialisation of storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0683—Plurality of storage devices
- G06F3/0689—Disk arrays, e.g. RAID, JBOD
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/52—Program synchronisation; Mutual exclusion, e.g. by means of semaphores
- G06F9/526—Mutual exclusion algorithms
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2053—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
- G06F11/2094—Redundant storage or storage space
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/85—Active fault masking without idle spares
Definitions
- the present invention relates in general to systems and methods for eliminating bottlenecks in data storage networks, and in direct server attached storage, and more specifically to systems and methods for implementing dynamically shared redundancy group management between multiple disk a ⁇ ay management functions.
- SAN storage area networks
- N servers are clustered together for a proportional performance gain, and a SAN (e.g., a Fiber Channel based SAN) is added between the servers and various RAID (“Redundant A ⁇ ay of Inexpensive Disks”) storage systems/arrays.
- RAID Redundant A ⁇ ay of Inexpensive Disks
- the SAN allows any server to access any storage element.
- each RAID system has an associated RAID controller that must be accessed in order to access data stored on that particular RAID system.
- One solution for providing fault tolerance is to include a redundant controller in a master/slave a ⁇ angement.
- the master controller has primary control, and only when the master fails does the slave controller take over. This solution is very inefficient, however, as the slave controller is not used until a failure in the master has occu ⁇ ed.
- Another solution is to use the master/slave controller architecture, but to split the storage a ⁇ ay into two redundancy groups, each of which is controlled by one and only one of the two controllers (each controller is a "master" vis-a-vis the redundancy group it controls). In this manner, both controllers are operational at the same time, thereby improving the efficiency of the system. In the event one controller fails, the other controller assumes control of the failed controller's redundancy group. This solution also prevents "collisions", which occur, for example, when more than one controller tries to write data to a redundancy group. However, this solution also has some performance drawbacks. For example, the performance in such a master/slave architecture is bound by the speed of the master controller such that performance is not scalable.
- the present invention provides such a peer-to-peer controller architecture solution for data storage management.
- the systems and methods of the present invention implement a novel type of RAID A ⁇ ay Management Function that is useful for building highly scalable disk a ⁇ ays.
- the systems and methods of the present invention provide for sharing redundancy group management between multiple (two or more) Array Management Functions.
- multiple A ⁇ ay Management Functions are connected to multiple redundancy groups over an interconnect medium.
- the Array Management Functions are connected to the redundancy groups over any storage area network (SAN), such as a fiber-channel based SAN.
- SAN storage area network
- the multiple AMFs share management responsibility of the redundancy groups, each of which typically includes multiple resources spread over multiple disks.
- the AMFs provide concu ⁇ ent access to the redundancy groups for associated host systems.
- the AMF synchronizes with the other AMFs sharing control of the redundancy group that includes the resource to be operated on, so as to obtain a lock on the resource.
- the AMF send replication data and state information associated with the resource such that if the AMF fails, any of the other AMFs are able to complete the operation and maintain data reliability and coherency.
- Array Management Function As used herein, the terms "Array Management Function,” “Redundancy Group,” and “Redundancy Group Management” are defined as set forth in The RAID Advisory Board's (RAB) Handbook on System Storage Technology, 6 th edition, the contents of which are herein incorporated by reference for all purposes.
- RAID Advisory Board's (RAB) Handbook on System Storage Technology, 6 th edition the contents of which are herein incorporated by reference for all purposes.
- AMF Array Management Function
- An AMF generally refers to the body that provides common control and management for one or more disk or tape a ⁇ ays.
- An AMF presents the a ⁇ ays of tapes or disks it controls to the operating environment as one or more virtual disks or tapes.
- a AMF typically executes in a disk controller, an intelligent host bus adapter or in a host computer. When it executes in a disk controller, an AMF is often refe ⁇ ed to as firmware.
- One or more AMFs can execute in each controller, adapter or host as desired for the particular application.
- “Redundancy Group” generally refers to a collection of p_extents organized by an AMF for the purpose of providing data protection. With one redundancy group, a single type of data protection is used.
- Redundancy groups typically include logical entities composed of many resources such as stripes, data blocks, cached data, map tables, configuration tables, state tables, etc.
- Redundancy Group Management generally refers to the responsibilities, processes and actions of an AMF associated with a given redundancy group.
- updates of the check data within a redundancy group are dynamically coordinated and synchronized between the various AMFs sharing the redundancy group.
- Such updating is facilitated using coherency and locking/unlocking techniques.
- Coherency and locking are typically performed as a function of a block, a group of blocks, a stripe or a group of stripes. Locking is performed dynamically using any of a variety of well known or proprietary coherency protocols such as MESI.
- MESI a variety of well known or proprietary coherency protocols
- the coherency between the caches associated with a redundancy group and the data contained within the redundancy group is synchronized and maintained.
- a data storage network typically comprises a redundancy group including a plurality of resources, and two or more a ⁇ ay management functions (AMFs) sharing access to the redundancy group.
- the AMFs provide concu ⁇ ent access to the redundancy group for associated host systems.
- the network also typically includes a storage area network for connecting the AMFs with the redundancy group.
- the first AMF arbitrates with the other AMFs sharing access to the redundancy group for a lock on the first resource.
- the first AMF performs the operation on the first resource and concu ⁇ ently sends replication data and state information associated with the first resource to the other AMFs such that if the first AMF fails while performing the operation, one of the other AMFs is able to complete the operation.
- a method of dynamically sharing management of a redundancy group between two or more a ⁇ ay management functions (AMFs) where the AMFs are able to concu ⁇ ently access the redundancy group, which includes a plurality of resources.
- the method typically comprises the steps of receiving a request from a host by a first one of the AMFs to perform a first operation on a first one of the resources, synchronizing with the other AMFs so as to acquire access to the first resource, and performing the first operation on the first resource.
- a data storage network system typically comprises one or more redundancy groups, each redundancy group including multiple resources spread over multiple disks, and two or more a ⁇ ay management functions (AMFs) sharing redundancy group management of the one or more redundancy groups, wherein the AMFs are able to concu ⁇ ently access the one or more redundancy groups.
- the system also typically comprises a storage area network for interconnecting the AMFs with the redundancy groups.
- a method of reconstructing a redundancy group when one of its disks fails in a data storage network system comprising two or more a ⁇ ay management functions (AMFs) interconnected with the redundancy group over a storage area network, wherein the AMFs all share management of the redundancy group, and wherein the AMFs are able to concu ⁇ ently access the redundancy group.
- the redundancy group includes multiple resources spread over multiple disks and a replacement disk.
- the method typically comprises the steps of arbitrating for control of a first resource of the redundancy group by a first one of the AMFs, arbitrating for control of a second resource of the redundancy group by a second one of the AMFs, and concu ⁇ ently reconstructing the first and second resources using the replacement disk.
- a method of expanding a redundancy group when an extra disk is added to it in a data storage network system typically comprises two or more a ⁇ ay management functions (AMFs) interconnected with the redundancy group over a storage area network.
- the redundancy group includes multiple resources spread over multiple disks.
- the method typically comprises the steps of arbitrating for control of a first resource by a first one of the AMFs, arbitrating for control of a second resource by a second one of the AMFs, and concu ⁇ ently expanding the first and second resources using the extra disk.
- a method of pipelining replication of incoming host data in a data storage network system typically comprises a redundancy group interconnected with two or more a ⁇ ay management functions (AMFs) over a storage area network.
- the redundancy group includes multiple resources spread over multiple disks.
- the AMFs all share management of the redundancy group, and are able to concu ⁇ ently access the redundancy group.
- the method typically comprises the steps of receiving a write command by a first AMF from a host to write at least two data sets to two or more of the resources, and acquiring a lock by the first AMF on the first resource to which the first data set is to be written.
- the method also typically includes the steps of writing the first data set to the first resource, and concu ⁇ ently performing a first replication operation wherein replication data and state information associated with the first resource is sent to the other AMFs, such that if the first AMF fails while performing the write operation, one of the other AMFs is able to complete the write operation.
- a method is provided for dynamically sharing management of a redundancy group between two or more a ⁇ ay management functions (AMFs) in a data storage system.
- the AMFs are able to concu ⁇ ently access the redundancy group, which includes a plurality of resources.
- the method typically comprises the step of determining an arbiter AMF for a first one of the resources, wherein the arbiter AMF is one of the two or more AMFs sharing management of the redundancy group.
- the arbiter AMF is able to grant a lock for the first resource.
- the method also typically comprises the steps of communicating a lock request from a first one of the AMFs to the arbiter AMF requesting a lock on the first resource, and performing an operation on the first resource by the first AMF once the lock on the first resource has been granted by the arbiter AMF.
- Figures 1 to 7 show exemplary configurations useful for providing data from one or more redundancy groups to one or more host systems using controllers sharing access to and control of redundancy groups according to the present invention
- Figure 8 shows a multiple controller configuration and the internal configuration of the controllers according to the present invention
- Figure 9 shows an operation using a general synchronization sequence according to an embodiment of the present invention
- Figure 10 shows an operation using a general replication sequence according to an embodiment of the present invention
- Figure 11 a shows the flow for read operations when the redundancy group is in a normal, non-degraded mode, according to an embodiment of the present invention
- Figure lib shows the flow for read operations when the redundancy group is in a degraded mode, according to an embodiment of the present invention
- Figure 12 shows the flow for pipelining the replication of incoming host data according to an embodiment of the present invention
- Figure 13 a shows the flow for a write operation when the redundancy group is in a normal, non-degraded mode according to an embodiment of the present invention
- Figure 13b shows the flow for a recovery process when the AMF updating the stripe as shown in Figure 13a fails before completing the update according to an embodiment of the present invention
- Figure 14a shows the flow for a write operation when the redundancy group is in a degraded (with a failed drive) mode, according to an embodiment of the present invention
- Figure 14b shows the flow for a recovery process when the AMF updating the stripe as shown in Figure 14a fails before completing the update according to an embodiment of the present invention
- Figure 15 shows the flow for a background reconstruction process according to an embodiment of the present invention
- Figure 16 shows the general sequence flow for a background expansion process according to an embodiment of the present invention
- FIGS 17a and 17b illustrate AMF communication without, and with, the message gathering techniques of the present invention, respectively;
- Figures 18a illustrates a basic arbitration process where an AMF requests a lock for a particular resource according to the present invention
- Figure 18b illustrates the general process flow of the generalized arbitration process according to the present invention
- Figure 19 illustrates a simplified arbitration process between two AMFs in a cluster configuration for a single resource
- Figure 20 illustrates exemplary resource arbitration sequences for a cluster including four AMFs according to the present invention.
- SRGM shared redundancy group management
- the present invention provides for shared redundancy group management (SRGM) between multiple AMFs so that multiple AMFs can simultaneously access the same redundancy group.
- SRGM shared redundancy group management
- distributed synchronization and replication techniques are used to coordinate the activities of all AMFs sharing a redundancy group and to maintain data reliability. Access to any redundancy group can be gained through any controller that includes an AMF that is sharing control of that redundancy group.
- the AMFs sharing a resource group are therefore peers.
- redundancy groups are preferably shared on a group by group basis. That is, some redundancy groups may be shared by a first group of AMFs, other redundancy groups may be shared by a second group of AMFs, and still other redundancy groups may not be shared at all.
- FIG. 1 shows a basic network configuration according to the present invention.
- a plurality of network clients 10i to 10 N are communicably coupled with a plurality of servers 20] to 20 N , each of which includes a controller 30.
- N is used herein to indicate an indefinite plurality, so that the number "N" when refe ⁇ ed to one component does not necessarily equal the number "N" of a different component.
- Each network client 10 is coupled to one or more of servers 20 over any of a number of connection schemes as required for the specific application and geographical location relative to servers 20, including, for example, an internet connection, any local area network (LAN) type connection, any wide area network (WAN) . type connection, any proprietary network connection, etc.
- Each controller 30 includes one or more AMFs, and is communicably coupled with the multiple a ⁇ ays 40 of disk drives 45 over an interconnect medium, such as a storage area network (SAN) 50.
- SAN 50 is a fiber-channel based SAN.
- any SAN type such as a SCSI-based SAN, or any direct server interconnect such as a direct SCSI or FC connection may be used without departing from the spirit of the invention.
- each controller 30 has direct access to each a ⁇ ay 40 over SAN 50, redundancy group management can be shared by all of controllers 30.
- a fiber-channel based SAN is prefe ⁇ ed because the fiber-channel standard is an open standard that supports several network topologies including point-to-point, switched fabric, arbitrated loop, and any combination of these topologies.
- Fiber-channel presently provides for data transfer speeds of up to lOOMBps (200MBps duplex) at distances of up to 30 meters over copper cabling and up to 10 kilometers over fiber-optic cabling.
- FIG. 2 shows an example of multiple hosts, each with a controller configured in a switch-based fiber-channel SAN according to the present invention.
- Each controller 30 is coupled to switches 55 in the SAN through two fiber-channel ports as shown.
- each controller 30 is in communication with all other controllers 30 and with disk a ⁇ ay 40.
- Each controller 30 communicates with its host system over a PCI bus 35.
- Switches 55 are coupled to disk a ⁇ ay 40 using the loop topology as shown.
- many loops can be supported through any of a number of switching topologies. In general, the more loops, the greater the data transfer rates that can be supported.
- the system redundancy as shown in Figure 2 is N-l, meaning that given N controllers (30
- a controller failure for a specific host causes a loss of data availability for the specific host, but not for the entire system. Controller environmental faults, such as power supply failures are protected against in this system configuration because the data from one host system is synchronized to data on the other host systems according to the present invention as will be described in more detail below.
- There is a recovery period associated with a controller failure This is the time it takes for the surviving controllers to make sure that all critical data is again replicated within the cluster.
- FIG. 3 shows an example of multiple controllers and a single host configured in a switch-based fiber-channel SAN according to the present invention.
- Each controller 30 is coupled to the switches 55 in the SAN through two fiber-channel ports as shown, however, from 1 to N ports may be used as desired for the particular application.
- each controller 30 is in communication with all other controllers 30 and with disk a ⁇ ay 40 over the fiber-channel SAN.
- each controller 30 communicates with the host system over one or more PCI buses 35.
- the controllers 30 are also able to communicate with each over the PCI buses 35.
- Switches 55 are coupled to disk a ⁇ ay 40 using the loop topology as shown.
- FIG. 4 shows an example of multiple hosts each with multiple controllers configured in a switch-based fiber-channel SAN according to the present invention.
- Each controller 30 is coupled to the switches 55 in the SAN through two fiber- channel ports as shown, however, from 1 to N ports may be used as desired for the particular application.
- each controller 30 is in communication with all other controllers 30 and with disk a ⁇ ay 40 over the fiber-channel SAN.
- each controller 30 communicates with its host system over one or more PCI buses 35.
- the controllers 30 are also able to communicate with each over the PCI buses 35.
- Switches 55 are coupled to disk a ⁇ ay 40 using the loop topology as shown.
- Each controller 30 is coupled to the loop through the two fiber-channel ports as shown. Thus, each controller 30 is in communication with all other controllers 30 and with disk a ⁇ ay 40 over the FC-AL. Further, each controller 30 communicates with its host system over one or more PCI buses 35. In this configuration, redundancy • and synchronization exist between two or more controllers within each host system. Where each host system includes N controllers 30, up to N-l controllers can fail before loss of data availability to the host system. Further, if a host system fails, no data will be lost on a ⁇ ay 40 when controllers 30 on other hosts are configured to share management of a ⁇ ay 40 with the controllers 30 of the failed host system according to the present invention.
- FIG. 6 shows two independent redundancy groups managed by two independent controllers according to the present invention.
- Redundancy group A is managed by controller 30 A of host system A
- redundancy group B is managed by controller 30 B of host system B.
- external host system C and external host system D are also shown.
- the FC ports of controllers 30 A and 30 B function as both device and host channels. This allows each controller 30 A or 30 B to respond to service requests from its associated PCI bus 35 or from an external FC host such as external host system C, external host system D or another controller such as controller 30 B or 30 A , respectively. In this manner, redundancy group A is made accessible to host system B and redundancy group B is made accessible to host system A.
- controller 30 A From the perspective of controller 30 A , for example, a request received from host system B to read or write data is treated as if it were received over associated PCI bus 35. Similarly, external hosts systems C and D are able to access data on redundancy groups A and B by issuing read or write commands to the appropriate controller 30 over the fiber- channel SAN. Any number of host systems can be interconnected in this manner. Further, although only a single controller configuration is shown, other configurations can be used, such as the configuration as illustrated in Figure 4. The use of switches helps isolate the disk a ⁇ ays for performance scalability.
- FIG 7 shows an example of an external RAID system including multiple controllers in passive PCI Backplane(s) configured in a switch-based fiber-channel SAN according to the present invention.
- the controllers 30 ⁇ to 30 N are installed into one or more passive PCI backplanes, and are configured to accept host commands from the FC ports and/or the associated PCI buses.
- external servers are able to access data on the various redundancy groups controlled by the controllers 30] to 30 N by issuing read or write requests to the appropriate controller 30.
- only one controller 30 is required, performance and redundancy scales as more controllers are added.
- a FC-AL similar to the configuration shown in Figure 5 can alternatively be used if the use of switches 55 is undesired or impractical.
- Figure 8 shows a multiple controller configuration and the internal configuration of controllers 30 according to the present invention.
- One or more of the controllers 30) to 30 N shown in Figure 8 may be located in a separate host or on passive PCI backplanes.
- each controller 30 may be located in a separate host system, or each of multiple host systems may include one or more of the controllers 30.
- PCI host connection 60 provides a connection path for receiving and processing commands from host systems and for providing inter-controller link (ICL) services with other controllers.
- Fiber-channel (FC) host connection 65 provides a connection means for processing and receiving commands from host systems and for providing ICL services with other controllers.
- each controller includes two physical FC ports (not shown in Figure 8, but see Figures 2 through 7 for examples), both of which are used for disk drive access, receiving and processing host commands and ICL services. It will be apparent to one skilled in the art that each controller can include from 1 to N FC ports as desired for the particular application.
- Each controller 30 includes one or more virtual disk ports 70 each of which provides access to a virtual disk 75.
- Virtual disks 75 are basically partitions of an a ⁇ ay. (A "Redundancy Group” is generally synonymous with "Array”). Each a ⁇ ay may be partitioned into as many virtual disks 75 as desired. Each virtual disk is associated and controlled by one or more associated AMFs 80.
- Many virtual disk ports 70 can exist for the same virtual disk 75, but each must exist on a separate controller. For example, as shown in Figure 8, virtual disk ports 70YR ⁇ and 70YR N associated with virtual disk YR are provided to hosts on controller 30 ⁇ and controller 30 N , respectively. Each virtual disk port YR provides access to the same virtual disk YR.
- Virtual disk YR is a partition of a ⁇ ay Y, the control and management of which is shared by AMFs 80Y ⁇ and 80Y N .
- Virtual disk ports can be added to a controller at any time by creating a virtual disk and assigning an IO port address to the virtual disk port.
- a virtual disk must exist before a virtual disk port is created and assigned to it, but the creation of a virtual disk is not coincident with the creation of a virtual disk port.
- a virtual disk port is created right after the redundancy group is created.
- Virtual disk ports can then be created at any time thereafter, but the creation of the virtual disk is only done once. Virtual disk ports can also be deleted at any time. All host operations in progress on the port are allowed to complete. While these operations are completing, new host operations are rejected, for example, by returning a not_ready status signal to the host.
- AMF 80Y ⁇ synchronizes and replicates with AMF 80Y N (and with any other AMF associated with a ⁇ ay Y, e.g., AMF 80Y 2 (not shown)).
- AMF 80G ⁇ synchronizes and replicates with AMF 80G N
- AMF 80T synchronizes and replicates with AMF 80T N .
- virtual disk ports on one controller synchronize and replicate with related virtual disk ports on other controllers.
- Synchronization and replication ensures that the operations performed by the different AMFs sharing a redundancy group (a ⁇ ay) do not destructively interfere with each other (e.g., "collisions" do not occur).
- Synchronization requires that any AMF which needs to access a resource associated with a shared redundancy group arbitrate with the other AMFs for access rights (lock) on the resource before using it.
- Arbitration is accomplished by sending arbitration messages between the AMFs over the PCI and/or FC ICL links.
- FIG. 9 shows a general synchronization sequence for an operation according to an embodiment of the present invention.
- the operation is started. For example, a host may send a request that requires data be written to a particular resource.
- the AMF determines whether it already has a lock on the desired resource. If not, the AMF arbitrates with other AMFs for a lock on the desired resource in step 130. Once a lock has been acquired (or it is determined that ' the AMF already has the lock), the desired operation is performed on the resource by the AMF in step 140.
- a lock is acquired by an AMF, it is preferably not released until another AMF needs the lock (i.e., another AMF arbitrates for the lock) to help cut shared redundancy group management (SRGM) overhead in many applications.
- SRGM shared redundancy group management
- a first-come-first-served type arbitration scheme is used, but a priority based, or any other arbitration scheme can be used.
- arbitration typically involves making a request to use a resource to a resource controller (typically software, but sometimes hardware based). The resource controller grants access to the resource based on the arbitration algorithm used.
- Each AMF is able to execute many types of operations on a redundancy group, including, for example, host reads, host writes, background writes, regeneration's, reconstruction's, online expansion, parity scrubbing, etc.
- An extended sequence of such operations is a termed a "process". Examples of processes include reconstructions, online expansion, and parity scrubbing. All AMF operation types require synchronization arbitration in order to proceed. Once an operation on a resource is completed by a given AMF, other AMFs are able to use the resource.
- Synchronization is preferably performed at the operation level as opposed to the process level. That is, for each operation to be executed, the basic synchronization sequence shown in Figure 9 is performed. For a process wherein some function must be performed on the entire redundancy group (such as a reconstruction), the processing is broken up into a sequence of operations. Where each operation operates on a different resource, arbitration for synchronization of those resources required for one operation is done independently of the resources required for other operations in the process. Using synchronization at the operation level instead of the process level allows AMFs to share resources while a process is in progress. If synchronization were performed at the process level instead of the operation level, some AMFs would have to wait until the entire process is complete before they could use the resources, thereby resulting in host timeouts.
- Replication accommodates AMF failures.
- Resources and their state information are replicated so that if an AMF fails the resource and its state information is available via another AMF that has a copy of the resource and its state information.
- a copy of the modified resource and/or the resource's operation state is sent to other AMFs sharing the resource.
- These other AMFs are called replication partners.
- AMF 80Y ⁇ and AMF 80Y N are replication partners as each share control of Array Y.
- the replicated information is used by the replication partners to complete the operation in the event that the AMF updating the resource fails during the operation.
- Figure 10 shows a general replication sequence for an operation according to an embodiment of the present invention.
- the start of the sequence is the basic synchronization sequence as shown in Figure 9.
- the operation is started. For example, a host may send a request that requires writing data to a particular resource.
- the AMF determines whether it already has a lock on the desired resource. If not, the AMF arbitrates with other AMFs for a lock on the desired resource in step 320. Once a lock has been acquired the operation can be performed. As shown, the operation performed by the AMF is broken into a number, i, of steps.
- step 240 ⁇ the replication data and state information associated with the resource and the first operation step is sent to each replication partner.
- the first step of the operation is performed.
- subsequent operation steps 250 2 to 250 are performed in sequence, as are the replication steps 240 2 to 240,.
- the replication information is sent to the replication partners associated with the given resource.
- N-l concu ⁇ ent AMF failures are accommodated if N copies of a resource and its state information exist within the AMF cluster (i.e., those AMFs sharing the resource), where N is defined as the replication dimension.
- replication information is sent to the N-l replication partners associated with the given resource.
- Replication can be targeted to specific replication groups or specific AMFs.
- N-way replication is performed without defined replication groups.
- replication takes place with any N-l AMFs in the cluster that are sharing the resource being replicated.
- replication is performed with N-l other replication groups.
- a replication group is a group of AMFs that replicate critical data to AMFs in replication groups other than their own.
- An example of this is a set of controllers, each controller including one or more AMFs, at one physical site and another set of controllers at another physical site.
- Another example is a set of controllers inside a host system and another set external to a host. Using replication groups helps ensure that if one group of controllers all fail, the other group(s) have the information necessary to maintain data reliability.
- the replication can be targeted to specific replication groups or specific AMFs.
- a given replication group preferably replicates with any AMF outside of the replicating AMFs replication group.
- the set of replication groups to which a given replication group replicates may be specified by an operator.
- synchronous replication is the prefe ⁇ ed replication mode. In the synchronous replication mode, completion status information for an operation is returned to a host after all replication targets have received the replication data. Alternate replication modes include asynchronous replication and pseudo-synchronous replication. In the asynchronous replication mode, completion status information for an operation is returned to a host before replication data has been transmitted. In the pseudo-synchronous replication mode, completion status information for an operation is returned to a host after the replication data has been transmitted, but before all replication targets have acknowledged receiving the data.
- multiple AMFs are able to read a resource concu ⁇ ently. That is, multiple read locks can be outstanding at any time.
- FIG. 11a shows the general sequence flow for a read operation when the redundancy group (RG) is in a normal, non-degraded mode according to an embodiment of the present invention.
- “Non-degraded” generally refers to the case where all drives in the redundancy group are operational, whereas “degraded” generally refers to the case where one or more drives in the redundancy group have failed.
- step 310 the read operation is started.
- the AMF receives a request from a host to read a particular resource.
- a lock on the particular resource is required. This is basically the same as steps 120 and 130 of Figure 9.
- multiple locks can be outstanding. This enables multiple AMFs to read a resource concu ⁇ ently.
- Figure 1 lb shows the general sequence flow for a read operation when the redundancy group (RG) is in a degraded mode according to an embodiment of the present invention.
- the read operation is started.
- the AMF receives a request from a host to read a particular resource.
- a lock on the particular resource is required.
- the AMF reads the data and parity from the particular resource at step 340, and regenerates any missing data at step 350.
- the data (regenerated) is transfe ⁇ ed to the host that issued the read request.
- Figure 12 shows the general sequence flow for replicating incoming host data in a pipelined manner for a write operation according to an embodiment of the present invention.
- Pipelining of the replication data helps to minimize replication latency.
- the operation is started. For example, a host issues a write command to write one or more blocks of data to one or more resources.
- the host command is received from the host and parsed. The host command is processed as a sequence of data block sets.
- the appropriate lock for the starting set of blocks is acquired.
- the starting block set is transfe ⁇ ed to the AMF from the host.
- the block set replication is started for the starting set of blocks.
- the AMF does not wait for the block set replication to complete; the AMF immediately determines whether any more sets of blocks need to be processed at step 460. If so, the AMF immediately starts acquiring the appropriate lock to get the next set of blocks in step 430, and repeats steps 440, 450 and 460 for the next block set. If all block sets have been received and processed, the AMF waits for all replication operations to complete in step 470. When each operation is complete the AMF sends status to the host in step 480.
- FIG 13a shows the general sequence flow for a write operation when the redundancy group (RG) is in a normal, non-degraded mode according to an embodiment of the present invention.
- the operation is started.
- a host issues a write command to write data to a resource.
- Step 520 is the process of acquiring the synchronization lock for the resource required as shown in Figure 9.
- the resource is a stripe write lock, but it may also be any other lock as required by the particular operation.
- the AMF reads the old data and parity from the RG's disks in step 530.
- the AMF Concu ⁇ ent with the disk read operation of step 530, the AMF sends a state notification signal to its replication partners for this resource, in step 540.
- the replication partners include all other AMFs to which this AMF replicates state information and data for the particular resource.
- the number of replication partners is equal to N-l where N is the replication dimension.
- N is the replication dimension.
- the replication dimension N is from 1 to 8, but N may be any number as desired.
- the state notification signal is a 'begin update' type signal, which tells the replication partners that a stripe update has started. The replication partners need to know this information because they will be responsible for cleaning up in the event the writing AMF fails before completing the operation.
- the AMF writes the new data to the RG member disks in step 550. Concu ⁇ ent with the new data write step 550 is the generation of the new parity in step 560. Thereafter, in step 570 the AMF writes the new parity to the RG member disks.
- the AMF sends an 'end update' notification to its replication partners in step 580. Upon receiving this notification, the replication partners release their state data associated with the stripe update.
- Figure 13b shows the general sequence flow for a recovery process when the AMF updating the stripe as shown in Figure 13a fails before completing the update according to an embodiment of the present invention.
- the replication partners assume the responsibility of recovering from the failed update operation.
- the recovery operation begins when one or more of the replication partner AMFs either detects a failure or is notified of a failure.
- the replication partners arbitrate for ownership of the stripe lock in step 620.
- the AMF that wins the arbitration (the recovery AMF) is responsible for executing recovery of the failed update operation.
- Failure notification typically comes from the inter-controller link (ICL) component. If a controller fails, the AMFs on that controller lose communication with the other AMFs they were sharing the redundancy group with. The ICL periodically sends a 'ping' message to all the other AMFs it is sharing the redundancy group with. If any of these AMFs fails to respond to the ping message, then the AMF that sent the ping message assumes the AMF has failed and begins recovery action. Recovery is also triggered if the ICL encounters a transport failure when sending synchronization or replication messages to the destination AMF.
- ICL inter-controller link
- the recovery process includes two basic steps: recalculation of the stripe parity and rewriting the data.
- the recovery AMF reads all the data for the stripe segments affected by the failed update operation. Concu ⁇ ent with the data read step 630, the recovery AMF assigns one or more new replication partners and sends a 'begin update' notification to all of its replication partners in step 640.
- the recovery AMF When the data read operation is complete, the recovery AMF generates new parity in step 650. This new parity calculation does not include the new data. It is simply a regeneration of parity for data on the RG member disks.
- the recovery AMF writes the new parity to RG member disks in step 660.
- the recovery AMF sends an 'end update' notification to the replication partners in step 670.
- the cache write back scheduling algorithm causes one of the replication partners to write the new data to the RG member disks in step 680, which is a normal (non-recovery mode) stripe update operation as shown in Figure 13a.
- the caching functionality is a part of the AMF.
- FIG 14a shows the flow for a write operation when the redundancy group (RG) is in a degraded (with a failed drive) mode, according to an embodiment of the present invention.
- This sequence is similar to that of the non-degraded case shown in Figure 13a, with the inclusion of regeneration step 744 and replication step 746 as will be described below.
- step 710 the operation is started.
- a host issues a write command to write data to a resource.
- Step 720 is the process of acquiring the synchronization lock for the resource required as shown in Figure 9.
- the resource is a stripe write lock, but it may also be any other lock as required by the particular operation.
- the AMF reads the old data and parity from the RG's disks in step 730.
- the AMF Concu ⁇ ent with the disk read operation of step 730, the AMF sends a state notification signal to its replication partners for this resource, in step 740.
- the replication partners include all other AMFs to which this AMF replicates state information and data for the particular resource.
- the state notification signal is a 'begin update' type signal, which tells the replication partners that a stripe update has started.
- the replication partners need to know this information because they will be responsible for cleaning up in the event the writing AMF fails before completing the operation.
- the AMF regenerates the data that was on the failed disk in step 744.
- the old data including regenerated data, is replicated to the replication partners. Replication of this data to the replication partners is necessary for recovery in the event the updating AMF fails before completing the operation.
- step 746 the new dat ⁇ is written to the RG member disks in step 750. Concu ⁇ ent with the new data write step 750 is the generation of the new parity in step 760. Thereafter, in step 770 the AMF writes the new parity to the RG member disks. Once the parity write operation is complete, the AMF sends an 'end update' notification to its replication partners in step 780. Upon receiving this notification, the replication partners release their state data associated with the stripe update.
- Figure 14b shows the general sequence flow for a recovery process when the AMF updating the stripe as shown in Figure 14a fails before completing the update according to an embodiment of the present invention.
- This scenario differs from the non- degraded recovery case shown in Figure 13b in that the recovery AMF uses the replicated old data to reconstruct the RG disk segments that were affected by the updating AMF's failure.
- the replication partners assume the responsibility of recovering from the failed update operation.
- the recovery operation begins when one or more of the replication partner AMFs either detects a failure or is notified of a failure, for example by a host.
- the replication partners arbitrate for ownership of the stripe lock in step 820.
- the AMF that wins the arbitration (the recovery AMF) is responsible for executing recovery of the failed update operation.
- step 830 new parity is generated from the old data supplied by replication step 746 of Figure 14a.
- the recovery AMF assigns one or more new replication partners and sends a 'begin update' notification to all of its replication partners in step 840.
- step 850 the old data is written to the disks.
- step 860 the replication partners are informed that the old data has been written back to the disks. The replication partners can now discard their copies of the old data.
- the recovery sequence is the same as for the non-degraded recovery sequence. Specifically, the new parity to written to the RG member disks in step 870.
- the recovery AMF sends an 'end update' notification to the replication partners in step 880.
- the cache write back scheduling algorithm causes one of the replication partners to write the new data to the RG member disks in step 890, which is a normal (non-recovery mode) stripe update operation as shown in Figure 13a.
- FIG 15 shows the general sequence flow for a background reconstruction process, according to an embodiment of the present invention.
- Each operation is started in step 910, and the appropriate lock is acquired in step 920, in this case a stripe lock.
- the AMF reads the data and parity for the stripe.
- the AMF regenerates missing data, and in step 950 writes the data to the replacement disk.
- the AMF updates the map table to reflect the fact that blocks that originally mapped to the failed disk now map to the replacement disk in step 960.
- the map table maps host logical blocks to RG member disks and blocks on the disks.
- SRGM shared redundancy group management
- each AMF reconstructs stripe(s) Mod(S/i), where S is the stripe number.
- RG expansion is the addition of drive members to an existing RG.
- a unique advantage of SRGM is that it allows expansion processing to be distributed to all AMFs sharing a RG. This results in faster expansion times and a reduction in the increased response times normally encountered by a host during expansion.
- Distributed expansion is accomplished by having a subset (or all) of the AMFs sharing a RG arbitrate for which stripes they will be responsible for expanding. If any of these AMFs fail or shutdown during expansion then the remaining AMFs re- arbitrate expansion responsibilities. For example, suppose there are N AMFs sharing a redundancy group that needs expansion. These AMFs talk to each other (by sending messages) and determine which ones are to participate in the expansion, e.g., a subset of N, denote by M. These M AMFs determine expansion responsibilities by determining which AMFs will expand which stripe(s). This can be determined by any algorithm. In one embodiment of the invention, for example, each AMF; expands stripe(s) Mod(S/i), where S is the stripe number.
- Figure 16 shows the general sequence flow for a background expansion process according to an embodiment of the present invention.
- the process is started in step 1010, and the appropriate stripe lock is acquired in step 1020.
- the expansion case is different from the preceding examples in that multiple locks must be acquired.
- An expansion operation will involve 2 or more stripes.
- One stripe is the expansion stripe whose stripe width will be increased from W to W+N.
- the other stripes involved are stripes containing host data that will be migrated from those stripes to the expanded stripe.
- step 1030 the data on the stripe to be expanded is read.
- step 1040 the data is replicated so that if the operation fails before completion the replication partners will be able to clean up after the failure and continue the expansion process.
- the source data stripe containing data to be migrated to the expanded stripe is read in step 1045.
- the AMF notifies its replication partners that it is beginning the expansion stripe update in step 1050.
- step 1055 the AMF generates parity information for the expanded stripe.
- the data for the expanded stripe is written to the disks in step 1060.
- the parity generation step 1055 and the notify begin update step 1050 are complete, the parity is written to the disks in step 1070.
- the AMF notifies its replication partners that the update is complete in step 1080.
- the replication partners then update their map tables to reflect the increased stripe width and migrated host data. They also discard the data replicated in step 1040.
- the map table maps host logical blocks to RG member disks and blocks on the disks.
- step 1090 it is determined whether any more stripes are to be expanded by the AMF. If so, the sequence is repeated. This sequence repeats as long as there are more stripes that need to be expanded to utilize the capacity of the new RG member disks. Note that this is process - what makes it a process is the looping that results in step 1090. Steps 1020 through 1090 comprise an operation.
- ICL Inter- Controller Link
- Message gathering is, generally, an algorithm that combines many small messages destined for a particular cluster node (i.e. a controller, in which may reside many AMFs) into one big message packet and sends it as one message to the particular node. This dramatically reduces processing overhead and IO channel loading, and contrasts with the approach of sending individual messages to a cluster node.
- Figure 17a illustrates AMF communication without the message gathering techniques of the present invention.
- a collection of AMFs 1100 and an Inter- Controller Link ( ICL ) entity 1105 compose a SRGM node 1110.
- a node is typically a hardware entity such as a controller.
- ICL 1105 is a software entity that is responsible for routing synchronization and replication messages 1120 from one AMF to another. As shown in Figure 17a, only one of many similar nodes is shown as being connected to the SAN 1130.
- the AMFs 1100 within node 1110 are sending and receiving synchronization and replication messages with other AMFs on other nodes that share the same redundancy group. Each AMF within node 1110 generates independent streams of synchronization and replication messages, all destined for one or more other nodes on
- SAN 1130 The messages being sent or received by a particular AMF are independent of the messages being sent or received by other AMFs on the same node. As shown in Figure 17a, three AMFs 1100 are sending a total of nine messages 1140 to AMFs on other nodes. Without message gathering, ICL 1105 has to send nine messages to other nodes. Also, without message gathering, all synchronization and replication messages generated by all AMFs within a SAN node are processed and sent through the SAN individually. Each message takes a fixed amount of processing overhead, regardless of size.
- Figure 17b illustrates AMF communication with the message gathering techniques of the present invention.
- Message gathering is where many smaller messages destined for a particular node are packed together to form one larger message. This larger message can be sent over SAN 1130 as one message and then unpacked on the receiving node back into the individual messages. For example as shown, the nine messages 1120 are destined for three different nodes. In this example, then, if message gathering is used, ICL 1105 only needs to send three messages 1150 - one for each node (not counting itself). ICL 1105 takes on the responsibility of packing and unpacking individual AMF messages.
- FIGS 18a illustrates a basic arbitration process where an AMF requests a lock for a particular resource according to the present invention.
- AMF 1200 and AMF 1210 each request a lock on a particular resource, and the lock requests are queued in an arbitration queue 1205.
- the arbitration process for an AMF begins when a request is placed in arbitration queue 1205.
- the requests are processed in some order such that all requests are satisfied in priority order.
- the request queue priority is established through any well known algorithm (e.g. FIFO, LIFO).
- Each requesting AMF must wait until its request is processed to obtain the lock.
- Each AMF obtains a lock on the resource at successful completion of the arbitration process.
- An AMF fails to lock the resource if arbitration fails.
- FIG 18b illustrates the general process flow of the generalized arbitration process according to the present invention.
- Arbitration involves coordinating the resource lock transfer between two AMFs: the requesting AMF 1225 and the AMF 1230 with the resource lock.
- AMF 1225 sends a Request Lock message to an arbitrator 1220 (the entity performing the arbitration process), which queues the message until a time defined by its priority algorithm.
- arbitrator 1220 processes the request by issuing a Release Lock message to AMF 1230 which cu ⁇ ently has the resource lock.
- AMF 1230 releases the lock and notifies arbitrator 1220 that the lock is released.
- Arbitrator 1220 then signals requesting AMF 1225 that it has been granted the resource lock.
- AMF 1225 continues to hold the lock until arbitrator 1220 calls for it to release the resource.
- optimizations of the arbitration queue are possible when one or more AMFs request a read lock for a particular resource.
- the arbitration process simultaneously grants read locks in these situations, as long as command ordering is preserved.
- An AMF (or controller) manages the arbitration process for a resource within the redundancy group cluster. This AMF is known as the resource arbitrator. Assignment of the arbitrator for a specific resource can be accomplished using any of multiple methods (e.g. single arbitrator for all resources, load balancing assignment, etc.). The prefe ⁇ ed methods for arbitration assignment according to the present invention are based on the number of controllers and the resource range. For cluster configurations with one or two AMFs, the assigned arbitrator is the last AMF with a Write Lock.
- FIG. 19 illustrates a simplified arbitration process between two AMFs in a cluster configuration for a single resource.
- First AMF 1300 (AMF #1) issues a Read Lock request 1320 for a resource to second AMF 1310 (AMF #2), which cu ⁇ ently has a Write Lock on the resource.
- AMF#2 issues a Grant Lock (read) message 1330 to AMF #1 indicating that a resource lock has been granted.
- AMF #1 now has read access to the resource.
- the sequence continues when AMF #1 issues a Write Lock request 1340 to AMF #2.
- AMF #2 responds with a Grant Lock (write) message 1350.
- AMF #1 issues a Read Lock request 1360, and since AMF #1 already has a Write Lock, it handles its own arbitration and demotes the Write Lock to a Read Lock.
- AMF #2 has no locks on the resource at this time, so it does not need to be notified.
- AMF #2 issues a Read Lock request 1370 to AMF #1 , which responds immediately with a Grant Lock (read) message 1380 since the resource supports multiple read locks.
- Grant Lock read
- Figure 20 illustrates exemplary resource arbitration sequences for a cluster including four AMFs according to the present invention.
- the prefe ⁇ ed arbitrator assignment method for clusters containing three or more AMFs is to select the arbitrator using a fixed mapping algorithm. This has the effect of permanently associating an arbitrator with a single AMF. In the event of AMF resource arbitration failure, the resource arbitrator is reassigned according to the mapping algorithm.
- first AMF 1400 issues a write lock request 1420 to the resource X arbitrator on third AMF 1410 (AMF #3).
- the arbitrator on AMF #3 issues a release lock request 1422 to second AMF 1405 (AMF #2) to release its lock on the resource X.
- AMF #1 issues a read lock request 1424 for resource Y.
- Fourth AMF 1415 is the assigned arbitrator for resource Y.
- AMF #4 immediately grants a read lock 1426 since no other AMFs cu ⁇ ently have a lock.
- AMF #4 issues a write lock request 1428 for resource X, which is queued by the arbitrator on AMF #3 since it is cu ⁇ ently processing write lock request 1420.
- AMF #2 sends a lock released message 1430 to AMF #3, which sends a grant lock (write) message 1432 to AMF #1.
- Embedded within grant lock message 1432 is a flag indicating that AMF #1 should release the lock when finished. This optimization eliminates the need for AMF #3 to send a release lock message to AMF #1.
- AMF #1 sends a lock released message 1434 to AMF #3, which sends a grant lock message 1436 to AMF #4 (which is next in the queue for a write lock on resource X).
- the sequence beginning with request lock (read) message 1440 shows a multiple read lock condition. With the reception of the grant lock (read) message 1442, both AMF #2 and AMF #1 have simultaneous read locks on resource Y.
- the write lock request 1444 sent by AMF #3 causes AMF #4 to issue release lock messages 1446 and 1448 to AMF #2 and AMF #1, respectively. This results in both lock-released message 1450 and lock-released message 1852 being sent to AMF #4.
- AMF #1 Prior to AMF #4 granting a lock to AMF #3, AMF #1 sends a request read lock message 1454 which is queued by AMF #4.
- AMF #3 receives the grant write lock message 1456 for resource Y which contains a flag indicating that it should release the lock when complete.
- AMF #3 issues a lock released message 1458 when done with the resource Y.
- AMF #4 then issues a grant lock (read) message 1460 notifying AMF #1 that it has obtained a read lock on resource Y.
- resource arbitration is also optimized through the use of lock prefetch.
- An AMF can specify additional prefetch resources when arbitrating for a lock. If all or some of the prefetch resources are not locked, the Arbiter will lock them for the AMF as well. Thus, when the AMF requests the lock on these prefetched resources, (at some later time) it can quickly gain the lock (since it already had it).
Abstract
Description
Claims
Priority Applications (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CA2363726A CA2363726C (en) | 1999-03-03 | 2000-02-08 | Methods and systems for implementing shared disk array management functions |
KR1020017011191A KR20020012539A (en) | 1999-03-03 | 2000-02-08 | Methods and systems for implementing shared disk array management functions |
JP2000602929A JP2002538549A (en) | 1999-03-03 | 2000-02-08 | Method and system for performing a shared disk array management function |
EP00911736A EP1171820B1 (en) | 1999-03-03 | 2000-02-08 | Methods and systems for implementing shared disk array management functions |
DE60034327T DE60034327T2 (en) | 1999-03-03 | 2000-02-08 | METHOD AND SYSTEMS FOR IMPLEMENTING ADMINISTRATIVE FUNCTIONS FOR COMMONLY USED DISK ASSEMBLIES |
NZ513789A NZ513789A (en) | 1999-03-03 | 2000-02-08 | Methods and systems for implementing shared disk array management functions |
AU33586/00A AU3358600A (en) | 1999-03-03 | 2000-02-08 | Methods and systems for implementing shared disk array management functions |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/261,906 | 1999-03-03 | ||
US09/261,906 US6148414A (en) | 1998-09-24 | 1999-03-03 | Methods and systems for implementing shared disk array management functions |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2000052576A1 true WO2000052576A1 (en) | 2000-09-08 |
WO2000052576A9 WO2000052576A9 (en) | 2001-08-30 |
Family
ID=22995391
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2000/003275 WO2000052576A1 (en) | 1999-03-03 | 2000-02-08 | Methods and systems for implementing shared disk array management functions |
Country Status (12)
Country | Link |
---|---|
US (3) | US6148414A (en) |
EP (2) | EP1171820B1 (en) |
JP (1) | JP2002538549A (en) |
KR (1) | KR20020012539A (en) |
CN (1) | CN100489796C (en) |
AT (2) | ATE359552T1 (en) |
AU (1) | AU3358600A (en) |
CA (1) | CA2363726C (en) |
DE (2) | DE60042225D1 (en) |
NZ (1) | NZ513789A (en) |
TW (1) | TW468107B (en) |
WO (1) | WO2000052576A1 (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2351375A (en) * | 1999-03-25 | 2000-12-27 | Dell Usa Lp | Storage Domain Management System |
WO2002062014A2 (en) * | 2001-01-30 | 2002-08-08 | Sun Microsystems, Inc. | Method and system for testing a network system |
US6446141B1 (en) | 1999-03-25 | 2002-09-03 | Dell Products, L.P. | Storage server system including ranking of data source |
US6553408B1 (en) | 1999-03-25 | 2003-04-22 | Dell Products L.P. | Virtual device architecture having memory for storing lists of driver modules |
US6640278B1 (en) | 1999-03-25 | 2003-10-28 | Dell Products L.P. | Method for configuration and management of storage resources in a storage network |
US6654830B1 (en) | 1999-03-25 | 2003-11-25 | Dell Products L.P. | Method and system for managing data migration for a storage system |
EP1370945A1 (en) * | 2001-02-13 | 2003-12-17 | Candera, Inc. | Failover processing in a storage system |
EP1402420A2 (en) * | 2001-06-26 | 2004-03-31 | EMC Corporation | Mirroring network data to establish virtual storage area network |
US7056718B2 (en) | 2003-08-08 | 2006-06-06 | Novozymes, Inc. | Polypeptides having oxaloacetate hydrolase activity and nucleic acids encoding same |
WO2009032360A1 (en) * | 2007-09-05 | 2009-03-12 | Nec Laboratories America, Inc. | Storage over optical/ wireless integrated broadband access network (soba) architecture |
US7548975B2 (en) | 2002-01-09 | 2009-06-16 | Cisco Technology, Inc. | Methods and apparatus for implementing virtualization of storage within a storage area network through a virtual enclosure |
US7934023B2 (en) | 2003-12-01 | 2011-04-26 | Cisco Technology, Inc. | Apparatus and method for performing fast fibre channel write operations over relatively high latency networks |
US8805918B1 (en) | 2002-09-11 | 2014-08-12 | Cisco Technology, Inc. | Methods and apparatus for implementing exchange management for virtualization of storage within a storage area network |
EP2830284A4 (en) * | 2012-12-28 | 2015-05-20 | Huawei Tech Co Ltd | Caching method for distributed storage system, node and computer readable medium |
Families Citing this family (229)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3360719B2 (en) * | 1998-06-19 | 2002-12-24 | 日本電気株式会社 | Disk array clustering reporting method and system |
US6148414A (en) * | 1998-09-24 | 2000-11-14 | Seek Systems, Inc. | Methods and systems for implementing shared disk array management functions |
JP2000187561A (en) * | 1998-12-24 | 2000-07-04 | Hitachi Ltd | Storage device system |
JP4294142B2 (en) | 1999-02-02 | 2009-07-08 | 株式会社日立製作所 | Disk subsystem |
US6397350B1 (en) * | 1999-02-19 | 2002-05-28 | International Business Machines Corporation | Method of providing direct data processing access using a queued direct input-output device |
US7266706B2 (en) * | 1999-03-03 | 2007-09-04 | Yottayotta, Inc. | Methods and systems for implementing shared disk array management functions |
US6449731B1 (en) * | 1999-03-03 | 2002-09-10 | Tricord Systems, Inc. | Self-healing computer system storage |
US6400730B1 (en) * | 1999-03-10 | 2002-06-04 | Nishan Systems, Inc. | Method and apparatus for transferring data between IP network devices and SCSI and fibre channel devices over an IP network |
US6341328B1 (en) * | 1999-04-20 | 2002-01-22 | Lucent Technologies, Inc. | Method and apparatus for using multiple co-dependent DMA controllers to provide a single set of read and write commands |
JP4461511B2 (en) * | 1999-06-09 | 2010-05-12 | 株式会社日立製作所 | Disk array device and data read / write method to disk device |
US6519679B2 (en) * | 1999-06-11 | 2003-02-11 | Dell Usa, L.P. | Policy based storage configuration |
JP2001034427A (en) * | 1999-07-23 | 2001-02-09 | Fujitsu Ltd | Device controller and its method |
US6961749B1 (en) * | 1999-08-25 | 2005-11-01 | Network Appliance, Inc. | Scalable file server with highly available pairs |
US6499058B1 (en) * | 1999-09-09 | 2002-12-24 | Motokazu Hozumi | File shared apparatus and its method file processing apparatus and its method recording medium in which file shared program is recorded and recording medium in which file processing program is recorded |
US6343324B1 (en) * | 1999-09-13 | 2002-01-29 | International Business Machines Corporation | Method and system for controlling access share storage devices in a network environment by configuring host-to-volume mapping data structures in the controller memory for granting and denying access to the devices |
US8165146B1 (en) | 1999-10-28 | 2012-04-24 | Lightwaves Systems Inc. | System and method for storing/caching, searching for, and accessing data |
US6944654B1 (en) * | 1999-11-01 | 2005-09-13 | Emc Corporation | Multiple storage array control |
US6351776B1 (en) | 1999-11-04 | 2002-02-26 | Xdrive, Inc. | Shared internet storage resource, user interface system, and method |
US20100185614A1 (en) | 1999-11-04 | 2010-07-22 | O'brien Brett | Shared Internet storage resource, user interface system, and method |
US6721900B1 (en) * | 1999-12-22 | 2004-04-13 | Rockwell Automation Technologies, Inc. | Safety network for industrial controller having reduced bandwidth requirements |
US6834326B1 (en) * | 2000-02-04 | 2004-12-21 | 3Com Corporation | RAID method and device with network protocol between controller and storage devices |
US6877044B2 (en) | 2000-02-10 | 2005-04-05 | Vicom Systems, Inc. | Distributed storage management platform architecture |
US6772270B1 (en) * | 2000-02-10 | 2004-08-03 | Vicom Systems, Inc. | Multi-port fibre channel controller |
US20010044879A1 (en) * | 2000-02-18 | 2001-11-22 | Moulton Gregory Hagan | System and method for distributed management of data storage |
US7509420B2 (en) * | 2000-02-18 | 2009-03-24 | Emc Corporation | System and method for intelligent, globally distributed network storage |
US6826711B2 (en) * | 2000-02-18 | 2004-11-30 | Avamar Technologies, Inc. | System and method for data protection with multidimensional parity |
CA2404095A1 (en) * | 2000-03-22 | 2001-09-27 | Yottayotta, Inc. | Method and system for providing multimedia information on demand over wide area networks |
US6490659B1 (en) * | 2000-03-31 | 2002-12-03 | International Business Machines Corporation | Warm start cache recovery in a dual active controller with cache coherency using stripe locks for implied storage volume reservations |
US6754718B1 (en) | 2000-05-10 | 2004-06-22 | Emc Corporation | Pushing attribute information to storage devices for network topology access |
US6950871B1 (en) * | 2000-06-29 | 2005-09-27 | Hitachi, Ltd. | Computer system having a storage area network and method of handling data in the computer system |
US6820171B1 (en) * | 2000-06-30 | 2004-11-16 | Lsi Logic Corporation | Methods and structures for an extensible RAID storage architecture |
US6625747B1 (en) * | 2000-06-30 | 2003-09-23 | Dell Products L.P. | Computer storage system and failover method |
US7188157B1 (en) * | 2000-06-30 | 2007-03-06 | Hitachi, Ltd. | Continuous update of data in a data server system |
US7281032B2 (en) * | 2000-06-30 | 2007-10-09 | Hitachi, Ltd. | File sharing system with data mirroring by storage systems |
US6928470B1 (en) * | 2000-07-31 | 2005-08-09 | Western Digital Ventures, Inc. | Transferring scheduling data from a plurality of disk storage devices to a network switch before transferring data associated with scheduled requests between the network switch and a plurality of host initiators |
WO2002015018A1 (en) * | 2000-08-11 | 2002-02-21 | 3Ware, Inc. | Architecture for providing block-level storage access over a computer network |
US6603625B1 (en) * | 2000-09-11 | 2003-08-05 | Western Digital Technologies, Inc. | Spindle synchronizing a plurality of spindles connected to a multi-dimensional computer network |
US6810491B1 (en) * | 2000-10-12 | 2004-10-26 | Hitachi America, Ltd. | Method and apparatus for the takeover of primary volume in multiple volume mirroring |
US6721902B1 (en) * | 2000-10-12 | 2004-04-13 | Hewlett-Packard Development Company, L.P. | Method and system for providing LUN-based backup reliability via LUN-based locking |
US6901451B1 (en) * | 2000-10-31 | 2005-05-31 | Fujitsu Limited | PCI bridge over network |
US6671773B2 (en) * | 2000-12-07 | 2003-12-30 | Spinnaker Networks, Llc | Method and system for responding to file system requests |
US6857059B2 (en) * | 2001-01-11 | 2005-02-15 | Yottayotta, Inc. | Storage virtualization system and methods |
US6907457B2 (en) * | 2001-01-25 | 2005-06-14 | Dell Inc. | Architecture for access to embedded files using a SAN intermediate device |
US6862692B2 (en) | 2001-01-29 | 2005-03-01 | Adaptec, Inc. | Dynamic redistribution of parity groups |
US20020156973A1 (en) * | 2001-01-29 | 2002-10-24 | Ulrich Thomas R. | Enhanced disk array |
US6990667B2 (en) | 2001-01-29 | 2006-01-24 | Adaptec, Inc. | Server-independent object positioning for load balancing drives and servers |
US20020191311A1 (en) * | 2001-01-29 | 2002-12-19 | Ulrich Thomas R. | Dynamically scalable disk array |
US20020138559A1 (en) * | 2001-01-29 | 2002-09-26 | Ulrich Thomas R. | Dynamically distributed file system |
US7054927B2 (en) | 2001-01-29 | 2006-05-30 | Adaptec, Inc. | File system metadata describing server directory information |
US6990547B2 (en) * | 2001-01-29 | 2006-01-24 | Adaptec, Inc. | Replacing file system processors by hot swapping |
IES20010611A2 (en) * | 2001-03-08 | 2002-09-18 | Richmount Computers Ltd | Distributed lock management chip |
US8766773B2 (en) | 2001-03-20 | 2014-07-01 | Lightwaves Systems, Inc. | Ultra wideband radio frequency identification system, method, and apparatus |
US7545868B2 (en) * | 2001-03-20 | 2009-06-09 | Lightwaves Systems, Inc. | High bandwidth data transport system |
US8270452B2 (en) * | 2002-04-30 | 2012-09-18 | Lightwaves Systems, Inc. | Method and apparatus for multi-band UWB communications |
US7533132B2 (en) * | 2001-03-21 | 2009-05-12 | Sap Ag | Parallel replication mechanism for state information produced by serialized processing |
US7231430B2 (en) * | 2001-04-20 | 2007-06-12 | Egenera, Inc. | Reconfigurable, virtual processing system, cluster, network and method |
US7062704B2 (en) * | 2001-04-30 | 2006-06-13 | Sun Microsystems, Inc. | Storage array employing scrubbing operations using multiple levels of checksums |
US7017107B2 (en) * | 2001-04-30 | 2006-03-21 | Sun Microsystems, Inc. | Storage array employing scrubbing operations at the disk-controller level |
US6839815B2 (en) * | 2001-05-07 | 2005-01-04 | Hitachi, Ltd. | System and method for storage on demand service in a global SAN environment |
US6915397B2 (en) * | 2001-06-01 | 2005-07-05 | Hewlett-Packard Development Company, L.P. | System and method for generating point in time storage copy |
US6757753B1 (en) | 2001-06-06 | 2004-06-29 | Lsi Logic Corporation | Uniform routing of storage access requests through redundant array controllers |
JP4232357B2 (en) * | 2001-06-14 | 2009-03-04 | 株式会社日立製作所 | Computer system |
JP4175788B2 (en) | 2001-07-05 | 2008-11-05 | 株式会社日立製作所 | Volume controller |
US7076510B2 (en) * | 2001-07-12 | 2006-07-11 | Brown William P | Software raid methods and apparatuses including server usage based write delegation |
US7289499B1 (en) * | 2001-07-16 | 2007-10-30 | Network Appliance, Inc. | Integrated system and method for controlling telecommunication network data communicated over a local area network and storage data communicated over a storage area network |
US7239642B1 (en) | 2001-07-16 | 2007-07-03 | Network Appliance, Inc. | Multi-protocol network interface card |
US7404206B2 (en) * | 2001-07-17 | 2008-07-22 | Yottayotta, Inc. | Network security devices and methods |
US7127565B2 (en) * | 2001-08-20 | 2006-10-24 | Spinnaker Networks, Inc. | Method and system for safely arbitrating disk drive ownership using a timestamp voting algorithm |
US7257815B2 (en) * | 2001-09-05 | 2007-08-14 | Microsoft Corporation | Methods and system of managing concurrent access to multiple resources |
US7330892B2 (en) * | 2001-09-07 | 2008-02-12 | Network Appliance, Inc. | High-speed data transfer in a storage virtualization controller |
US7472231B1 (en) | 2001-09-07 | 2008-12-30 | Netapp, Inc. | Storage area network data cache |
US20030055932A1 (en) * | 2001-09-19 | 2003-03-20 | Dell Products L.P. | System and method for configuring a storage area network |
US7421509B2 (en) * | 2001-09-28 | 2008-09-02 | Emc Corporation | Enforcing quality of service in a storage network |
US7404000B2 (en) * | 2001-09-28 | 2008-07-22 | Emc Corporation | Protocol translation in a storage system |
US6976134B1 (en) | 2001-09-28 | 2005-12-13 | Emc Corporation | Pooling and provisioning storage resources in a storage network |
US7185062B2 (en) * | 2001-09-28 | 2007-02-27 | Emc Corporation | Switch-based storage services |
US7864758B1 (en) | 2001-09-28 | 2011-01-04 | Emc Corporation | Virtualization in a storage system |
US7707304B1 (en) | 2001-09-28 | 2010-04-27 | Emc Corporation | Storage switch for storage area network |
US7558264B1 (en) | 2001-09-28 | 2009-07-07 | Emc Corporation | Packet classification in a storage system |
US20030079018A1 (en) * | 2001-09-28 | 2003-04-24 | Lolayekar Santosh C. | Load balancing in a storage network |
US7499986B2 (en) | 2001-10-04 | 2009-03-03 | International Business Machines Corporation | Storage area network methods with event notification conflict resolution |
US20030149762A1 (en) * | 2001-10-05 | 2003-08-07 | Knight Gregory John | Storage area network methods and apparatus with history maintenance and removal |
US20030154271A1 (en) * | 2001-10-05 | 2003-08-14 | Baldwin Duane Mark | Storage area network methods and apparatus with centralized management |
US6996670B2 (en) * | 2001-10-05 | 2006-02-07 | International Business Machines Corporation | Storage area network methods and apparatus with file system extension |
US20030109679A1 (en) * | 2001-10-10 | 2003-06-12 | Green Brent Everett | Flax protein isolate and production |
US6880101B2 (en) * | 2001-10-12 | 2005-04-12 | Dell Products L.P. | System and method for providing automatic data restoration after a storage device failure |
US6545872B1 (en) | 2001-10-12 | 2003-04-08 | Compaq Information Technologies Group, L.P. | Heat sink for edge connectors |
US6912599B2 (en) * | 2001-10-19 | 2005-06-28 | Hewlett-Packard Development Company, L.P. | Method and apparatus for sensing positions of device enclosures within multi-shelf cabinets |
US6920511B2 (en) * | 2001-10-19 | 2005-07-19 | Hewlett-Packard Development Company, L.P. | Method and apparatus for controlling communications in data storage complexes |
US6889345B2 (en) * | 2001-10-19 | 2005-05-03 | Hewlett-Packard Development Company, Lp. | System and method for locating a failed storage device in a data storage system |
US6988136B2 (en) * | 2001-10-19 | 2006-01-17 | Hewlett-Packard Development Company, L.P. | Unified management system and method for multi-cabinet data storage complexes |
US8046469B2 (en) * | 2001-10-22 | 2011-10-25 | Hewlett-Packard Development Company, L.P. | System and method for interfacing with virtual storage |
US6931487B2 (en) * | 2001-10-22 | 2005-08-16 | Hewlett-Packard Development Company L.P. | High performance multi-controller processing |
US6895467B2 (en) | 2001-10-22 | 2005-05-17 | Hewlett-Packard Development Company, L.P. | System and method for atomizing storage |
US6883065B1 (en) * | 2001-11-15 | 2005-04-19 | Xiotech Corporation | System and method for a redundant communication channel via storage area network back-end |
JP2003162377A (en) * | 2001-11-28 | 2003-06-06 | Hitachi Ltd | Disk array system and method for taking over logical unit among controllers |
US7194656B2 (en) * | 2001-11-28 | 2007-03-20 | Yottayotta Inc. | Systems and methods for implementing content sensitive routing over a wide area network (WAN) |
US6732201B2 (en) * | 2001-12-17 | 2004-05-04 | Lsi Logic Corporation | Hardware speed selection behind a disk array controller |
US7159080B1 (en) * | 2001-12-20 | 2007-01-02 | Network Appliance, Inc. | System and method for storing storage operating system data in switch ports |
US7296068B1 (en) | 2001-12-21 | 2007-11-13 | Network Appliance, Inc. | System and method for transfering volume ownership in net-worked storage |
US7650412B2 (en) * | 2001-12-21 | 2010-01-19 | Netapp, Inc. | Systems and method of implementing disk ownership in networked storage |
US7093043B2 (en) | 2001-12-27 | 2006-08-15 | Hewlett-Packard Development Company, L.P. | Data array having redundancy messaging between array controllers over the host bus |
EP1429249A3 (en) * | 2002-01-04 | 2009-09-09 | Egenera, Inc. | Virtual networking system and method in a processing system |
US7281044B2 (en) * | 2002-01-10 | 2007-10-09 | Hitachi, Ltd. | SAN infrastructure on demand service system |
US20030140128A1 (en) * | 2002-01-18 | 2003-07-24 | Dell Products L.P. | System and method for validating a network |
US6880052B2 (en) * | 2002-03-26 | 2005-04-12 | Hewlett-Packard Development Company, Lp | Storage area network, data replication and storage controller, and method for replicating data using virtualized volumes |
US6947981B2 (en) * | 2002-03-26 | 2005-09-20 | Hewlett-Packard Development Company, L.P. | Flexible data replication mechanism |
US8051197B2 (en) * | 2002-03-29 | 2011-11-01 | Brocade Communications Systems, Inc. | Network congestion management systems and methods |
US7120826B2 (en) * | 2002-03-29 | 2006-10-10 | International Business Machines Corporation | Partial mirroring during expansion thereby eliminating the need to track the progress of stripes updated during expansion |
US20040006635A1 (en) * | 2002-04-19 | 2004-01-08 | Oesterreicher Richard T. | Hybrid streaming platform |
US7899924B2 (en) * | 2002-04-19 | 2011-03-01 | Oesterreicher Richard T | Flexible streaming hardware |
US20040006636A1 (en) * | 2002-04-19 | 2004-01-08 | Oesterreicher Richard T. | Optimized digital media delivery engine |
US7328284B2 (en) * | 2002-05-06 | 2008-02-05 | Qlogic, Corporation | Dynamic configuration of network data flow using a shared I/O subsystem |
US7356608B2 (en) * | 2002-05-06 | 2008-04-08 | Qlogic, Corporation | System and method for implementing LAN within shared I/O subsystem |
US7447778B2 (en) * | 2002-05-06 | 2008-11-04 | Qlogic, Corporation | System and method for a shared I/O subsystem |
US7404012B2 (en) * | 2002-05-06 | 2008-07-22 | Qlogic, Corporation | System and method for dynamic link aggregation in a shared I/O subsystem |
JP2003330782A (en) * | 2002-05-10 | 2003-11-21 | Hitachi Ltd | Computer system |
US8140622B2 (en) | 2002-05-23 | 2012-03-20 | International Business Machines Corporation | Parallel metadata service in storage area network environment |
US7448077B2 (en) | 2002-05-23 | 2008-11-04 | International Business Machines Corporation | File level security for a metadata controller in a storage area network |
US7010528B2 (en) * | 2002-05-23 | 2006-03-07 | International Business Machines Corporation | Mechanism for running parallel application programs on metadata controller nodes |
US6732171B2 (en) * | 2002-05-31 | 2004-05-04 | Lefthand Networks, Inc. | Distributed network storage system with virtualization |
US7024586B2 (en) * | 2002-06-24 | 2006-04-04 | Network Appliance, Inc. | Using file system information in raid data reconstruction and migration |
GB0214669D0 (en) * | 2002-06-26 | 2002-08-07 | Ibm | Method for maintaining data access during failure of a controller |
US20040024807A1 (en) * | 2002-07-31 | 2004-02-05 | Microsoft Corporation | Asynchronous updates of weakly consistent distributed state information |
US6928509B2 (en) * | 2002-08-01 | 2005-08-09 | International Business Machines Corporation | Method and apparatus for enhancing reliability and scalability of serial storage devices |
MXPA05001357A (en) * | 2002-08-02 | 2005-08-26 | Grass Valley Inc | Real-time fail-over recovery for a media area network. |
US7418702B2 (en) * | 2002-08-06 | 2008-08-26 | Sheng (Ted) Tai Tsao | Concurrent web based multi-task support for control management system |
US7283560B1 (en) | 2002-08-08 | 2007-10-16 | Vicom Systems, Inc. | Method and apparatus for address translation between fibre channel addresses and SCSI addresses |
US7571206B2 (en) * | 2002-08-12 | 2009-08-04 | Equallogic, Inc. | Transparent request routing for a partitioned application service |
US7134044B2 (en) * | 2002-08-16 | 2006-11-07 | International Business Machines Corporation | Method, system, and program for providing a mirror copy of data |
JP2004086721A (en) * | 2002-08-28 | 2004-03-18 | Nec Corp | Data reproducing system, relay system, data transmission/receiving method, and program for reproducing data in storage |
US20040085908A1 (en) * | 2002-10-31 | 2004-05-06 | Brocade Communications Systems, Inc. | Method and apparatus for managing locking of resources in a cluster by use of a network fabric |
US7461146B2 (en) * | 2003-01-20 | 2008-12-02 | Equallogic, Inc. | Adaptive storage block data distribution |
US7627650B2 (en) * | 2003-01-20 | 2009-12-01 | Equallogic, Inc. | Short-cut response for distributed services |
US7127577B2 (en) * | 2003-01-21 | 2006-10-24 | Equallogic Inc. | Distributed snapshot process |
US7937551B2 (en) * | 2003-01-21 | 2011-05-03 | Dell Products L.P. | Storage systems having differentiated storage pools |
US20040210724A1 (en) * | 2003-01-21 | 2004-10-21 | Equallogic Inc. | Block data migration |
US8037264B2 (en) * | 2003-01-21 | 2011-10-11 | Dell Products, L.P. | Distributed snapshot process |
US8499086B2 (en) * | 2003-01-21 | 2013-07-30 | Dell Products L.P. | Client load distribution |
US7181574B1 (en) * | 2003-01-30 | 2007-02-20 | Veritas Operating Corporation | Server cluster using informed prefetching |
US20040181707A1 (en) * | 2003-03-11 | 2004-09-16 | Hitachi, Ltd. | Method and apparatus for seamless management for disaster recovery |
US7817583B2 (en) * | 2003-04-28 | 2010-10-19 | Hewlett-Packard Development Company, L.P. | Method for verifying a storage area network configuration |
JP2005157825A (en) * | 2003-11-27 | 2005-06-16 | Hitachi Ltd | Computer system with function to recover from failure and method for recovery from failure |
JP2005004350A (en) * | 2003-06-10 | 2005-01-06 | Sony Ericsson Mobilecommunications Japan Inc | Resource management method and device, resource management program, and storage medium |
US7065589B2 (en) | 2003-06-23 | 2006-06-20 | Hitachi, Ltd. | Three data center remote copy system with journaling |
US20050015655A1 (en) * | 2003-06-30 | 2005-01-20 | Clayton Michele M. | Intermediate station |
US7379974B2 (en) * | 2003-07-14 | 2008-05-27 | International Business Machines Corporation | Multipath data retrieval from redundant array |
JP4313650B2 (en) * | 2003-11-07 | 2009-08-12 | 株式会社日立製作所 | File server, redundancy recovery method, program, and recording medium |
US7446433B2 (en) * | 2004-01-23 | 2008-11-04 | American Power Conversion Corporation | Methods and apparatus for providing uninterruptible power |
CN100452673C (en) * | 2004-02-16 | 2009-01-14 | 上海欣国信息技术有限公司 | Digital auendant console |
JP2005275829A (en) * | 2004-03-25 | 2005-10-06 | Hitachi Ltd | Storage system |
US7529291B2 (en) * | 2004-04-13 | 2009-05-05 | Raytheon Company | Methods and structures for rapid code acquisition in spread spectrum communications |
US7409494B2 (en) * | 2004-04-30 | 2008-08-05 | Network Appliance, Inc. | Extension of write anywhere file system layout |
US20050289143A1 (en) | 2004-06-23 | 2005-12-29 | Exanet Ltd. | Method for managing lock resources in a distributed storage system |
JP2008506195A (en) * | 2004-07-07 | 2008-02-28 | ヨッタヨッタ インコーポレイテッド | System and method for providing distributed cache coherence |
KR100899462B1 (en) * | 2004-07-21 | 2009-05-27 | 비치 언리미티드 엘엘씨 | Distributed storage architecture based on block map caching and vfs stackable file system modules |
US7412545B2 (en) * | 2004-07-22 | 2008-08-12 | International Business Machines Corporation | Apparatus and method for updating I/O capability of a logically-partitioned computer system |
US20060146780A1 (en) * | 2004-07-23 | 2006-07-06 | Jaques Paves | Trickmodes and speed transitions |
US7240155B2 (en) * | 2004-09-30 | 2007-07-03 | International Business Machines Corporation | Decision mechanisms for adapting RAID operation placement |
US7529967B2 (en) * | 2004-11-04 | 2009-05-05 | Rackable Systems Inc. | Method and system for network storage device failure protection and recovery |
US7535832B2 (en) * | 2004-11-22 | 2009-05-19 | International Business Machines Corporation | Apparatus and method to set the signaling rate of a switch domain disposed within an information storage and retrieval system |
US9495263B2 (en) * | 2004-12-21 | 2016-11-15 | Infortrend Technology, Inc. | Redundant SAS storage virtualization subsystem and system using the same, and method therefor |
JP4563794B2 (en) * | 2004-12-28 | 2010-10-13 | 株式会社日立製作所 | Storage system and storage management method |
CN100373354C (en) * | 2005-01-20 | 2008-03-05 | 英业达股份有限公司 | Data accessing system and method capable of recognizing disk cache content validity |
US7401260B2 (en) * | 2005-01-28 | 2008-07-15 | International Business Machines Corporation | Apparatus, system, and method for performing storage device maintenance |
US7535917B1 (en) | 2005-02-22 | 2009-05-19 | Netapp, Inc. | Multi-protocol network adapter |
JP4815825B2 (en) * | 2005-03-10 | 2011-11-16 | 日本電気株式会社 | Disk array device and method for reconstructing the same |
US7676688B1 (en) * | 2005-03-16 | 2010-03-09 | Symantec Corporation | Concurrent data broadcast of selected common groups of data blocks |
US7600214B2 (en) | 2005-04-18 | 2009-10-06 | Broadcom Corporation | Use of metadata for seamless updates |
CA2615324A1 (en) * | 2005-07-14 | 2007-07-05 | Yotta Yotta, Inc. | Maintaining write order fidelity on a multi-writer system |
US8819088B2 (en) * | 2005-07-14 | 2014-08-26 | International Business Machines Corporation | Implementing storage management functions using a data store system |
CN100405313C (en) * | 2005-07-22 | 2008-07-23 | 鸿富锦精密工业(深圳)有限公司 | Link control card detecting system and method |
US7502957B2 (en) * | 2005-09-09 | 2009-03-10 | International Business Machines Corporation | Method and system to execute recovery in non-homogeneous multi processor environments |
US8396966B2 (en) * | 2005-10-26 | 2013-03-12 | Hewlett-Packard Development Company, L.P. | Method and an apparatus for creating visual representations of farms that enables connecting farms securely |
US8347010B1 (en) * | 2005-12-02 | 2013-01-01 | Branislav Radovanovic | Scalable data storage architecture and methods of eliminating I/O traffic bottlenecks |
US9118698B1 (en) | 2005-12-02 | 2015-08-25 | Branislav Radovanovic | Scalable data storage architecture and methods of eliminating I/O traffic bottlenecks |
US8275949B2 (en) * | 2005-12-13 | 2012-09-25 | International Business Machines Corporation | System support storage and computer system |
EP1985046B1 (en) | 2006-02-14 | 2016-09-14 | EMC Corporation | Systems and methods for obtaining ultra-high data availability and geographic disaster tolerance |
US7676702B2 (en) * | 2006-08-14 | 2010-03-09 | International Business Machines Corporation | Preemptive data protection for copy services in storage systems and applications |
JP5179031B2 (en) * | 2006-09-13 | 2013-04-10 | 株式会社日立製作所 | Storage system that makes effective use of available ports |
US20080181243A1 (en) * | 2006-12-15 | 2008-07-31 | Brocade Communications Systems, Inc. | Ethernet forwarding in high performance fabrics |
US20080159277A1 (en) * | 2006-12-15 | 2008-07-03 | Brocade Communications Systems, Inc. | Ethernet over fibre channel |
US7882393B2 (en) * | 2007-03-28 | 2011-02-01 | International Business Machines Corporation | In-band problem log data collection between a host system and a storage system |
US7779308B2 (en) * | 2007-06-21 | 2010-08-17 | International Business Machines Corporation | Error processing across multiple initiator network |
US8892942B2 (en) * | 2007-07-27 | 2014-11-18 | Hewlett-Packard Development Company, L.P. | Rank sparing system and method |
US8583780B2 (en) * | 2007-11-20 | 2013-11-12 | Brocade Communications Systems, Inc. | Discovery of duplicate address in a network by reviewing discovery frames received at a port |
US8108454B2 (en) * | 2007-12-17 | 2012-01-31 | Brocade Communications Systems, Inc. | Address assignment in Fibre Channel over Ethernet environments |
JP5072692B2 (en) * | 2008-04-07 | 2012-11-14 | 株式会社日立製作所 | Storage system with multiple storage system modules |
US8347182B2 (en) * | 2008-07-01 | 2013-01-01 | International Business Machines Corporation | Ensuring data consistency |
US20100043006A1 (en) * | 2008-08-13 | 2010-02-18 | Egenera, Inc. | Systems and methods for a configurable deployment platform with virtualization of processing resource specific persistent settings |
US20100083268A1 (en) * | 2008-09-29 | 2010-04-01 | Morris Robert P | Method And System For Managing Access To A Resource By A Process Processing A Media Stream |
US8086911B1 (en) * | 2008-10-29 | 2011-12-27 | Netapp, Inc. | Method and apparatus for distributed reconstruct in a raid system |
US7882389B2 (en) * | 2008-11-18 | 2011-02-01 | International Business Machines Corporation | Dynamic reassignment of devices attached to redundant controllers |
US8495417B2 (en) | 2009-01-09 | 2013-07-23 | Netapp, Inc. | System and method for redundancy-protected aggregates |
US8848575B2 (en) | 2009-02-23 | 2014-09-30 | Brocade Communications Systems, Inc. | High availability and multipathing for fibre channel over ethernet |
US8719829B2 (en) * | 2009-03-09 | 2014-05-06 | International Business Machines Corporation | Synchronizing processes in a computing resource by locking a resource for a process at a predicted time slot |
US20100293145A1 (en) * | 2009-05-15 | 2010-11-18 | Hewlett-Packard Development Company, L.P. | Method of Selective Replication in a Storage Area Network |
US8074003B1 (en) * | 2009-12-28 | 2011-12-06 | Emc Corporation | Host-based storage controller providing block devices in geographically distributed storage |
CN102209097A (en) * | 2010-03-31 | 2011-10-05 | 英业达股份有限公司 | System for allocating storage resources of storage local area network |
US8738724B2 (en) | 2010-05-25 | 2014-05-27 | Microsoft Corporation | Totally ordered log on appendable storage |
WO2012020505A1 (en) * | 2010-08-13 | 2012-02-16 | 富士通株式会社 | Memory control device, information processing device, and control method for a memory control device |
CN102129400B (en) | 2010-12-29 | 2013-12-04 | 华为数字技术(成都)有限公司 | Storage system connection configuration method and equipment and storage system |
CN104094603B (en) * | 2011-12-28 | 2018-06-08 | 英特尔公司 | For carrying out the system and method that integrated metadata is inserted into video coding system |
CN103838515B (en) * | 2012-11-23 | 2016-08-03 | 中国科学院声学研究所 | A kind of method and system of server cluster access scheduling multi-controller disk array |
CN103701925B (en) * | 2013-12-31 | 2017-04-05 | 北京网康科技有限公司 | Source synchronous management-control method |
US9547448B2 (en) * | 2014-02-24 | 2017-01-17 | Netapp, Inc. | System and method for transposed storage in raid arrays |
US10455019B2 (en) * | 2014-09-10 | 2019-10-22 | Oracle International Corporation | Highly performant reliable message storage using in-memory replication technology |
US9853873B2 (en) | 2015-01-10 | 2017-12-26 | Cisco Technology, Inc. | Diagnosis and throughput measurement of fibre channel ports in a storage area network environment |
US9900250B2 (en) | 2015-03-26 | 2018-02-20 | Cisco Technology, Inc. | Scalable handling of BGP route information in VXLAN with EVPN control plane |
US10222986B2 (en) | 2015-05-15 | 2019-03-05 | Cisco Technology, Inc. | Tenant-level sharding of disks with tenant-specific storage modules to enable policies per tenant in a distributed storage system |
US11588783B2 (en) | 2015-06-10 | 2023-02-21 | Cisco Technology, Inc. | Techniques for implementing IPV6-based distributed storage space |
US10778765B2 (en) | 2015-07-15 | 2020-09-15 | Cisco Technology, Inc. | Bid/ask protocol in scale-out NVMe storage |
US9892075B2 (en) | 2015-12-10 | 2018-02-13 | Cisco Technology, Inc. | Policy driven storage in a microserver computing environment |
CN105677499B (en) * | 2015-12-29 | 2018-10-12 | 曙光信息产业(北京)有限公司 | Hardware based time-out management platform |
CN105786656B (en) * | 2016-02-17 | 2019-08-13 | 中科院成都信息技术股份有限公司 | Redundant array of independent disks disaster tolerance storage method based on random matrix |
US10331353B2 (en) | 2016-04-08 | 2019-06-25 | Branislav Radovanovic | Scalable data access system and methods of eliminating controller bottlenecks |
US10140172B2 (en) | 2016-05-18 | 2018-11-27 | Cisco Technology, Inc. | Network-aware storage repairs |
US20170351639A1 (en) | 2016-06-06 | 2017-12-07 | Cisco Technology, Inc. | Remote memory access using memory mapped addressing among multiple compute nodes |
US10664169B2 (en) | 2016-06-24 | 2020-05-26 | Cisco Technology, Inc. | Performance of object storage system by reconfiguring storage devices based on latency that includes identifying a number of fragments that has a particular storage device as its primary storage device and another number of fragments that has said particular storage device as its replica storage device |
US11563695B2 (en) | 2016-08-29 | 2023-01-24 | Cisco Technology, Inc. | Queue protection using a shared global memory reserve |
CN106527978B (en) * | 2016-10-19 | 2019-07-09 | 华中科技大学 | A kind of multi-controller implementation method based on cyclic annular virtual dual control |
US10545914B2 (en) | 2017-01-17 | 2020-01-28 | Cisco Technology, Inc. | Distributed object storage |
US10243823B1 (en) | 2017-02-24 | 2019-03-26 | Cisco Technology, Inc. | Techniques for using frame deep loopback capabilities for extended link diagnostics in fibre channel storage area networks |
US10713203B2 (en) | 2017-02-28 | 2020-07-14 | Cisco Technology, Inc. | Dynamic partition of PCIe disk arrays based on software configuration / policy distribution |
US10254991B2 (en) | 2017-03-06 | 2019-04-09 | Cisco Technology, Inc. | Storage area network based extended I/O metrics computation for deep insight into application performance |
CN109089255B (en) * | 2017-06-14 | 2022-01-25 | 中国移动通信有限公司研究院 | User position notification control method, device, system, equipment and storage medium |
US10303534B2 (en) | 2017-07-20 | 2019-05-28 | Cisco Technology, Inc. | System and method for self-healing of application centric infrastructure fabric memory |
US10404596B2 (en) | 2017-10-03 | 2019-09-03 | Cisco Technology, Inc. | Dynamic route profile storage in a hardware trie routing table |
US10942666B2 (en) | 2017-10-13 | 2021-03-09 | Cisco Technology, Inc. | Using network device replication in distributed storage clusters |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5574851A (en) * | 1993-04-19 | 1996-11-12 | At&T Global Information Solutions Company | Method for performing on-line reconfiguration of a disk array concurrent with execution of disk I/O operations |
US5694581A (en) * | 1993-09-07 | 1997-12-02 | Industrial Technology Research Institute | Concurrent disk array management system implemented with CPU executable extension |
US5818754A (en) * | 1995-12-27 | 1998-10-06 | Nec Corporation | Nonvolatile memory having data storing area and attribute data area for storing attribute data of data storing area |
US5826001A (en) * | 1995-10-13 | 1998-10-20 | Digital Equipment Corporation | Reconstructing data blocks in a raid array data storage system having storage device metadata and raid set metadata |
Family Cites Families (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5671436A (en) | 1991-08-21 | 1997-09-23 | Norand Corporation | Versatile RF data capture system |
WO1993018456A1 (en) * | 1992-03-13 | 1993-09-16 | Emc Corporation | Multiple controller sharing in a redundant storage array |
US5459857A (en) * | 1992-05-15 | 1995-10-17 | Storage Technology Corporation | Fault tolerant disk array data storage subsystem |
US5611049A (en) | 1992-06-03 | 1997-03-11 | Pitts; William M. | System for accessing distributed data cache channel at each network node to pass requests and data |
US5758058A (en) * | 1993-03-31 | 1998-05-26 | Intel Corporation | Apparatus and method for initializing a master/checker fault detecting microprocessor |
TW252248B (en) | 1994-08-23 | 1995-07-21 | Ibm | A semiconductor memory based server for providing multimedia information on demand over wide area networks |
US6085234A (en) | 1994-11-28 | 2000-07-04 | Inca Technology, Inc. | Remote file services network-infrastructure cache |
US5875456A (en) | 1995-08-17 | 1999-02-23 | Nstor Corporation | Storage device array and methods for striping and unstriping data and for adding and removing disks online to/from a raid storage array |
US5657468A (en) | 1995-08-17 | 1997-08-12 | Ambex Technologies, Inc. | Method and apparatus for improving performance in a reduntant array of independent disks |
US5862312A (en) * | 1995-10-24 | 1999-01-19 | Seachange Technology, Inc. | Loosely coupled mass storage computer cluster |
US6073218A (en) * | 1996-12-23 | 2000-06-06 | Lsi Logic Corp. | Methods and apparatus for coordinating shared multiple raid controller access to common storage devices |
JPH10289524A (en) * | 1997-04-11 | 1998-10-27 | Sony Corp | Recording medium drive device |
US6151297A (en) | 1997-07-08 | 2000-11-21 | Hewlett-Packard Company | Method and system for link level server/switch trunking |
US6049833A (en) | 1997-08-29 | 2000-04-11 | Cisco Technology, Inc. | Mapping SNA session flow control to TCP flow control |
US6216173B1 (en) | 1998-02-03 | 2001-04-10 | Redbox Technologies Limited | Method and apparatus for content processing and routing |
US6138247A (en) * | 1998-05-14 | 2000-10-24 | Motorola, Inc. | Method for switching between multiple system processors |
US6243829B1 (en) * | 1998-05-27 | 2001-06-05 | Hewlett-Packard Company | Memory controller supporting redundant synchronous memories |
US6148414A (en) * | 1998-09-24 | 2000-11-14 | Seek Systems, Inc. | Methods and systems for implementing shared disk array management functions |
US6405219B2 (en) | 1999-06-22 | 2002-06-11 | F5 Networks, Inc. | Method and system for automatically updating the version of a set of files stored on content servers |
US7343413B2 (en) | 2000-03-21 | 2008-03-11 | F5 Networks, Inc. | Method and system for optimizing a network by independently scaling control segments and data flow |
CA2404095A1 (en) | 2000-03-22 | 2001-09-27 | Yottayotta, Inc. | Method and system for providing multimedia information on demand over wide area networks |
US8281022B1 (en) | 2000-06-30 | 2012-10-02 | Emc Corporation | Method and apparatus for implementing high-performance, scaleable data processing and storage systems |
US6658478B1 (en) | 2000-08-04 | 2003-12-02 | 3Pardata, Inc. | Data storage system |
-
1999
- 1999-03-03 US US09/261,906 patent/US6148414A/en not_active Expired - Lifetime
-
2000
- 2000-02-08 AU AU33586/00A patent/AU3358600A/en not_active Abandoned
- 2000-02-08 KR KR1020017011191A patent/KR20020012539A/en not_active Application Discontinuation
- 2000-02-08 DE DE60042225T patent/DE60042225D1/en not_active Expired - Lifetime
- 2000-02-08 WO PCT/US2000/003275 patent/WO2000052576A1/en active IP Right Grant
- 2000-02-08 CN CN00804532.1A patent/CN100489796C/en not_active Expired - Lifetime
- 2000-02-08 CA CA2363726A patent/CA2363726C/en not_active Expired - Lifetime
- 2000-02-08 AT AT00911736T patent/ATE359552T1/en not_active IP Right Cessation
- 2000-02-08 AT AT07104609T patent/ATE431589T1/en not_active IP Right Cessation
- 2000-02-08 NZ NZ513789A patent/NZ513789A/en unknown
- 2000-02-08 EP EP00911736A patent/EP1171820B1/en not_active Expired - Lifetime
- 2000-02-08 DE DE60034327T patent/DE60034327T2/en not_active Expired - Lifetime
- 2000-02-08 JP JP2000602929A patent/JP2002538549A/en active Pending
- 2000-02-08 EP EP07104609A patent/EP1796001B1/en not_active Expired - Lifetime
- 2000-03-20 TW TW089103707A patent/TW468107B/en not_active IP Right Cessation
- 2000-09-07 US US09/657,258 patent/US6912668B1/en not_active Expired - Lifetime
-
2005
- 2005-03-15 US US11/082,178 patent/US7246260B2/en not_active Expired - Lifetime
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5574851A (en) * | 1993-04-19 | 1996-11-12 | At&T Global Information Solutions Company | Method for performing on-line reconfiguration of a disk array concurrent with execution of disk I/O operations |
US5694581A (en) * | 1993-09-07 | 1997-12-02 | Industrial Technology Research Institute | Concurrent disk array management system implemented with CPU executable extension |
US5826001A (en) * | 1995-10-13 | 1998-10-20 | Digital Equipment Corporation | Reconstructing data blocks in a raid array data storage system having storage device metadata and raid set metadata |
US5818754A (en) * | 1995-12-27 | 1998-10-06 | Nec Corporation | Nonvolatile memory having data storing area and attribute data area for storing attribute data of data storing area |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6654830B1 (en) | 1999-03-25 | 2003-11-25 | Dell Products L.P. | Method and system for managing data migration for a storage system |
GB2351375B (en) * | 1999-03-25 | 2001-11-14 | Dell Usa Lp | Storage domain management system |
GB2351375A (en) * | 1999-03-25 | 2000-12-27 | Dell Usa Lp | Storage Domain Management System |
US6446141B1 (en) | 1999-03-25 | 2002-09-03 | Dell Products, L.P. | Storage server system including ranking of data source |
US6553408B1 (en) | 1999-03-25 | 2003-04-22 | Dell Products L.P. | Virtual device architecture having memory for storing lists of driver modules |
US6640278B1 (en) | 1999-03-25 | 2003-10-28 | Dell Products L.P. | Method for configuration and management of storage resources in a storage network |
US6904544B2 (en) | 2001-01-30 | 2005-06-07 | Sun Microsystems, Inc. | Method, system, program, and data structures for testing a network system including input/output devices |
WO2002062014A3 (en) * | 2001-01-30 | 2003-11-06 | Sun Microsystems Inc | Method and system for testing a network system |
WO2002062014A2 (en) * | 2001-01-30 | 2002-08-08 | Sun Microsystems, Inc. | Method and system for testing a network system |
EP1370945A4 (en) * | 2001-02-13 | 2008-01-02 | Candera Inc | Failover processing in a storage system |
EP1370945A1 (en) * | 2001-02-13 | 2003-12-17 | Candera, Inc. | Failover processing in a storage system |
EP1402420A4 (en) * | 2001-06-26 | 2008-01-09 | Emc Corp | Mirroring network data to establish virtual storage area network |
EP1402420A2 (en) * | 2001-06-26 | 2004-03-31 | EMC Corporation | Mirroring network data to establish virtual storage area network |
US7548975B2 (en) | 2002-01-09 | 2009-06-16 | Cisco Technology, Inc. | Methods and apparatus for implementing virtualization of storage within a storage area network through a virtual enclosure |
US8805918B1 (en) | 2002-09-11 | 2014-08-12 | Cisco Technology, Inc. | Methods and apparatus for implementing exchange management for virtualization of storage within a storage area network |
US9733868B2 (en) | 2002-09-11 | 2017-08-15 | Cisco Technology, Inc. | Methods and apparatus for implementing exchange management for virtualization of storage within a storage area network |
US7056718B2 (en) | 2003-08-08 | 2006-06-06 | Novozymes, Inc. | Polypeptides having oxaloacetate hydrolase activity and nucleic acids encoding same |
US7934023B2 (en) | 2003-12-01 | 2011-04-26 | Cisco Technology, Inc. | Apparatus and method for performing fast fibre channel write operations over relatively high latency networks |
WO2009032360A1 (en) * | 2007-09-05 | 2009-03-12 | Nec Laboratories America, Inc. | Storage over optical/ wireless integrated broadband access network (soba) architecture |
EP2830284A4 (en) * | 2012-12-28 | 2015-05-20 | Huawei Tech Co Ltd | Caching method for distributed storage system, node and computer readable medium |
US9424204B2 (en) | 2012-12-28 | 2016-08-23 | Huawei Technologies Co., Ltd. | Caching method for distributed storage system, a lock server node, and a lock client node |
Also Published As
Publication number | Publication date |
---|---|
US7246260B2 (en) | 2007-07-17 |
ATE359552T1 (en) | 2007-05-15 |
DE60034327D1 (en) | 2007-05-24 |
EP1796001A3 (en) | 2007-06-27 |
CA2363726C (en) | 2010-06-29 |
US6148414A (en) | 2000-11-14 |
JP2002538549A (en) | 2002-11-12 |
US6912668B1 (en) | 2005-06-28 |
ATE431589T1 (en) | 2009-05-15 |
US20060005076A1 (en) | 2006-01-05 |
NZ513789A (en) | 2003-10-31 |
DE60034327T2 (en) | 2008-01-03 |
EP1171820B1 (en) | 2007-04-11 |
CN100489796C (en) | 2009-05-20 |
EP1796001A2 (en) | 2007-06-13 |
EP1796001B1 (en) | 2009-05-13 |
KR20020012539A (en) | 2002-02-16 |
EP1171820A4 (en) | 2005-06-15 |
TW468107B (en) | 2001-12-11 |
EP1171820A1 (en) | 2002-01-16 |
CN1350674A (en) | 2002-05-22 |
WO2000052576A9 (en) | 2001-08-30 |
DE60042225D1 (en) | 2009-06-25 |
AU3358600A (en) | 2000-09-21 |
CA2363726A1 (en) | 2000-09-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6148414A (en) | Methods and systems for implementing shared disk array management functions | |
US7721144B2 (en) | Methods and systems for implementing shared disk array management functions | |
US7181578B1 (en) | Method and apparatus for efficient scalable storage management | |
US6571354B1 (en) | Method and apparatus for storage unit replacement according to array priority | |
US8639878B1 (en) | Providing redundancy in a storage system | |
US20180181430A1 (en) | Storage virtual machine relocation | |
US8335899B1 (en) | Active/active remote synchronous mirroring | |
KR100995466B1 (en) | Methods and apparatus for implementing virtualization of storage within a storage area network | |
US7173929B1 (en) | Fast path for performing data operations | |
US8521685B1 (en) | Background movement of data between nodes in a storage cluster | |
US6973549B1 (en) | Locking technique for control and synchronization | |
US20030140209A1 (en) | Fast path caching | |
US20030188218A1 (en) | System and method for active-active data replication | |
US20020161983A1 (en) | System, method, and computer program product for shared device of storage compacting | |
AU2003238219A1 (en) | Methods and apparatus for implementing virtualization of storage within a storage area network | |
US9058127B2 (en) | Data transfer in cluster storage systems | |
CN108205573B (en) | Data distributed storage method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 00804532.1 Country of ref document: CN |
|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: IN/PCT/2001/871/KOL Country of ref document: IN |
|
ENP | Entry into the national phase |
Ref document number: 2363726 Country of ref document: CA Ref document number: 2363726 Country of ref document: CA Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 513789 Country of ref document: NZ |
|
AK | Designated states |
Kind code of ref document: C2 Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: C2 Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
COP | Corrected version of pamphlet |
Free format text: PAGES 1/19-19/19, DRAWINGS, REPLACED BY NEW PAGES 1/19-19/19; DUE TO LATE TRANSMITTAL BY THE RECEIVING OFFICE |
|
ENP | Entry into the national phase |
Ref document number: 2000 602929 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1020017011191 Country of ref document: KR |
|
WWE | Wipo information: entry into national phase |
Ref document number: 33586/00 Country of ref document: AU |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2000911736 Country of ref document: EP |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
WWP | Wipo information: published in national office |
Ref document number: 2000911736 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 1020017011191 Country of ref document: KR |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 1020017011191 Country of ref document: KR |
|
WWG | Wipo information: grant in national office |
Ref document number: 2000911736 Country of ref document: EP |